WO2022236579A1 - Gait recognition method and system based on lightweight attention convolutional neural network - Google Patents

Gait recognition method and system based on lightweight attention convolutional neural network Download PDF

Info

Publication number
WO2022236579A1
WO2022236579A1 PCT/CN2021/092775 CN2021092775W WO2022236579A1 WO 2022236579 A1 WO2022236579 A1 WO 2022236579A1 CN 2021092775 W CN2021092775 W CN 2021092775W WO 2022236579 A1 WO2022236579 A1 WO 2022236579A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
convolutional neural
features
channel
lightweight
Prior art date
Application number
PCT/CN2021/092775
Other languages
French (fr)
Chinese (zh)
Inventor
孙方敏
李烨
黄浩华
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to PCT/CN2021/092775 priority Critical patent/WO2022236579A1/en
Publication of WO2022236579A1 publication Critical patent/WO2022236579A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the present invention relates to the field of computer application technology, more specifically, to a gait recognition method and system based on a lightweight attentional convolutional neural network.
  • biometric technology is the latest technology for access control of wearable smart devices, which identifies individuals based on unique, stable and measurable physiological or behavioral characteristics of human beings.
  • Physiological characteristics mainly include face, fingerprint and iris, etc.
  • behavioral characteristics are related to a person's behavior patterns, including gait, signature and so on.
  • biometric technologies based on physiological characteristics have been widely used, they also have many insurmountable shortcomings.
  • the sensors used to acquire physiological characteristics (such as fingerprint scanners, cameras, etc.) are expensive and large in size, which increases the weight and cost of wearable smart devices.
  • biometric technology based on physiological characteristics requires explicit interaction between the user and the device, and cannot achieve long-distance, active, real-time and continuous identification. When the device is lost in the unlocked state, the security risk is huge.
  • gait refers to the walking posture of the human body. Research has shown that each individual's gait is unique and stable, making it difficult to imitate or replicate.
  • Gait-based identification does not require explicit interaction between the user and the device, and is an active, real-time and continuous identification method with high security.
  • microelectronics technology inertial sensors with small size, low power consumption and low cost are integrated in almost all wearable smart devices, which makes it possible to use wearable smart devices to obtain gait information and use corresponding algorithms to It is possible to realize user identification.
  • Gait identification technology based on wearable smart devices has received extensive attention and research from scholars at home and abroad.
  • gait recognition methods based on wearable smart devices mainly include three categories: template matching methods, machine learning methods and deep learning methods.
  • the template matching method identifies the identity of the user by calculating and comparing the similarity between the gait template stored in the wearable smart device and the gait cycle to be detected. If the similarity is higher than the preset threshold, the user is identified for legitimate users.
  • the methods used to calculate the similarity mainly include Dynamic Time Warping (DTW), Pearson Correlation Coefficient (PCC) and cross-correlation etc.
  • DTW Dynamic Time Warping
  • PCC Pearson Correlation Coefficient
  • cross-correlation etc.
  • template matching methods need to detect gait cycles to construct gait templates and test samples, and gait cycle detection is a challenging task because it is sensitive to noise and device location, pace, road conditions and device Any change in position can easily lead to failure of gait cycle detection or loss of phase within a gait cycle, which will lead to wrong recognition decisions. Therefore, the robustness and accuracy of the template matching method can not meet the needs of practical applications.
  • Machine learning methods achieve identity recognition by extracting features of gait signals for classification.
  • Existing studies have used algorithms such as support vector machine (SVM), nearest neighbor (KNN) and random forest (RF) for gait identification, and achieved better performance than template matching methods.
  • SVM support vector machine
  • KNN nearest neighbor
  • RF random forest
  • model recognition accuracy of machine learning methods is greatly affected by manually extracted features.
  • Manually extracting features requires researchers to have rich professional knowledge and experience in related fields, with professionalism and a certain degree of subjectivity. Preprocessing, feature engineering, and continuous experimental verification and improvement are required to obtain good results, which is time-consuming and difficult.
  • Deep learning networks have powerful nonlinear representation learning capabilities, which can automatically extract useful features from input data for classification and other tasks.
  • Existing studies have proposed many deep learning-based gait recognition methods, which have been extensively compared with traditional machine learning algorithms and template matching algorithms, and have achieved better performance improvements in recognition accuracy.
  • the deep learning method can automatically extract useful features from the data, it has better robustness and higher recognition performance than the template matching method and the machine learning method, but the models proposed by the existing research have high complexity. It is not suitable for wearable smart devices with limited computing power and capacity.
  • An object of the present invention is to provide a lightweight attentional convolutional neural network for gait recognition based on wearable smart devices, which can achieve better performance improvement while occupying less memory resources.
  • a gait recognition method based on a lightweight attentional convolutional neural network includes the following steps:
  • Step S1 Input the collected triaxial acceleration and triaxial angular velocity gait data into a lightweight convolutional neural network to extract gait features.
  • the lightweight convolutional neural network performs one-dimensional convolution calculations on the time axis to extract The features in the single-axis acceleration signal and the single-axis angular velocity signal, and use two-dimensional convolution to fuse the extracted six-axis signal features to obtain the output feature map;
  • Step S2 For the feature map output by the lightweight convolutional neural network, calculate the attention weight parameters of each channel according to the context coding information of each channel;
  • Step S3 For the feature map of each channel output by the lightweight convolutional neural network, use depth separable convolution to further extract features and multiply them with the attention weight parameters of the corresponding channel, and then perform gait recognition, wherein The depthwise separable convolution performs convolution operations only in the spatial dimension.
  • a gait recognition system based on a lightweight attention convolutional neural network includes:
  • Lightweight convolutional neural network used to extract gait features and obtain output feature maps with triaxial acceleration and triaxial angular velocity gait data as input, where the lightweight convolutional neural network performs one-dimensional convolution on the time axis Product calculation, respectively extracting the features in the acceleration single-axis signal and angular velocity single-axis signal, and using two-dimensional convolution to fuse the extracted six-axis signal features;
  • Attention module used to calculate the attention weight parameters of each channel according to the context encoding information of each channel for the feature map output by the lightweight convolutional neural network; and for the output of the lightweight convolutional neural network
  • the feature map of each channel is further extracted by depth-separable convolution and then multiplied by the attention weight parameter of the corresponding channel to obtain enhanced features, wherein the depth-separable convolution only performs convolution operations in the spatial dimension;
  • Prediction output module used for gait recognition according to the enhanced features.
  • the present invention has the advantage of proposing a lightweight neural network model suitable for wearable smart devices, which can obtain higher recognition accuracy while occupying less memory resources, solving the problem of Existing research requires a high-complexity model to obtain high recognition accuracy.
  • Fig. 1 is the flowchart of the gait recognition method based on lightweight attention convolutional neural network according to one embodiment of the present invention
  • Fig. 2 is a structural diagram of a lightweight attention convolutional neural network according to one embodiment of the present invention.
  • Fig. 3 is a schematic diagram of extracting a feature map of each channel through depthwise separable convolution according to an embodiment of the present invention.
  • the present invention proposes a lightweight attentional convolutional neural network, which is a new technical solution for realizing gait recognition based on wearable smart devices.
  • the technical solution first uses a lightweight convolutional neural network (CNN) to extract gait features from the three-axis acceleration and three-axis angular velocity data collected by wearable smart devices.
  • CNN convolutional neural network
  • a new attention weight calculation method is proposed, and an attention module is designed based on the attention weight calculation method, contextual encoding information and depthwise separable convolution, which is embedded into a lightweight CNN to enhance step dynamic features and simplify the complexity of the model.
  • the enhanced gait features are input into, for example, a Softmax classifier for classification, and then the gait recognition result is output.
  • the provided gait recognition method based on a lightweight attention convolutional neural network includes the following steps.
  • Step S110 taking the three-axis acceleration and three-axis angular velocity gait data as input, and using a lightweight convolutional neural network to extract features.
  • the lightweight attention convolutional neural network is shown in Figure 2, which generally includes an input layer, a convolutional neural network, an attention module (marked as Attention) and an output layer (to obtain a predicted output module).
  • the input layer receives the three-axis acceleration and three-axis angular velocity gait data collected by wearable smart devices, the convolutional neural network is used to extract gait features from the gait data, and the attention module is used to process the extracted gait features Enhancement, the enhanced features are input into the Softmax classifier for classification and output recognition results.
  • L-CNN Lightweight CNN
  • L-CNN Lightweight CNN
  • L-CNN contains four convolutional layers and two pooling layers. Two pooling layers are respectively placed after the first and third convolutional layers to further extract the main features of the convolutional layers.
  • BN BatchNormalization
  • ReLU ReLU activation layer
  • the first three convolutional layers of L-CNN use 1D convolution, that is, convolution calculations are performed on the time axis, and the features in the single-axis signals of acceleration and angular velocity are extracted respectively. This method is conducive to obtaining better feature representation of single-axis signals. .
  • the last convolutional layer of L-CNN uses 2D convolution to fuse the 6-axis signal features extracted by the previous three convolutional layers to obtain more useful potential advanced features, which will help the network to obtain better recognition performance .
  • the hierarchical structure and parameter settings of L-CNN are shown in Table 1 below.
  • Table 1 L-CNN hierarchy and parameter settings
  • Step S120 for the feature map output by the lightweight convolutional neural network, use the channel attention mechanism to extract enhanced features.
  • the attention module uses the channel attention mechanism to learn the correlation between a single channel information and all channel information, and uses this correlation as the weight of different channels to multiply the original feature map , so as to enhance the feature map of the important channel, the larger the weight value, the more important the information contained in the feature map of the channel.
  • an attention weight calculation module is usually composed of Global Average Pooling (GAP) and Fully Connected Layers (FC) to obtain the weights of different channels, but this method is due to the The presence of connected layers increases the model parameters of the network.
  • GAP Global Average Pooling
  • FC Fully Connected Layers
  • a new channel weight calculation method is proposed.
  • F ⁇ R H ⁇ W ⁇ C be a set of feature maps output by L-CNN, where H, W and C denote the height, width and channel dimension of the feature map respectively, and the weight calculation formula of the i-th channel is defined as:
  • the molecule F i represents the context encoding information contained in the i-th channel, which can be represented by a value or the sum of a set of data, Represents the sum of all channel context encoding information.
  • a Context Encoding Module (Context Encoding Module, CEM) is used to capture global context information and selectively highlight feature maps associated with categories.
  • CEM Context Encoding Module
  • This module can be found in (“Deep TEN: Texture Encoding Network”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, July 21-26, 2017, Zhang, H.; Xue, J.; Dana , K.). Since this module combines dictionary learning and residual encoding, it carries domain-specific information and can be transferred to the processing of gait timing signals. The whole CEM is differentiable, and embedding this module into a convolutional neural network can achieve end-to-end learning optimization.
  • CEM contains K D-dimensional encoding vectors.
  • the values of these encoding vectors are generally initialized randomly, and reasonable values are automatically learned during the continuous training of the network.
  • a new vector set E with a fixed length of K is obtained, and each element in E contains context coding information.
  • a single feature map only obtains one weight parameter containing context coding information after passing through the context coding module, and contains input features of C channels
  • the present invention does not directly multiply the channel weight parameter and the original feature map, but makes some improvements to it.
  • the channel attention mechanism can determine which channels are important and enhance the feature maps of important channels, so as to select feature maps that are more relevant to the target task. But the channel attention mechanism ignores that there may still be some useless or redundant features in the feature maps of important channels.
  • the feature map of each channel is further extracted and then multiplied by the attention weight parameter.
  • depthwise separable convolution (Depthwise Separable Convolution, DS-Conv) is used to implement feature extraction on the feature map of each channel.
  • Depth-separable convolution is a model lightweight technology. Unlike traditional convolution, which performs convolution operations in two dimensions, space and channel, depth-separable convolution only performs convolution operations in the spatial dimension. Convolution calculations are performed only in the spatial dimension so that the depthwise separable convolution does not need to specify the number of convolution kernels, thereby significantly reducing the number of parameters that the model needs to learn.
  • the embodiment of the present invention proposes a channel attention method that can effectively improve model recognition performance and simplify model complexity, which is named CEDS-A (Attention with Context Encoding and Depthwise Separable Convolution), its structure is shown in Figure 3.
  • Input F (H,W,C) represents a set of feature maps output by L-CNN
  • DS-Conv represents depth separable convolution
  • ⁇ (1,1,C) represents channel attention weights
  • Y (H',W ', C) is a new set of feature maps obtained.
  • Equation (2) is a mathematical description of Figure 3, where D C represents a depthwise separable convolution operation, for example, the size of its convolution kernel is set to 1x3, and ⁇ N represents BN+Sigmoid.
  • Step S130 using enhanced features for gait recognition.
  • a classifier such as a Softmax classifier
  • the invention can effectively improve the recognition rate of gait identity authentication, and can be applied to monitoring systems in various occasions.
  • the whuGait data set contains gait data collected by 118 experimenters outdoors through smartphones in a completely unrestrained environment. It is not clear when, where and how each experimenter walks.
  • the whuGait dataset consists of 8 sub-datasets: dataset #1 to dataset #4 for identification, dataset #5 and dataset #6 for authentication, dataset #7 and dataset #8 for Walking data and non-walking data are separated.
  • the present invention only uses two sub-data sets from data set #1 to data set #2.
  • the OU-ISIR dataset is currently the inertial sensor-based gait dataset with the largest number of experimental participants, and it includes gait data of 744 experimenters (389 males, 355 females, ranging in age from 2 to 78 years old).
  • the OU-ISIR and whuGait datasets can refer to the open source processed datasets on the GitHub website (https://github.com/qinnzou/). See Table 2 for details of the datasets used in the experiments. There is no intersection between the training set and the test set used in the experiment, and the overlap rate of samples refers to the overlap between the internal samples of the training set and the test set.
  • the network model uses Early Stopping to control the number of iterations of network training.
  • the early stopping method is a widely used model training method, which means that during the network training process, if the performance of the network on the verification set has not been improved for N consecutive iterations, the learning and training of the network will be stopped.
  • the early stopping method saves the model or model parameters with the best performance of the network on the verification set during the training process by monitoring whether the performance indicators (such as accuracy rate, average error, etc.) have improved, which can prevent the network from over-fitting and improve the generality of the model. performance.
  • the accuracy rate is used as a monitoring index
  • N is set to 50 to control the training of the network, that is, if the accuracy rate of the network on the verification set has not improved for 50 consecutive iterations, the training of the network will end.
  • the method proposed by the present invention is mainly consistent with the existing technical solution "Deep Learning-Based Gait Recognition Using Smartphones in the Wild” (IEEE Transactions on Information Forensics and Security, 2020, 15, 3197- 3212], Zou, Q.; Wang, Y.; Wang, Q.; etc.) for comparison.
  • the method proposed by the present invention (marked as L-CNN+CEDS-A) is higher than the experimental results of existing CNN+LSTM on data set #1 and data set #2 1.39% and 0.95%, and 25.16% higher on the OU-ISIR dataset.
  • the parameter amount of the model of the present invention is reduced by 87.8% on average compared with the parameter amount of the existing CNN+LSTM model, which shows that the memory resource occupied on the model size is less.
  • the present invention also provides a gait recognition system based on a lightweight attentional convolutional neural network, which is used to realize one or more aspects of the above method.
  • the system includes: a lightweight convolutional neural network, which is used to use three-axis acceleration and three-axis angular velocity gait data as input, extract gait features, and obtain an output feature map, wherein the lightweight convolutional neural network is used in Perform one-dimensional convolution calculation on the time axis, extract the features in the acceleration single-axis signal and angular velocity single-axis signal, and use two-dimensional convolution to fuse the extracted six-axis signal features; the attention module is used for For the feature map output by the lightweight convolutional neural network, calculate the attention weight parameters of each channel according to the context coding information of each channel; and for the feature map of each channel output by the lightweight convolutional neural network , use the depth separable convolution to further extract features and multiply the corresponding channel attention weight parameters to obtain enhanced features, wherein the depth separable convolution
  • the present invention proposes a new channel attention weight calculation method, which is simple and effective, and hardly increases the number of parameters of the model. Based on the proposed channel attention weight calculation method, context encoding module and depthwise separable convolution, the present invention proposes a channel attention module that can effectively improve model recognition performance and simplify model complexity.
  • the lightweight convolutional neural network and the channel attention module designed by the present invention are combined to form a complete gait recognition network, which achieves better performance improvement while occupying less memory resources.
  • the present invention can be a system, method and/or computer program product.
  • a computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
  • a computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device.
  • a computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disc read only memory
  • DVD digital versatile disc
  • memory stick floppy disk
  • mechanically encoded device such as a printer with instructions stored thereon
  • a hole card or a raised structure in a groove and any suitable combination of the above.
  • computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
  • Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
  • Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages.
  • Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • an electronic circuit such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA)
  • FPGA field programmable gate array
  • PDA programmable logic array
  • These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram.
  • These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
  • each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed in the present invention are a gait recognition method and system based on a lightweight attention convolutional neural network. The method comprises: inputting collected three-axis acceleration gait data and three-axis angular velocity gait data into a lightweight convolutional neural network, so as to extract gait features, wherein the lightweight convolutional neural network performs a one-dimensional convolution calculation on a time axis, respectively extracts features in an acceleration single-axis signal and an angular velocity single-axis signal, and fuses extracted six-axis signal features by means of two-dimensional convolution; for a feature map output by the lightweight convolutional neural network, calculating an attention weight parameter of each channel according to context encoding information of each channel; and for a feature map of each channel that is output by the lightweight convolutional neural network, further extracting features by means of depthwise separable convolution, and then multiplying the extracted features by the attention weight parameter of the corresponding channel, so as to enhance the features, wherein the enhanced features are used for classification, thereby realizing gait recognition. By means of the present invention, the complexity of a model can be reduced, and the accuracy of gait recognition can be improved.

Description

一种基于轻量注意力卷积神经网络的步态识别方法和系统A Gait Recognition Method and System Based on Lightweight Attention Convolutional Neural Network 技术领域technical field
本发明涉及计算机应用技术领域,更具体地,涉及一种基于轻量注意力卷积神经网络的步态识别方法和系统。The present invention relates to the field of computer application technology, more specifically, to a gait recognition method and system based on a lightweight attentional convolutional neural network.
背景技术Background technique
近年来,可穿戴智能设备的种类(如智能手机、智能手表等)和数量取得了惊人的增长,其应用也变得越来越普遍,包括移动支付、即时通讯、社交娱乐、定位导航、远程办公和健康检测等。可穿戴智能设备的普及给人们的生活带来了极大的便利,但由于在使用过程中它们可能存储和收集了个人敏感信息,随之而来的是高度的隐私泄露风险,这使得可穿戴智能设备的安全问题备受关注和重视。身份识别作为保护信息安全的第一道关卡,具有举足轻重的作用。基于可穿戴智能设备的步态识别是一种有效的身份识别方法,它通过个体独特的行走方式来识别其身份,具有远距离、主动、实时和连续识别等优点。目前,采用深度学习技术进行步态识别已经实现了显着的性能提升,并成为一种新的有前途的趋势。然而,大多数现有研究只是着眼于提高识别精度,它们的网络模型通常具有较高的复杂度,忽略了轻量模型对计算能力和存储资源有限的可穿戴智能设备的重要性。In recent years, the types and quantities of wearable smart devices (such as smartphones, smart watches, etc.) Office and health testing, etc. The popularity of wearable smart devices has brought great convenience to people's lives, but because they may store and collect personal sensitive information during use, there is a high risk of privacy leakage, which makes wearable The security of smart devices has attracted much attention and attention. As the first hurdle to protect information security, identity recognition plays a pivotal role. Gait recognition based on wearable smart devices is an effective identification method, which identifies individuals through their unique walking styles, and has the advantages of long-distance, active, real-time and continuous recognition. Currently, adopting deep learning techniques for gait recognition has achieved significant performance improvements and has become a new promising trend. However, most existing studies only focus on improving recognition accuracy, and their network models usually have high complexity, ignoring the importance of lightweight models for wearable smart devices with limited computing power and storage resources.
在现有技术中,生物识别技术是用于可穿戴智能设备访问控制的最新技术,它根据人类独特、稳定和可测量的生理特征或行为特征识别个体身份。生理特征主要包括人脸、指纹和虹膜等,而行为特征与一个人的行为模式有关,包括步态、签名等。尽管基于生理特征的生物识别技术已被广泛使用,但是它们也存在许多无法克服的缺点。首先,用于获取生理特征的传感器(例如指纹扫描仪,相机等)价格昂贵并且尺寸大,这增加了可穿戴智能设备的重量和成本。其次,生理特征如指纹、人脸等存在被复制 的风险,如3D打印可轻易复制用户的指纹用于解锁设备等。最后,基于生理特征的生物识别技术需要用户与设备之间进行显式交互,无法实现远距离、主动、实时和连续的身份识别,当设备在解锁状态丢失时,安全风险是巨大的。Among the existing technologies, biometric technology is the latest technology for access control of wearable smart devices, which identifies individuals based on unique, stable and measurable physiological or behavioral characteristics of human beings. Physiological characteristics mainly include face, fingerprint and iris, etc., while behavioral characteristics are related to a person's behavior patterns, including gait, signature and so on. Although biometric technologies based on physiological characteristics have been widely used, they also have many insurmountable shortcomings. First, the sensors used to acquire physiological characteristics (such as fingerprint scanners, cameras, etc.) are expensive and large in size, which increases the weight and cost of wearable smart devices. Secondly, there is a risk of copying physiological features such as fingerprints and faces. For example, 3D printing can easily copy users' fingerprints to unlock devices. Finally, biometric technology based on physiological characteristics requires explicit interaction between the user and the device, and cannot achieve long-distance, active, real-time and continuous identification. When the device is lost in the unlocked state, the security risk is huge.
步态作为一种行为特征,指的是人体的步行姿态。研究表明,每个人的步态具有独特性和稳定性,很难被模仿或复制。基于步态的身份识别(步态识别)不需要用户和设备之间进行明确的交互,是一种主动、实时和连续的身份识别方法,具有很高的安全性。随着微电子技术的发展,体积小、低功耗和低成本的惯性传感器几乎集成在所有的可穿戴智能设备中,这使得使用可穿戴智能设备来获取步态信息,并通过相应的算法来实现用户身份识别成为可能。基于可穿戴智能设备的步态身份识别技术受到了国内外学者广泛的关注和研究。目前基于可穿戴智能设备的步态识别方法主要包括三类:模板匹配方法、机器学习方法和深度学习方法。As a behavioral characteristic, gait refers to the walking posture of the human body. Research has shown that each individual's gait is unique and stable, making it difficult to imitate or replicate. Gait-based identification (gait recognition) does not require explicit interaction between the user and the device, and is an active, real-time and continuous identification method with high security. With the development of microelectronics technology, inertial sensors with small size, low power consumption and low cost are integrated in almost all wearable smart devices, which makes it possible to use wearable smart devices to obtain gait information and use corresponding algorithms to It is possible to realize user identification. Gait identification technology based on wearable smart devices has received extensive attention and research from scholars at home and abroad. At present, gait recognition methods based on wearable smart devices mainly include three categories: template matching methods, machine learning methods and deep learning methods.
模板匹配方法通过计算并比较可穿戴智能设备中存储的步态模板和待检测的步态周期之间的相似度来识别用户的身份,如果相似度高于预先设定的阈值,则将用户识别为合法用户。用于计算相似度的方法主要包括动态时间规整(DTW),皮尔逊相关系数(PCC)和互相关等。目前,已有许多研究提出了不同的模板匹配方法,并在实验室条件下取得了良好的性能。然而,模板匹配方法需要检测步态周期以构建步态模板和测试样本,而步态周期检测是一项具有挑战性的工作,因为它对噪声和设备位置很敏感,步速、道路条件和设备位置的变化都很容易导致步态周期检测失败或步态周期内的相位丢失,而这将导致错误的识别决策。因此,模板匹配方法在鲁棒性和准确性上尚不能满足实际应用的需求。The template matching method identifies the identity of the user by calculating and comparing the similarity between the gait template stored in the wearable smart device and the gait cycle to be detected. If the similarity is higher than the preset threshold, the user is identified for legitimate users. The methods used to calculate the similarity mainly include Dynamic Time Warping (DTW), Pearson Correlation Coefficient (PCC) and cross-correlation etc. Currently, many studies have proposed different template matching methods and achieved good performance under laboratory conditions. However, template matching methods need to detect gait cycles to construct gait templates and test samples, and gait cycle detection is a challenging task because it is sensitive to noise and device location, pace, road conditions and device Any change in position can easily lead to failure of gait cycle detection or loss of phase within a gait cycle, which will lead to wrong recognition decisions. Therefore, the robustness and accuracy of the template matching method can not meet the needs of practical applications.
机器学习方法通过提取步态信号的特征进行分类来实现身份识别。现有的研究使用支持向量机(SVM),最近邻居(KNN)和随机森林(RF)等算法进行步态身份识别,并获得了比模板匹配方法更好的性能。然而,机器学习方法的模型识别精度受手动提取的特征的影响很大,手动提取特征需要研究者具备相关领域丰富的专业知识和经验,具有专业性和一定程度的主观性,同时还要经过数据预处理、特征工程和不断的实验验证和改良 才能获得好的结果,这是费时且困难的。Machine learning methods achieve identity recognition by extracting features of gait signals for classification. Existing studies have used algorithms such as support vector machine (SVM), nearest neighbor (KNN) and random forest (RF) for gait identification, and achieved better performance than template matching methods. However, the model recognition accuracy of machine learning methods is greatly affected by manually extracted features. Manually extracting features requires researchers to have rich professional knowledge and experience in related fields, with professionalism and a certain degree of subjectivity. Preprocessing, feature engineering, and continuous experimental verification and improvement are required to obtain good results, which is time-consuming and difficult.
最近的研究表明,采用深度学习模型(如卷积神经网络(CNN))进行步态识别已经取得了显着的性能改善,并成为一种新的有希望的趋势。深度学习网络具有强大的非线性表征学习能力,它可以从输入数据中自动提取有用的特征进行分类和其他任务。现有研究提出了许多基于深度学习的步态识别方法,这些方法与传统的机器学习算法和模板匹配算法进行了广泛的比较,并在识别精度上取得了更好的性能提升。尽管深度学习方法能够从数据中自动提取有用的特征,相比模板匹配方法和机器学习方法具有更好的鲁棒性和更高的识别性能,但是现有研究所提出的模型具有较高的复杂度(模型参数量大),不适合计算能力和容量有限的可穿戴智能设备。Recent studies have shown that adopting deep learning models such as convolutional neural networks (CNNs) for gait recognition has achieved significant performance improvements and has become a new promising trend. Deep learning networks have powerful nonlinear representation learning capabilities, which can automatically extract useful features from input data for classification and other tasks. Existing studies have proposed many deep learning-based gait recognition methods, which have been extensively compared with traditional machine learning algorithms and template matching algorithms, and have achieved better performance improvements in recognition accuracy. Although the deep learning method can automatically extract useful features from the data, it has better robustness and higher recognition performance than the template matching method and the machine learning method, but the models proposed by the existing research have high complexity. It is not suitable for wearable smart devices with limited computing power and capacity.
发明内容Contents of the invention
本发明的一个目的是提供轻量注意力卷积神经网络用于实现基于可穿戴智能设备的步态识别,能在占用更少内存资源的情况下获得更好的性能提升。An object of the present invention is to provide a lightweight attentional convolutional neural network for gait recognition based on wearable smart devices, which can achieve better performance improvement while occupying less memory resources.
根据本发明的第一方面,提供一种基于轻量注意力卷积神经网络的步态识别方法。该方法包括以下步骤:According to a first aspect of the present invention, a gait recognition method based on a lightweight attentional convolutional neural network is provided. The method includes the following steps:
步骤S1:将采集的三轴加速度和三轴角速度步态数据输入轻量级卷积神经网络提取步态特征,该轻量级卷积神经网络在时间轴上进行一维卷积计算,分别提取加速度单轴信号和角速度单轴信号内的特征,并采用二维卷积对所提取到的六轴信号特征进行融合,获得输出特征图;Step S1: Input the collected triaxial acceleration and triaxial angular velocity gait data into a lightweight convolutional neural network to extract gait features. The lightweight convolutional neural network performs one-dimensional convolution calculations on the time axis to extract The features in the single-axis acceleration signal and the single-axis angular velocity signal, and use two-dimensional convolution to fuse the extracted six-axis signal features to obtain the output feature map;
步骤S2:对于所述轻量级卷积神经网络输出的特征图,根据各通道的上下文编码信息计算各通道的注意力权重参数;Step S2: For the feature map output by the lightweight convolutional neural network, calculate the attention weight parameters of each channel according to the context coding information of each channel;
步骤S3:对于所述轻量级卷积神经网络输出的每一个通道的特征图,使用深度可分离卷积进一步提取特征后与对应通道注意力权重参数进行相乘,进而进行步态识别,其中所述深度可分离卷积仅在空间维度进行卷积操作。Step S3: For the feature map of each channel output by the lightweight convolutional neural network, use depth separable convolution to further extract features and multiply them with the attention weight parameters of the corresponding channel, and then perform gait recognition, wherein The depthwise separable convolution performs convolution operations only in the spatial dimension.
根据本发明的第二方面,提供一种基于轻量注意力卷积神经网络的步 态识别系统。该系统包括:According to a second aspect of the present invention, there is provided a gait recognition system based on a lightweight attention convolutional neural network. The system includes:
轻量卷积神经网络:用于以三轴加速度和三轴角速度步态数据作为输入,提取步态特征,获得输出特征图,其中该轻量级卷积神经网络在时间轴上进行一维卷积计算,分别提取加速度单轴信号和角速度单轴信号内的特征,并采用二维卷积对所提取到的六轴信号特征进行融合;Lightweight convolutional neural network: used to extract gait features and obtain output feature maps with triaxial acceleration and triaxial angular velocity gait data as input, where the lightweight convolutional neural network performs one-dimensional convolution on the time axis Product calculation, respectively extracting the features in the acceleration single-axis signal and angular velocity single-axis signal, and using two-dimensional convolution to fuse the extracted six-axis signal features;
注意力模块:用于对于所述轻量级卷积神经网络输出的特征图,根据各通道的上下文编码信息计算各通道的注意力权重参数;以及对于所述轻量级卷积神经网络输出的每一个通道的特征图,使用深度可分离卷积进一步提取特征后与对应通道注意力权重参数进行相乘,获得增强特征,其中所述深度可分离卷积仅在空间维度进行卷积操作;Attention module: used to calculate the attention weight parameters of each channel according to the context encoding information of each channel for the feature map output by the lightweight convolutional neural network; and for the output of the lightweight convolutional neural network The feature map of each channel is further extracted by depth-separable convolution and then multiplied by the attention weight parameter of the corresponding channel to obtain enhanced features, wherein the depth-separable convolution only performs convolution operations in the spatial dimension;
预测输出模块:用于根据所述增强特征进行步态识别。Prediction output module: used for gait recognition according to the enhanced features.
与现有技术相比,本发明的优点在于,提出一种适合可穿戴智能设备的轻量神经网络模型,该模型能在占用更少的内存资源的情况下获得更高的识别精度,解决了现有研究需要复杂度高的模型才能获得高识别精度的问题。Compared with the prior art, the present invention has the advantage of proposing a lightweight neural network model suitable for wearable smart devices, which can obtain higher recognition accuracy while occupying less memory resources, solving the problem of Existing research requires a high-complexity model to obtain high recognition accuracy.
通过以下参照附图对本发明的示例性实施例的详细描述,本发明的其它特征及其优点将会变得清楚。Other features of the present invention and advantages thereof will become apparent from the following detailed description of exemplary embodiments of the present invention with reference to the accompanying drawings.
附图说明Description of drawings
被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.
图1是根据本发明一个实施例的基于轻量注意力卷积神经网络的步态识别方法的流程图;Fig. 1 is the flowchart of the gait recognition method based on lightweight attention convolutional neural network according to one embodiment of the present invention;
图2是根据本发明一个实施例的轻量注意力卷积神经网络的结构图;Fig. 2 is a structural diagram of a lightweight attention convolutional neural network according to one embodiment of the present invention;
图3是根据本发明一个实施例的深度可分离卷积实现对每个通道的特征图进行提取的示意图。Fig. 3 is a schematic diagram of extracting a feature map of each channel through depthwise separable convolution according to an embodiment of the present invention.
具体实施方式Detailed ways
现在将参照附图来详细描述本发明的各种示例性实施例。应注意到: 除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangements of components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless specifically stated otherwise.
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and in no way taken as limiting the invention, its application or uses.
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。Techniques, methods and devices known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods and devices should be considered part of the description.
在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。In all examples shown and discussed herein, any specific values should be construed as exemplary only, and not as limitations. Therefore, other instances of the exemplary embodiment may have different values.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。It should be noted that like numerals and letters denote like items in the following figures, therefore, once an item is defined in one figure, it does not require further discussion in subsequent figures.
本发明提出一种轻量注意力卷积神经网络,是实现基于可穿戴智能设备的步态识别的新技术方案。简言之,该技术方案首先采用轻量卷积神经网络(CNN)从可穿戴智能设备采集到的三轴加速度和三轴角速度数据中提取步态特征。然后,提出了一种新的注意力权重计算方法,并基于该注意力权重计算方法、上下文编码信息和深度可分离卷积设计了注意力模块,该模块被嵌入到轻量CNN中以增强步态特征并简化模型的复杂度。最后,增强后的步态特征被输入到例如Softmax分类器中进行分类,进而输出步态识别结果。The present invention proposes a lightweight attentional convolutional neural network, which is a new technical solution for realizing gait recognition based on wearable smart devices. In short, the technical solution first uses a lightweight convolutional neural network (CNN) to extract gait features from the three-axis acceleration and three-axis angular velocity data collected by wearable smart devices. Then, a new attention weight calculation method is proposed, and an attention module is designed based on the attention weight calculation method, contextual encoding information and depthwise separable convolution, which is embedded into a lightweight CNN to enhance step dynamic features and simplify the complexity of the model. Finally, the enhanced gait features are input into, for example, a Softmax classifier for classification, and then the gait recognition result is output.
具体地,参见图1所示,所提供的基于轻量注意力卷积神经网络的步态识别方法包括以下步骤。Specifically, as shown in FIG. 1 , the provided gait recognition method based on a lightweight attention convolutional neural network includes the following steps.
步骤S110,以三轴加速度和三轴角速度步态数据为输入,利用轻量卷积神经网络提取特征。Step S110, taking the three-axis acceleration and three-axis angular velocity gait data as input, and using a lightweight convolutional neural network to extract features.
在一个实施例中,轻量注意力卷积神经网络如图2所示,整体上包括输入层、卷积神经网络、注意力模块(标记为Attention)和输出层(获得预测输出模块)。In one embodiment, the lightweight attention convolutional neural network is shown in Figure 2, which generally includes an input layer, a convolutional neural network, an attention module (marked as Attention) and an output layer (to obtain a predicted output module).
输入层接收可穿戴智能设备采集到的三轴加速度和三轴角速度步态数据,卷积神经网络用于从步态数据中提取步态特征,注意力模块用于对 提取到的步态特征进行增强,经过增强后的特征被输入到Softmax分类器中进行分类并输出识别结果。The input layer receives the three-axis acceleration and three-axis angular velocity gait data collected by wearable smart devices, the convolutional neural network is used to extract gait features from the gait data, and the attention module is used to process the extracted gait features Enhancement, the enhanced features are input into the Softmax classifier for classification and output recognition results.
图2中的卷积神经网络设计为轻量的网络结构,以下简称L-CNN(Lightweight CNN),它是整个网络的最前端,用于从输入数据中提取特征。例如,L-CNN包含四个卷积层和两个池化层。两个池化层分别放置在第一和第三个卷积层之后,用于进一步提取卷积层的主要特征。每一个卷积层或池化层后设置批归一化层(BatchNormalization,BN)和ReLU激活层,BN层和ReLU层可以加快网络训练和收敛的速度,并防止梯度消失或爆炸以及过拟合问题。L-CNN前面三层卷积层采用1D卷积,即在时间轴上进行卷积计算,分别提取加速度和角速度单轴信号内的特征,这种方式有利于获得单轴信号更好的特征表示。L-CNN的最后一层卷积层采用2D卷积对前面三层卷积层提取到的6轴信号特征进行融合,以获取更有用的潜在高级特征,从而有利于网络获得更好的识别性能。L-CNN的层次结构和参数设置参见下表1。The convolutional neural network in Figure 2 is designed as a lightweight network structure, hereinafter referred to as L-CNN (Lightweight CNN), which is the front end of the entire network and is used to extract features from input data. For example, L-CNN contains four convolutional layers and two pooling layers. Two pooling layers are respectively placed after the first and third convolutional layers to further extract the main features of the convolutional layers. Set a batch normalization layer (BatchNormalization, BN) and a ReLU activation layer after each convolution layer or pooling layer. The BN layer and the ReLU layer can speed up network training and convergence, and prevent gradient disappearance or explosion and overfitting question. The first three convolutional layers of L-CNN use 1D convolution, that is, convolution calculations are performed on the time axis, and the features in the single-axis signals of acceleration and angular velocity are extracted respectively. This method is conducive to obtaining better feature representation of single-axis signals. . The last convolutional layer of L-CNN uses 2D convolution to fuse the 6-axis signal features extracted by the previous three convolutional layers to obtain more useful potential advanced features, which will help the network to obtain better recognition performance . The hierarchical structure and parameter settings of L-CNN are shown in Table 1 below.
表1:L-CNN层次结构和参数设置Table 1: L-CNN hierarchy and parameter settings
Figure PCTCN2021092775-appb-000001
Figure PCTCN2021092775-appb-000001
步骤S120,对于轻量卷积神经网络输出的特征图,利用通道注意力机制提取增强特征。Step S120, for the feature map output by the lightweight convolutional neural network, use the channel attention mechanism to extract enhanced features.
在图2中,注意力模块使用通道注意力机制,学习单个通道信息和全部通道信息之间的相关性,并将这种相关性作为不同通道的权重,用以和原来的特征图进行相乘,从而对重要通道的特征图进行增强,权重值越大,表明该通道的特征图所包含的信息越重要。在现有技术中,通常由全局平均池化(Global Average Pooling,GAP)和全连接层(Fully Connected Layers,FC)组成一个注意力权重计算模块来获取不同通道的权重,但这种方法由于全连接层的存在增加了网络的模型参数。In Figure 2, the attention module uses the channel attention mechanism to learn the correlation between a single channel information and all channel information, and uses this correlation as the weight of different channels to multiply the original feature map , so as to enhance the feature map of the important channel, the larger the weight value, the more important the information contained in the feature map of the channel. In the prior art, an attention weight calculation module is usually composed of Global Average Pooling (GAP) and Fully Connected Layers (FC) to obtain the weights of different channels, but this method is due to the The presence of connected layers increases the model parameters of the network.
在一个实施例中,提出了一种新的通道权重计算方法。令F∈R H×W×C为L-CNN输出的一组特征图,其中H、W和C分别表示特征图的高度、宽度和通道维度,将第i个通道的权重计算公式定义为: In one embodiment, a new channel weight calculation method is proposed. Let F∈R H×W×C be a set of feature maps output by L-CNN, where H, W and C denote the height, width and channel dimension of the feature map respectively, and the weight calculation formula of the i-th channel is defined as:
Figure PCTCN2021092775-appb-000002
Figure PCTCN2021092775-appb-000002
其中分子F i表示第i个通道所包含的上下文编码信息,可以用一个数值或者一组数据的和来表示,
Figure PCTCN2021092775-appb-000003
表示所有通道上下文编码信息的总和。
The molecule F i represents the context encoding information contained in the i-th channel, which can be represented by a value or the sum of a set of data,
Figure PCTCN2021092775-appb-000003
Represents the sum of all channel context encoding information.
为了获取上下文编码信息,优选地,使用上下文编码模块(Context Encoding Module,CEM)用于捕获全局上下文信息和选择性的突出与类别相关的特征图。该模块可参见(“Deep TEN:Texture Encoding Network”.IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Honolulu,Hawaii,USA,July 21-26,2017,Zhang,H.;Xue,J.;Dana,K.)。由于该模块将字典学习和残差编码结合在一起,因此携带了特定域的信息,可以迁移到步态时序信号的处理上。整个CEM是可微分的,将该模块嵌入到卷积神经网络中可以实现端到端的学习优化。In order to obtain context encoding information, preferably, a Context Encoding Module (Context Encoding Module, CEM) is used to capture global context information and selectively highlight feature maps associated with categories. This module can be found in (“Deep TEN: Texture Encoding Network”. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, USA, July 21-26, 2017, Zhang, H.; Xue, J.; Dana , K.). Since this module combines dictionary learning and residual encoding, it carries domain-specific information and can be transferred to the processing of gait timing signals. The whole CEM is differentiable, and embedding this module into a convolutional neural network can achieve end-to-end learning optimization.
例如,CEM包含了K个D维编码向量,这些编码向量的值一般随机初始化,并在网络不断训练的过程中自动学习到合理的数值。单个特征图经过CEM处理后得到一个新的固定长度为K的向量集合E,E中的每个元素都包含了上下文编码信息。在一个实施例中,令固有字典只包含一个编码向量,即K=1,此时,单个特征图经过上下文编码模块后只得到一个包含上 下文编码信息的权重参数,而含有C个通道的输入特征图将得到C个权重参数,记为γ={E 1,E 2,...,E C}。根据公式(1)和γ={E 1,E 2,...,E C}可以计算出每个通道的注意力权重。 For example, CEM contains K D-dimensional encoding vectors. The values of these encoding vectors are generally initialized randomly, and reasonable values are automatically learned during the continuous training of the network. After a single feature map is processed by CEM, a new vector set E with a fixed length of K is obtained, and each element in E contains context coding information. In one embodiment, let the inherent dictionary only contain one coding vector, that is, K=1. At this time, a single feature map only obtains one weight parameter containing context coding information after passing through the context coding module, and contains input features of C channels The graph will get C weight parameters, recorded as γ={E 1 ,E 2 ,...,EC }. The attention weight of each channel can be calculated according to formula (1) and γ={E 1 , E 2 ,...,E C }.
在获得通道注意力权重后,本发明并没有直接将通道权重参数和原来特征图进行相乘,而是对其做了一些改进。通道注意力机制可以确定哪些通道是重要的并对重要通道的特征图进行增强,从而选择出和目标任务更相关的特征图。但是通道注意力机制忽略了在重要通道的特征图中,仍然可能存在一些无用或冗余的特征。针对通道注意力机制存在的这个问题,对每一个通道的特征图进一步提取特征后再和注意力权重参数进行相乘。After obtaining the channel attention weight, the present invention does not directly multiply the channel weight parameter and the original feature map, but makes some improvements to it. The channel attention mechanism can determine which channels are important and enhance the feature maps of important channels, so as to select feature maps that are more relevant to the target task. But the channel attention mechanism ignores that there may still be some useless or redundant features in the feature maps of important channels. To solve this problem in the channel attention mechanism, the feature map of each channel is further extracted and then multiplied by the attention weight parameter.
在一个实施例中,使用深度可分离卷积(Depthwise Separable Convolution,DS-Conv)实现对每个通道的特征图进行特征提取。深度可分离卷积是一种模型轻量化技术,与传统的卷积在空间和通道两个维度进行卷积操作的方式不同,深度可分离卷积仅在空间维度进行卷积操作。只在空间维度进行卷积计算使得深度可分离卷积并不需要指定卷积核的数量,从而显著减少了模型所需要学习的参数数量。In one embodiment, depthwise separable convolution (Depthwise Separable Convolution, DS-Conv) is used to implement feature extraction on the feature map of each channel. Depth-separable convolution is a model lightweight technology. Unlike traditional convolution, which performs convolution operations in two dimensions, space and channel, depth-separable convolution only performs convolution operations in the spatial dimension. Convolution calculations are performed only in the spatial dimension so that the depthwise separable convolution does not need to specify the number of convolution kernels, thereby significantly reducing the number of parameters that the model needs to learn.
基于公式(1)、上下文编码模块和深度可分离卷积,本发明实施例提出了一种能有效提高模型识别性能并简化模型复杂度的通道注意力方法,将其命名为CEDS-A(Attention with Context Encodeing and Depthwise Separable Convolution),其结构如图3所示。输入F (H,W,C)表示L-CNN输出的一组特征图,DS-Conv代表深度可分离卷积,γ (1,1,C)代表通道注意力权重,Y (H',W',C)是获得的一组新的特征图。 Based on formula (1), context coding module and depthwise separable convolution, the embodiment of the present invention proposes a channel attention method that can effectively improve model recognition performance and simplify model complexity, which is named CEDS-A (Attention with Context Encoding and Depthwise Separable Convolution), its structure is shown in Figure 3. Input F (H,W,C) represents a set of feature maps output by L-CNN, DS-Conv represents depth separable convolution, γ (1,1,C) represents channel attention weights, Y (H',W ', C) is a new set of feature maps obtained.
公式(2)是对图3的数学描述,其中,D C表示深度可分离卷积操作,例如将其卷积核大小设置为1x3,δ N表示BN+Sigmoid。 Equation (2) is a mathematical description of Figure 3, where D C represents a depthwise separable convolution operation, for example, the size of its convolution kernel is set to 1x3, and δ N represents BN+Sigmoid.
Figure PCTCN2021092775-appb-000004
Figure PCTCN2021092775-appb-000004
步骤S130,利用增强特征进行步态识别。Step S130, using enhanced features for gait recognition.
基于上述提取的增强特征,可进一步利用分类器,例如Softmax分类器,判断对应的步态特征是否合法,从而实现个人的身份验证。通过本发明能够有效地提高步态身份认证的识别率,可以应用于各种场合的监控系统。Based on the enhanced features extracted above, a classifier, such as a Softmax classifier, can be further used to judge whether the corresponding gait features are legal, thereby realizing personal identity verification. The invention can effectively improve the recognition rate of gait identity authentication, and can be applied to monitoring systems in various occasions.
为进一步验证本发明的效果,进行实验。结果表明,与现有类似研究在识别准确度和模型参数量方面相比,本发明提出的模型在复杂度平均降低87.8%的情况下获得了更高的识别性能。具体实验过程如下。In order to further verify the effect of the present invention, experiment is carried out. The results show that, compared with the recognition accuracy and model parameter quantity of the existing similar research, the model proposed by the present invention obtains higher recognition performance under the condition that the complexity is reduced by 87.8% on average. The specific experimental process is as follows.
1)、实验数据1), experimental data
在真实场景下采集的whuGait数据集和实验参与人数最多的OU-ISIR数据集上进行实验来评估所提出的网络模型的性能。whuGait数据集包含118名实验者在户外完全不受约束的情况下通过智能手机采集得到的步态数据,每个实验者何时、何地以及如何行走并不清楚。whuGait数据集由8个子数据集组成:数据集#1到数据集#4用于身份识别,数据集#5和数据集#6用于身份认证,数据集#7和数据集#8用于将行走数据和非行走数据分离。本发明仅使用数据集#1到数据集#2两个子数据集。OU-ISIR数据集是目前实验参与者最多的基于惯性传感器的步态数据集,它包括744名实验者(389名男性,355名女性,年龄范围从2岁到78岁)的步态数据。Experiments are conducted on the whuGait dataset collected in real scenarios and the OU-ISIR dataset with the largest number of experimental participants to evaluate the performance of the proposed network model. The whuGait data set contains gait data collected by 118 experimenters outdoors through smartphones in a completely unrestrained environment. It is not clear when, where and how each experimenter walks. The whuGait dataset consists of 8 sub-datasets: dataset #1 to dataset #4 for identification, dataset #5 and dataset #6 for authentication, dataset #7 and dataset #8 for Walking data and non-walking data are separated. The present invention only uses two sub-data sets from data set #1 to data set #2. The OU-ISIR dataset is currently the inertial sensor-based gait dataset with the largest number of experimental participants, and it includes gait data of 744 experimenters (389 males, 355 females, ranging in age from 2 to 78 years old).
OU-ISIR和whuGait数据集可参考GitHub网站(https://github.com/qinnzou/)上开源的处理之后的数据集。实验所使用的数据集的详细信息参见表2。实验所使用的训练集和测试集之间不存在交集,样本的重叠率指的是训练集和测试集内部样本之间的重叠。The OU-ISIR and whuGait datasets can refer to the open source processed datasets on the GitHub website (https://github.com/qinnzou/). See Table 2 for details of the datasets used in the experiments. There is no intersection between the training set and the test set used in the experiment, and the overlap rate of samples refers to the overlap between the internal samples of the training set and the test set.
表2:实验数据集信息Table 2: Experimental dataset information
Figure PCTCN2021092775-appb-000005
Figure PCTCN2021092775-appb-000005
2)、实验方法2), experimental method
网络模型使用早停法(Early Stopping)控制网络训练的迭代次数。早停法是一种广泛使用的模型训练方法,其含义是在网络训练过程中如果网络在验证集上的性能连续N次迭代都没有得到提高则停止网络的学习训练。早停法通过监控性能指标(如准确率,平均误差等)有无提升来保存 训练过程中网络在验证集上性能最好的模型或模型参数,可以防止网络过拟合问题,提高模型的泛化性能。在发明中使用准确率作为监控指标,并设置为N为50来控制网络的训练,即如果网络在验证集上的准确率连续50次迭代都没有提高则结束网络的训练。The network model uses Early Stopping to control the number of iterations of network training. The early stopping method is a widely used model training method, which means that during the network training process, if the performance of the network on the verification set has not been improved for N consecutive iterations, the learning and training of the network will be stopped. The early stopping method saves the model or model parameters with the best performance of the network on the verification set during the training process by monitoring whether the performance indicators (such as accuracy rate, average error, etc.) have improved, which can prevent the network from over-fitting and improve the generality of the model. performance. In the invention, the accuracy rate is used as a monitoring index, and N is set to 50 to control the training of the network, that is, if the accuracy rate of the network on the verification set has not improved for 50 consecutive iterations, the training of the network will end.
3)、评估指标3), evaluation indicators
为评估模型的性能,使用准确率(accuracy)、召回率(recall)和F1分数(F1-score)作为评估指标,这三个评估指标的值越大,模型的性能越好。To evaluate the performance of the model, accuracy, recall and F1-score are used as evaluation indicators. The larger the value of these three evaluation indicators, the better the performance of the model.
4)、实验结果和分析4), Experimental results and analysis
在whuGait和OU-ISIR数据集上,本发明所提出的方法主要和现有技术方案“Deep Learning-Based Gait Recognition Using Smartphones in the Wild”(IEEE Transactions on Information Forensics and Security,2020,15,3197-3212],Zou,Q.;Wang,Y.;Wang,Q.;etc.)的实验结果进行对比。On the whuGait and OU-ISIR data sets, the method proposed by the present invention is mainly consistent with the existing technical solution "Deep Learning-Based Gait Recognition Using Smartphones in the Wild" (IEEE Transactions on Information Forensics and Security, 2020, 15, 3197- 3212], Zou, Q.; Wang, Y.; Wang, Q.; etc.) for comparison.
实验对比结果参见下表3,可以看到:The experimental comparison results are shown in Table 3 below, and it can be seen that:
(1)在识别准确率上,本发明所提出的方法(标记为L-CNN+CEDS-A)在数据集#1和数据集#2上比现有的CNN+LSTM的实验结果分别高出1.39%和0.95%,而在OU-ISIR数据集上要高出25.16%。(1) On recognition accuracy, the method proposed by the present invention (marked as L-CNN+CEDS-A) is higher than the experimental results of existing CNN+LSTM on data set #1 and data set #2 1.39% and 0.95%, and 25.16% higher on the OU-ISIR dataset.
(2)在模型参数量上,本发明的模型的参数量比相比现有的CNN+LSTM模型的参数数量平均减少了87.8%,表现在模型大小上占用的内存资源更少。(2) On the amount of model parameters, the parameter amount of the model of the present invention is reduced by 87.8% on average compared with the parameter amount of the existing CNN+LSTM model, which shows that the memory resource occupied on the model size is less.
上述实验结果表明,本发明所提出的方法相比已有研究的方法在模型更轻量的情况下获得了更高的识别准确率,这对于目前资源有限的的可穿戴智能设备来说是重要且有意义的。The above experimental results show that the method proposed in the present invention achieves higher recognition accuracy with a lighter model than the existing research methods, which is important for wearable smart devices with limited resources. and meaningful.
表3:与现有研究结果的对比Table 3: Comparison with Existing Research Results
Figure PCTCN2021092775-appb-000006
Figure PCTCN2021092775-appb-000006
Figure PCTCN2021092775-appb-000007
Figure PCTCN2021092775-appb-000007
相应地,本发明还提供一种基于轻量注意力卷积神经网络的步态识别系统,用于实现上述方法的一个方面或多个方面。例如,该系统包括:轻量卷积神经网络,其用于以三轴加速度和三轴角速度步态数据作为输入,提取步态特征,获得输出特征图,其中该轻量级卷积神经网络在时间轴上进行一维卷积计算,分别提取加速度单轴信号和角速度单轴信号内的特征,并采用二维卷积对所提取到的六轴信号特征进行融合;注意力模块,其用于对于所述轻量级卷积神经网络输出的特征图,根据各通道的上下文编码信息计算各通道的注意力权重参数;以及对于所述轻量级卷积神经网络输出的每一个通道的特征图,使用深度可分离卷积进一步提取特征后与对应通道注意力权重参数进行相乘,获得增强特征,其中所述深度可分离卷积仅在空间维度进行卷积操作;预测输出模块,其用于根据所述增强特征进行步态识别。Correspondingly, the present invention also provides a gait recognition system based on a lightweight attentional convolutional neural network, which is used to realize one or more aspects of the above method. For example, the system includes: a lightweight convolutional neural network, which is used to use three-axis acceleration and three-axis angular velocity gait data as input, extract gait features, and obtain an output feature map, wherein the lightweight convolutional neural network is used in Perform one-dimensional convolution calculation on the time axis, extract the features in the acceleration single-axis signal and angular velocity single-axis signal, and use two-dimensional convolution to fuse the extracted six-axis signal features; the attention module is used for For the feature map output by the lightweight convolutional neural network, calculate the attention weight parameters of each channel according to the context coding information of each channel; and for the feature map of each channel output by the lightweight convolutional neural network , use the depth separable convolution to further extract features and multiply the corresponding channel attention weight parameters to obtain enhanced features, wherein the depth separable convolution only performs convolution operations in the spatial dimension; the prediction output module is used for Perform gait recognition according to the enhanced features.
综上所述,本发明提出了一种新通道注意力权重计算方法,该方法简单有效,且几乎不增加模型的参数数量。基于所提出的通道注意力权重计算方法、上下文编码模块和深度可分离卷积,本发明提出了一种能有效提高模型识别性能并简化模型复杂度的通道注意力模块。本发明所设计的轻量卷积神经网络和通道注意力模块结合在一些形成一个完整的步态识别网络,在占用更少内存资源的情况下获得了更好的性能提升。In summary, the present invention proposes a new channel attention weight calculation method, which is simple and effective, and hardly increases the number of parameters of the model. Based on the proposed channel attention weight calculation method, context encoding module and depthwise separable convolution, the present invention proposes a channel attention module that can effectively improve model recognition performance and simplify model complexity. The lightweight convolutional neural network and the channel attention module designed by the present invention are combined to form a complete gait recognition network, which achieves better performance improvement while occupying less memory resources.
本发明可以是系统、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本发明的各个方面的计算机可读程序指令。The present invention can be a system, method and/or computer program product. A computer program product may include a computer readable storage medium having computer readable program instructions thereon for causing a processor to implement various aspects of the present invention.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的 列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。A computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. A computer readable storage medium may be, for example, but is not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or flash memory), static random access memory (SRAM), compact disc read only memory (CD-ROM), digital versatile disc (DVD), memory stick, floppy disk, mechanically encoded device, such as a printer with instructions stored thereon A hole card or a raised structure in a groove, and any suitable combination of the above. As used herein, computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., pulses of light through fiber optic cables), or transmitted electrical signals.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .
用于执行本发明操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++、Python等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本发明的各个方面。Computer program instructions for carrying out operations of the present invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, Python, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect). In some embodiments, an electronic circuit, such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA), can be customized by utilizing state information of computer-readable program instructions, which can Various aspects of the invention are implemented by executing computer readable program instructions.
这里参照根据本发明实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that when executed by the processor of the computer or other programmable data processing apparatus , producing an apparatus for realizing the functions/actions specified in one or more blocks in the flowchart and/or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions cause computers, programmable data processing devices and/or other devices to work in a specific way, so that the computer-readable medium storing instructions includes An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions into a computer, other programmable data processing device, or other equipment, so that a series of operational steps are performed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , so that instructions executed on computers, other programmable data processing devices, or other devices implement the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本发明的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。对于本领域技术人员来说公知的是,通过硬件方式实现、通过软件方式实现以及通过软件和硬件结合的方式实现都是等价的。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions. It is well known to those skilled in the art that implementation by means of hardware, implementation by means of software, and implementation by a combination of software and hardware are all equivalent.
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。本发明的范围由所附权利要求来限定。Having described various embodiments of the present invention, the foregoing description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Many modifications and alterations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principle of each embodiment, practical application or technical improvement in the market, or to enable other ordinary skilled in the art to understand each embodiment disclosed herein. The scope of the invention is defined by the appended claims.

Claims (10)

  1. 一种基于轻量注意力卷积神经网络的步态识别方法,包括以下步骤:A gait recognition method based on a lightweight attention convolutional neural network, comprising the following steps:
    步骤S1:将采集的三轴加速度和三轴角速度步态数据输入轻量级卷积神经网络提取步态特征,该轻量级卷积神经网络在时间轴上进行一维卷积计算,分别提取加速度单轴信号和角速度单轴信号内的特征,并采用二维卷积对所提取到的六轴信号特征进行融合,获得输出特征图;Step S1: Input the collected triaxial acceleration and triaxial angular velocity gait data into a lightweight convolutional neural network to extract gait features. The lightweight convolutional neural network performs one-dimensional convolution calculations on the time axis to extract The features in the single-axis acceleration signal and the single-axis angular velocity signal, and use two-dimensional convolution to fuse the extracted six-axis signal features to obtain the output feature map;
    步骤S2:对于所述轻量级卷积神经网络输出的特征图,根据各通道的上下文编码信息计算各通道的注意力权重参数;Step S2: For the feature map output by the lightweight convolutional neural network, calculate the attention weight parameters of each channel according to the context coding information of each channel;
    步骤S3:对于所述轻量级卷积神经网络输出的每一个通道的特征图,使用深度可分离卷积进一步提取特征后与对应通道注意力权重参数进行相乘,进而进行步态识别,其中所述深度可分离卷积仅在空间维度进行卷积操作。Step S3: For the feature map of each channel output by the lightweight convolutional neural network, use depth separable convolution to further extract features and multiply them with the attention weight parameters of the corresponding channel, and then perform gait recognition, wherein The depthwise separable convolution performs convolution operations only in the spatial dimension.
  2. 根据权利要求1所述的方法,其特征在于,在步骤S2中,对于一组特征图F∈R H×W×C,第i个通道的权重计算公式表示为: The method according to claim 1, characterized in that, in step S2, for a set of feature maps F∈R H×W×C , the weight calculation formula of the i-th channel is expressed as:
    Figure PCTCN2021092775-appb-100001
    Figure PCTCN2021092775-appb-100001
    其中,H、W和C分别表示特征图的高度,F i表示第i个通道所包含的上下文编码信息,
    Figure PCTCN2021092775-appb-100002
    表示所有通道上下文编码信息的总和。
    Among them, H, W and C represent the height of the feature map respectively, F i represents the context encoding information contained in the i-th channel,
    Figure PCTCN2021092775-appb-100002
    Represents the sum of all channel context encoding information.
  3. 根据权利要求1所述的方法,其特征在于,所述上下文编码信息根据以下步骤获得::The method according to claim 1, wherein the context coding information is obtained according to the following steps::
    单个特征图经过上下文编码处理后得到一个新的固定长度为K的向量集合E,E中的每个元素都包含上下文编码信息。After a single feature map is processed by context encoding, a new vector set E with a fixed length of K is obtained, and each element in E contains context encoding information.
  4. 根据权利要求1所述的方法,其中,所述轻量级卷积神经网络包含四个卷积层和两个池化层,两个池化层分别设置在第一卷积层和第三卷积层之后,每一个卷积层或池化层后设置批归一化层和ReLU激活层,该轻量级卷积神经网络的前三层卷积层采用一维卷积,在时间轴上进行卷积计算,分别提取加速度和角速度单轴信号内的特征;最后一层卷积层采用二维卷积对前三层卷积层提取到的六轴信号特征进行融合。The method according to claim 1, wherein the lightweight convolutional neural network comprises four convolutional layers and two pooling layers, and the two pooling layers are respectively arranged in the first convolutional layer and the third convolutional layer. After the convolutional layer, a batch normalization layer and a ReLU activation layer are set after each convolutional layer or pooling layer. The first three convolutional layers of the lightweight convolutional neural network use one-dimensional convolution. On the time axis Carry out convolution calculations to extract the features in the acceleration and angular velocity single-axis signals; the last convolutional layer uses two-dimensional convolution to fuse the six-axis signal features extracted by the first three convolutional layers.
  5. 根据权利要求1所述的方法,其特征在于,所述三轴加速度和三轴角速度步态数据利用可穿戴智能设备采集。The method according to claim 1, wherein the triaxial acceleration and triaxial angular velocity gait data are collected by a wearable smart device.
  6. 一种基于轻量注意力卷积神经网络的步态识别系统,包括:A gait recognition system based on a lightweight attentional convolutional neural network, including:
    轻量卷积神经网络:用于以三轴加速度和三轴角速度步态数据作为输入,提取步态特征,获得输出特征图,其中该轻量级卷积神经网络在时间轴上进行一维卷积计算,分别提取加速度单轴信号和角速度单轴信号内的特征,并采用二维卷积对所提取到的六轴信号特征进行融合;Lightweight convolutional neural network: used to extract gait features and obtain output feature maps with triaxial acceleration and triaxial angular velocity gait data as input, where the lightweight convolutional neural network performs one-dimensional convolution on the time axis Product calculation, respectively extracting the features in the acceleration single-axis signal and angular velocity single-axis signal, and using two-dimensional convolution to fuse the extracted six-axis signal features;
    注意力模块:用于对于所述轻量级卷积神经网络输出的特征图,根据各通道的上下文编码信息计算各通道的注意力权重参数;以及对于所述轻量级卷积神经网络输出的每一个通道的特征图,使用深度可分离卷积进一步提取特征后与对应通道注意力权重参数进行相乘,获得增强特征,其中所述深度可分离卷积仅在空间维度进行卷积操作;Attention module: used to calculate the attention weight parameters of each channel according to the context encoding information of each channel for the feature map output by the lightweight convolutional neural network; and for the output of the lightweight convolutional neural network The feature map of each channel is further extracted by depth-separable convolution and then multiplied by the attention weight parameter of the corresponding channel to obtain enhanced features, wherein the depth-separable convolution only performs convolution operations in the spatial dimension;
    预测输出模块:用于根据所述增强特征进行步态识别。Prediction output module: used for gait recognition according to the enhanced features.
  7. 根据权利要求6所述的系统,其特征在于,所述注意力模块包括输入层、深度可分离卷积层、上下文编码模块以及批归一化层和激活层,输入输出之间的关系表示为:The system according to claim 6, wherein the attention module includes an input layer, a depth separable convolution layer, a context encoding module, a batch normalization layer and an activation layer, and the relationship between input and output is expressed as :
    Figure PCTCN2021092775-appb-100003
    Figure PCTCN2021092775-appb-100003
    其中,F (H,W,C)表示一组特征图,D C表示深度可分离卷积操作,卷积核大小设置为1x3,γ (1,1,C)代表通道注意力权重,Y (H′,W′,C)是获得的一组新的特征图,δ N表示批归一化和激活处理。 Among them, F (H, W, C) represents a set of feature maps, D C represents the depth separable convolution operation, the convolution kernel size is set to 1x3, γ (1, 1, C) represents the channel attention weight, Y ( H′, W′, C) are a new set of feature maps obtained, and δN denotes batch normalization and activation processing.
  8. 根据权利要求6所述的系统,其特征在于,所述预测输出模块采用softmax分类器实现。The system according to claim 6, wherein the prediction output module is implemented by a softmax classifier.
  9. 一种计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现根据权利要求1至5中任一项所述方法的步骤。A computer-readable storage medium, on which a computer program is stored, wherein, when the program is executed by a processor, the steps of the method according to any one of claims 1 to 5 are implemented.
  10. 一种计算机设备,包括存储器和处理器,在所述存储器上存储有能够在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现权利要求1至5中任一项所述的方法的步骤。A computer device comprising a memory and a processor, wherein a computer program capable of running on the processor is stored in the memory, wherein any one of claims 1 to 5 is implemented when the processor executes the program The steps of the method described in the item.
PCT/CN2021/092775 2021-05-10 2021-05-10 Gait recognition method and system based on lightweight attention convolutional neural network WO2022236579A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/092775 WO2022236579A1 (en) 2021-05-10 2021-05-10 Gait recognition method and system based on lightweight attention convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/092775 WO2022236579A1 (en) 2021-05-10 2021-05-10 Gait recognition method and system based on lightweight attention convolutional neural network

Publications (1)

Publication Number Publication Date
WO2022236579A1 true WO2022236579A1 (en) 2022-11-17

Family

ID=84027817

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/092775 WO2022236579A1 (en) 2021-05-10 2021-05-10 Gait recognition method and system based on lightweight attention convolutional neural network

Country Status (1)

Country Link
WO (1) WO2022236579A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958221A (en) * 2017-12-08 2018-04-24 北京理工大学 A kind of human motion Approach for Gait Classification based on convolutional neural networks
CN111967326A (en) * 2020-07-16 2020-11-20 北京交通大学 Gait recognition method based on lightweight multi-scale feature extraction
US20200375501A1 (en) * 2019-05-31 2020-12-03 Georgetown University Assessing diseases by analyzing gait measurements

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958221A (en) * 2017-12-08 2018-04-24 北京理工大学 A kind of human motion Approach for Gait Classification based on convolutional neural networks
US20200375501A1 (en) * 2019-05-31 2020-12-03 Georgetown University Assessing diseases by analyzing gait measurements
CN111967326A (en) * 2020-07-16 2020-11-20 北京交通大学 Gait recognition method based on lightweight multi-scale feature extraction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HUANG HAOHUA, ZHOU PAN, LI YE, SUN FANGMIN: "A Lightweight Attention-Based CNN Model for Efficient Gait Recognition with Wearable IMU Sensors", SENSORS, vol. 21, no. 8, 1 January 2020 (2020-01-01), pages 1 - 13, XP093003076, DOI: 10.3390/s21082866 *
WANG TAO;WANG HONGZHANG;XIA YI;ZHANG DEXIANG: "Human Gait Recognition Based on Convolutional Neural Network and Attention Model", CHINESE JOURNAL OF SENSORS AND ACTUATORS, vol. 32, no. 7, 15 July 2019 (2019-07-15), pages 1027 - 1033, XP093003092, ISSN: 1004-1699, DOI: 10.3969/j.issn.1004-1699.2019.07.012 *

Similar Documents

Publication Publication Date Title
Neverova et al. Learning human identity from motion patterns
CN113139499A (en) Gait recognition method and system based on light-weight attention convolutional neural network
KR102152120B1 (en) Automated Facial Expression Recognizing Systems on N frames, Methods, and Computer-Readable Mediums thereof
Imani et al. Neural computation for robust and holographic face detection
Peng et al. A face recognition software framework based on principal component analysis
He et al. Gait2Vec: continuous authentication of smartphone users based on gait behavior
Song et al. A brief survey of dimension reduction
Bekhet et al. A robust deep learning approach for glasses detection in non‐standard facial images
Li et al. A novel fingerprint recognition method based on a Siamese neural network
Adel et al. Inertial gait-based person authentication using siamese networks
CN113742669B (en) User authentication method based on twin network
Jadhav et al. HDL-PI: hybrid DeepLearning technique for person identification using multimodal finger print, iris and face biometric features
Li et al. Feature extraction based on deep‐convolutional neural network for face recognition
WO2022236579A1 (en) Gait recognition method and system based on lightweight attention convolutional neural network
Pathak et al. Deep learning model for facial emotion recognition
Li et al. [Retracted] Human Motion Representation and Motion Pattern Recognition Based on Complex Fuzzy Theory
Annamalai et al. Facial matching and reconstruction techniques in identification of missing person using deep learning
Kambala et al. A multi-task learning based hybrid prediction algorithm for privacy preserving human activity recognition framework
Dong et al. GIAD-ST: Detecting anomalies in human monitoring based on generative inpainting via self-supervised multi-task learning
Dhiman et al. An introduction to deep learning applications in biometric recognition
Gaurav et al. A hybrid deep learning model for human activity recognition using wearable sensors
Maddalena et al. Pattern recognition and beyond: Alfredo Petrosino’s scientific results
Kavita et al. Machine Learning Techniques for Real-Time Human Face Recognition
Hemanth et al. Improving Accuracy of Face Detection in ID Proofs using CNN and Comparing with DLNN
Athalla et al. Analysis of smart home security system design based on facial recognition with application of deep learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21941155

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21941155

Country of ref document: EP

Kind code of ref document: A1