CN111985332A

CN111985332A - Gait recognition method for improving loss function based on deep learning

Info

Publication number: CN111985332A
Application number: CN202010696163.3A
Authority: CN
Inventors: 胡海根; 汪鹏飞; 吴泽成; 周乾伟; 李小薪; 钱汉望
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-07-20
Filing date: 2020-07-20
Publication date: 2020-11-24
Anticipated expiration: 2040-07-20
Also published as: CN111985332B

Abstract

A gait recognition method of an improved loss function based on deep learning comprises the following steps: step 1, acquiring a pedestrian gait data set; step 2, preprocessing the training data obtained in the step 1, and cutting the data into 64 × 64 by using a center line principle; step 3, building a deep convolutional neural network; step 4, designing a loss function; step 5, initializing neural network parameters; step 6, training the constructed neural network, inputting the training sample obtained in the step 2 as input and the corresponding actual identity label as output into the network in batches, and adjusting the network parameters and the weight of a loss function through a back propagation algorithm after calculating loss; and 7, identifying unknown data by using the trained network, and dividing the identification into two stages of registration and identification. The method can better retain the motion information on time and space dimensions, and achieve better recognition effect under complex scenes such as backpacks, dresses and the like.

Description

Gait recognition method for improving loss function based on deep learning

Technical Field

The invention belongs to the technical field of computer vision, and relates to a gait recognition method for improving a loss function based on deep learning.

Technical Field

The gait recognition carries out the identity recognition through the walking posture of people, compared with other biological characteristic recognition technologies, the gait recognition has the advantages of non-contact, long distance, difficulty in camouflage and the like, and has wide application in crime prevention, forensic identification and social security.

Currently, gait recognition is mainly divided into two methods, namely image recognition and video sequence recognition. The former compresses all gait contour maps into an image, and recognizes gait recognition as an image matching problem, and obviously, the method ignores information on a time dimension in gait and cannot model fine information of a space dimension; the later method extracts features from the contour, and can well model information of time and space dimensions in gait recognition by using an LSTM, a 3D-CNN or a double-flow method, but the calculation cost is high, and the training is not easy. At present, gait recognition methods are basically carried out on a binary image without a background, and the accuracy is influenced by factors such as wearing, dressing, angles of a camera and the like of a target.

Disclosure of Invention

In order to overcome the defects of the prior art, time and space dimension information is not lost while training is easy, and meanwhile, the accuracy rate under complex scenes that a target wears overcoat, a backpack and the like can be improved, the invention provides a gait recognition method for improving a loss function based on deep learning, wherein gait images are taken as an image set, and the loss function is improved.

In order to solve the above technical problems, the present invention can provide the following technical solutions:

a gait recognition method of an improved loss function based on deep learning, the method comprising the steps of:

step 1, using a gait recognition data set or self-establishing the data set, wherein the gait recognition data set is CASIA-B or OU-MVLP, and preprocessing the data set, and the process comprises the following steps:

1.1) if an image acquisition device is used for acquiring a gait image of a pedestrian, extracting a human body target contour from the acquired image by depeplabv 3+ and converting the human body target contour into a binary image;

1.2) cutting the image into 64 x 64 by using the center line principle;

1.3) dividing the data set into a training set and a testing set;

step 2, training stage, namely training the deep convolutional neural network on the training set, wherein the process is as follows:

2.1) constructing a deep convolutional neural network, extracting frame level features of an image by a CNN module, extracting sequence level features from the frame level features by an SP module, extracting sequence information of different levels by an MGP module, and extracting local and global features simultaneously by an HPM;

2.2) designing a loss function, and defining the loss function as follows:

where an represents the original sample, po represents the sample of the same class as an,ne represents samples of different classes from an, d (x, y) represents Euclidean distances of x and y in an embedding space, margin is a positive integer for enlarging the distance between different label samples, N represents the number of samples in a batch, M represents the number of classes, P represents the number of people in a batch, K represents the number of pictures of each person in a batch, P (X) represents the distribution of the true samples, Q (X) represents the distribution of the network prediction, L_BCEAnd L_BFIs an improved loss function;

2.3) weighting σ of the loss function₁And σ₂As a parameter of the network;

2.4) initializing neural network parameters;

2.5) taking the training sample obtained in the step 1 as input, taking a corresponding actual identity label as output, inputting the training sample into the network in batches, and adjusting the weight of the network parameter and the loss function through a back propagation algorithm after calculating the loss;

2.6) repeating 2.5) until the training is finished;

step 3, in the testing stage, the testing data is a testing set or collected data, and the process is as follows:

3.1) registering, inputting a gait image sequence set G, and carrying out forward propagation on each image sequence G in the G through the network_iCalculating the characteristic vector to obtain a characteristic vector set F_gAnd storing in a gait database;

3.2) identifying, inputting a gait image sequence Q, traversing all sequences in an image sequence set G to find the same identity label, and obtaining a characteristic vector F through network forward propagation_qAnd gait database F_gAnd calculating Euclidean distance by each feature vector, wherein the identity label corresponding to the feature vector with the minimum distance is the label of Q.

Further, in step 2, the training phase is set as follows: the optimizer uses Adam with a learning rate of 1e-4, a total number of iterations of 80K, a batch size of (8,8), meaning that 8 people are taken for one batch, 8 images per person, L_BA+Is set to 2, the weight of the loss function σ₁And σ₂Are all initialized to 0.5.

The technical conception of the invention is as follows: extracting space dimension information of the gait by using a convolutional neural network, and extracting time dimension information of the gait by using an attention mechanism; secondly, the loss function is improved, the weight of the loss function is used as the parameter of the network for training, and the weight can be made to be self-adaptive.

The invention has the following beneficial effects: the input gait images do not need to be ordered, and the accuracy rate of the gait images under complex scenes of target wearing of overcoat, backpack and the like is improved.

Drawings

Fig. 1 is a network architecture diagram of the method of the present invention.

Fig. 2 is a schematic centerline principle cut.

Fig. 3 is a flow chart of the method of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

Referring to fig. 1 to 3, a gait recognition method based on a deep learning improved loss function, which regards gait as an image sequence composed of independent frames, extracts image space features and time features at the same time, and is not affected by frame arrangement. The network firstly extracts frame-level features from a plurality of images through CNN feature extraction; then, using a Set Pooling-based multi-feature Set for Pooling, and extracting sequence-level features from the frame-level features; simultaneously, multi-feature fusion based on multi-layer execution of a full flow pipeline MGP is used for sequence information of different levels; finally, HPM-based multi-scale feature identification is used to extract local and global features simultaneously.

The process of cutting the image into 64 x 64 by the center line principle refers to fig. 2.

Referring to fig. 3, the gait recognition method based on the improved loss function of deep learning comprises the following steps:

1.2) cutting the image into 64 x 64 by using the center line principle;

1.3) dividing the data set into a training set and a testing set;

2.2) designing a loss function, and defining the loss function as follows:

wherein an represents an original sample, po represents a sample of the same category as an, ne represents a sample of a different category from an, d (x, y) represents Euclidean distance of x and y in an embedding space, margin is a positive integer for enlarging the distance between different label samples, N represents the number of samples in one batch, M represents the number of categories, P represents the number of people in one batch, K represents the number of pictures of each person in one batch, P (X) represents the real distribution of samples, Q (X) represents the distribution of network prediction, L (X) represents the real distribution of samples, and_BCEand L_BFIs an improved loss function;

2.4) initializing neural network parameters;

2.6) repeating 2.5) until the training is finished;

According to the scheme of the embodiment, the accuracy of the network in two complex scenes, namely BG (carrying bag) and CL (wearing overcoat), of the CASIA-B data set is improved through improvement of the loss function.

Claims

1. A gait recognition method of an improved loss function based on deep learning is characterized by comprising the following steps:

step 1, using a gait recognition data set or self-establishing the data set, wherein the gait recognition data set comprises CASIA-B or OU-MVLP, and preprocessing the data set, wherein the process comprises the following steps:

1.2) cutting the image into 64 x 64 by using the center line principle;

1.3) dividing the data set into a training set and a testing set;

2.2) designing a loss function, and defining the loss function as follows:

where an denotes the original sample, po denotes the sample of the same class as an, ne denotes the sample of a different class from an, d(x, y) represents Euclidean distance of x and y in embedding space, margin is a positive integer for expanding the distance between different label samples, N represents the number of samples in a batch, M represents the number of categories, P represents the number of people in a batch, K represents the number of pictures of each person in a batch, P (X) represents the true distribution of the samples, Q (X) represents the distribution of network prediction, L_BCEAnd L_BFIs an improved loss function;

2.4) initializing neural network parameters;

2.6) repeating 2.5) until the training is finished;

2. The gait recognition method for improving the loss function based on deep learning as claimed in claim 1, wherein in the step 2, the training phase is set as follows: the optimizer uses Adam with a learning rate of 1e-4, a total number of iterations of 80K, a batch size of (8,8), meaning that 8 people are taken for one batch, 8 images per person, L_BA+Is set to 2, the weight of the function is lostWeight σ₁And σ₂Are all initialized to 0.5.