CN114627424A - Gait recognition method and system based on visual angle transformation - Google Patents

Gait recognition method and system based on visual angle transformation Download PDF

Info

Publication number
CN114627424A
CN114627424A CN202210305445.5A CN202210305445A CN114627424A CN 114627424 A CN114627424 A CN 114627424A CN 202210305445 A CN202210305445 A CN 202210305445A CN 114627424 A CN114627424 A CN 114627424A
Authority
CN
China
Prior art keywords
gait
image
pedestrian
visual angle
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210305445.5A
Other languages
Chinese (zh)
Inventor
卫星
周芳
陈柏霖
杨烨
王明珠
陈逸康
何煦
李宝璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202210305445.5A priority Critical patent/CN114627424A/en
Publication of CN114627424A publication Critical patent/CN114627424A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a gait recognition method and a gait recognition system based on visual angle transformation, which are characterized in that firstly, monitoring videos acquired by a plurality of pedestrian monitoring devices are acquired, a pedestrian gait data set is obtained through processing, and a training set and a testing set are divided; training a visual angle conversion model for generating a specific visual angle image through any visual angle and a discriminator for discriminating the correctness of the generated visual angle image by using a GaitGAN network, and then inputting a test set into the visual angle conversion model to obtain a gait energy atlas under a target visual angle; obtaining an image generated by the visual angle conversion model, preprocessing the image to obtain a pixel image, sending the pixel image into a reference gait feature extraction model to obtain a feature vector and a pedestrian prediction vector to calculate the total loss, and optimizing model parameters by using a gradient descent algorithm to obtain a trained gait feature extraction model. The invention realizes the automation of pedestrian tracking, and converts the pedestrian gait energy diagram into the most obvious gait feature at a 90-degree visual angle by adopting the generated confrontation network, so that the gait recognition accuracy is higher.

Description

Gait recognition method and system based on visual angle transformation
Technical Field
The invention relates to the field of computer vision and deep learning, in particular to a gait recognition method and system based on visual angle transformation.
Background
The gait recognition technology is a new biological characteristic recognition technology, and aims to perform identity recognition and analysis, detection of physiological and pathological characteristics and the like through the walking characteristics of a person. Gait recognition has wide application prospects in the fields of access control systems, pedestrian monitoring, public security and the like, and is widely applied in the field of computer vision in recent years. The pedestrian detection and tracking by gait recognition has the following advantages: 1) compared with the traditional biological characteristic identification technology (such as fingerprint identification, palm print identification and the like), the gait identification is bright in non-touch and is suitable for identity authentication under a long-distance situation; 2) the gait characteristics of pedestrians are not easy to hide or disguise due to the fact that physiological conditions such as bones and the center of gravity of each person and the individual coordination ability are different.
Although gait recognition techniques have been proposed for some time, a unified technical framework has not been formed so far. Compared with other biometric authentication technologies (such as face recognition, iris recognition, fingerprint recognition and the like), the biometric authentication method is still immature, and mainly shows that a well-known effective database, an effective algorithm and a high recognition rate are lacked.
The current gait recognition technology mainly comprises the following steps: background separation, target tracking, machine learning, machine vision, etc. However, some of these techniques are not yet mature and can cause certain difficulties in gait recognition. In addition, the method also faces a lot of difficulties in practical application, which is mainly reflected in that pedestrians are influenced by external environments and self factors in the walking process, for example, the comprehensive influence of multiple factors such as different walking pavements, different resolutions, different viewing angles, different clothes and different carried objects, and the like, so that the recognition rate in a complex real environment combining a plurality of influencing factors is still low. Among the above-mentioned influencing factors, the view angle change is one of the most important factors influencing the performance of the gait recognition system.
In conclusion, the gait recognition technology has the problems that the recognition accuracy is hindered by various factors such as different visual angles, different clothes, different carried objects and the like.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention aims to provide a gait recognition method and system based on view angle transformation, which is used to improve the problem in the prior art that the recognition accuracy is hindered by various factors such as different view angles, different clothes and different carrying objects.
To achieve the above and other related objects, the present invention provides a gait recognition method based on perspective transformation, including:
step one S100: the method comprises the steps of data acquisition and processing, wherein monitoring videos acquired by a plurality of pedestrian monitoring devices are acquired and processed to obtain a pedestrian gait data set, and the pedestrian gait data set is divided into a gait training set and a gait testing set;
step two S200: a visual angle conversion step, namely training a visual angle conversion model for generating a specific visual angle image through any visual angle by using a GaitGAN network, training a discriminator for discriminating the correctness of the generated visual angle image, and then inputting the data of the gait test set into the visual angle conversion model to obtain a gait energy map set under a specific target visual angle; and
step three S300: and a gait recognition step, namely acquiring an image generated by the GaitGAN network, preprocessing the image to obtain a pixel image, inputting the pixel image into a reference gait feature extraction model to obtain a feature vector and a pedestrian prediction vector, calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
In a preferred embodiment of the present invention, the step S100 includes:
step S101: acquiring the monitoring video, and performing video frame extraction processing on the monitoring video to obtain a frame image;
step S102: screening the frame images, and preprocessing the screened frame images to obtain pedestrian images; and
step S103: and processing the pedestrian image to obtain a gait energy map, wherein all the gait energy maps form the pedestrian gait data set and are divided into the gait training set and the gait testing set.
In a preferred embodiment of the present invention, the step two S200 includes:
step S201: marking a view angle label on the gait energy map in the gait training set obtained in the step one S100;
step S202: setting the target view angle beta to be 90 degrees, and obtaining the gait energy image I obtained in the step S201αAnd the gait energy map IαGait energy map I generated by initial generator Gβ' As input, train true and false arbiter DRDistinguishing a real image from a generated image;
step S203: mapping the gait energy IαAnd the gait energy map Iβ' As input, train identity arbiter DATo judge whether the real image and the generated image are the same person;
step S204: mapping the gait energy IαAnd target view angle meansSign vβInput to a generator G that is trained to generate a gait energy map I with a target perspective ββ
Step S205: inputting all gait energy maps of the gait training set obtained in the step one S100 into a perspective conversion model, and repeating the steps S202 to S204 until the discrimination probabilities of the true and false discriminators and the identity discriminator tend to be 0.5 and stable; and
step S206: generating a new gait energy image set by using the gait energy map in the gait training set obtained in the first step S100 by using the perspective transformation model, and using the new gait energy image set as the training set of the reference gait feature extraction model in the third step S300; and generating a new gait energy image set by using the gait energy image in the gait test set obtained in the first step S100 by using the perspective transformation model, and using the new gait energy image set as the test set of the reference gait feature extraction model in the third step S300.
In a preferred embodiment of the present invention, after repeating the steps S202 to S204 until the discrimination probabilities of the true and false discriminators and the identity discriminator tend to be 0.5 and stable, the generator G obtained in the step S205 is used as the view angle conversion model.
In a preferred embodiment of the present invention, the real image in step S202 is the gait energy map I obtained from step S201αThe generated image is the gait energy map IαThe gait energy map I generated by the initial generator Gβ’。
In a preferred embodiment of the present invention, the step three S300 includes:
step 301: acquiring the gait energy images of the training set obtained after the step S205 as images for training to form a training batch, and processing the acquired images of the training batch to obtain a plurality of pixel images;
step 302: sending the pixel image into a convolutional neural network for feature extraction to obtain a feature vector, then sending the feature vector into a full connection layer, and obtaining a pedestrian prediction vector with dimensionality equal to the pedestrian category number through a softmax function; and
step 303: and calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing the parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
In a preferred embodiment of the present invention, the total loss in step S303 is algebraically added by the ternary loss and the ID loss, and the reference gait feature extraction model parameter includes a weight parameter wiAnd a bias parameter bi
In a preferred embodiment of the invention, ternary losses are calculated from the eigenvectors obtained (16 sets for the number m of sets):
Figure BDA0003564719200000041
wherein, aiFeature vector, p, representing target pictureiFeature vectors, n, representing positive sample pictures (belonging to the same category, i.e. the same person, as the target picture)iRepresenting the feature vector of a negative sample picture (not the same person as the target picture), all three dimensions are 1000 multiplied by 1 dimension, ai,pi,niForming a triple for loss calculation; margin is a parameter, set here to 0.3, m represents the triple logarithm extracted from the training batch, d (a)i,ni) For Euclidean distance, the formula is (where z is the feature vector dimension, here 1000):
Figure BDA0003564719200000051
the activation function calculation formula for converting the feature vector into an N-dimensional vector is:
Figure BDA0003564719200000052
wherein N is the number of pedestrian categories, v is the output vector of the full connection layer, vjFor the j-th value in v, i represents the pedestrian category which needs to be calculated currently, the result of calculation through the activation function is between 0 and 1, and the softmax values of all categories are summed to be 1.
In a preferred embodiment of the present invention, the ID loss is calculated from the resulting pedestrian prediction vector:
Figure BDA0003564719200000053
wherein p isniIs the nth pedestrian prediction vector v'nAnd the predicted probability value of the ith pedestrian, y is a real ID value, N is the number of the categories of the pedestrians, and K is the number of the pictures selected by each category of the pedestrians in a training batch.
In a preferred embodiment of the present invention, the total loss is calculated according to the feature vector and the pedestrian prediction vector by the following formula:
Ltotal=LT+LID
in a preferred embodiment of the present invention, the step 303 of optimizing the reference gait feature extraction model parameters by using a gradient descent algorithm is that:
for all data in a training batch, the total loss L is calculated as a training steptotal
Extracting the respective weights w in the model for the reference gait featureiAnd deviation bi(wi+1And bi+1For updated parameters), the following formula is executed to update the parameters:
Figure BDA0003564719200000061
Figure BDA0003564719200000062
the invention further provides a gait recognition system based on visual angle transformation, which comprises:
the system comprises a data acquisition and processing device, a pedestrian gait data set and a pedestrian monitoring device, wherein the data acquisition and processing device acquires monitoring videos from a plurality of pedestrian monitoring devices, processes the monitoring videos to obtain the pedestrian gait data set, and divides the pedestrian gait data set into a gait training set and a gait testing set;
the visual angle conversion device trains a visual angle conversion model for generating a specific visual angle image through any visual angle by using a GaitGAN network, trains a discriminator for discriminating the correctness of the generated visual angle image, and then inputs the data of the gait test set into the visual angle conversion model to obtain a gait energy map set under a specific target visual angle; and
and the gait recognition device is used for acquiring an image generated by the GaitGAN network, preprocessing the image to obtain a pixel image, inputting the pixel image into a standard gait feature extraction model to obtain a feature vector and a pedestrian prediction vector, calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing parameters of the standard gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
The gait recognition method and system based on visual angle conversion provided by the invention realize the automation of pedestrian tracking, so that the process of multidirectional tracking of the target pedestrian is free from dependence on manpower, and the video acquired by the cameras at multiple visual angles is processed into a gait energy diagram to be input into a pre-trained visual angle conversion model and a standard gait feature extraction model, so that the multidirectional and multi-visual angle tracking result of the target pedestrian can be obtained. The process utilizes a gait recognition technology to realize high automation of pedestrian tracking, and a generated confrontation network is adopted to convert a pedestrian gait energy map into a 90-degree visual angle with most obvious gait characteristics, so that the gait recognition accuracy is higher.
Drawings
FIG. 1 is a schematic flow chart of a gait recognition method based on visual angle transformation according to the invention;
FIG. 2 is a schematic diagram of a pedestrian gait data acquisition and processing process in step S100 according to the invention;
fig. 3 is a schematic structural diagram of the generator G in step two S200 according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of the true/false discriminator in step S200 according to an embodiment of the invention;
fig. 5 is a schematic structural diagram of the identity identifier in step S200 according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of gait energy images of a new training set and test set generated by generator G in step two S200 of the present invention;
FIG. 7 is a schematic flow chart of step three S300 according to the present invention;
fig. 8 is a schematic diagram of the basic gait feature extraction model in step three S300 according to the present invention;
fig. 9 is a schematic structural diagram of a ResNet50 network used in step three S300 according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. It is also to be understood that the terminology used in the examples is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention. Test methods in which specific conditions are not specified in the following examples are generally carried out under conventional conditions or under conditions recommended by the respective manufacturers.
Please refer to fig. 1 to 7 for a technical solution of the present invention. It should be understood that the structures, ratios, sizes, and the like shown in the drawings and described in the specification are only used for matching with the disclosure of the specification, so as to be understood and read by those skilled in the art, and are not used to limit the conditions of the present invention, so that the present invention has no technical significance. In addition, the terms such as "upper", "lower", "left", "right", "middle" and "one" used in the present specification are used for clarity of description only, and are not used to limit the scope of the present invention, and the relative relationship between the terms may be changed or adjusted without substantial change in the technical content.
Fig. 1 is a schematic flow chart of a gait recognition method based on perspective transformation according to the present invention. In this embodiment, the gait recognition method based on perspective transformation of the present invention includes:
step one S100: the method comprises the steps of data acquisition and processing, wherein monitoring videos acquired by a plurality of pedestrian monitoring devices are acquired and processed to obtain a pedestrian gait data set, and the pedestrian gait data set is divided into a gait training set and a gait testing set;
step two S200: a visual angle conversion step, namely training a visual angle conversion model for generating a specific visual angle image through any visual angle by using a GaitGAN network, training a discriminator for discriminating the correctness of the generated visual angle image, and then inputting the data of the gait test set into the visual angle conversion model to obtain a gait energy map set under a specific target visual angle; and
step three S300: and a gait recognition step (simultaneously refer to fig. 8), acquiring an image generated by the GaitGAN network, preprocessing the image to obtain a pixel image, inputting the pixel image into a reference gait feature extraction model to obtain a feature vector and a pedestrian prediction vector, calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
For GaitGAN networks, see Shiqi Yu, Haifeng Chen, Edel b. garcia Reyes and Norman Poh at Conference 2017 at IEEE Conference on Computer Vision and Pattern Recognition Works (CVPRW) entitled "GaitGAN: an investigational gap Feature Extraction Using general adaptive Networks.
In an embodiment of the present invention (see fig. 2), the step S100 of data acquisition and processing specifically includes:
step S101: and acquiring the monitoring video, and performing video frame extraction processing on the monitoring video to obtain a frame image. Fig. 2 (a) shows an example of one extracted frame image;
step S102: screening frame images, and preprocessing the screened frame images to obtain pedestrian images, as shown in (b) and (c) of fig. 2, wherein (c) shows a pedestrian image in one gait cycle; and
step S103: the pedestrian images are processed to obtain gait energy maps (the map (d) in fig. 2 is a gait energy map synthesized by the pedestrian images in one gait cycle), and all the gait energy maps form a pedestrian gait data set and are divided into a training set and a testing set. The gait energy profile of the training set and the test set can be seen as an example in fig. 2 (e). All the gait energy images in the training set and the test set obtained in the step are gait energy images corresponding to the original observation visual angle, and the original observation visual angle refers to the original visual angle of the monitoring video shot by the monitoring equipment. The graph (e) in fig. 2 includes gait energy graphs corresponding to 11 original observation perspectives.
In an embodiment of the present invention, for example, a high-precision camera may be used to acquire several pieces of pedestrian walking video stream data from 11 viewing angles (from 0 ° to 180 ° and at an interval of 18 °), and perform video frame extraction processing on the several pieces of pedestrian walking video stream data to acquire a plurality of frame images (a).
The plurality of frame images are then pre-processed: firstly, frame images which do not contain detection objects, namely pedestrians, are removed, and then the rest of the screened frame images are subjected to noise removal, gray level conversion, binarization conversion and other operations, so that the definition and the contrast of the frame images are improved, image information is enhanced, and the frame images after image processing are obtained. And then cutting the frame image after the image processing to ensure that the specification size of the frame image after the image processing meets the preset specification size requirement, and manually checking the cut frame image to remove abnormal data. The frame image to be regarded as the abnormal data is, for example: the extracted video key frame image has extremely poor image quality and is not easy to distinguish, or the image main body is the surrounding environment and is not the frame image of the current detection object (namely the pedestrian), and the like.
The frame image of the pedestrian is cut by taking half of the human body width as a positioning position, and is zoomed to the same size (256 × 256), then the cut frame image is synthesized into a pedestrian image (b), and the pedestrian image (b) in one gait cycle is synthesized into a gait energy image (d). And marking a pedestrian identity label for each gait energy map. And (3) a data set consisting of all gait energy maps is as follows: and 2, dividing the gait training set and the gait testing set and storing the gait training set and the gait testing set.
Finally, a data set containing images of pedestrians photographed by a plurality of target pedestrians at 11 viewing angles respectively is acquired through step S100, and is processed to obtain a pedestrian gait data set. The pedestrian gait data set consists of gait energy images of each pedestrian in the monitoring video of the individual at 11 visual angles.
In an embodiment of the present invention, the step two S200 of converting the viewing angle specifically includes:
step S201: marking a view angle label on the gait energy map in the training set obtained in the step one S100;
step S202: setting the target visual angle beta as 90 degrees, and obtaining a gait energy chart I with the original observation angleαAnd gait energy map IαGait energy map I generated by initial generator Gβ' As input, train true and false arbiter DRDistinguishing a real image (i.e., the gait energy map in the training set obtained in step one S100) from a generated image (i.e., the gait energy map output by the generator G');
step S203: will IαAnd Iβ' As input, train identity arbiter DATo determine whether the real image (i.e. the gait energy map in the training set obtained in step one S100) and the generated image (i.e. the gait energy map output by the generator G') are presentIs the same person;
step S204: will IαAnd a target view indicator vβInput to a generator G that is trained to generate a gait energy map D with a target perspective βA
Step S205: inputting all gait energy maps of the training set obtained in the step one S100 into a view angle conversion model, and repeating the steps S202 to S204 until a true and false discriminator DRIdentity discriminator DAThe discrimination probability of (2) tends to 0.5 and is stable;
the steps S202 to S204 are repeated until a true and false discriminator DRIdentity discriminator DAAfter the discrimination probability of (2) tends to 0.5 and stabilizes, the generator G obtained in step S205 is used as a view conversion model;
step S206: acquiring a gait energy image set under a generated target view angle beta from the gait energy images in the training set acquired in the first step S100 by using a view angle conversion model (namely, a generator G), and taking the gait energy image set as a training set of the reference gait feature extraction module in the third step S300; the gait energy images in the test set obtained in step one S100 are converted by using the perspective conversion model (i.e., generator G), and the generated gait energy image set is obtained and used as the test set of the reference gait feature extraction model in step three S300.
FIG. 3 is a schematic diagram of the generator G obtained in step 205, and FIG. 4 is a diagram of the true/false discriminator D in step 202RFig. 5 is a schematic diagram of the identity discriminator D in step 203ASchematic structural diagram of (1). It should be understood that fig. 3-5 show only the generator G, the true and false arbiter DRAnd identity discriminator DAA structure example of the present invention, a generator G and a true/false discriminator DRAnd an identity discriminator DAOther suitable configurations may also be used.
The following is a description of a specific implementation process of step two S200 in an embodiment of the present invention:
step S201: in this embodiment, for example, for each gait energy image in the gait training set obtained in step one S100, a picture IθAdopting One-hot vector, and expressing the visual angle indicator v according to the shooting angle theta in the form of vector according to the division mode that the shooting visual angle (namely the original observation visual angle) is from 0 DEG to 180 DEG and the interval is 18 DEGθ(θ 0, θ 18, …, θ 180). For each picture, if the picture is obtained by post-processing of shooting by monitoring equipment with a 0-degree visual angle, setting the value of theta 0 to be 1, and setting the rest values to be 0;
step S202: randomly extracting gait energy images I of n original observation angles from the gait training set obtained in the step one S100αAs a true sample (i.e., a true image); setting the target view angle beta to 90 deg., i.e. the target view angle indicator vθThe value of θ 90 is set to 1, and the remaining values are set to 0; for each gait energy map in the n real samples, v is divided intoθCopying and expanding the value of each element in the image to a two-dimensional matrix with the same size as the two-dimensional matrix corresponding to the input image, inputting the two-dimensional matrix and the input image together to a generator G' (the "input image" refers to a real image and is a gait energy map in the training set obtained in the step S100), and generating n generation samples (namely generation images, namely gait energy maps I) corresponding to the real samples one by one under a target view angle betaβ') to a host; using the n generated samples as false samples and the true samples as input images x to be input into a true and false discriminator DRTaking binary cross entropy as a true-false discriminator DRLoss function of
Figure BDA0003564719200000111
The calculation formula is as follows:
Figure BDA0003564719200000112
wherein D isR(x) Representing the probability that x is a true image, when DR(x) If it is greater than 0.5, the true or false decision device DRJudging that the input image x is a real sample, and taking t as 1; when D is presentR(x) If it is less than 0.5, the true or false decision device DRAnd judging that the input image x is a false sample, and taking t as 0.
Figure BDA0003564719200000113
Denotes the probability that x is at a 90 ° view angle when
Figure BDA0003564719200000114
If it is greater than 0.5, the true or false decision device DRJudging that the input image x is at a 90-degree visual angle, and taking s as 1; when in use
Figure BDA0003564719200000115
If it is less than 0.5, the true or false decision device DRAnd judging that the input image x is at other visual angles, and taking s as 0. Updating and optimizing the true and false discriminator D by using a back propagation algorithmRTo minimize
Figure BDA0003564719200000116
Step S203: using the n real samples selected in step S202 as a source gait energy map set { x }iN noise samples obtained in step S202 are used as a generated gait energy map set { x }jExtracting n images from the training set to form an irrelevant image set { x }k},xkMust have an and xiA different identity tag;
from { x, respectivelyiAnd { x }jSelecting an image to be input to an identity discriminator D togetherAIn, the identity discriminator DALoss function
Figure BDA0003564719200000121
Is defined as follows:
Figure BDA0003564719200000122
wherein D isA(x1,x2) Denotes x1,x2Is the probability of two gait energy maps of the same person. Updating the optimized identity discriminator D using a back propagation algorithmATo minimize
Figure BDA0003564719200000123
Step S204: set source gait energy map { xiImage I in (1)αAnd a target view indicator vβInput to a generator G, and combined with a true-false discriminator DRAnd an identity discriminator DADetermining an objective function of the generator G as
Figure BDA0003564719200000124
Updating the parameters of the optimized generator G to maximize L using a back propagation algorithmG(x)。
Step S205: each gait energy map of the training set (obtained in step S100) is combined with vθThe values of each element are copied and expanded to obtain a two-dimensional matrix with the same size as the two-dimensional matrix corresponding to the input image, the two-dimensional matrix is input into the model (the input image refers to a real image and is a gait energy image in the training set obtained in the step S100), and the steps 202-204 are repeated until the discrimination probabilities (D) of a true discriminator, a false discriminator and an identity discriminator are reachedR(x)、DA(x1,x2) Tends to 0.5 and stabilizes, at which point the ability to generate the model is considered to be sufficiently strong and the training is stopped.
The steps S202 to S204 are repeated until a true and false discriminator DRIdentity discriminator DAAfter the discrimination probability of (2) tends to 0.5 and stabilizes, the generator G obtained in step S205 is used as a view conversion model;
step 206: each gait energy map and v in the gait training set obtained in the step one S100θCopying and expanding the value of each element to obtain a first two-dimensional matrix, wherein the first two-dimensional matrix has the same size as a second two-dimensional matrix corresponding to an input image (namely, a real image which is a gait energy image in the training set obtained in the step one S100), inputting the first two-dimensional matrix and the second two-dimensional matrix into a generator G together for visual angle conversion, and taking an image set generated by the generator G as a training set of the reference gait feature extraction model in the step three S300;
centralizing the gait test obtained in step S100Each gait energy map and vθAnd copying and expanding the value of each element to obtain a third two-dimensional matrix, wherein the size of the third two-dimensional matrix is the same as that of a fourth two-dimensional matrix corresponding to the input image (namely, a real image which is a gait energy image in the test set obtained in the step S100), inputting the third two-dimensional matrix and the fourth two-dimensional matrix into a generator G together, and taking an image set generated by the generator G as the test set of the reference gait feature extraction model in the step three S300.
Fig. 6 shows gait energy images of the new training set and test set generated by generator G in step two 200.
Step three S300: namely a gait recognition step, comprising: acquiring images generated by the GaitGAN network (i.e. gait energy images of the new training set and the new test set obtained in step S206, as illustrated in fig. 8), preprocessing the images to obtain pixel images, sending the pixel images to a convolutional neural network (i.e. a feature extraction network in a reference gait feature extraction model) to perform feature extraction to obtain feature vectors, then sending the feature vectors to a full-link layer, obtaining vectors with dimensions equal to pedestrian categories through a softmax function, wherein the vectors are pedestrian prediction vectors, calculating total loss according to the feature vectors and the pedestrian prediction vectors, optimizing parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
Specifically, step three S300 of the present invention includes the following steps (see fig. 7 at the same time):
step 301: acquiring images (images generated by a GaitGAN network, namely gait energy images acquired after S205) used for training to form a training batch, and processing the acquired images of the training batch to acquire a plurality of pixel images;
step 302: sending the pixel image into a convolutional neural network for feature extraction to obtain a feature vector, then sending the feature vector into a full connection layer, and obtaining a pedestrian prediction vector with dimensionality equal to the pedestrian category number through a softmax function;
step 303: and calculating the total loss according to the feature vector and the pedestrian prediction vector, and optimizing the parameters of the model by using a gradient descent algorithm to finally obtain a trained gait feature extraction model.
The following is a description of a specific implementation process of step three S300 in an embodiment of the present invention:
step S301: in the GaitGAN network of the perspective conversion step of step two S200, among the gait energy maps at the 90-degree perspective of various pedestrians generated after step S205, the gait energy maps of N × K pedestrians (N is the number of pedestrian categories, and K represents that K pictures (where K is 16) are randomly extracted for each type of pedestrian) are extracted as a training batch, and then each image is resized to 224 × 224 pixels, and then each image is decoded in [0, 1] to 32-bit floating-point original pixel values.
S302: the gait energy images of a training batch are all input into a convolutional neural network (namely a feature extraction network in a reference gait feature extraction model), multi-dimensional features (such as contour features, step length, proportion of each part of a human body and the like of pedestrians) are extracted, each image generates a feature vector, then the feature vector is input into a full connection layer (namely a conventional neural network) with the dimension number equal to the class number N of the pedestrians in the training set, and a pedestrian prediction vector with the dimension equal to the class number N is generated after a softmax activation function.
In an embodiment of the present invention, the generation process of the pedestrian prediction vector takes a process of generating a pedestrian prediction vector by a pedestrian picture (an image generated through step S301) as an example:
after extracting features by using a convolutional neural network, inputting the features into a full connection layer (namely a conventional neural network) with the dimensionality equal to the pedestrian classification number N of a training set, outputting a column vector N with the dimensionality of 1 multiplied by N after the features pass through the full connection layer (the column vector is a mathematical proper noun, and the column vector N is an output vector mentioned in a softmax formula), obtaining a column vector v 'with each element having the size between (0 and 1) after the column vector v passes through a softmax function, and taking the column vector v' as a pedestrian prediction vector obtained after the picture is processed by a reference gait feature extraction model.
In an embodiment of the present invention, the structure of the reference gait feature extraction model may be as shown in fig. 8. The convolutional network is an extracted feature network which is a part of the reference gait recognition model, the ResNet50 is a specific example of a convolutional neural network in the reference gait recognition model, or the ResNet50 is a specific example of an extracted feature network in the reference gait feature extraction model. The structure of the ResNet50 network may take the form shown in fig. 9.
As to the ResNet50 network, see the paper entitled "Deep residual learning for image recognition", published by Kaming He, Xiangyu Zhang, Shaoqing Ren and Jianan Sun in 2016 at the IEEE conference on computer vision and pattern recognition conference.
S303: calculating total loss according to the characteristic vector and the pedestrian prediction vector, and optimizing the parameters (including the weight parameter w) of the reference gait feature extraction model by using a gradient descent algorithmiAnd a bias parameter bi) And finally obtaining a trained gait feature extraction model. The method specifically comprises the following steps:
and (4) calculating the ternary loss according to the obtained feature vector (the group number m is 16 groups):
Figure BDA0003564719200000151
wherein, aiFeature vector, p, representing a target pictureiFeature vectors, n, representing positive sample pictures (belonging to the same category, i.e. the same person, as the target picture)iRepresenting the feature vector of a negative sample picture (not the same person as the target picture), and the dimensions of the three are 1000 multiplied by 1 dimensions. a isi,pi,niAnd forming a triple to calculate loss. A in the triadi,pi,niThe combinations of (a) are arranged by calling the torch.nn. triplettmarginloss method in the mature and open source pytorch deep learning framework. margin is a parameter, set here to 0.3, m represents the number of triplets extracted from the training batch, d (a)i,ni) Calculating the Euclidean distance (the Euclidean distance is the distance of a straight line between two points in space)The formula is (where z is the feature vector dimension, here 1000):
Figure BDA0003564719200000152
the activation function calculation formula (softmax formula) for converting the feature vector into an N-dimensional vector is:
Figure BDA0003564719200000153
wherein N is the number of pedestrian categories, v is the output vector of the full connection layer, vjFor the j-th value in v, i represents the pedestrian category which needs to be calculated currently, the result of calculation through the activation function is between 0 and 1, and the softmax values of all categories are summed to be 1.
For ternary losses, see the paper "FaceNet" by Florian Schroff, Dmitry Kalenichenko and James philibin at year 2015, month 2, at the Conference cvpr (ieee Conference on Computer Vision and Pattern recognition): a Unified Embedding for Face Recognition and Cluster.
Calculating ID loss according to the obtained pedestrian prediction vector:
Figure BDA0003564719200000161
wherein p isniIs the n-th pedestrian prediction vector v'nThe predicted probability value of the ith pedestrian is the real ID value, N is the number of the categories of the pedestrians, and K is the number of the pictures selected by each category of the pedestrians in a training batch.
With regard to ID loss, see the paper entitled "A discrete levels of elevation and examination for person identification" published by Zhedong Zheng, Liang Zheng and Yi Yang on ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) (volume 14, vol.14, page 13, 2018).
According to the feature vector and the pedestrian prediction vector, the total loss formula is calculated as follows:
Ltotal=LT+LID
inputting the total loss of a convolutional neural network (namely, the network corresponding to the reference gait feature extraction model) after a training batch into the algebraic addition of the ternary loss and the ID loss, and then utilizing a gradient descent algorithm to extract the weight parameters (w) of the network and the full-connection layeri) And updating to complete the optimization of the reference gait feature extraction model once. A total of three such optimizations are performed in one iteration. The initial learning rate was set to 0.00035, which was reduced by 0.1 at 40 and 70 th iterations, respectively. A total of 120 iterations were performed. And after 120 iterations, finishing the training of the reference gait feature extraction model.
The gradient descent algorithm has the following principle:
for all data in a training batch, the total loss L is calculated as a training steptotal
For each weight parameter w in the modeliAnd bias parameter bi(wi+1And bi+1For updated parameters), the following formula is executed to update the parameters:
Figure BDA0003564719200000171
Figure BDA0003564719200000172
the invention provides a gait recognition method and a gait recognition system based on visual angle conversion, which can realize the automation of pedestrian tracking, so that the process of multidirectional tracking of a target pedestrian is free from dependence on manpower, and the multidirectional and multi-visual-angle tracking result of the target pedestrian can be obtained only by processing videos acquired by cameras at multiple visual angles into a gait energy diagram and inputting the gait energy diagram into a pre-trained gait feature extraction model. The process utilizes a gait recognition technology to realize high automation of pedestrian tracking, and a generated confrontation network is adopted to convert a pedestrian gait energy map into a 90-degree visual angle with most obvious gait characteristics, so that the gait recognition accuracy is higher.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (11)

1. A gait recognition method based on visual angle transformation comprises the following steps:
step one S100: the method comprises the steps of data acquisition and processing, wherein monitoring videos acquired by a plurality of pedestrian monitoring devices are acquired and processed to obtain a pedestrian gait data set, and the pedestrian gait data set is divided into a gait training set and a gait testing set;
step two S200: a visual angle conversion step, namely training a visual angle conversion model for generating a specific visual angle image through any visual angle by using a GaitGAN network, training a discriminator for discriminating the correctness of the generated visual angle image, and then inputting the data of the gait test set into the visual angle conversion model to obtain a gait energy map set under a specific target visual angle; and
step three S300: and a gait recognition step, namely acquiring an image generated by the GaitGAN network, preprocessing the image to obtain a pixel image, inputting the pixel image into a reference gait feature extraction model to obtain a feature vector and a pedestrian prediction vector, calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
2. The gait recognition method according to claim 1, characterized in that: the step S100 includes:
step S101: acquiring the monitoring video, and performing video frame extraction processing on the monitoring video to obtain a frame image;
step S102: screening the frame images, and preprocessing the screened frame images to obtain pedestrian images; and
step S103: and processing the pedestrian image to obtain a gait energy map, wherein all the gait energy maps form the pedestrian gait data set and are divided into the gait training set and the gait testing set.
3. The gait recognition method according to claim 2, characterized in that: the second step S200 includes:
step S201: marking a view angle label on the gait energy map in the gait training set obtained in the step one S100;
step S202: setting the target view angle beta to be 90 degrees, and obtaining the gait energy image I obtained in the step S201αAnd the gait energy map IαGait energy map I generated by initial generator Gβ' As input, training true and false discriminator DRDistinguishing a real image from a generated image;
step S203: mapping the gait energy IαAnd the gait energy map Iβ' As input, training an identity discriminator DATo judge whether the real image and the generated image are the same person;
step S204: mapping the gait energy IαAnd a target view indicator vβInput to a generator G that is trained to generate a gait energy map I with a target perspective ββ
Step S205: inputting all gait energy maps of the gait training set obtained in the step one S100 into a perspective conversion model, and repeating the steps S202 to S204 until the discrimination probabilities of the true and false discriminators and the identity discriminator tend to be 0.5 and stable; and
step S206: generating a new gait energy image set by using the perspective conversion model for the gait energy images in the gait training set obtained in the first step S100, and using the new gait energy image set as the training set of the reference gait feature extraction model in the third step S300; and generating a new gait energy image set by using the gait energy image in the gait test set obtained in the first step S100 by using the perspective transformation model, and using the new gait energy image set as the test set of the reference gait feature extraction model in the third step S300.
4. The gait recognition method according to claim 3, characterized in that: after repeating the steps S202 to S204 until the discrimination probabilities of the true and false discriminators and the identity discriminator approach 0.5 and stabilize, the generator G obtained in the step S205 is used as the view angle conversion model.
5. The gait recognition method according to claim 3, characterized in that: the real image in step S202 is the gait energy map I obtained from step S201αThe generated image is the gait energy map IαThe gait energy map I generated by the initial generator Gβ,。
6. The gait recognition method according to claim 3, characterized in that: the third step S300 includes:
step 301: acquiring the gait energy images of the training set obtained after the step S205 as images for training to form a training batch, and processing the acquired images of the training batch to obtain a plurality of pixel images;
step 302: sending the pixel image into a convolutional neural network of the reference gait feature extraction model for feature extraction to obtain a feature vector, inputting the feature vector into a full connection layer of the reference gait feature extraction model, and obtaining the pedestrian prediction vector with dimensionality equal to the number of pedestrian categories through a softmax function; and
step 303: and calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining the trained gait feature extraction model.
7. The gait recognition method according to claim 6, characterized in that: in step S303, the total loss is algebraically adding the ternary loss and the ID loss, and the parameter of the reference gait feature extraction model includes a weight parameter wiAnd a bias parameter bi
8. The gait recognition method according to claim 7, characterized in that: calculating the ternary loss according to the obtained feature vector (taking 16 groups as the group number m):
Figure FDA0003564719190000031
wherein, aiFeature vector, p, representing a target pictureiFeature vectors, n, representing positive sample pictures (belonging to the same category, i.e. the same person, as the target picture)iRepresenting the feature vector of a negative sample picture (not the same person as the target picture), all three dimensions are 1000 multiplied by 1 dimension, ai,pi,niForming a triple for loss calculation; margin is a parameter, set here to 0.3, m represents the number of triplets extracted from the training batch, d (a)i,ni) For the Euclidean distance, the formula is calculated (where z is the feature vector dimension, here 1000):
Figure FDA0003564719190000032
the activation function calculation formula for converting the feature vector into an N-dimensional vector is:
Figure FDA0003564719190000041
wherein N is the pedestrian categoryNumber, v is the full connection layer output vector, vjFor the j-th value in v, i represents the pedestrian category which needs to be calculated currently, the result of calculation through the activation function is between 0 and 1, and the softmax values of all categories are summed to be 1.
9. The gait recognition method according to claim 7, characterized in that: calculating the ID loss according to the obtained pedestrian prediction vector:
Figure FDA0003564719190000042
wherein p isniIs the n-th pedestrian prediction vector v'nThe predicted probability value of the ith pedestrian is the real ID value, N is the number of the categories of the pedestrians, and K is the number of the pictures selected by each category of the pedestrians in a training batch.
10. The gait recognition method according to claim 7, characterized in that: the step 303 of optimizing the parameters of the reference gait feature extraction model by using the gradient descent algorithm is to:
for all data in a training batch, the total loss L is calculated as a training steptotal
Extracting the respective weights w in the model for the reference gait featureiAnd deviation bi(wi+1And bi+1For updated parameters), the following formula is executed to update the parameters:
Figure FDA0003564719190000043
Figure FDA0003564719190000044
11. a gait recognition system based on perspective transformation, comprising:
the system comprises a data acquisition and processing device, a pedestrian monitoring device and a monitoring and control device, wherein the data acquisition and processing device acquires monitoring videos from a plurality of pedestrian monitoring devices, processes the monitoring videos to obtain a pedestrian gait data set, and divides the pedestrian gait data set into a gait training set and a gait testing set;
the visual angle conversion device trains a visual angle conversion model for generating a specific visual angle image through any visual angle by using a GaitGAN network, trains a discriminator for discriminating the correctness of the generated visual angle image, and then inputs the data of the gait test set into the visual angle conversion model to obtain a gait energy map set under a specific target visual angle; and
and the gait recognition device is used for acquiring an image generated by the GaitGAN network, preprocessing the image to obtain a pixel image, inputting the pixel image into a reference gait feature extraction model to obtain a feature vector and a pedestrian prediction vector, calculating total loss according to the feature vector and the pedestrian prediction vector, optimizing parameters of the reference gait feature extraction model by using a gradient descent algorithm, and finally obtaining a trained gait feature extraction model.
CN202210305445.5A 2022-03-25 2022-03-25 Gait recognition method and system based on visual angle transformation Pending CN114627424A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210305445.5A CN114627424A (en) 2022-03-25 2022-03-25 Gait recognition method and system based on visual angle transformation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210305445.5A CN114627424A (en) 2022-03-25 2022-03-25 Gait recognition method and system based on visual angle transformation

Publications (1)

Publication Number Publication Date
CN114627424A true CN114627424A (en) 2022-06-14

Family

ID=81903934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210305445.5A Pending CN114627424A (en) 2022-03-25 2022-03-25 Gait recognition method and system based on visual angle transformation

Country Status (1)

Country Link
CN (1) CN114627424A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253283A (en) * 2023-08-09 2023-12-19 三峡大学 Wheelchair following method based on fusion of image information and electromagnetic positioning information data
CN117893450A (en) * 2024-03-15 2024-04-16 西南石油大学 Digital pathological image enhancement method, device and equipment
CN117893450B (en) * 2024-03-15 2024-05-24 西南石油大学 Digital pathological image enhancement method, device and equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253283A (en) * 2023-08-09 2023-12-19 三峡大学 Wheelchair following method based on fusion of image information and electromagnetic positioning information data
CN117893450A (en) * 2024-03-15 2024-04-16 西南石油大学 Digital pathological image enhancement method, device and equipment
CN117893450B (en) * 2024-03-15 2024-05-24 西南石油大学 Digital pathological image enhancement method, device and equipment

Similar Documents

Publication Publication Date Title
CN110443143B (en) Multi-branch convolutional neural network fused remote sensing image scene classification method
CN108108657B (en) Method for correcting locality sensitive Hash vehicle retrieval based on multitask deep learning
CN108537743B (en) Face image enhancement method based on generation countermeasure network
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
CN107633226B (en) Human body motion tracking feature processing method
CN113221641B (en) Video pedestrian re-identification method based on generation of antagonism network and attention mechanism
CN105138998B (en) Pedestrian based on the adaptive sub-space learning algorithm in visual angle recognition methods and system again
CN109255289B (en) Cross-aging face recognition method based on unified generation model
CN111709311A (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN111709313B (en) Pedestrian re-identification method based on local and channel combination characteristics
CN109684913A (en) A kind of video human face mask method and system based on community discovery cluster
CN110852152B (en) Deep hash pedestrian re-identification method based on data enhancement
CN105654122B (en) Based on the matched spatial pyramid object identification method of kernel function
CN112580480B (en) Hyperspectral remote sensing image classification method and device
CN112084895B (en) Pedestrian re-identification method based on deep learning
CN113095158A (en) Handwriting generation method and device based on countermeasure generation network
Pratama et al. Face recognition for presence system by using residual networks-50 architecture
CN110826534B (en) Face key point detection method and system based on local principal component analysis
CN115311502A (en) Remote sensing image small sample scene classification method based on multi-scale double-flow architecture
CN114547365A (en) Image retrieval method and device
CN112613474B (en) Pedestrian re-identification method and device
CN114627424A (en) Gait recognition method and system based on visual angle transformation
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN116543419B (en) Hotel health personnel wearing detection method and system based on embedded platform
CN105844299B (en) A kind of image classification method based on bag of words

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination