CN111914762A

CN111914762A - Gait information-based identity recognition method and device

Info

Publication number: CN111914762A
Application number: CN202010773633.1A
Authority: CN
Inventors: 王震
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2020-08-04
Filing date: 2020-08-04
Publication date: 2020-11-10

Abstract

The invention provides an identity recognition method and device based on gait information, which comprises the following steps: acquiring at least two frames of target images shot by image acquisition equipment, wherein the at least two frames of target images comprise gait information of a target object; taking the at least two frames of target images as the input of an integrated network model, and analyzing the gait information in the at least two frames of target images through the integrated network model to obtain the gait feature vector of the target object output by the integrated network model; and determining the identity information of the target object through the gait feature vector. By the method and the device, the problem of low gait recognition accuracy is solved, and the effect of improving the gait recognition accuracy is achieved.

Description

Gait information-based identity recognition method and device

Technical Field

The invention relates to the field of communication, in particular to an identity recognition method and device based on gait information.

Background

With the development of deep learning, computer vision technology based on deep learning is widely applied in society. In a real monitoring scene, as an identification technology for identifying an identity by using a pedestrian walking mode, gait identification is widely concerned and applied in the field of security monitoring by virtue of the advantages of long identification distance, difficulty in camouflage, easiness in acquisition and the like.

The conventional gait recognition method takes gait segmentation and gait recognition as two tasks which are respectively researched, so that certain influence is caused on the accuracy and robustness of the gait recognition.

Aiming at the problem of low gait recognition precision in the related art, no effective solution exists at present.

Disclosure of Invention

The embodiment of the invention provides an identity recognition method and device based on gait information, which at least solve the problem of low gait recognition precision in the related technology.

According to an embodiment of the invention, there is provided an identity recognition method based on gait information, including: acquiring at least two frames of target images shot by image acquisition equipment, wherein the at least two frames of target images comprise gait information of a target object; taking the at least two frames of target images as input of an integrated network model, analyzing the gait information in the at least two frames of target images through the integrated network model to obtain gait feature vectors of the target object output by the integrated network model, wherein the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting a gait silhouette of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouette output by the gait segmentation network model to obtain gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data; and determining the identity information of the target object through the gait feature vector.

Optionally, before the analyzing the gait information in the at least two frames of target images by the integrated network model, the method further includes: pre-training an original segmentation network model by using a first training data set to obtain a preset gait segmentation network model, wherein the first training data set comprises: at least two training images and corresponding known gait silhouettes; inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences; and pre-training the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a preset gait recognition model.

Optionally, the pre-training the original segmentation network model using the first training data set includes: determining a known gait silhouette image of a first training object in the at least two frames of training images, wherein the at least two frames of training images comprise gait information of the first training object; and taking the at least two training images and the corresponding known gait silhouette as a first training data set, and pre-training the original segmentation network model by using the first training data set to obtain the preset gait segmentation network model, wherein a first loss function between an estimated gait silhouette output by the preset gait segmentation network model and the known gait silhouette of the first training object meets a first convergence condition, the first convergence condition is used for indicating that the output value of the first loss function is within a first preset range, the numerical value of each pixel point in the estimated gait silhouette is a floating point, and the floating point is used for indicating the probability that the pixel point is the first training object.

Optionally, inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences, including: and inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences which are output by the preset gait segmentation network model and correspond to the plurality of groups of first step sequences, wherein each group of first step sequences in the plurality of groups of first step sequences comprises at least two frames of images, and each group of first step sequences is used for representing gait information of a second training object.

Optionally, the pre-training the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a preset gait recognition model includes: respectively carrying out normalization processing and binarization processing on each gait silhouette image in the multiple groups of first estimated gait silhouette sequences to obtain a second training data set, wherein 0 represents a pixel point of a background image in a training image in the second training data set, and 1 represents a pixel point of a second training object in the training image; pre-training the original gait recognition model by using a second training data set to obtain a first preset gait recognition model, wherein a second loss function of a first estimated gait feature vector output by the first preset gait recognition model meets a second convergence condition, the second convergence condition is used for representing that an output value of the second loss function is within a second preset range, and/or a third loss function of a first estimated identity feature vector output by the first preset gait recognition model and a known identity feature vector of a second training object meets a third convergence condition, the third convergence condition is used for representing that an output value of the third loss function is within a third preset range, and the preset gait recognition model comprises the first preset gait recognition model.

Optionally, after the obtaining the first preset gait recognition model, the method comprises: determining a model formed by the preset gait segmentation network model and the first preset gait recognition model as a first preset integrated network model; loading a first training parameter of the preset gait segmentation network model and a second training parameter of the first preset gait recognition model; and performing fine tuning training on the first training parameter and the second training parameter in the first preset integrated network model by using multiple groups of first step sequences to obtain the integrated network model, wherein a first target loss function of a gait feature vector output by the integrated network model meets a first target convergence condition, the first target convergence condition is used for indicating that an output value of the first target loss function is within a fourth preset range, and/or a second target loss function of an estimated identity feature vector output by the integrated network model and a known identity feature vector of a second training object meets a second target convergence condition, and the second target convergence condition is used for indicating that an output value of the second target loss function is within a fifth preset range.

Optionally, the pre-training the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a preset gait recognition model includes: respectively carrying out normalization processing on each gait silhouette image in the multiple groups of first estimated gait silhouette sequences to obtain multiple groups of second estimated gait silhouette sequences; respectively superposing each group of second estimated gait silhouette sequences in the plurality of groups of second estimated gait silhouette sequences, and then calculating the average value to obtain a plurality of energy graphs corresponding to the plurality of groups of second estimated gait silhouette sequences, wherein the plurality of energy graphs are used as a third training data set, and one group of second estimated gait silhouette sequences corresponds to one energy graph; training an original gait recognition model by using the third training data set to obtain a second preset gait recognition model, wherein a fourth loss function of a second estimated gait feature vector output by the second preset gait recognition model meets a fourth convergence condition, the fourth convergence condition is used for indicating that an output value of the fourth loss function is within a sixth preset range, and/or a fifth estimated identity feature vector output by the second preset gait recognition model and a fifth loss function of a known identity feature vector of a third training object meet a fifth convergence condition, the fifth convergence condition is used for indicating that an output value of the fifth loss function is within a seventh preset range, and the preset gait recognition model comprises the second preset gait recognition model.

Optionally, after the obtaining of the second preset gait recognition model, the method comprises: determining a model formed by the preset gait segmentation network model and the second preset gait recognition model as a second preset integrated network model; loading a first training parameter of the preset gait segmentation network model and a third training parameter of a second preset gait recognition model; and performing fine tuning training on the first training parameter and the third training parameter in the second preset integrated network model by using multiple groups of second step sequences to obtain the integrated network model, wherein a second target loss function of a gait feature vector output by the integrated network model meets a second target convergence condition, the second target convergence condition is used for indicating that an output value of the second target loss function is within an eighth preset range, and/or a third target loss function of an estimated identity feature vector output by the integrated network model and a known identity feature vector of a third training object meets a third target convergence condition, and the third target convergence condition is used for indicating that an output value of the third target loss function is within a ninth preset range.

Optionally, the determining the identity information of the target object through the gait feature vector includes: determining the cosine distance between the gait feature vector and each preset gait feature vector stored in a preset gait search base; and determining the identity information corresponding to the preset gait feature vector of which the cosine distance is less than or equal to a preset threshold value as the estimated identity information of the target object, wherein at least two preset gait feature vectors are stored in the preset gait search base, and each two preset gait feature vectors are stored in association with the corresponding identity information.

According to another embodiment of the present invention, there is provided an identification apparatus based on gait information, including: the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring at least two frames of target images shot by image acquisition equipment, and the at least two frames of target images comprise gait information of a target object; the analysis module is used for analyzing the gait information in the at least two frames of target images by using the at least two frames of target images as the input of an integrated network model, and obtaining the gait feature vector of the target object output by the integrated network model, the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting a gait silhouette of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouette output by the gait segmentation network model to obtain the gait feature vector, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data; and the first determination module is used for determining the identity information of the target object through the gait feature vector.

According to a further embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.

According to the invention, at least two frames of target images obtained by shooting by the image acquisition equipment are obtained, and the at least two frames of target images comprise gait information of the target object; taking at least two frames of target images as input of an integrated network model, analyzing gait information in the at least two frames of target images through the integrated network model to obtain gait feature vectors of a target object output by the integrated network model, wherein the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting a gait silhouette of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouette output by the gait segmentation network model to obtain the gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data; and determining the identity information of the target object through the gait feature vector. The gait segmentation and the gait recognition are used as a task, and the gait features are recognized through the integrated network model. Therefore, the problem of low gait recognition accuracy can be solved, and the effect of improving the gait recognition accuracy is achieved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a block diagram of a hardware structure of a mobile terminal of an identity recognition method based on gait information according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method of identity recognition based on gait information according to an embodiment of the invention;

FIG. 3 is a functional diagram of a gait preprocessing function according to an alternative embodiment of the invention;

fig. 4 is a block diagram of an identification apparatus based on gait information according to an embodiment of the invention.

Detailed Description

The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the example of being operated on a mobile terminal, fig. 1 is a hardware structure block diagram of the mobile terminal of an identity recognition method based on gait information according to an embodiment of the present invention. As shown in fig. 1, the mobile terminal 10 may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program and a module of an application software, such as a computer program corresponding to the gait information-based identification method in the embodiment of the invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

In this embodiment, an identity recognition method based on gait information, which is operated in the mobile terminal described above, is provided, and fig. 2 is a flowchart of identity recognition based on gait information according to an embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:

step S202, at least two frames of target images shot by image acquisition equipment are obtained, and the at least two frames of target images comprise gait information of a target object;

step S204, taking the at least two frames of target images as input of an integrated network model, analyzing the gait information in the at least two frames of target images through the integrated network model to obtain gait feature vectors of the target object output by the integrated network model, wherein the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting gait silhouettes of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouettes output by the gait segmentation network model to obtain gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data;

step S206, determining the identity information of the target object through the gait feature vector.

As an optional implementation manner, the image acquisition device may be a camera, specifically, may be a monitoring camera, and for a video of a pedestrian to be detected, a sequence picture of the pedestrian is obtained by using a pedestrian detection and tracking algorithm, where the number of the sequence picture may be at least two frames, and at least two frames of the sequence picture include gait information of the pedestrian.

As an optional implementation manner, the application aims at the problem of end-to-end training of gait segmentation and gait recognition, and constructs an integrated network for gait segmentation and gait recognition, wherein the integrated network model comprises a segmentation network model and a gait recognition network model, and the segmentation network model and the gait recognition network model can be obtained by training a convolutional neural network. The segmentation network model is used for segmenting a gait silhouette image of the pedestrian from the sequence image, the gait silhouette image output by the segmentation network model is input into the gait recognition network model to recognize the gait feature vector of the pedestrian, and the identity of the pedestrian can be determined based on the gait feature of the pedestrian. The following is illustrated by a specific example:

for the pedestrian video to be detected, a sequence picture of the pedestrian is obtained by utilizing a pedestrian detection and tracking algorithm, and firstly, a pedestrian segmentation model is utilized to segment the pedestrian image to obtain a gait silhouette of the pedestrian. The gait segmentation network input can be a pedestrian three-channel color map which is expressed as a three-dimensional characteristic vector (respectively a characteristic channel dimension, a characteristic height dimension and a characteristic width dimension), and the segmentation network model expressed by the multilayer convolutional neural network is output as a channel gait probability pedestrian segmentation map which is expressed as a three-dimensional gait probability characteristic vector (respectively a characteristic channel dimension, a characteristic height dimension and a characteristic width dimension). The gait silhouette sequence of the pedestrian can be sent into a gait recognition model, gait feature vectors of the pedestrian are output, the cosine distances between the feature vectors and each gait feature vector in a pre-established base library are calculated, the distances are sorted from small to large, and the base library pedestrian with the minimum distance to the feature vector of the pedestrian to be detected can be the identity of the identified pedestrian to be detected. Or the identity of the pedestrian to be detected can be identified by the face features acquired by the image acquisition equipment based on the gait feature vectors output by the gait identification model and the face identification technology.

Through the steps, at least two frames of target images obtained by shooting through the image acquisition equipment are obtained, and the at least two frames of target images comprise gait information of the target object; taking at least two frames of target images as input of an integrated network model, analyzing gait information in the at least two frames of target images through the integrated network model to obtain gait feature vectors of a target object output by the integrated network model, wherein the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting a gait silhouette of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouette output by the gait segmentation network model to obtain the gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data; and determining the identity information of the target object through the gait feature vector. The gait segmentation and the gait recognition are used as a task, and the gait features are recognized through the integrated network model. Therefore, the problem of low gait recognition accuracy can be solved, and the effect of improving the gait recognition accuracy is achieved. Alternatively, the execution subject of the above steps may be a terminal or the like, but is not limited thereto.

As an alternative embodiment, the gait segmentation network model is obtained by training an original convolutional neural network by using training data, wherein the gait segmentation network is composed of a plurality of layers of convolutional neural networks, and a network model such as U-Net, deep Lab v3 and PSPNet can be adopted. And pre-training the gait segmentation network model to obtain a preset gait segmentation network model. After a preset gait segmentation network model is obtained, inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of corresponding groups of first pre-estimated gait silhouette sequences, and training an original gait recognition model by using the first pre-estimated gait silhouette sequences to obtain the preset gait recognition model. In this embodiment, the gait recognition model is trained based on the output of the gait segmentation network model as training data, so that the gait segmentation network model and the gait recognition model are used as a whole to recognize gait features, the relevance of the two models can be increased, and the accuracy of gait recognition is improved.

The training process of the gait segmentation network model in the embodiment can comprise the following steps:

step S1, data acquisition: collecting pedestrians in monitoring videos of public places by using a pedestrian detection and tracking method, wherein each video of each person forms a section of sequence pedestrian image, and the identity information of each section of sequence is marked; selecting N pictures at intervals from each section of serial pedestrian images of each pedestrian and marking the pedestrian gait silhouette image as a known gait silhouette image;

step S2, gait preprocessing: normalizing the pedestrian image and the corresponding pedestrian gait silhouette image to a preset pixel size, taking the pedestrian image and the corresponding pedestrian gait silhouette image as a pair of paired images, performing data enhancement on all the paired images, and taking the obtained preprocessed image as a first training data set of the gait segmentation network model.

Step S3, segmenting the network model for training: and inputting the first training data set into the original segmentation network, and training the original convolution neural network model to fix parameters of the gait segmentation network. And carrying out forward calculation and reverse gradient propagation on the gait segmentation network, wherein the pedestrian segmentation data set comprises a pedestrian three-channel color image and a gait silhouette image corresponding to the pedestrian three-channel color image, and training the pedestrian three-channel color image and the gait silhouette image to obtain a preset gait segmentation network model which has the capability of segmenting the pedestrian outline. In this embodiment, the first loss function may be a cross entropy loss function, and the cross entropy loss function may be used to calculate an error between a predicted segmentation result (corresponding to the predicted gait silhouette) and a known gait silhouette of the pedestrian, and obtain a convolutional neural network model for segmenting the pedestrian through multiple iterative training, so that an error value between the predicted gait silhouette output by the trained preset gait segmentation network model and the known gait silhouette of the training object is within a first predetermined range, where the first predetermined range may be determined according to an actual situation, for example, may be 0.1, 0.01, and the like, so as to obtain the preset gait segmentation network model. The method comprises the steps that the numerical value of each pixel point in an estimated gait silhouette image output by a preset gait segmentation network is a floating point number, and the floating point number is used for representing the probability that the pixel point is a pedestrian pixel. By representing the gait outline by floating point numbers, the robustness of the gait recognition network is improved.

As an alternative embodiment, the plurality of sets of first step sequence may be gait sequence diagrams of a plurality of pedestrians acquired by the image acquisition device, wherein each pedestrian includes a plurality of sets of gait sequences. Gait silhouette segmentation can be carried out on the multiple groups of first step sequence through the preset gait segmentation network model, multiple groups of first pre-estimated gait silhouette sequences output by the preset gait segmentation network model are obtained, and each group of first pre-estimated gait silhouette sequences corresponds to one group of first step sequences.

As an optional implementation manner, the data after normalization and binarization processing of the first pre-estimated gait silhouette sequence is used as a training data set of the gait recognition network model. In this embodiment, the integrated network may further include a gait preprocessing module, where the gait preprocessing module is configured to preprocess an image in an estimated gait silhouette sequence output by a preset gait segmentation network model, and use a result obtained after the preprocessing as an input of the gait recognition network model. The gait preprocessing module comprises a gait preprocessing function, and the gait preprocessing function is used for filtering non-gait information in an estimated gait silhouette image output by the preset gait segmentation network model.

The input of the gait preprocessing module is a gait probability pedestrian segmentation graph output by the gait segmentation network, and the numerical value of each pixel point in the segmentation graph is a floating point number and represents the probability that the pixel point is a pedestrian pixel. In order to realize the end-to-end training of the gait segmentation network and the gait recognition network, the gait preprocessing module adopts a differentiable gait preprocessing function F (x) to process each pixel value;

wherein x represents the probability that each pixel belongs to the foreground, and k is a fixed parameter of the preprocessing function and can be set to be a coefficient larger than 4. Figure 3 is a functional diagram of a gait preprocessing function according to an alternative embodiment of the invention. F (x) has three-point properties: after mapping by function F (x)The numerical range of each pixel point is 0-1; the value of the parameter k is much greater than 1, e.g., k 400; at the position where x is 0.5, the slope or derivative of f (x) is k/4, and since k is large, the slope or derivative of f (x) is large.

As an optional implementation manner, the gait silhouette sequence subjected to the normalization and binarization is used as a second training data set, and the second training data is used for training the original gait recognition model to obtain a first preset gait recognition model, wherein the model is obtained based on the training of the pedestrian gait sequence. And fixing parameters of the gait recognition network, and performing forward calculation and backward gradient propagation on the gait recognition network. In this embodiment, the recognition result of the gait recognition model may be a gait feature vector or an identity feature vector, the gait feature vector is used to represent the gait feature of the person to be recognized, and the identity feature vector is used to represent the identity of the person to be recognized. When the gait feature vector is used as output, the gait feature vector output by the gait recognition network is trained by using a triple loss function, and the second loss function corresponds to the triple loss function, so that the output value of the loss function of the gait feature vector output by the trained gait recognition model is within a second predetermined range, and the second predetermined range can be determined according to actual conditions, such as 0.2, 0.01 and the like. When the identity feature vector is used as an output, the identity feature vector output by the gait recognition network is trained by using a cross entropy loss function, and a third loss function corresponds to the cross entropy loss function, so that the output values of the identity feature vector output by the trained gait recognition model and the loss function of the known identity feature vector of the training object are within a third predetermined range, and the third predetermined range can be determined according to actual conditions, such as 0.2, 0.01 and the like. In the embodiment, the pre-estimated gait silhouette image output by the preset gait segmentation network model is pre-processed by the gait pre-processing module, so that the influence of non-gait information can be reduced, and the accuracy of model identification is improved.

As an optional implementation manner, the trained preset gait segmentation network model and the first preset gait recognition model are used as a first preset integrated network model. And finely adjusting the first preset integrated network model. And performing fine tuning training on the preset integrated network model by using the known gait silhouette of the training object. The method specifically comprises the following steps:

and loading the network parameters of the preset gait segmentation network model obtained by training as first training parameters, taking the network parameters of the first preset gait recognition model as second training parameters, and carrying out fine tuning training on the first training parameters and the second training parameters. And carrying out fine tuning training on a first training parameter and a second training parameter in a preset integrated network model by using a training image three-channel color sequence and a corresponding known gait silhouette image, wherein the known gait silhouette image can be a binary pedestrian contour sequence, 0 is used for representing a background, and 1 is used for representing a pedestrian. The network training uses the L1 loss function of the gait segmentation network and the triplet loss function and the cross entropy loss function of the gait recognition network, and the loss function calculation, the gradient calculation and the parameter updating of the gait segmentation network and the gait recognition network are synchronously carried out. In the embodiment, the obtained pedestrian segmentation and gait recognition integrated network can simultaneously realize the extraction of the gait outline and the gait feature, can realize the end-to-end training of the pedestrian segmentation network and the gait recognition network, and improves the recognition accuracy and the robustness of the gait recognition network.

As an alternative embodiment, the energy map may also be used as a training set for the gait recognition model. In this embodiment, the estimated gait silhouette image output by the preset gait segmentation network model is preprocessed through the gait preprocessing function f (x), and the output numerical value is input to the feature shaping module, so as to obtain a gait energy image corresponding to the estimated gait silhouette image.

In the present embodiment, the input is a pedestrian gait contour sequence { S }_a1,S_a2S_ak}{S_b1,S_b2S_bk}{S_c1,S_c2...S_ckWhere a, b, c represent pedestrians of different identities, and the number 1,2 … k represents an out-of-sync profile of the same pedestrian), as a four-dimensional input vector (batch dimension, feature channel dimension, feature height dimension, feature width dimension, and feature channel dimension, respectively, as 1). When a network based on a pedestrian gait energy map is employed, the feature shaping module averages the pedestrian gait contour sequence by pedestrian identity over the batch dimension (e.g.

) Obtaining a gait energy map S of each pedestrian_a,S_b,S_c…, where a, b, c represent pedestrians of different identities). The gait recognition network can be trained by adopting a network structure based on a pedestrian gait energy map, and the obtained gait energy map is used as a third training data set to train the multilayer convolutional neural network to obtain a second preset gait recognition model. When a network structure based on a pedestrian gait energy map is adopted, the pedestrian gait energy map is input and expressed as a three-dimensional input vector (a characteristic channel dimension, a characteristic height dimension and a characteristic width dimension respectively); the gait recognition network consists of a multilayer convolutional neural network and a full connection layer, the gait recognition network is output as a gait feature vector (used for calculating the similarity between the pedestrian to be recognized and the verification pedestrian) through the multilayer convolutional neural network, and the gait feature vector outputs a gait identity feature vector through the full connection layer.

In this embodiment, the recognition result of the gait recognition model may be a gait feature vector or an identity feature vector, the gait feature vector is used to represent the gait feature of the person to be recognized, and the identity feature vector is used to represent the identity of the person to be recognized. When the gait feature vector is used as output, the gait feature vector output by the gait recognition network is trained by using a triple loss function, and the fourth loss function corresponds to the triple loss function, so that the output value of the loss function of the gait feature vector output by the trained gait recognition model is in a sixth predetermined range, and the sixth predetermined range can be determined according to actual conditions, such as 0.2, 0.01 and the like. When the identity feature vector is used as an output, the identity feature vector output by the gait recognition network is trained by using a cross entropy loss function, and the fifth loss function corresponds to the cross entropy loss function, so that the output values of the identity feature vector output by the trained gait recognition model and the loss function of the known identity feature vector of the training object are in a seventh predetermined range, and the seventh predetermined range can be determined according to actual conditions, such as 0.2, 0.01 and the like. In the embodiment, the pedestrian gait energy map obtained by the gait segmentation network is more suitable for the gait recognition network to extract the gait features, and the recognition accuracy of the gait recognition network can be improved.

As an optional implementation manner, the trained preset gait segmentation network model and the second preset gait recognition model are used as a second preset integrated network model. And fine-tuning the second preset integrated network model. And performing fine tuning training on the second preset integrated network model by using the known gait silhouette of the training object. The method specifically comprises the following steps:

and loading the network parameters of the preset gait segmentation network model obtained by training as first training parameters, taking the network parameters of the first preset gait recognition model or the second preset gait recognition model as third training parameters, and carrying out fine tuning training on the first training parameters and the third training parameters.

And performing fine tuning training on the first training parameter and the third training parameter in the second preset integrated network model by using at least two three-channel color sequences of training images and corresponding known gait silhouette images, wherein the known gait silhouette images can be binary pedestrian contour sequences, 0 represents a background, and 1 represents a pedestrian. The network training uses the L1 loss function of the gait segmentation network and the triplet loss function and the cross entropy loss function of the gait recognition network, and the loss function calculation, the gradient calculation and the parameter updating of the gait segmentation network and the gait recognition network are synchronously carried out. In the embodiment, the obtained pedestrian segmentation and gait recognition integrated network can simultaneously realize the extraction of the gait outline and the gait feature, can realize the end-to-end training of the pedestrian segmentation network and the gait recognition network, and improves the recognition accuracy and the robustness of the gait recognition network.

As an optional implementation, the preset gait search base can be a pre-established base, which can pre-collect pedestrian sequence images and label the identity information of each image sequence, divide the pedestrian images by using a pedestrian division model to obtain gait silhouettes, use the gait silhouette sequence images labeled with the identity information as the gait recognition search base, extract the gait feature vectors of the pedestrian gait silhouette sequence images and store the vectors as the retrieval items during gait recognition. The gait recognition process is completed by calculating the cosine distances between the gait feature vectors output by the integrated network model and the gait preset gait feature vectors stored in the bottom base, sequencing the distances from small to large, and estimating the identity information for the identity of the pedestrian to be detected by the pedestrian in the bottom base with the minimum distance to the pedestrian feature vector to be detected.

The present application is illustrated below by a specific embodiment, which may include the following steps:

step S1: the method comprises the steps of constructing an integrated network for gait segmentation and gait recognition, wherein the integrated network comprises a gait segmentation network, a gait preprocessing module and a gait recognition network. The gait segmentation network consists of a plurality of layers of convolutional neural networks, and can adopt a typical gait segmentation network structure, such as U-Net, deep Lab v3 and PSPNet. The gait segmentation network inputs a pedestrian three-channel color image which is expressed as a three-dimensional characteristic vector (respectively a characteristic channel dimension, a characteristic height dimension and a characteristic width dimension), the three-channel color image is output as a channel gait probability pedestrian segmentation image through the multilayer convolution neural network and is expressed as a three-dimensional gait probability characteristic vector (respectively a characteristic channel dimension, a characteristic height dimension and a characteristic width dimension), wherein the numerical value of each pixel point is a floating point number and expresses the probability that the pixel point is a pedestrian pixel.

The gait preprocessing module is used for preprocessing the output of the gait segmentation network and comprises a gait preprocessing function and a characteristic shaping module. The input of the gait preprocessing module is a gait probability pedestrian segmentation graph output by the gait segmentation network, wherein the numerical value of each pixel point is a floating point number and represents the probability that the pixel point is a pedestrian pixel. In order to realize the end-to-end training of the gait segmentation network and the gait recognition network, the gait preprocessing module adopts a differentiable gait preprocessing function F (x) to process each pixel value. F (x) has three-point properties: after mapping by the function F (x), the numerical range of each pixel point is 0-1; the value of the parameter k is much larger than 1, for example, assuming that k is 400, at the position of x is 0.5, the slope or derivative of f (x) is k/4, and since k is large, the slope or derivative of f (x) is large.

The gait preprocessing function is followed by a feature shaping module, and the input is a pedestrian gait contour sequence (S)_a1,S_a2...S_ak,S_b1,S_b2...S_bk,S_c1,S_c2...S_ck…, where a, b, c represent pedestrians of different identities, and the numbers 1,2 … k represent asynchronous contours of the same pedestrian), as a four-dimensional input vector (batch dimension, eigen-channel dimension, eigen-height dimension, eigen-width dimension, eigen-channel dimension 1, respectively). When a network based on a pedestrian gait energy map is employed, the feature shaping module averages the pedestrian gait contour sequence by pedestrian identity over the batch dimension (e.g.

) Obtaining a gait energy map (S) of each pedestrian_a,S_b,S_cWhere a, b, c represent pedestrians of different identities); when a network based on a pedestrian gait sequence is employed, the feature shaping module does not operate.

The gait recognition network may employ a network structure based on a pedestrian gait sequence or a pedestrian gait energy map. When a network structure based on a pedestrian gait energy map is adopted, the pedestrian gait energy map is input and expressed as a three-dimensional input vector (a characteristic channel dimension, a characteristic height dimension and a characteristic width dimension respectively); when a network structure based on a pedestrian gait sequence is adopted, the input is the pedestrian gait sequence which is expressed as a four-dimensional input vector (time sequence dimension, feature channel dimension, feature height dimension and feature width dimension respectively), and the network structure has time domain and space domain feature fusion capability. The gait recognition network consists of a multilayer convolutional neural network and a full connection layer, the gait recognition network is output as a gait feature vector (used for calculating the similarity between the pedestrian to be recognized and the verification pedestrian) through the multilayer convolutional neural network, and the gait feature vector outputs a gait identity feature vector through the full connection layer.

Step S2: a gait segmentation network is trained using a pedestrian segmentation dataset. Parameters of the gait recognition network are fixed, forward calculation and reverse gradient propagation are only carried out on the gait segmentation network, a pedestrian segmentation data set comprises a pedestrian three-channel color image and a corresponding binary pedestrian contour map thereof, and a cross entropy loss function is used in network training. The corresponding true value is a binary gait contour map of the pedestrian, the numerical value of each pixel point is 1 or 0, wherein 1 represents that the pixel point is the pedestrian, and 0 represents that the pixel point is the background. After training, the gait segmentation network has the capability of segmenting the pedestrian outline and can be used as the input of the gait recognition network to extract the gait information of the pedestrian.

Step S3: a gait recognition network is trained using a pedestrian gait recognition data set. Parameters of the gait segmentation network are fixed, forward calculation and reverse gradient propagation are only carried out on the gait recognition network, the pedestrian gait recognition data set is a pedestrian binary pedestrian contour map sequence, and the network training uses a triple loss function and a cross entropy loss function. Gait feature vectors output by the gait recognition network are trained by using the triple loss function, and gait identity feature vectors are trained by using the cross entropy loss function. After training, the gait recognition network has the ability of extracting gait feature vectors with pedestrian identity information.

Step S4: and performing fine tuning training on the gait segmentation and gait recognition integrated network by using the pedestrian gait segmentation recognition data set. And loading the gait segmentation network and the gait recognition network parameters obtained by the training of S2 and S3, and carrying out fine tuning training on the two networks. The pedestrian gait segmentation identification data set is a pedestrian gait three-channel color sequence and a corresponding binary pedestrian contour sequence. The network training uses the L1 loss function of the gait segmentation network and the triplet loss function and the cross entropy loss function of the gait recognition network, the gradient calculation mode is consistent with the corresponding gradient calculation modes of S2 and S3, and the difference is that the loss function calculation, the gradient calculation and the parameter updating of the gait segmentation network and the gait recognition network are synchronously carried out. After training, on one hand, a pedestrian gait energy image obtained by the gait segmentation network is more suitable for the gait recognition network to extract gait features, and the recognition accuracy of the gait recognition network can be improved; on the other hand, the input gait outline of the gait recognition network is represented by floating point numbers, so that the robustness of the gait recognition network is improved.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.

In this embodiment, an identity recognition device based on gait information is further provided, and the device is used to implement the above embodiments and preferred embodiments, which have already been described and will not be described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 4 is a block diagram of an identification apparatus based on gait information according to an embodiment of the invention, and as shown in fig. 4, the apparatus includes: the acquiring module 42 is configured to acquire at least two frames of target images captured by an image capturing device, where the at least two frames of target images include gait information of a target object; an analysis module 44, configured to analyze the gait information in the at least two frames of target images by using the at least two frames of target images as an input of an integrated network model, and obtain gait feature vectors of the target object output by the integrated network model, where the integrated network model includes a gait segmentation network model and a gait recognition model, the gait segmentation network model is used to segment gait silhouettes of the target object in the at least two frames of target images, the gait recognition model is used to analyze the gait silhouettes output by the gait segmentation network model to obtain gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model using multiple sets of training data; a first determining module 46, configured to determine identity information of the target object through the gait feature vector.

Optionally, the apparatus is further configured to pre-train an original segmentation network model using a first training data set before the gait information in the at least two frames of target images is analyzed by the unified network model, so as to obtain a preset gait segmentation network model, where the first training data set includes: at least two training images and corresponding known gait silhouettes; inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences; and pre-training the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a preset gait recognition model.

Optionally, the apparatus is further configured to perform the pre-training on the original segmentation network model by using the first training data set, as follows: determining a known gait silhouette image of a first training object in the at least two frames of training images, wherein the at least two frames of training images comprise gait information of the first training object; and taking the at least two training images and the corresponding known gait silhouette as a first training data set, and pre-training the original segmentation network model by using the first training data set to obtain the preset gait segmentation network model, wherein a first loss function between an estimated gait silhouette output by the preset gait segmentation network model and the known gait silhouette of the first training object meets a first convergence condition, the first convergence condition is used for indicating that the output value of the first loss function is within a first preset range, the numerical value of each pixel point in the estimated gait silhouette is a floating point, and the floating point is used for indicating the probability that the pixel point is the first training object.

Optionally, the apparatus is further configured to input a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first predicted gait silhouette sequences by: and inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences which are output by the preset gait segmentation network model and correspond to the plurality of groups of first step sequences, wherein each group of first step sequences in the plurality of groups of first step sequences comprises at least two frames of images, and each group of first step sequences is used for representing gait information of a second training object.

Optionally, the apparatus is further configured to pre-train the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a pre-estimated gait recognition model by: respectively carrying out normalization processing and binarization processing on each gait silhouette image in the multiple groups of first estimated gait silhouette sequences to obtain a second training data set, wherein 0 represents a pixel point of a background image in a training image in the second training data set, and 1 represents a pixel point of a second training object in the training image; pre-training the original gait recognition model by using a second training data set to obtain a first preset gait recognition model, wherein a second loss function of a first estimated gait feature vector output by the first preset gait recognition model meets a second convergence condition, the second convergence condition is used for representing that an output value of the second loss function is within a second preset range, and/or a third loss function of a first estimated identity feature vector output by the first preset gait recognition model and a known identity feature vector of a second training object meets a third convergence condition, the third convergence condition is used for representing that an output value of the third loss function is within a third preset range, and the preset gait recognition model comprises the first preset gait recognition model.

Optionally, the device is further configured to determine, after obtaining the first preset gait recognition model, that a model formed by the preset gait segmentation network model and the first preset gait recognition model is a first preset integrated network model; loading a first training parameter of the preset gait segmentation network model and a second training parameter of the first preset gait recognition model; and performing fine tuning training on the first training parameter and the second training parameter in the first preset integrated network model by using multiple groups of first step sequences to obtain the integrated network model, wherein a first target loss function of a gait feature vector output by the integrated network model meets a first target convergence condition, the first target convergence condition is used for indicating that an output value of the first target loss function is within a fourth preset range, and/or a second target loss function of an estimated identity feature vector output by the integrated network model and a known identity feature vector of a second training object meets a second target convergence condition, and the second target convergence condition is used for indicating that an output value of the second target loss function is within a fifth preset range.

Optionally, the apparatus is further configured to pre-train the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a pre-estimated gait recognition model by: respectively carrying out normalization processing on each gait silhouette image in the multiple groups of first estimated gait silhouette sequences to obtain multiple groups of second estimated gait silhouette sequences; respectively superposing each group of second estimated gait silhouette sequences in the plurality of groups of second estimated gait silhouette sequences, and then calculating the average value to obtain a plurality of energy graphs corresponding to the plurality of groups of second estimated gait silhouette sequences, wherein the plurality of energy graphs are used as a third training data set, and one group of second estimated gait silhouette sequences corresponds to one energy graph; training an original gait recognition model by using the third training data set to obtain a second preset gait recognition model, wherein a fourth loss function of a second estimated gait feature vector output by the second preset gait recognition model meets a fourth convergence condition, the fourth convergence condition is used for indicating that an output value of the fourth loss function is within a sixth preset range, and/or a fifth estimated identity feature vector output by the second preset gait recognition model and a fifth loss function of a known identity feature vector of a third training object meet a fifth convergence condition, the fifth convergence condition is used for indicating that an output value of the fifth loss function is within a seventh preset range, and the preset gait recognition model comprises the second preset gait recognition model.

Optionally, the device is further configured to determine, after the second preset gait recognition model is obtained, that a model formed by the preset gait segmentation network model and the second preset gait recognition model is a second preset integrated network model; loading a first training parameter of the preset gait segmentation network model and a third training parameter of a second preset gait recognition model; and performing fine tuning training on the first training parameter and the third training parameter in the second preset integrated network model by using multiple groups of second step sequences to obtain the integrated network model, wherein a second target loss function of a gait feature vector output by the integrated network model meets a second target convergence condition, the second target convergence condition is used for indicating that an output value of the second target loss function is within an eighth preset range, and/or a third target loss function of an estimated identity feature vector output by the integrated network model and a known identity feature vector of a third training object meets a third target convergence condition, and the third target convergence condition is used for indicating that an output value of the third target loss function is within a ninth preset range.

Optionally, the above apparatus is further configured to implement the determining the identity information of the target object through the gait feature vector by: determining the cosine distance between the gait feature vector and each preset gait feature vector stored in a preset gait search base; and determining the identity information corresponding to the preset gait feature vector of which the cosine distance is less than or equal to a preset threshold value as the estimated identity information of the target object, wherein at least two preset gait feature vectors are stored in the preset gait search base, and each two preset gait feature vectors are stored in association with the corresponding identity information.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.

Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:

s1, acquiring at least two frames of target images shot by image acquisition equipment, wherein the at least two frames of target images comprise gait information of a target object;

s2, taking the at least two frames of target images as input of an integrated network model, analyzing the gait information in the at least two frames of target images through the integrated network model to obtain gait feature vectors of the target object output by the integrated network model, wherein the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting gait silhouettes of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouettes output by the gait segmentation network model to obtain gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using multiple sets of training data;

and S3, determining the identity information of the target object through the gait feature vector.

Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.

Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims

1. An identity recognition method based on gait information is characterized by comprising the following steps:

acquiring at least two frames of target images shot by image acquisition equipment, wherein the at least two frames of target images comprise gait information of a target object;

taking the at least two frames of target images as input of an integrated network model, analyzing the gait information in the at least two frames of target images through the integrated network model to obtain gait feature vectors of the target object output by the integrated network model, wherein the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting a gait silhouette of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouette output by the gait segmentation network model to obtain gait feature vectors, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data;

and determining the identity information of the target object through the gait feature vector.

2. The method of claim 1, wherein prior to said analyzing the gait information in the at least two frames of target images by the unified network model, the method further comprises:

pre-training an original segmentation network model by using a first training data set to obtain a preset gait segmentation network model, wherein the first training data set comprises: at least two training images and corresponding known gait silhouettes;

inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences;

and pre-training the original gait recognition model by using the plurality of groups of first pre-estimated gait silhouette sequences to obtain a preset gait recognition model.

3. The method of claim 2, wherein pre-training the original segmented network model using the first training data set comprises:

determining a known gait silhouette image of a first training object in the at least two frames of training images, wherein the at least two frames of training images comprise gait information of the first training object;

and taking the at least two training images and the corresponding known gait silhouette as a first training data set, and pre-training the original segmentation network model by using the first training data set to obtain the preset gait segmentation network model, wherein a first loss function between an estimated gait silhouette output by the preset gait segmentation network model and the known gait silhouette of the first training object meets a first convergence condition, the first convergence condition is used for indicating that the output value of the first loss function is within a first preset range, the numerical value of each pixel point in the estimated gait silhouette is a floating point, and the floating point is used for indicating the probability that the pixel point is the first training object.

4. The method of claim 2, wherein inputting the plurality of first step sequences into the predetermined gait segmentation network model to obtain a plurality of first estimated gait silhouette sequences comprises:

and inputting a plurality of groups of first step sequences into the preset gait segmentation network model to obtain a plurality of groups of first estimated gait silhouette sequences which are output by the preset gait segmentation network model and correspond to the plurality of groups of first step sequences, wherein each group of first step sequences in the plurality of groups of first step sequences comprises at least two frames of images, and each group of first step sequences is used for representing gait information of a second training object.

5. The method of claim 4, wherein the pre-training of the original gait recognition model using the plurality of sets of first estimated gait silhouette sequences to obtain a pre-defined gait recognition model comprises:

respectively carrying out normalization processing and binarization processing on each gait silhouette image in the multiple groups of first estimated gait silhouette sequences to obtain a second training data set, wherein 0 represents a pixel point of a background image in a training image in the second training data set, and 1 represents a pixel point of a second training object in the training image;

pre-training the original gait recognition model by using a second training data set to obtain a first preset gait recognition model, wherein a second loss function of a first estimated gait feature vector output by the first preset gait recognition model meets a second convergence condition, the second convergence condition is used for representing that an output value of the second loss function is within a second preset range, and/or a third loss function of a first estimated identity feature vector output by the first preset gait recognition model and a known identity feature vector of a second training object meets a third convergence condition, the third convergence condition is used for representing that an output value of the third loss function is within a third preset range, and the preset gait recognition model comprises the first preset gait recognition model.

6. The method according to claim 5, wherein after said deriving said first predetermined gait recognition model, the method comprises:

determining a model formed by the preset gait segmentation network model and the first preset gait recognition model as a first preset integrated network model;

loading a first training parameter of the preset gait segmentation network model and a second training parameter of the first preset gait recognition model;

and performing fine tuning training on the first training parameter and the second training parameter in the first preset integrated network model by using multiple groups of first step sequences to obtain the integrated network model, wherein a first target loss function of a gait feature vector output by the integrated network model meets a first target convergence condition, the first target convergence condition is used for indicating that an output value of the first target loss function is within a fourth preset range, and/or a second target loss function of an estimated identity feature vector output by the integrated network model and a known identity feature vector of a second training object meets a second target convergence condition, and the second target convergence condition is used for indicating that an output value of the second target loss function is within a fifth preset range.

7. The method of claim 4, wherein the pre-training of the original gait recognition model using the plurality of sets of first estimated gait silhouette sequences to obtain a pre-defined gait recognition model comprises:

respectively carrying out normalization processing on each gait silhouette image in the multiple groups of first estimated gait silhouette sequences to obtain multiple groups of second estimated gait silhouette sequences;

respectively superposing each group of second estimated gait silhouette sequences in the plurality of groups of second estimated gait silhouette sequences, and then calculating the average value to obtain a plurality of energy graphs corresponding to the plurality of groups of second estimated gait silhouette sequences, wherein the plurality of energy graphs are used as a third training data set, and one group of second estimated gait silhouette sequences corresponds to one energy graph;

training an original gait recognition model by using the third training data set to obtain a second preset gait recognition model, wherein a fourth loss function of a second estimated gait feature vector output by the second preset gait recognition model meets a fourth convergence condition, the fourth convergence condition is used for indicating that an output value of the fourth loss function is within a sixth preset range, and/or a fifth estimated identity feature vector output by the second preset gait recognition model and a fifth loss function of a known identity feature vector of a third training object meet a fifth convergence condition, the fifth convergence condition is used for indicating that an output value of the fifth loss function is within a seventh preset range, and the preset gait recognition model comprises the second preset gait recognition model.

8. The method according to claim 7, wherein after said deriving a second predetermined gait recognition model, the method comprises:

determining a model formed by the preset gait segmentation network model and the second preset gait recognition model as a second preset integrated network model;

loading a first training parameter of the preset gait segmentation network model and a third training parameter of a second preset gait recognition model;

and performing fine tuning training on the first training parameter and the third training parameter in the second preset integrated network model by using multiple groups of second step sequences to obtain the integrated network model, wherein a second target loss function of a gait feature vector output by the integrated network model meets a second target convergence condition, the second target convergence condition is used for indicating that an output value of the second target loss function is within an eighth preset range, and/or a third target loss function of an estimated identity feature vector output by the integrated network model and a known identity feature vector of a third training object meets a third target convergence condition, and the third target convergence condition is used for indicating that an output value of the third target loss function is within a ninth preset range.

9. The method of claim 1, the determining identity information of the target object by the gait feature vector, comprising:

determining the cosine distance between the gait feature vector and each preset gait feature vector stored in a preset gait search base;

and determining the identity information corresponding to the preset gait feature vector of which the cosine distance is less than or equal to a preset threshold value as the estimated identity information of the target object, wherein at least two preset gait feature vectors are stored in the preset gait search base, and each two preset gait feature vectors are stored in association with the corresponding identity information.

10. An identification device based on gait information, comprising:

the device comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring at least two frames of target images shot by image acquisition equipment, and the at least two frames of target images comprise gait information of a target object;

the analysis module is used for analyzing the gait information in the at least two frames of target images by using the at least two frames of target images as the input of an integrated network model, and obtaining the gait feature vector of the target object output by the integrated network model, the integrated network model comprises a gait segmentation network model and a gait recognition model, the gait segmentation network model is used for segmenting a gait silhouette of the target object in the at least two frames of target images, the gait recognition model is used for analyzing the gait silhouette output by the gait segmentation network model to obtain the gait feature vector, and the gait segmentation network model and the gait recognition model are obtained by training an original convolutional neural network model by using a plurality of groups of training data;

and the first determination module is used for determining the identity information of the target object through the gait feature vector.