CN116092186A - Gait recognition method, electronic device, and computer-readable storage medium - Google Patents

Gait recognition method, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN116092186A
CN116092186A CN202211699164.9A CN202211699164A CN116092186A CN 116092186 A CN116092186 A CN 116092186A CN 202211699164 A CN202211699164 A CN 202211699164A CN 116092186 A CN116092186 A CN 116092186A
Authority
CN
China
Prior art keywords
gait
sampling
sequence
model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211699164.9A
Other languages
Chinese (zh)
Inventor
王昕�
潘华东
殷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202211699164.9A priority Critical patent/CN116092186A/en
Publication of CN116092186A publication Critical patent/CN116092186A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06V40/25Recognition of walking or running movements, e.g. gait recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a gait recognition method, an electronic device and a computer readable storage medium, comprising: acquiring gait sequences of at least one target object; inputting the gait sequence into a gait recognition model, and extracting gait characteristics of a target object from the gait sequence by using the gait recognition model; the gait recognition model is obtained by comparing and training the equidistant sampling gait sequences and the random interval sampling gait sequences; wherein the equally spaced sampling gait sequence comprises a continuous frame gait sequence; and matching and identifying the gait characteristics with the stored gait characteristics of the user by using the gait identification model, and outputting an identification result. The method and the device can solve the problem that the algorithm recognition efficiency is reduced only by training the gait sequence of continuous frames and the problem that the algorithm recognition precision is reduced only by training the sampling sequence at equal intervals, so that the gait recognition effect of the gait recognition model in an actual scene is improved, and the requirement of accurately recognizing the gait is met.

Description

Gait recognition method, electronic device, and computer-readable storage medium
Technical Field
The present application relates to the field of image processing, and in particular, to a gait recognition method, an electronic device, and a computer-readable storage medium.
Background
With the continuous development of computer vision technology, gait recognition is becoming an important content in the field of biological recognition. Gait recognition is a technology for recognizing the identity of a pedestrian by utilizing the body shape and the posture of a human body during walking, has the characteristics of uniqueness of the individual, difficult camouflage, long distance, no need of matching of a tested individual and the like, and can play an important role in difficult situations such as face shielding, long-distance recognition, clothes replacement and the like.
The gait contour map is a gray level binary map obtained by performing binarization processing on a human-shaped region and a background region of a split image after semantic splitting of an original image of a pedestrian walking, and the gait contour map sequence is a sequence formed by arranging multi-frame gait contour maps according to a time sequence. In the prior art, a gait contour map sequence corresponding to continuous multi-frame images or equidistant sampling images is generally used for training a preset depth model to obtain a gait recognition model, and the gait contour map sequence of a pedestrian is recognized and matched by utilizing the gait recognition model.
However, a large amount of similar and/or repeated information exists between the continuous frames, when a model obtained by training continuous multi-frame pictures is embedded into a hardware end, the hardware calculation force and the reasoning speed can be influenced, gait information in a larger time range can not be obtained, and the recognition efficiency is reduced; the equally spaced sampling images cannot encompass the scene with the walking speed changed, and the model obtained by training the equally spaced sampling images cannot be well fit with the scene with the walking speed changed, so that the recognition accuracy is reduced. Therefore, the model cannot obtain a good gait recognition effect in an actual scene, and the requirement of accurately recognizing the gait is difficult to meet.
Disclosure of Invention
The technical problem that the application mainly solves is to provide a gait recognition method, electronic equipment and a computer readable storage medium, and the problem that gait recognition requirements of actual scenes cannot be met in the prior art can be solved.
In order to solve the technical problem, a first technical scheme adopted by the application is to provide a gait recognition method, which comprises the following steps: acquiring gait sequences of at least one target object; inputting the gait sequence into a gait recognition model, and extracting gait characteristics of a target object from the gait sequence by using the gait recognition model; the gait recognition model is obtained by comparing and training the equidistant sampling gait sequences and the random interval sampling gait sequences; wherein the equally spaced sampling gait sequence comprises a continuous frame gait sequence; and matching and identifying the gait characteristics with the stored gait characteristics of the user by using the gait identification model, and outputting an identification result.
In order to solve the technical problem, a second technical scheme adopted in the application is to provide an electronic device, including: a memory for storing program data which when executed performs the steps of the gait recognition method described above; and a processor for executing the program data stored in the memory to implement the steps in the gait recognition method as described above.
In order to solve the above technical problem, a third technical solution adopted in the present application is to provide a computer readable storage medium, where a computer program is stored, and the steps in the gait recognition method are implemented when the computer program is executed by a processor.
The beneficial effects of this application are: compared with the prior art, the gait recognition method, the electronic equipment and the computer readable storage medium are provided, the gait recognition model is used for extracting and recognizing the gait characteristics of the acquired gait sequence of the target object, and is obtained by comparing and training the gait recognition model by using the equidistant sampling gait sequence (comprising the continuous frame gait sequence) and the random interval sampling gait sequence, so that the computational advantages of a plurality of gait recognition algorithms can be combined, the problem of the algorithm recognition efficiency reduction caused by training only by means of the continuous frame gait sequence can be solved, the problem of the algorithm recognition precision reduction caused by training only by means of the equidistant sampling sequence can be solved, the gait recognition effect of the gait recognition model in an actual scene is improved, and the requirement of accurate recognition of the gait is met.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an embodiment of a training method of the gait recognition model of the present application;
FIG. 2 is a flow chart of an embodiment of S11;
FIG. 3 is a schematic flow chart of an embodiment of S13;
FIG. 4 is a flow chart of an embodiment of a gait recognition method of the present application;
FIG. 5 is a flow chart of an embodiment of S41;
FIG. 6 is a schematic diagram of an embodiment of a gait recognition device of the present application;
FIG. 7 is a schematic diagram of an embodiment of an electronic device of the present application;
fig. 8 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, the "plurality" generally includes at least two, but does not exclude the case of at least one.
It should be understood that the term "and/or" as used herein is merely one relationship describing the association of the associated objects, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
It should be understood that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The application firstly provides a training method of a gait recognition model.
Referring to fig. 1, fig. 1 is a flow chart illustrating an embodiment of a training method of a gait recognition model of the present application. In this embodiment, the gait recognition model is obtained by performing contrast training using an equidistant sampling gait sequence and a random equidistant sampling gait sequence, and the training method specifically includes:
s11: acquiring a plurality of first training sets and a plurality of second training sets based on the plurality of marked image sequences; wherein each first training set comprises a plurality of equally spaced sampling gait sequences having the same sampling interval and each second training set comprises a plurality of randomly spaced sampling gait sequences having the same random frame skip interval; the sampling intervals corresponding to the first training sets are different, and the random frame skip intervals corresponding to the second training sets are different.
Specifically, referring to fig. 2, fig. 2 is a flowchart of an embodiment of S11. In this embodiment, the step of acquiring the plurality of first training sets and the plurality of second training sets based on the plurality of labeled image sequences specifically includes:
s111: and acquiring image sequences corresponding to a plurality of pedestrians and labeling information of each image sequence from the selected video image.
In this embodiment, firstly, a video image of a pedestrian walking in a natural state is collected by a monitoring device, then, preprocessing such as detection and tracking is performed on a plurality of pedestrians in the video image, after an image sequence corresponding to each pedestrian is obtained, the image sequence is optimized, namely, a clear and complete walking image sequence without shielding is obtained, and each frame of image in the image sequence is aligned, so that a human body contour corresponding to the pedestrian is located at the center of the image sequence.
Specifically, the video image generally includes a plurality of pedestrians, when the video image is detected, the plurality of pedestrians are generally detected, and when at least one pedestrian is detected and taken as a target object, tracking of the target object in successive frames behind the video image is started to acquire an image sequence of the target object in a continuous tracking process.
S112: respectively carrying out equal interval sampling on each image sequence by utilizing a plurality of preset sampling intervals so as to obtain a plurality of equal interval sampling image sequences with different sampling intervals based on each image sequence; wherein the sampling interval comprises 0.
In this embodiment, different values are performed multiple times in a preset integer set, so as to obtain multiple sampling intervals with different values. Wherein the maximum number of sampling intervals is equal to the number of integers comprised in the preset integer set.
In this embodiment, the preset integer set is Rε [0,4],
specifically, different values can be performed at most 5 times in the preset integer set, so as to obtain sampling intervals with values of 0, 1, 2, 3 and 4 respectively, namely, corresponding interval frame numbers of 0, 1, 2, 3 and 4 respectively.
It will be appreciated that when the value is 0, the corresponding sampling interval is 0, i.e. the number of interval frames is 0, and the acquired sequence is a continuous frame sequence.
It can be understood that, in the numerical range of the preset integer set, the larger the value is, the larger the number of interval frames is, the larger the time range of the acquired image information is, and the redundant information is. However, if the maximum value is more than 4, the acquired sequence contains less information, and the robustness of the model obtained by training is poor.
In this embodiment, after each image sequence is sampled at equal intervals by using each sampling interval, the sampled images are arranged in a corresponding time sequence to obtain a corresponding sampled image sequence at equal intervals.
In which the number of interval frames in the equally spaced samples is fixed, the equally spaced samples are also referred to as static frame rate samples, and the equally spaced sample image sequence is also referred to as static (Stable) frame rate sequence.
In this embodiment, each image sequence is sampled at equal intervals by k sampling intervals, so that k equally spaced sampled image sequences having different sampling intervals can be obtained based on each image sequence. The value of k can be any integer from 1 to 5, namely 1 to 5 equidistant sampling image sequences are constructed, and the application is not limited to the method.
In this embodiment, a static frame rate sequence set formed by a plurality of equally spaced sampling image sequences corresponding to each image sequence may be expressed as s= { S 1 ,…,s k Wherein k.gtoreq.1, k represents the number of sampling intervals taken from a preset integer set, s 1 Representing the 1 st equally spaced sample image sequence in the set of static frame rate sequences, s k Representing the kth equally spaced sample image sequence in the set of static frame rate sequences.
Wherein each equally spaced sample image sequence may be represented as s i ={p 1 ,p 1+(1+ri) ,p 1+2*(1+ri) ,…,p j … }, wherein s i Representing the ith equidistant sampling pattern in a set of static frame rate sequencesImage sequence, r i Represents the sampling interval, n represents the total frame number of the sequence, p 1 1 st frame image, p, representing an image sequence 1+(1+ri) Representing [1+ (1+r) in the image sequence i )]Frame image, p 1+2*(1+ri) Represents the [1+2 ] th (1+r i )]Frame image, p j Representing the j-th frame image in the image sequence.
In one particular implementation scenario, the sampling interval r i 0, s i ={p 1 ,p 2 ,p 3 ,…,p j ,…,p n I.e. the sequence of segments is a sequence of consecutive frames.
In another specific implementation scenario, the sampling interval r i 1, s is i ={p 1 ,p 3 ,p 5 ,…,p j ,…,p n I.e. the number of frames between two adjacent frames of the sequence is 1.
In yet another specific implementation scenario, the sampling interval r i Is 3, s i ={p 1 ,p 5 ,p 9 ,…,p j ,…,p n I.e. the number of frames between two adjacent frames of the sequence is 3.
S113: and respectively carrying out random interval sampling on each image sequence by utilizing a plurality of preset random frame-skip interval sets so as to obtain a plurality of random interval sampling image sequences with different random frame-skip intervals based on each image sequence.
In this embodiment, since the number of interval frames in the random interval sampling is randomly selected, that is, the number of interval frames between two adjacent frames of images is dynamically changed, the random interval sampling is also called Dynamic frame rate sampling, and the random interval sampling image sequence is also called Dynamic frame rate sequence.
In this embodiment, in response to the random interval sampling image sequence having a preset number of image frames, a preset number of random values are performed in a preset integer set, so as to obtain a random frame-skip interval set having a preset number of random values.
Wherein, the preset integer set is R E [0,4].
Specifically, if the number of frames of the input sequence required by the model is H, each dynamic frame rate sequence needs to be randomly valued H times on R (the 1 st frame is also subjected to frame skipping) to obtain H sampling intervals, sampling the image sequence based on the H sampling intervals, and arranging the sampled multi-frame images in time sequence to obtain the dynamic frame rate sequence with the number of frames of H.
It can be understood that the random value is carried out in a plurality of integers defined by the preset integer set, so that the dynamic sampling interval of each time can be controlled within a certain range, namely, the sampling interval of each time can only be randomly changed within the preset range, and the time sequence information loss caused by overlarge sampling interval can be avoided, thereby avoiding errors.
Further, repeating the steps of performing random value taking to obtain a plurality of random frame-skip interval sets, and performing random interval sampling on each image sequence by using the obtained plurality of random frame-skip interval sets respectively to obtain a plurality of random interval sampling image sequences with different random frame-skip intervals based on each image sequence.
It will be appreciated that, by randomly sampling each image sequence by a preset random frame-skip interval set, it is ensured that a plurality of randomly-spaced sampled image sequences obtained by sampling a plurality of image sequences based on the same random frame-skip interval set have the same random frame-skip interval.
In this embodiment, each image sequence is sampled at equal intervals by using m random frame-skip interval sets, so that m random frame-skip interval sampled image sequences with different random frame-skip intervals can be obtained based on each image sequence. Wherein the value of m can be any integer which is more than or equal to 1, namely at least one random interval sampling image sequence is constructed, and the application is not limited to the method.
In this embodiment, a dynamic frame rate sequence set formed by a plurality of randomly spaced sampling image sequences corresponding to each image sequence may be expressed as d= { D 1 ,…,d m M is equal to or greater than 1, m represents the number of structured random frame-skip interval sets, d 1 Representing dynamic frame rate orderingThe 1 st randomly spaced sample image sequence in the column set, d m Representing the mth randomly spaced sample image sequence in the set of dynamic frame rate sequences.
In this embodiment, the random frame interval set may be expressed as t= { i1, i2, …, iH }, where i1 represents the first frame interval, i2 represents the second frame interval, and iL represents the H-th frame interval (i.e., the last frame interval). Wherein, the value of i1 can only be 0, so as to obtain the 1 st frame sequence in the image sequence.
In this embodiment, a random-interval sampling image sequence sampled by a random frame-skip interval set may be represented as d i ={d 1+i1 ,d 1+i1+i2 ,…,d 1+i1+i2+…+iH And d is as follows i Representing an ith randomly spaced sample image sequence in a set of dynamic frame rate sequences, d 1 Representing the 1 st frame of image in the image sequence, d 1+i1+i2 Representing the (1+i1+i2) th frame image in the image sequence, d 1+i1+i2+…+iH Representing the (1+i1+i2+ … iH) th frame image in the image sequence.
S114: a corresponding equally spaced sampling gait sequence is derived based on each equally spaced sampling image sequence and a corresponding randomly spaced sampling gait sequence is derived based on each randomly spaced sampling image sequence.
In this embodiment, after k equally spaced sampling image sequences with different sampling intervals and m randomly spaced sampling image sequences with different randomly spaced frame skipping intervals corresponding to each image sequence are obtained, semantic segmentation is performed on multi-frame images in each sequence according to time sequence by using a semantic segmentation algorithm, after a human-shaped region mask and a background region corresponding to a pedestrian are obtained based on the segmented images, binarization processing is performed on the human-shaped region mask and the background region to obtain a gray level binary image, and then a gait contour gray level binary image sequence corresponding to the pedestrian is obtained based on the multi-frame gray level binary image, so that a plurality of equally spaced sampling gait sequences and a plurality of randomly spaced sampling gait sequences are obtained.
The binarization processing refers to setting the gray value of the pixel point on the image to 0 or 255, that is, the whole image presents obvious visual effects of only black and white. Binarizing the image can simplify the image and reduce the data volume, and can highlight the outline of the object of interest. In the present embodiment, the human body contour can be more clearly obtained by performing binarization processing on the human body region mask.
Specifically, the gray value of the human-shaped area mask is set to 255, and the gray value of the background area is set to 0.
S115: extracting an equal-interval sampling gait sequence with the same sampling interval from a plurality of equal-interval sampling gait sequences corresponding to each image sequence, and dividing the plurality of equal-interval sampling gait sequences with the same sampling interval into the same set to obtain a plurality of first training sets; the number of the first training sets is the same as the number of the preset sampling intervals.
In a specific implementation scenario, in response to extracting 3 static frame rate sequence sets S from image sequences corresponding to 3 pedestrians, each static frame rate sequence set S includes 5 equally-spaced sampling image sequences with different sampling intervals (sampling intervals are 0, 1, 2, 3, 4 respectively), and the number of frames of each equally-spaced sampling image sequence is H, i.e., s= { S 1 ,s 2 ,s 3 ,s 4 ,s 5 -wherein s 1 An equally spaced sample image sequence (continuous frame image sequence) representing a number of spaced frames of 0 s 2 An equally spaced sample image sequence, s, representing a number of spaced frames 1 3 An equally spaced sample image sequence, s, representing a number of spaced frames of 2 4 An equally spaced sample image sequence, s, representing a number of spaced frames of 3 5 Representing a sequence of equally spaced sampled images with a number of spaced frames of 4.
Obtaining corresponding 5 equally spaced sampling gait sequences based on the 5 equally spaced sampling image sequences in each static frame rate sequence set S, the equally spaced sampling gait sequence sets may be represented as S B ={s B1 ,s B2 ,s B3 ,s B4 ,s B5 -wherein s B1 An equally spaced sampling gait sequence (continuous frame gait sequence) representing a number of spaced frames of 0 s B2 Equal interval sampling gait sequence s representing an interval frame number of 1 B3 Equal interval sampling gait sequence s representing interval frame number 2 B4 Equal interval sampling gait sequence, s, representing an interval frame number of 3 B5 An equally spaced sampling gait sequence with a spacing frame number of 4 is shown.
Further, gait sequence sets S are sampled from 3 equally spaced apart respectively B Extracting an equidistant sampling gait sequence s with the interval frame number of 0 B1 And will 3 s B1 Dividing the training sets into the same set to obtain 1 first training set. Repeating the above steps until 3 s are respectively obtained B2 3 s B3 3 s B4 3 s B4 Divided into the same set to obtain a total of 5 first training sets.
It will be appreciated that the number of equally spaced sampling gait sequences included in the 5 first training sets obtained in the above manner is identical (3 each), and the number of equally spaced sampling gait sequences obtained based on the image sequence corresponding to each pedestrian in each first training set is also identical (1 each), and the lengths of equally spaced sampling gait sequences in each first training set are also identical (all H frames).
It will be appreciated that the number of first training sets is equal to the number of sampling intervals taken from the preset integer set.
S116: extracting random interval sampling gait sequences with the same random frame-skipping interval from a plurality of random interval sampling gait sequences corresponding to each image sequence, and dividing the plurality of random interval sampling gait sequences with the same random frame-skipping interval into the same set to obtain a plurality of second training sets; the number of the second training sets is the same as the number of the preset random frame skip interval sets.
In the specific implementation scenario, in response to extracting 3 dynamic frame rate sequence sets D from the image sequences corresponding to 3 pedestrians, each dynamic frame rate sequence set D includes 4 random interval sampling image sequences with different random frame skip intervals (sampling is performed by using 4 random frame skip interval sets T1, T2, T3 and T4 respectively), and the number of frames of each random interval sampling image sequence is H,i.e. d= { D 1 ,d 2 ,d 3 ,d 4 And d is as follows 1 Representing a sequence of randomly spaced sampled images obtained by sampling with T1, d 2 Representing a sequence of randomly spaced sampled images obtained by sampling with T2, d 3 Representing a sequence of randomly spaced sampled images obtained by sampling with T3, d 4 Representing a sequence of randomly spaced sampled images sampled with T4.
Obtaining corresponding 4 random interval sampling gait sequences based on the 4 random interval sampling image sequences in each dynamic frame rate sequence set D, wherein the random interval sampling gait sequence set can be expressed as D B ={d B1 ,d B2 ,d B3 ,d B4 And d is as follows B1 Representing a random interval sampling gait sequence obtained by sampling with T1, d B2 Representing a random interval sampling gait sequence obtained by sampling with T2, d B3 Representing a random interval sampling gait sequence obtained by sampling with T3, d B4 A random interval sampling gait sequence obtained by sampling with T4 is shown.
Further, gait sequence sets D are sampled from 3 random intervals, respectively B Extracting a random interval sampling gait sequence d obtained by sampling by using T1 B1 And 3 d B1 Dividing the training sets into the same set to obtain 1 second training set. Repeating the above steps until 3 d are respectively obtained B2 3 d B3 3 d B4 Divided into the same set to get a total of 4 second training sets.
It will be appreciated that the number of random-interval-sampling gait sequences included in the 4 second training sets obtained in the above manner is identical (3 each), and the number of random-interval-sampling gait sequences obtained based on the image sequence corresponding to each pedestrian in each second training set is also identical (1 each), and the lengths of the random-interval-sampling gait sequences in each second training set are also identical (all H frames).
It is understood that the number of second training sets is equal to the number of preset random frame skip interval sets.
S12: performing gait recognition training on the main model by using one of the first training sets, and performing gait recognition training on the plurality of auxiliary models by using the remaining plurality of first training sets and the plurality of second training sets; the input sequences of the main model and the auxiliary models are gait sequences acquired based on the same marked image sequence.
In this embodiment, a plurality of deep learning models with the same network structure are obtained first, and different initialization functions are set for each deep learning model to generate different initialization parameters based on the different initialization functions, so as to obtain a plurality of recognition models with different initialization parameters. Further, one of the recognition models is used as a main model, and the rest of the plurality of recognition models are used as a plurality of auxiliary models.
For example, in response to acquiring 5 first training sets and 4 second training sets based on the image sequences corresponding to 3 pedestrians, 9 deep learning models with the same network structure are acquired, and different initialization functions are set for each deep learning model to generate different initialization parameters based on the different initialization functions, so as to obtain 9 recognition models (N 1 ~N 9 ). Further, one of the recognition models N 1 As a main model, the remaining 8 recognition models (N 2 ~N 9 ) As an auxiliary model.
In this embodiment, one of the equally spaced sampling gait sequences in one of the first training sets is input into the master model to obtain a first gait feature. And simultaneously inputting the rest first training sets and the second training sets into a plurality of auxiliary models to obtain a plurality of second gait features, wherein the equidistant sampling gait sequences and the random equidistant sampling gait sequences are acquired based on the same marked image sequence.
Specifically, the first gait feature is taken as main feature information, and the plurality of second gait features are taken as auxiliary feature information.
In this embodiment, the input sequences input to the main model and the plurality of auxiliary models in each round are gait sequences acquired based on the same labeled image sequence, that is, the training data input to the main model and the auxiliary models in each round are gait sequences obtained by sampling the same scene and the same main body in the same labeled image sequence at different frame rates.
By way of example of the specific implementation scenario described above, the first round samples gait sequence s at equal intervals with an interval frame number of 0 in one of the first training sets B1 Input to the master model N 1 While equally-spaced sampling gait sequences s with a spacing frame number of 1 obtained based on the same labeled image sequence in the remaining first training set B2 Equidistant sampling gait sequence s with interval frame number of 2 B3 Equidistant sampling gait sequence s with interval frame number of 3 B4 Equidistant sampling gait sequence s with interval frame number of 4 B5 Respectively input to the auxiliary model N 2 ~N 5 And a random interval sampling gait sequence d obtained by sampling with T1 obtained based on the same marked image sequence in a plurality of second training sets B1 Random interval sampling gait sequence d obtained by sampling by T2 B2 Random interval sampling gait sequence d obtained by sampling by T3 B3 Random interval sampling gait sequence d obtained by sampling with T4 B4 Input to the auxiliary model N 6 ~N 9 Is trained.
S13: reverse training the model parameters of the main model by using gait loss functions generated during each iterative training of the main model; and updating the model parameters of each auxiliary model by using the model parameters of the main model after each update and the similarity loss function between the main model and each auxiliary model.
Specifically, referring to fig. 3, fig. 3 is a flowchart of an embodiment of S13. In the present embodiment, model parameters of the main model are reversely trained using gait loss functions generated during each iterative training of the main model; and updating the model parameters of each auxiliary model by using the model parameters of the main model after each update and the similarity loss function between the main model and each auxiliary model, wherein the method specifically comprises the following steps:
s131: and calculating to obtain a gait loss function between the first gait characteristic and the labeling information.
In this embodiment, the gait loss function between the first gait feature and the labeling information corresponding to the graph is calculated using a triplet loss function (triplet) and a Cross entropy loss function (Cross-entropy loss function, CEloss).
The triple loss function is calculated mainly by using the measurement characteristics of the main characteristic information, and the cross entropy loss function is calculated mainly by using the classification characteristics of the main characteristic information.
S132: and updating the main model parameters of the main model based on the gait loss function by using a back propagation algorithm to obtain updated main model parameters.
In this embodiment, the main model parameters of the main model are updated after optimizing the sum of all losses by using the back propagation algorithm.
S133: a similarity loss function between the first gait feature and each of the second gait features is calculated.
In this embodiment, a cosine similarity (Cosine Similarity) is used to calculate a similarity loss function between the first gait feature and each of the second gait features to obtain a set of similarity loss functions comprising a plurality of similarity loss functions.
Specifically, the set of similarity loss functions is represented as l= { L 2 ,L 3 ,…,L k+m Wherein k represents the number of first training sets, m represents the number of second training sets, L 2 Representing a similarity loss function between a first gait feature and a 1 st second gait feature, L 2 Representing a similarity loss function, L, between the first gait feature and the 2 nd second gait feature (k+m)-1 Representing a similarity loss function between the first gait feature and the (k+m) th (i.e., last) second gait feature.
In other embodiments, the similarity loss function between the first gait feature and each of the second gait features may also be calculated using any one of pearson correlation coefficient (Pearson Correlation Coefficient), KL divergence (Kullback-Leibler Divergence), jaccard similarity coefficient (Jaccard Coefficient), tanimoto coefficient (generalized Jaccard similarity coefficient), and mutual information (Mutual Information), which is not limited in this application.
S134: and respectively updating the auxiliary model parameters corresponding to each auxiliary model by using the updated main model parameters and each similarity loss function to obtain a plurality of updated auxiliary model parameters.
In this embodiment, after the main model parameters are updated, the auxiliary model parameters corresponding to the respective auxiliary models need to be updated.
By taking the specific implementation scenario as an example, a main model N is set 1 The main model parameter of (2) is omega 1 Auxiliary model N 2 Is ω 2 The formula for updating the auxiliary model parameters by using the main model parameters is as follows:
ω 2 * =(α×ω 1 +β×ω 2 )/(α+β)
α=γ×softmax(L 2 )
β=(1-γ)×(1-softmax(L 2 ))
Figure BDA0004023303610000131
wherein omega 2 * For assisting model N 2 The updated auxiliary model parameters, alpha is the main model N 1 Beta is the weight coefficient of the auxiliary model N 2 Is used for controlling the proportion of alpha to beta, and the value of gamma is (0.8,1), L 2 Representing a master model N 1 Output first gait feature and auxiliary model N 2 Similarity loss function between output second gait characteristics, L i Main model N 1 Output first gait feature and auxiliary model N i Similarity loss function, so, between output second gait characteristicsftmax(L 2 ) For assisting model N 2 Is a function of the normalization of (a).
Wherein the rest of the auxiliary models N 3 ~N i Update mode and auxiliary model N of (2) 2 The same way of updating.
In this embodiment, the smaller the difference between the auxiliary model parameter of a certain auxiliary model and the main model parameter of the main model is, the more similar the second gait feature output by the auxiliary model is to the first gait feature output by the main model is, the smaller the similarity loss function between the two is, the larger the alpha generated based on the similarity loss function is, and the smaller the update amplitude of the auxiliary model is. Conversely, the larger the difference between the auxiliary model parameter of a certain auxiliary model and the main model parameter of the main model, the larger the update amplitude of the auxiliary model.
In the existing contrast learning field, a fixed weight coefficient is generally adopted to update network parameters corresponding to an auxiliary model, but the mode cannot be well adapted to the change in the training process.
Compared with the prior art, the method and the device have the advantages that the network parameters corresponding to the auxiliary models can be dynamically adjusted through the formula, the change in the training process can be well adapted, and the formula can well update the network weight parameters no matter how many auxiliary models are.
S14: repeating the steps of inputting and updating reversely until the set iteration times are reached, and taking the trained main model as a gait recognition model.
In this embodiment, the gait sequence sampled at equal intervals in the same first training set is sequentially used as the input of the main model, the sequences in the remaining first training sets and the remaining second training sets are sequentially used as the input of the auxiliary models, and the iteration is continuously performed until the set iteration times are reached, until the training results of the main model and the auxiliary models meet the preset convergence condition, a trained main model is obtained, and the main model is used as the gait recognition model.
As can be appreciated, in this embodiment, gait sequences corresponding to the same image sequence at different frame rates are used as input sequences of the main model and the auxiliary model, a similarity loss function between auxiliary feature information output by each auxiliary model and main feature information output by the main model is obtained, a plurality of similarity loss functions and gait loss functions of the main model are used as a total loss function to update main model parameters of the main model, and the updated main model parameters are used to update auxiliary model parameters of the auxiliary model, so that stability of the main model can be improved by using a comparison learning mode.
In addition, the training process of the embodiment is end-to-end, and the plurality of auxiliary models in the training process do not participate in actual gait recognition, only the trained main model is used as the gait recognition model to be deployed in hardware, the requirement on the hardware is low, and the method is applicable to more monitoring devices or playing devices.
Compared with the prior art, the method is different from the prior art in that the multiple recognition models are trained by adopting the gait sequences combined by the multiple frame rates, and the continuous frame sequences can be used for training when the selected gait sequences are short, so that the information of the original gait sequences is well reserved, and the recognition accuracy of the models is enhanced. Further, under the condition of limited hardware resources, compared with a mode of training by using a continuous frame sequence only, the method of training by using a static frame rate sequence with intervals can acquire gait information in a larger time range, thereby improving the recognition efficiency of a model. Further, when the walking speed of the pedestrian is high or low, the dynamic frame skipping sequence for frame skipping in the preset range is utilized for training, so that the scene with the changed speed can be fitted better, and the robustness of the model is enhanced.
It can be understood that the method and the device can combine the computing advantages of various gait recognition algorithms, not only solve the problem of reduced algorithm recognition efficiency caused by training only by means of continuous frame gait sequences, but also solve the problem of reduced algorithm recognition accuracy caused by training only by means of equidistant sampling sequences, so that the method and the device can be better suitable for practical scenes.
Referring to fig. 4, fig. 4 is a flow chart illustrating an embodiment of a gait recognition method according to the present application. In this embodiment, the gait recognition method is implemented by the gait recognition model described above, and includes:
s41: a gait sequence of at least one target object is acquired.
Specifically, referring to fig. 5, fig. 5 is a flowchart of an embodiment of S41. In this embodiment, the step of acquiring the gait sequence of the at least one target object specifically includes:
s411: video images acquired based on the monitored area are acquired.
In this embodiment, the monitoring area may be an indoor area or an outdoor area, for example, the indoor area may be a partial area inside an office place near a doorway, and the outdoor area may be a community doorway or a school doorway.
The monitoring device may be a network camera (IPC) or a remaining video monitoring camera.
S412: and detecting the video image to obtain an image sequence of at least one target object.
In this embodiment, first, a continuous multi-frame image is acquired based on a video image, and the continuous multi-frame image is detected, so as to obtain a human body detection frame of at least one target object.
Specifically, after at least one target object is detected from a video image by using a pedestrian detection algorithm, a human body detection frame is added to the target object in the continuous multi-frame images by using a human body segmentation model.
In a specific implementation scenario, the pedestrian detection algorithm may employ a target tracking algorithm based on motion detection, that is, a background modeling algorithm is used to extract a moving foreground target when the camera is stationary, and then a classifier is used to classify the moving target and determine whether the moving target includes a pedestrian, for example, a gaussian mixture model algorithm, a frame difference algorithm, or a sample consistency modeling algorithm.
In another specific implementation scenario, a pedestrian detection algorithm based on machine learning, that is, an algorithm based on HOG (Histogram of Oriented Gradient, directional gradient histogram) +svm (support vector machine), an algorithm based on hog+adaboost (Adaptive Boosting, adaptive enhancement), an algorithm based on DPM (Deformable Parts Model, deformable component model) +latntent SVM, and the like, may be used to train a classifier and distinguish pedestrians from the background using appearance features of the human body itself (such as color, edge, texture features, and the like).
In yet another specific implementation scenario, a pedestrian detection algorithm based on deep learning, that is, training a classifier and distinguishing pedestrians from the background based on learning human features based on deep learning, is adopted, and has strong robustness, specifically, an algorithm based on cascades CNN, an algorithm based on JointDeep (joint depth), and the like, which is not limited in this application.
Further, real-time tracking is performed on the human body detection frame corresponding to each target object, so that tracking Identity (ID) information is respectively established for each target object, and an image sequence of each target object in the continuous multi-frame images is determined based on the tracking identity information.
Specifically, when a plurality of target objects exist in the video image, the corresponding target objects are tracked in continuous frames based on different tracking identity information, so that an image sequence of each target object in a continuous tracking process is obtained.
S413: and obtaining a humanoid region of the target object based on the image sequence, and obtaining a gait sequence of the target object by utilizing the humanoid region.
In this embodiment, a human body detection frame in an image sequence is segmented by using a human body segmentation model to obtain a background region and a human body region, then a human body region mask of a target object is obtained by using the background region and the human body region, and binarization processing is performed on the human body region mask to obtain a gait sequence of the target object.
The matting corresponding to the human body detection frame comprises a larger background area, and the background area can influence the detection of the human body area, so that the mask is required to be used for stripping the human body area and the background area to obtain an effective mask of the human body area.
S42: inputting the gait sequence into a gait recognition model, and extracting gait characteristics of a target object from the gait sequence by using the gait recognition model; the gait recognition model is obtained by comparing and training the equidistant sampling gait sequences and the random interval sampling gait sequences; wherein the equally spaced sampling gait sequence comprises a continuous frame gait sequence.
In the embodiment, the gait recognition model is a main model obtained by training in a contrast learning mode, has good recognition accuracy, recognition efficiency and robustness, has low requirements on hardware, and can be directly deployed at the front end or the rear end of the monitoring device.
In this embodiment, the gait features include static features and dynamic features, the static features refer to physiological features such as height, leg bones, joints and muscles of the target object obtained based on the human body detection frame, and the dynamic features refer to moving features such as arm swing, head swing, body swing and walking frequency of the target object, and reflected walking habits of the target object in the falling, rising and supporting swing phases.
It can be understood that, because physiological features and walking habits of different pedestrians are different, the gait features are extracted and identified through the gait identification model, so that the identity features of the target object can be obtained.
S43: and matching and identifying the gait characteristics with the stored gait characteristics of the user by using the gait identification model, and outputting an identification result.
In this embodiment, the gait recognition model is used to compare the extracted gait features with the gait features of the user stored in the feature library, and determine whether the target object is the user by determining whether the extracted gait features reach the set similarity threshold.
In a specific implementation scenario, if the matching degree of the extracted gait feature and the gait feature of a certain user in the feature library reaches a set similarity threshold, it indicates that the gait feature of the target object and the gait feature of the user are the gait feature of the same person, and the gait recognition model outputs a recognition result that the target object is the user.
In another specific implementation scenario, if the matching degree of the extracted gait feature and the gait feature of any user stored in the feature library does not reach the set similarity threshold, it indicates that the feature library does not have the gait feature of the user matched with the gait feature of the target object, and the gait recognition model outputs a recognition result that the target object is not the user.
In this embodiment, if the monitoring area is an office location, the gait features of the user stored in the feature library may be gait features extracted based on staff, and the gait features of the pedestrian appearing in the monitoring area may be identified by the gait identification model, so that it may be possible to determine whether the pedestrian appearing is a staff. If the monitoring area is a district gate, the gait features of the user stored in the feature library can be gait features extracted based on the owner, and the gait features of the pedestrians appearing in the monitoring area can be identified through a gait identification model, so that whether the pedestrians appearing are owners can be determined. If the monitoring area is a school gate, the gait features of the user stored in the feature library can be gait features extracted based on students or teachers, and the gait features of pedestrians appearing in the monitoring area are identified through a gait identification model, so that whether the pedestrians appearing are students or teachers can be determined.
Compared with the prior art, the gait feature extraction and recognition are carried out on the obtained gait sequence of the target object through the gait recognition model, the gait recognition model is obtained by comparing and training the equidistant sampling gait sequence (comprising the continuous frame gait sequence) and the random interval sampling gait sequence, the calculation advantages of various gait recognition algorithms can be combined, the problem of algorithm recognition efficiency reduction caused by training only by means of the continuous frame gait sequence can be solved, the problem of algorithm recognition precision reduction caused by training only by means of the equidistant sampling sequence can be solved, accordingly, the gait recognition effect of the gait recognition model in an actual scene is improved, and the requirement of accurate recognition of the gait is met.
Correspondingly, the application provides a gait recognition device.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of the gait recognition device of the present application. As shown in fig. 6, the gait recognition device 60 includes a gait sequence acquisition module 61, a gait feature extraction module 62, and a gait recognition module 63.
A gait sequence acquisition module 61 for acquiring a gait sequence of at least one target object.
A gait feature extraction module 62, configured to input a gait sequence into a gait recognition model, and extract gait features of the target object from the gait sequence using the gait recognition model; the gait recognition model is obtained by comparing and training the equidistant sampling gait sequences and the random interval sampling gait sequences; wherein the equally spaced sampling gait sequence comprises a continuous frame gait sequence.
The gait recognition module 63 is configured to perform matching recognition on the gait features and the stored gait features of the user by using the gait recognition model, and output a recognition result.
The specific process is described in the related text of S11 to S13, S111 to S116, S131 to S134, S41 to S43, and S411 to S413, and will not be described herein.
Compared with the prior art, the gait feature extraction module 62 and the gait recognition module 63 are used for extracting and recognizing the gait feature of the acquired target object, and the used gait recognition model is obtained by comparing and training the equidistant sampling gait sequence (comprising the continuous frame gait sequence) and the random interval sampling gait sequence, so that the calculation advantages of various gait recognition algorithms can be combined, the problem of the reduction of the algorithm recognition efficiency caused by training only by means of the continuous frame gait sequence can be solved, the problem of the reduction of the algorithm recognition precision caused by training only by means of the equidistant sampling sequence can be solved, the gait recognition effect of the gait recognition model in an actual scene is improved, and the requirement of accurate recognition of the gait is met.
Correspondingly, the application provides electronic equipment.
Referring to fig. 7, fig. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application. As shown in fig. 7, the electronic device 70 includes a memory 71 and a processor 72.
In the present embodiment, the memory 71 is used to store program data, and the program data, when executed, implements the steps in the gait recognition method described above; the processor 72 is configured to execute program instructions stored in the memory 71 to implement the steps in the gait recognition method as described above.
Specifically, the processor 72 is configured to control itself and the memory 71 to implement the steps in the gait recognition method as described above. The processor 72 may also be referred to as a CPU (Central Processing Unit ). The processor 72 may be an integrated circuit chip having signal processing capabilities. The processor 72 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 72 may be commonly implemented by a plurality of integrated circuit chips.
Compared with the prior art, the gait feature extraction and recognition are performed on the obtained gait sequence of the target object through the processor 72, and the gait recognition model used by the processor 72 is obtained by comparing and training the equidistant sampling gait sequence (comprising the continuous frame gait sequence) and the random interval sampling gait sequence, so that the computational advantages of a plurality of gait recognition algorithms can be combined, the problem of the algorithm recognition efficiency reduction caused by training only by means of the continuous frame gait sequence can be solved, the problem of the algorithm recognition precision reduction caused by training only by means of the equidistant sampling sequence can be solved, the gait recognition effect of the gait recognition model in an actual scene is improved, and the requirement of accurate gait recognition is met.
Accordingly, the present application provides a computer-readable storage medium.
Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of a computer readable storage medium according to the present invention.
The computer readable storage medium 80 comprises a computer program 801 stored on the computer readable storage medium 80, which computer program 801, when executed by the above-mentioned processor, implements the steps of the gait recognition method as described above. In particular, the integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer-readable storage medium 80. Based on such understanding, the technical solution of the present application, or a part or all or part of the technical solution contributing to the prior art, may be embodied in the form of a software product stored in a computer-readable storage medium 80, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned computer-readable storage medium 80 includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing description is only of embodiments of the present application, and is not intended to limit the scope of the patent application, and all equivalent structures or equivalent processes using the descriptions and the contents of the present application or other related technical fields are included in the scope of the patent application.
If the technical scheme of the application relates to personal information, the product applying the technical scheme of the application clearly informs the personal information processing rule before processing the personal information, and obtains independent consent of the individual. If the technical scheme of the application relates to sensitive personal information, the product applying the technical scheme of the application obtains individual consent before processing the sensitive personal information, and simultaneously meets the requirement of 'explicit consent'. For example, a clear and remarkable mark is set at a personal information acquisition device such as a camera to inform that the personal information acquisition range is entered, personal information is acquired, and if the personal voluntarily enters the acquisition range, the personal information is considered as consent to be acquired; or on the device for processing the personal information, under the condition that obvious identification/information is utilized to inform the personal information processing rule, personal authorization is obtained by popup information or a person is requested to upload personal information and the like; the personal information processing rule may include information such as a personal information processor, a personal information processing purpose, a processing mode, and a type of personal information to be processed.

Claims (11)

1. A gait recognition method, comprising:
Acquiring gait sequences of at least one target object;
inputting the gait sequence into a gait recognition model, and extracting gait characteristics of the target object from the gait sequence by using the gait recognition model; the gait recognition model is obtained by comparing and training the equidistant sampling gait sequences and the random interval sampling gait sequences; wherein the equally spaced sampling gait sequence comprises a continuous frame gait sequence;
and carrying out matching recognition on the gait characteristics and the stored gait characteristics of the user by using the gait recognition model, and outputting a recognition result.
2. The gait recognition method according to claim 1, wherein,
the step of acquiring a gait sequence of at least one target object comprises:
acquiring a video image acquired based on a monitoring area;
detecting the video image to obtain an image sequence of at least one target object;
and obtaining a humanoid region of the target object based on the image sequence, and obtaining the gait sequence of the target object by utilizing the humanoid region.
3. The gait recognition method according to claim 2, wherein,
The step of detecting the video image to obtain at least one image sequence of the target object comprises the following steps:
acquiring continuous multi-frame images based on the video images, and detecting the continuous multi-frame images to obtain at least one human body detection frame of the target object;
tracking the human body detection frames corresponding to each target object in real time so as to respectively establish tracking identity information for each target object;
and determining an image sequence of each target object in the continuous multi-frame images based on the tracking identity information.
4. The gait recognition method according to claim 2, wherein,
the step of obtaining a humanoid region of the target object based on the image sequence and obtaining the gait sequence of the target object by using the humanoid region comprises the following steps:
dividing the human body detection frame in the image sequence to obtain a background area and the humanoid area;
obtaining a humanoid region mask of the target object by using the background region and the humanoid region;
and performing binarization processing on the humanoid region mask to obtain the gait sequence of the target object.
5. The gait recognition method according to any one of claims 1 to 4, wherein,
the gait recognition model is obtained by training in the following manner:
acquiring a plurality of first training sets and a plurality of second training sets based on the plurality of marked image sequences; wherein each of the first training sets comprises a plurality of equally spaced sampling gait sequences having the same sampling interval, and each of the second training sets comprises a plurality of randomly spaced sampling gait sequences having the same random frame skip interval; the sampling intervals corresponding to the first training sets are different, and the random frame skipping intervals corresponding to the second training sets are different;
performing gait recognition training on the main model by using one of the first training sets, and performing gait recognition training on a plurality of auxiliary models by using the rest of the first training sets and the rest of the second training sets; the input sequences of the main model and the auxiliary models are gait sequences acquired based on the same marked image sequence;
reversely training model parameters of the main model by using gait loss functions generated during each iterative training of the main model; and updating the model parameters of each auxiliary model by using the model parameters of the main model after each update and the similarity loss function between the main model and each auxiliary model;
Repeating the steps of inputting and reversely updating until the set iteration times are reached, and taking the trained main model as the gait recognition model.
6. The gait recognition method according to claim 5, wherein,
the step of acquiring a plurality of first training sets and a plurality of second training sets based on the plurality of annotated image sequences includes:
acquiring image sequences corresponding to a plurality of pedestrians and labeling information of each image sequence from a selected video image;
respectively carrying out equal interval sampling on each image sequence by utilizing a plurality of preset sampling intervals so as to obtain a plurality of equal interval sampling image sequences with different sampling intervals based on each image sequence; wherein the sampling interval comprises 0; the method comprises the steps of,
respectively carrying out random interval sampling on each image sequence by utilizing a plurality of preset random frame-skipping interval sets so as to obtain a plurality of random interval sampling image sequences with different random frame-skipping intervals based on each image sequence;
obtaining a corresponding equidistant sampling gait sequence based on each equidistant sampling image sequence, and obtaining a corresponding random equidistant sampling gait sequence based on each random equidistant sampling image sequence;
Extracting the equidistant sampling gait sequences with the same sampling interval from a plurality of equidistant sampling gait sequences corresponding to each image sequence, and dividing the plurality of equidistant sampling gait sequences with the same sampling interval into the same set to obtain a plurality of first training sets; the number of the first training sets is the same as the number of the preset sampling intervals; the method comprises the steps of,
extracting the random interval sampling gait sequences with the same random frame-skipping interval from a plurality of random interval sampling gait sequences corresponding to each image sequence, and dividing the plurality of random interval sampling gait sequences with the same random frame-skipping interval into the same set to obtain a plurality of second training sets; the number of the second training sets is the same as the number of the preset random frame-skip interval sets.
7. The gait recognition method according to claim 6, wherein,
before the step of sampling each image sequence at equal intervals by using a plurality of preset sampling intervals to obtain a plurality of image sequences with different sampling intervals based on each image sequence, the method comprises the following steps:
Performing different values for a plurality of times in a preset integer set to obtain a plurality of sampling intervals with different values; wherein the maximum number of sampling intervals is equal to the number of integers included in the preset integer set; wherein the preset integer set comprises 0;
the step of sampling each image sequence at random intervals by using a preset plurality of random frame-skip interval sets to obtain a plurality of random interval sampling image sequences with different random frame-skip intervals based on each image sequence comprises the following steps:
responding to the random interval sampling image sequence with a preset number of image frames, and carrying out random value taking for a preset number of times in the preset integer set to obtain a random frame skipping interval set with a preset number of random value taking;
repeating the steps of random value taking to obtain a plurality of random frame-skip interval sets;
and respectively carrying out random interval sampling on each image sequence by using the acquired plurality of random frame-skip interval sets so as to obtain a plurality of random interval sampling image sequences with different random frame-skip intervals based on each image sequence.
8. The gait recognition method according to claim 6, wherein,
the step of performing gait recognition training on the main model by using one of the first training sets and performing gait recognition training on the plurality of auxiliary models by using the remaining plurality of first training sets and the plurality of second training sets simultaneously includes:
inputting one of the equally spaced sampling gait sequences in one of the first training sets into the master model to obtain a first gait feature; the method comprises the steps of,
simultaneously inputting the equidistant sampling gait sequences and the random equidistant sampling gait sequences which are acquired based on the same marked image sequences in the first training set and the second training set into a plurality of auxiliary models to obtain a plurality of second gait features;
the step of reverse training the model parameters of the main model using gait loss functions generated during each iterative training of the main model includes:
calculating to obtain the gait loss function between the first gait feature and the labeling information;
updating the main model parameters of the main model based on the gait loss function by using a back propagation algorithm to obtain updated main model parameters;
The step of updating the model parameters of each auxiliary model by using the model parameters of the main model after each update and the similarity loss function between the main model and each auxiliary model comprises the following steps:
calculating the similarity loss function between the first gait feature and each of the second gait features;
and updating the auxiliary model parameters corresponding to each auxiliary model by using the updated main model parameters and each similarity loss function to obtain a plurality of updated auxiliary model parameters.
9. The gait recognition method according to claim 5, wherein,
before the step of performing gait recognition training on the main model by using one of the first training sets and simultaneously performing gait recognition training on the plurality of auxiliary models by using the remaining plurality of first training sets and the plurality of second training sets, the method comprises the following steps:
acquiring a plurality of deep learning models with the same network structure;
setting different initialization functions for each deep learning model;
generating different initialization parameters based on the initialization function to obtain a plurality of identification models with different initialization parameters;
And taking one of the recognition models as the main model, and taking the rest of the plurality of recognition models as the plurality of auxiliary models.
10. An electronic device, comprising:
a memory for storing program data which when executed performs the steps of the gait recognition method as claimed in any one of claims 1 to 9;
a processor for executing the program data stored in the memory to implement the steps in the gait recognition method as claimed in any one of claims 1 to 9.
11. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the gait recognition method according to any one of claims 1 to 9.
CN202211699164.9A 2022-12-28 2022-12-28 Gait recognition method, electronic device, and computer-readable storage medium Pending CN116092186A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211699164.9A CN116092186A (en) 2022-12-28 2022-12-28 Gait recognition method, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211699164.9A CN116092186A (en) 2022-12-28 2022-12-28 Gait recognition method, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN116092186A true CN116092186A (en) 2023-05-09

Family

ID=86205587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211699164.9A Pending CN116092186A (en) 2022-12-28 2022-12-28 Gait recognition method, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN116092186A (en)

Similar Documents

Publication Publication Date Title
Tu et al. Edge-guided non-local fully convolutional network for salient object detection
Ong et al. A boosted classifier tree for hand shape detection
Kae et al. Augmenting CRFs with Boltzmann machine shape priors for image labeling
CN107633522B (en) Brain image segmentation method and system based on local similarity active contour model
KR100474848B1 (en) System and method for detecting and tracking a plurality of faces in real-time by integrating the visual ques
CN111310731A (en) Video recommendation method, device and equipment based on artificial intelligence and storage medium
Jabnoun et al. Object detection and identification for blind people in video scene
CN109902565B (en) Multi-feature fusion human behavior recognition method
Kocak et al. Top down saliency estimation via superpixel-based discriminative dictionaries.
CN110765882B (en) Video tag determination method, device, server and storage medium
CN111814690B (en) Target re-identification method, device and computer readable storage medium
CN107862680B (en) Target tracking optimization method based on correlation filter
CN111814845A (en) Pedestrian re-identification method based on multi-branch flow fusion model
Azaza et al. Context proposals for saliency detection
Vora et al. Iterative spectral clustering for unsupervised object localization
CN113378852A (en) Key point detection method and device, electronic equipment and storage medium
CN110516638B (en) Sign language recognition method based on track and random forest
CN116386118B (en) Drama matching cosmetic system and method based on human image recognition
Afkham et al. Joint visual vocabulary for animal classification
Elguebaly et al. Generalized Gaussian mixture models as a nonparametric Bayesian approach for clustering using class-specific visual features
CN111652080A (en) Target tracking method and device based on RGB-D image
CN108256578B (en) Gray level image identification method, device, equipment and readable storage medium
Kakkar Facial expression recognition with LDPP & LTP using deep belief network
CN116092186A (en) Gait recognition method, electronic device, and computer-readable storage medium
Jabnoun et al. Visual scene prediction for blind people based on object recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination