CN111291640B

CN111291640B - Method and apparatus for recognizing gait

Info

Publication number: CN111291640B
Application number: CN202010064256.4A
Authority: CN
Inventors: 王平; 李甫; 何栋梁; 龙翔; 迟至真; 赵翔; 周志超; 孙昊; 文石磊; 丁二锐
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-01-20
Filing date: 2020-01-20
Publication date: 2023-02-17
Anticipated expiration: 2040-01-20
Also published as: CN111291640A

Abstract

The disclosed embodiments disclose methods and apparatus for identifying gait. The method comprises the following steps: acquiring characteristics of different granularities corresponding to each picture in a gait sequence; acquiring a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture; fusing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain multi-dimensional statistical characteristics; and obtaining a classification result of the gait sequence based on the multi-dimensional statistical characteristics. The method improves the accuracy of the characteristics for classification, further improves the accuracy of classification of the gait sequence and reduces the dependence on the frame number in the input gait sequence.

Description

Method and apparatus for recognizing gait

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to the field of image recognition technologies, and in particular, to a method and an apparatus for recognizing gait.

Background

The gait recognition refers to distinguishing the identities of different people according to the differences of walking postures. Compared with other biometric technologies, gait recognition has the advantages of being non-contact, remote, and not prone to camouflage. Compared with the face recognition, the gait recognition can be more suitable for the conditions of long distance and low image quality of a camera. Meanwhile, the influence of operations such as makeup, clothes changing and the like can be reduced. Because a person is difficult to disguise the walking posture, the gait recognition has better robustness in an extreme security scene.

In recent years, due to the development of artificial intelligence, many gait recognition methods based on deep learning start to emerge. In the model representation, the mainstream method is to represent the gait cycle as a gait energy map (superposition of gait sequence pictures) and input the gait energy map into a Convolutional Neural Network (CNN). By representing the gait cycle as an unordered set (set), more information can be retained. In terms of model use, some methods are based on Long Short-Term Memory networks (LSTM) and 3D CNN, and can effectively utilize sequence information. There are also methods based on simple CNN stacking for gait recognition.

Disclosure of Invention

The disclosed embodiments provide methods and apparatus for identifying gait.

In a first aspect, the disclosed embodiments provide a method for identifying gait, including: acquiring characteristics of different granularities corresponding to each picture in a gait sequence; acquiring a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture; fusing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain multi-dimensional statistical characteristics; and obtaining a classification result of the gait sequence based on the multi-dimensional statistical characteristics.

In some embodiments, the gait sequence is acquired based on the following steps: intercepting a plurality of image frames from the same gait cycle or adjacent gait cycles of an original video; segmenting human figures from the plurality of intercepted image frames to obtain a human body contour sequence; and normalizing the human body contour image in the human body contour sequence into a binary contour image with the same height to obtain a gait sequence.

In some embodiments, obtaining features of different granularities corresponding to each picture in the gait sequence includes: and respectively inputting each picture of the gait sequence into a pre-trained branch sharing network of the MGN network to obtain the characteristics of different granularities corresponding to each picture output by each branch network connected with the branch sharing network.

In some embodiments, the MGN network comprises: a branch sharing network located in the first three layers, and three branch networks connecting outputs of the branch sharing network; wherein the three branch networks include: extracting a first branch network of global features of an input picture; extracting a second branch of the dichotomy granularity characteristic of the input picture after the input picture is dichotomy; and extracting a third branch network of the quartering granularity characteristics after the input picture is quartered.

In some embodiments, in each statistical feature dimension, fusing the statistical features of the features with different granularities corresponding to each picture to obtain the multidimensional statistical features includes: and splicing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain the multi-dimensional statistical characteristics.

In some embodiments, the plurality of statistical features includes: mean, four quantiles, second order moment and third order moment.

In a second aspect, the disclosed embodiments provide an apparatus for identifying gait, comprising: the granularity characteristic acquisition unit is configured to acquire characteristics of different granularities corresponding to each picture in the gait sequence; the statistical characteristic obtaining unit is configured to obtain a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture; the statistical feature fusion unit is configured to fuse the statistical features of the features with different granularities corresponding to each picture in each statistical feature dimension to obtain multi-dimensional statistical features; and the statistical characteristic classification unit is configured to obtain a classification result of the gait sequence based on the multidimensional statistical characteristics.

In some embodiments, the gait sequence in the granular feature acquisition unit is acquired with the following units: an image frame intercepting unit configured to intercept a plurality of image frames from the same gait cycle or adjacent gait cycles of the original video; the human image segmentation unit is configured to segment human images from the plurality of intercepted image frames to obtain a human body contour sequence; and the gait sequence generation unit is configured to normalize the human body contour images in the human body contour sequence into binary contour images with the same height to obtain a gait sequence.

In some embodiments, the granular feature acquisition unit is further configured to: and respectively inputting each picture of the gait sequence into a branch sharing network of a pre-trained MGN network to obtain the characteristics of different granularities corresponding to each picture output by each branch network connected with the branch sharing network.

In some embodiments, the MGN network employed by the granular feature obtaining unit includes: a branch sharing network located in the first three layers, and three branch networks connecting outputs of the branch sharing network; wherein the three branch networks include: a first branch network for extracting global features of the input picture; extracting a second branch of the dichotomy granularity characteristic after the input picture is dichotomized; and extracting a third branch network of the quartered granularity characteristics of the input picture.

In some embodiments, the statistical feature fusion unit is further configured to: and splicing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain the multi-dimensional statistical characteristics.

In a third aspect, an embodiment of the present disclosure provides an electronic device/terminal/server, including: one or more processors; storage means for storing one or more programs; when executed by one or more processors, the one or more programs cause the one or more processors to implement a method for identifying gait as described above.

In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method for identifying gait as described in any one of the above.

The method and the device for identifying the gait provided by the embodiment of the disclosure comprise the following steps: firstly, acquiring characteristics of different granularities corresponding to each picture in a gait sequence; then acquiring a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture; then, fusing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain a multi-dimensional statistical characteristic; and finally, obtaining a classification result based on the multi-dimensional statistical characteristics. In the process, the classification result of the portrait in the gait sequence can be determined by adopting the statistical characteristics of the plurality of dimensionalities at the image set level, and because the statistical characteristics of the plurality of dimensionalities at the image set level are fused with the characteristics of different granularities corresponding to each picture, the global and local information of the gait sequence is associated and mined, the accuracy of the characteristics for classification is improved, the accuracy of classifying the gait sequence is further improved, and the dependence on the frame number in the input gait sequence is reduced.

Drawings

Other features, objects, and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present disclosure may be applied;

fig. 2 is a schematic flow chart diagram of one embodiment of a method for identifying gait in accordance with an embodiment of the present disclosure;

fig. 3 is an exemplary schematic diagram of a gait sequence in a method for identifying gait according to an embodiment of the disclosure;

FIG. 4 is an exemplary application scenario of a method for identifying gait according to an embodiment of the disclosure;

fig. 5 is a schematic flow chart diagram of another embodiment of a method for identifying gait in accordance with an embodiment of the disclosure;

fig. 6 is a schematic structural diagram of an MGN network employed in another embodiment of a method for identifying gait according to an embodiment of the present disclosure;

FIG. 7 is an exemplary block diagram of one embodiment of an apparatus for identifying gait of the present disclosure;

FIG. 8 is a schematic block diagram of a computer system suitable for use with a server embodying embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the figures and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the disclosed method for identifying gait or apparatus for identifying gait can be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may use

terminal devices

101, 102, 103 to interact with a server 105 over a network 104 to receive or send messages or the like. Various communication client applications, such as a translation application, a browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like, may be installed on the

terminal devices

101, 102, 103.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices that support browser applications, including but not limited to tablet computers, laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein.

The server 105 may be a server providing various services, such as a background server providing support for browser applications running on the

terminal devices

101, 102, 103. The background server can analyze and process the received data such as the request and feed back the processing result to the terminal equipment.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules, for example, to provide distributed services, or as a single piece of software or software module. And is not particularly limited herein.

In practice, the method for identifying gait provided by the embodiments of the present disclosure may be performed by the

terminal device

101, 102, 103 and/or the

server

105, 106, and the apparatus for identifying gait may also be disposed in the

terminal device

101, 102, 103 and/or the

server

105, 106.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, fig. 2 illustrates a flow 200 of one embodiment of a method for identifying gait according to the present disclosure. The method for recognizing gait includes the steps of:

step 201, obtaining features of different granularities corresponding to each picture in the gait sequence.

In this embodiment, an executing subject (e.g., a terminal or a server shown in fig. 1) of the method for recognizing gait may acquire a gait sequence, and then acquire image frame-level features of different granularities for each picture in the gait sequence.

The gait sequence is an image sequence composed of independent gait image frames, and can be acquired by a method for acquiring the gait sequence in the prior art or a future developed technology, which is not limited in the present application. For example, for a given video frame sequence containing one or more pedestrian walking processes, pedestrian detection, pedestrian segmentation, pedestrian tracking and pedestrian identification can be performed on the video frame sequence, and a mask region convolution neural network and foreground and background separation technology are adopted to separate the gait sequence of the pedestrian. The obtained characteristics of different granularities refer to the characteristics of different refining degrees or comprehensive degrees.

In some optional implementations of the present embodiment, the gait sequence may be acquired based on the following steps: intercepting a plurality of image frames from the same gait cycle or adjacent gait cycles of an original video; segmenting human images from the plurality of intercepted image frames to obtain a human body contour sequence; and normalizing the human body contour images in the human body contour sequence into binary contour images with the same height to obtain the gait sequence shown in the figure 3.

In this implementation, a plurality of video frames may be first captured from the original video: the video frames can be from a small segment of walking video containing the same gait cycle or an adjacent gait cycle, and the video frames are intercepted at equal intervals from the small segment of walking video to obtain a preset number (for example, more than 20) of video frames. Then, the human body contour is obtained by segmenting the intercepted video frame (for example, generating a candidate region first and then performing CNN classification, and then segmenting the human body, or detecting a human body frame by adopting a yolo model and then segmenting by deplab v3 +). And finally normalizing the human body contours to the same height to obtain a gait sequence.

Specifically, when the features of different granularities of each picture in the gait sequence are obtained, the features of different granularities of the pictures can be obtained by a method of obtaining the features of different granularities of the pictures in the prior art or a future developed technology, which is not limited in the present application.

For example, a multi-Granularity Network (abbreviated as MGN) may be employed to extract features of different granularities. For another example, the CNN may be trained by using a plurality of image samples of different regions to obtain network models for extracting features of different regions, and then the network models for extracting features of different regions are used to extract features of images of different regions, so as to obtain features of different granularities.

Step 202, a plurality of statistical features of the features with different granularities corresponding to each picture are obtained.

In this embodiment, the statistical features reflect the distribution among individuals of the population, including the difference features and the regularity features, such as the range, the standard deviation, the variation coefficient, the mode, and the like.

For the features with different granularities corresponding to each picture, the statistical features of the features with each granularity can be counted.

In some optional implementations of this embodiment, the plurality of statistical features may include: mean, four quantiles, second order moment and third order moment.

In this implementation, the mean, also called an arithmetic mean, is a statistical feature quantity in a set of data sets, and often the arithmetic mean of the sample is used to represent the average level of the population. The four quantiles refer to the numerical values at the positions of the minimum end point, the two division points and the maximum end point after all the numerical values are arranged from small to large and divided into three equal parts in statistics. The second and third moments represent variance and skewness, respectively, in a set of data sets.

The plurality of statistical characteristics adopted in the implementation mode can describe the characteristics of different granularities corresponding to each picture in the data set from the aspects of the average level, the four quantiles, the variance and the skewness of the data set, so that the statistical characteristics of the characteristics of different granularities corresponding to each obtained picture can be enriched, and the comprehensiveness of the statistical characteristics of the characteristics of different granularities corresponding to each obtained picture is improved.

And 203, fusing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain the multi-dimensional statistical characteristics.

In this embodiment, for each statistical feature dimension, a feature fusion method in the prior art or a technique developed in the future may be used to fuse the statistical features of the features with different granularities corresponding to each picture, which is not limited in this application.

For example, the statistical characteristics of the features with different granularities corresponding to each picture may be added to obtain the statistical characteristics of each picture, and then the statistical characteristics of each picture may be added to obtain the statistical characteristics of the set level.

For another example, the statistical features of the features with different granularities corresponding to each picture may be directly spliced or added, so as to obtain the statistical features at the set level.

For example, a feature fusion algorithm may be used to fuse the statistical features of the features with different granularities corresponding to the respective pictures. The feature fusion algorithm herein may include, but is not limited to: an algorithm based on Bayesian decision theory; an algorithm based on sparse representation theory; and deep learning theory based algorithms.

In some optional implementation manners of this embodiment, in each statistical feature dimension, fusing the statistical features of the features with different granularities corresponding to each picture, and obtaining the multi-dimensional statistical features includes: and splicing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain the multi-dimensional statistical characteristics.

In the implementation mode, the statistical characteristics of the characteristics with different granularities corresponding to each picture are spliced in each statistical characteristic dimension, so that the multi-dimensional statistical characteristics can be obtained, information loss caused by characteristic fusion by adopting other characteristic fusion methods is avoided, and the information of the statistical characteristics of the characteristics with different granularities is retained to the maximum extent.

And step 204, obtaining a classification result of the gait sequence based on the multi-dimensional statistical characteristics.

In this embodiment, after obtaining the multidimensional statistical features, the multidimensional statistical features may be input into a pre-trained classification model, so as to obtain a classification result of a gait sequence output by the pre-trained classification model.

In the process, the input multi-dimensional statistical characteristics fully excavate the information contained in the gait sequence, so that the accuracy of the classification result can be improved.

Those skilled in the art will appreciate that the pre-trained classification model may be obtained by training an initial classification model using samples of multidimensional statistical features that have been labeled with classification results. And will not be described in detail herein.

According to the method for identifying the gait of the embodiment disclosed by the invention, the classification result of the portrait in the gait sequence can be determined by adopting the statistical characteristics of the plurality of dimensionalities of the image set level, and as the statistical characteristics of the plurality of dimensionalities of the image set level are fused with the characteristics of different granularities corresponding to each picture, the global and local information of the gait sequence is associated and mined, the accuracy of the characteristics for classification is improved, the accuracy of classifying the gait sequence is further improved, and the dependence on the frame number in the input gait sequence is reduced.

An exemplary application scenario of the method for recognizing gait of the present disclosure is described below in conjunction with fig. 4.

As shown in fig. 4, fig. 4 illustrates one exemplary application scenario of the method for recognizing gait according to the present disclosure.

As shown in fig. 4, a method 400 for identifying gait runs in an electronic device 420, the method 400 comprising:

firstly, acquiring features 402 with different granularities corresponding to each picture in a gait sequence 401;

then, acquiring a plurality of statistical features 403 of the features with different granularities corresponding to each picture;

then, at each statistical feature dimension, fusing the statistical features 403 of the features with different granularities corresponding to each picture to obtain a multi-dimensional statistical feature 404 of the gait sequence;

finally, based on the multi-dimensional statistical features 404, a classification result 405 of the gait sequence is obtained.

It should be understood that the application scenario of the method for recognizing gait shown in fig. 4 is only an exemplary description of the method for recognizing gait, and does not represent a limitation to the method. For example, the steps shown in fig. 4 above may be implemented in further detail. It is also possible to add further sub-steps to refine how a certain step is implemented on the basis of fig. 4 described above.

With further reference to fig. 5, fig. 5 shows a schematic flow chart of another embodiment of a method for identifying gait according to the present disclosure.

As shown in fig. 5, the method 500 for identifying gait of the present embodiment may include the following steps:

and step 501, acquiring characteristics with different granularities corresponding to each picture in the gait sequence.

Step 502, each picture of the gait sequence is respectively input into a pre-trained branch sharing network of the MGN network, and features of different granularities corresponding to each picture output by each branch network connected to the branch sharing network are obtained.

In this embodiment, the branch sharing network and each branch network in the MGN network are shown in fig. 6, and include: a branch shared network 601 located in the first three layers, and three branch networks connecting outputs of the branch shared network; wherein the three branch networks include: a first branch network 602 extracting global features of the input picture; a second branch 603 of the dichotomy granularity characteristic of the input picture after the input picture is dichotomized is extracted; a third branching network 604 of quartered granularity features into which the input picture is quartered is extracted.

The branch sharing network and each branch network in the MGN network shown in fig. 6 may be a branch sharing network and each branch network in a pre-trained MGN network, and each image in the gait sequence may be respectively input into the branch sharing network in the MGN network shown in fig. 6, so as to obtain a plurality of frame-level features with different granularities output by each branch network.

Depending on the nature of the gait recognition, here a globally represented branch 602 and two locally represented

branches

603, 604 are used. The first branch 602 is responsible for extracting global information of the whole picture, the second branch 603 divides the picture into an upper part and a lower part to extract medium-granularity semantic information, and the third branch 604 divides the picture into four parts from top to bottom to extract finer-granularity information. The three branches have cooperation and division, the first three low-level weights are shared, and the later high-level weights are independent, so that the whole gait information and the multi-granularity local information can be seen as the principle of human cognition.

Returning to fig. 5, in step 503, in each statistical feature dimension, the statistical features of the features with different granularities corresponding to each picture are fused to obtain a multi-dimensional statistical feature.

In this embodiment, for each statistical feature dimension, a feature fusion method in the prior art or a future developed technology may be adopted to fuse the statistical features of the features with different granularities corresponding to each picture, which is not limited in this application.

And step 504, obtaining a classification result of the gait sequence based on the multi-dimensional statistical characteristics.

It will be understood by those skilled in the art that steps 501, 503 and 504 in the method for identifying gait in the embodiment shown in fig. 5 correspond to

steps

201, 203 and 204, respectively, in the method for identifying gait in the embodiment shown in fig. 2. Therefore, the operations and features described above for

steps

201, 203 and 204 in fig. 2 are also applicable to

steps

501, 503 and 504, and are not described again here.

On the basis of the method for identifying gait shown in fig. 2, the method for identifying gait in the embodiment of fig. 5 in the disclosure refines the features of different granularities corresponding to each picture in the gait sequence acquired by adopting the pre-trained MGN network, considers the whole information and the local information of multiple granularities of gait, and improves the comprehensiveness of the extracted features of different granularities, thereby improving the comprehensiveness of the multi-dimensional statistical features obtained based on the features of different granularities, and improving the accuracy of the classification result of the obtained gait sequence.

With further reference to fig. 7, as an implementation of the methods shown in the above-mentioned figures, an embodiment of the present disclosure provides an embodiment of an apparatus for identifying gait, where the embodiment of the apparatus corresponds to the embodiments of the methods shown in fig. 2 to fig. 6, and the apparatus may be specifically applied to the above-mentioned terminal device or server.

As shown in fig. 7, the apparatus 700 for recognizing gait of the present embodiment may include: a granularity feature acquiring unit 710 configured to acquire features of different granularities corresponding to each picture in the gait sequence; a statistical feature obtaining unit 720, configured to obtain a plurality of statistical features of different granularities corresponding to each picture; a statistical feature fusion unit 730 configured to fuse, in each statistical feature dimension, statistical features of different granularities corresponding to each picture to obtain a multi-dimensional statistical feature; the statistical feature classification unit 740 is configured to obtain a classification result of the gait sequence based on the multidimensional statistical features.

In some optional implementations of the present embodiment, the gait sequence in the granularity feature acquisition unit 710 is acquired by the following units (not shown in the figure): an image frame intercepting unit configured to intercept a plurality of image frames from the same gait cycle or adjacent gait cycles of the original video; the human image segmentation unit is configured to segment human images from the intercepted image frames to obtain a human body contour sequence; and the gait sequence generation unit is configured to normalize the human body contour images in the human body contour sequence into binary contour images with the same height to obtain a gait sequence.

In some optional implementations of the present embodiment, the granular feature obtaining unit 710 is further configured to: and respectively inputting each picture of the gait sequence into a branch sharing network of a pre-trained MGN network to obtain the characteristics of different granularities corresponding to each picture output by each branch network connected with the branch sharing network.

In some optional implementations of this embodiment, the MGN network used in the granularity feature obtaining unit 710 includes: a branch sharing network located in the first three layers, and three branch networks connecting outputs of the branch sharing network; wherein the three branch networks include: a first branch network for extracting global features of the input picture; extracting a second branch of the dichotomy granularity characteristic of the input picture after the input picture is dichotomy; and extracting a third branch network of the quartered granularity characteristics of the input picture.

In some optional implementations of the present embodiment, the statistical feature fusion unit 730 is further configured to: and splicing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain the multi-dimensional statistical characteristics.

In some optional implementations of the present embodiment, the plurality of statistical features in the statistical feature obtaining unit 720 include: mean, four quantiles, second order moment and third order moment.

It should be understood that the various elements recited in the apparatus 700 correspond to various steps recited in the methods described with reference to fig. 2-6. Thus, the operations and features described above for the method are equally applicable to the apparatus 700 and the various units included therein and will not be described again here.

Referring now to fig. 8, a schematic diagram of an electronic device (e.g., a server or terminal device of fig. 1) 800 suitable for use in implementing embodiments of the present disclosure is shown. Terminal devices in embodiments of the present disclosure may include, but are not limited to, devices such as notebook computers, desktop computers, and the like. The terminal device/server shown in fig. 8 is only an example, and should not bring any limitation to the functions and the use range of the embodiments of the present disclosure.

As shown in fig. 8, electronic device 800 may include a processing means (e.g., central processing unit, graphics processor, etc.) 801 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage means 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for the operation of the electronic apparatus 800 are also stored. The processing device 801, the ROM 802, and the RAM803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

Generally, the following devices may be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 807 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage 808 including, for example, magnetic tape, hard disk, etc.; and a communication device 809. The communication means 809 may allow the electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While fig. 6 illustrates an electronic device 800 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be alternatively implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 809, or installed from the storage means 808, or installed from the ROM 802. The computer program, when executed by the processing apparatus 801, performs the above-described functions defined in the methods of the embodiments of the present disclosure. It should be noted that the computer readable medium described in the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In embodiments of the present disclosure, however, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring characteristics of different granularities corresponding to each picture in a gait sequence; acquiring a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture; fusing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain multi-dimensional statistical characteristics; and obtaining a classification result of the gait sequence based on the multi-dimensional statistical characteristics.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, which may be described as: a processor includes a granularity feature obtaining unit, a statistical feature fusing unit, and a statistical feature classifying unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, the encoder input unit may also be described as a "unit that acquires features of different granularities corresponding to the respective pictures in the gait sequence".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept as defined above. For example, the above features and the technical features disclosed in the present disclosure (but not limited to) having similar functions are replaced with each other to form the technical solution.

Claims

1. A method for identifying gait, comprising:

acquiring characteristics of different granularities corresponding to each picture in a gait sequence;

acquiring a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture;

fusing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain multi-dimensional statistical characteristics;

and obtaining a classification result of the gait sequence based on the multi-dimensional statistical characteristics.

2. The method of claim 1, wherein the gait sequence is acquired based on the steps of:

intercepting a plurality of image frames from the same gait cycle or adjacent gait cycles of an original video;

segmenting human figures from the plurality of intercepted image frames to obtain a human body contour sequence;

and normalizing the human body contour images in the human body contour sequence into binary contour images with the same height to obtain a gait sequence.

3. The method as claimed in claim 1, wherein the acquiring features of different granularities corresponding to each picture in the gait sequence comprises:

and respectively inputting each picture of the gait sequence into a pre-trained branch sharing network of the MGN network to obtain the characteristics of different granularities corresponding to each picture output by each branch network connected with the branch sharing network.

4. The method of claim 3, wherein the MGN network comprises: the branch sharing network is positioned on the first three layers, and the three branch networks are connected with the output of the branch sharing network;

wherein the three branch networks include: a first branch network for extracting global features of the input picture; extracting a second branch of the dichotomy granularity characteristic after the input picture is dichotomized; and extracting a third branch network of the quartering granularity characteristics after the input picture is quartered.

5. The method according to claim 1, wherein the step of fusing the statistical features of the features with different granularities corresponding to the respective pictures in each statistical feature dimension to obtain the multi-dimensional statistical features comprises:

and splicing the statistical characteristics of the characteristics with different granularities corresponding to each picture in each statistical characteristic dimension to obtain the multi-dimensional statistical characteristics.

6. The method of any of claims 1-5, wherein the plurality of statistical features comprises: mean, four quantiles, second order moment and third order moment.

7. An apparatus for identifying gait, comprising:

the granularity characteristic acquisition unit is configured to acquire characteristics of different granularities corresponding to each picture in the gait sequence;

the statistical characteristic obtaining unit is configured to obtain a plurality of statistical characteristics of the characteristics with different granularities corresponding to each picture;

the statistical feature fusion unit is configured to fuse the statistical features of the features with different granularities corresponding to each picture in each statistical feature dimension to obtain multi-dimensional statistical features;

and the statistical characteristic classification unit is configured to obtain a classification result of the gait sequence based on the multidimensional statistical characteristics.

8. The apparatus according to claim 7, wherein the gait sequence in the granular feature acquisition unit is acquired with the following units:

an image frame intercepting unit configured to intercept a plurality of image frames from the same gait cycle or adjacent gait cycles of the original video;

the human image segmentation unit is configured to segment human images from the plurality of intercepted image frames to obtain a human body contour sequence;

and the gait sequence generation unit is configured to normalize the human body contour images in the human body contour sequence into binary contour images with the same height to obtain a gait sequence.

9. The apparatus of claim 7, wherein the granular feature acquisition unit is further configured to:

10. The apparatus of claim 9, wherein the MGN network employed by the granular feature acquisition unit comprises: a branch sharing network located in the first three layers, three branch networks connecting the outputs of the branch sharing network;

wherein the three branch networks include: extracting a first branch network of global features of an input picture; extracting a second branch of the dichotomy granularity characteristic after the input picture is dichotomized; and extracting a third branch network of the quartering granularity characteristics after the input picture is quartered.

11. The apparatus of claim 7, wherein the statistical feature fusion unit is further configured to:

12. The apparatus of any of claims 7-11, wherein the plurality of statistical features comprises: mean, four quantiles, second order moment and third order moment.

13. An electronic device/terminal/server comprising:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.

14. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-6.