US11335127B2 - Media processing method, related apparatus, and storage medium - Google Patents

Media processing method, related apparatus, and storage medium Download PDF

Info

Publication number
US11335127B2
US11335127B2 US16/903,929 US202016903929A US11335127B2 US 11335127 B2 US11335127 B2 US 11335127B2 US 202016903929 A US202016903929 A US 202016903929A US 11335127 B2 US11335127 B2 US 11335127B2
Authority
US
United States
Prior art keywords
gait
energy diagram
gait energy
fused
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/903,929
Other languages
English (en)
Other versions
US20200320284A1 (en
Inventor
Kaihao ZHANG
Wenhan LUO
Lin Ma
Wei Liu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD reassignment TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ZHANG, Kaihao, LIU, WEI, LUO, Wenhan, MA, LIN
Publication of US20200320284A1 publication Critical patent/US20200320284A1/en
Application granted granted Critical
Publication of US11335127B2 publication Critical patent/US11335127B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • G06K9/6215
    • G06K9/6256
    • G06K9/629
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • G06V40/25Recognition of walking or running movements, e.g. gait recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]

Definitions

  • Example embodiments of the disclosure relate to the field of gait recognition technologies, and specifically, to a video processing method, a video processing apparatus, a video processing device, and a storage medium, as well as an image processing method, an image processing apparatus, an image processing device, and a storage medium.
  • gait recognition there are demands for gait recognition in many fields. For example, outdoor cameras are widely used in public places, but the outdoor cameras are normally located relatively distant from people, and pedestrian recognition may not be properly be performed based on faces included in a captured image or video. In gait recognition technologies, pedestrian recognition may be performed according to gait feature vectors of people, recognition does not need to be performed based on faces, and there is also no need for high-definition image quality. Therefore, the gait recognition technologies have become an important topic for research.
  • Example embodiments of the disclosure provide a media processing method, a media processing device, and a storage medium, to implement gait recognition. Further, the example embodiments of the disclosure provide an image processing method, an image processing device, and a storage medium, to implement posture recognition.
  • a media processing method performed by a media processing device, the method including:
  • the to-be-processed video including an object with a to-be-recognized identity
  • the second gait energy diagram being generated based on a video including an object with a known identity
  • a media processing device including: at least one memory configured to store program code; and at least one processor configured to read the program code and operate as instructed by the program code, the program code including: first obtaining code configured to cause at least one of the at least one processor to obtain a to-be-processed video, the to-be-processed video including an object with a to-be-recognized identity; generating code configured to cause at least one of the at least one processor to generate a first gait energy diagram based on the to-be-processed video; second obtaining code configured to cause at least one of the at least one processor to obtain a second gait energy diagram, the second gait energy diagram being generated based on a video including an object with a known identity; extracting code configured to cause at least one of the at least one processor to, by using a deep neural network, identity information of the first gait energy diagram and the second gait energy diagram, and determine a fused gait feature vector based on gait feature vectors
  • a non-transitory computer-readable storage medium storing a plurality of instructions executable by at least one processor to perform media processing method, the method comprising: obtaining a to-be-processed video, the to-be-processed video comprising an object with a to-be-recognized identity; generating a first gait energy diagram based on the to-be-processed video; obtaining a second gait energy diagram, the second gait energy diagram being generated based on a video comprising an object with a known identity; by using a deep neural network, extracting identity information of the first gait energy diagram and the second gait energy diagram, and determining a fused gait feature vector based on gait feature vectors of the first gait energy diagram, included in the identity information of the first gait energy diagram, and gait feature vectors of the second gait energy diagram, included in the identity information of the second gait energy diagram; and calculating a similarity between the first gait energy diagram and
  • the deep neural network used in an example embodiment not only extracts the fused gait feature vector of the object with a to-be-recognized identity and the object with a known identity, but also extracts the identity information (including identifiers (IDs) and gait feature vectors) of the object with a to-be-recognized identity and the object with a known identity.
  • the fused gait feature vector depends on the gait feature vectors.
  • the similarity (that is, a similarity between the object with a to-be-recognized identity and the object with a known identity) between the two gait energy diagrams is calculated based on at least the fused gait feature vector, thereby implementing gait recognition on the object with a to-be-recognized identity.
  • the deep neural network used in the example embodiments not only extracts a fused posture feature vector of an object with a to-be-recognized identity and an object with a known identity, but also extracts identity information (including IDs and posture feature vectors) of the object with a to-be-recognized identity and the object with a known identity.
  • the fused posture feature vector depends on the posture feature vectors.
  • a similarity that is, a similarity between the object with a to-be-recognized identity and the object with a known identity
  • a similarity that is, a similarity between the object with a to-be-recognized identity and the object with a known identity
  • FIG. 1 a to FIG. 1 d are example structural diagrams of a video processing application scenario according to an example embodiments.
  • FIG. 2 a and FIG. 2 b are example structural diagrams of a video processing apparatus according to an example embodiment.
  • FIG. 2 c is an example structural diagram of a video processing device according to an example embodiment.
  • FIG. 3 , FIG. 6 , and FIG. 9 are example flowcharts of a video processing method according to an example embodiment.
  • FIG. 4 is a schematic diagram of a gait energy diagram according to an example embodiment.
  • FIG. 5 a to FIG. 5 c are schematic diagrams of extracting a gait feature vector according to an example embodiment.
  • FIG. 7 and FIG. 8 are schematic diagrams of a training process according to an example embodiment.
  • FIG. 10 is an example structural diagram of an image processing application scenario according to an example embodiment.
  • FIG. 11 a and FIG. 11 b are example structural diagrams of an image processing apparatus according to an example embodiment.
  • Gait recognition is an emerging biometric feature recognition technology, and aims to perform identity recognition according to people's walking postures.
  • Gait recognition has advantages including being capable of recognizing a user's identity without requiring a contact with the user, even from a long distance, and a difficulty of disguising walking, and does not require high-definition image quality. Therefore, gait recognition is widely applied in various scenarios such as security protection, public security, and public transportation, and has tremendous application potential.
  • Example embodiments of the disclosure provide a media processing method and related apparatuses (e.g., a media processing device, a storage medium, and the like).
  • the media processing method and the related apparatuses are applicable to various scenarios (for example, intelligent video surveillance) in which a real-time or offline gait recognition technology service is provided.
  • gait recognition may be used for data retrieval; media (for example, when the media is a video, the media may be referred to as a first video or a to-be-processed video, and video frames in the first video include an object with a to-be-recognized identity) of a to-be-queried-for person (or the object with a to-be-recognized identity) is provided; and a database is queried for a video (which may be referred to as a second video) of a specific person whose identity information is known (an object with a known identity) that is similar or most similar to the object included in the media.
  • a video which may be referred to as a second video
  • the media processing method and the related apparatuses provided in the example embodiments of the disclosure implement gait recognition based on a deep neural network.
  • a video is mainly used as an optional presentation form of media for description.
  • this is merely an example and the disclosure is not limited thereto.
  • the deep neural network performs gait recognition based on gait energy diagrams. Therefore, before gait recognition is performed by using the deep neural network, a first gait energy diagram is pre-extracted from the first video, and a second gait energy diagram is pre-extracted from the second video; then the first gait energy diagram and the second gait energy diagram are inputted into the deep neural network, and a similarity between the two gait energy diagrams is outputted by the deep neural network as a similarity between the foregoing to-be-processed video and the second video.
  • the deep neural network extracts respective identity information of the two inputted gait energy diagrams and a fused gait feature vector of the two gait energy diagrams.
  • Identity information of either of the gait energy diagrams may include: an identifier (ID) of the gait energy diagram, and gait feature vectors extracted based on the gait energy diagram.
  • ID an identifier
  • gait feature vectors extracted based on the gait energy diagram.
  • the fused gait feature vector of the two gait energy diagrams depends on respective gait feature vectors of the two gait energy diagrams.
  • the deep neural network calculates the similarity between the two gait energy diagrams according to at least the extracted fused gait feature vector.
  • the deep neural network not only extracts the fused gait feature vector of the object with a to-be-recognized identity and the object with a known identity, but also extracts the identity information (including IDs and the gait feature vectors) of the object with a to-be-recognized identity and the object with a known identity.
  • the fused gait feature vector depends on the gait feature vectors.
  • the similarity (that is, a similarity between the object with a to-be-recognized identity and the object with a known identity) between the two gait energy diagrams is calculated according to at least the fused gait feature vector, thereby implementing gait recognition on the object with a to-be-recognized identity.
  • the deep neural network includes neurons in a layered structure.
  • Each neuron layer includes a plurality of filters, and weights and offsets (filter parameters) therebetween may be obtained through training.
  • the deep neural network may alternatively be trained in advance, and parameters thereof may be adjusted. Descriptions are provided subsequently in this specification.
  • a video processing apparatus and a video processing device included in the example embodiments for implementing gait recognition in the disclosure are described below.
  • the video processing apparatus may be applied in the video processing device in a software and/or hardware form.
  • the video processing device may be a server or personal computer (PC) providing a gait recognition service, or may be a terminal such as a digital camera, a mobile terminal (for example, a smartphone), or an iPad.
  • PC personal computer
  • the video processing apparatus When being applied in the video processing device in a software form, the video processing apparatus may be independent software.
  • the video processing apparatus may alternatively be used as a subsystem (e.g., child component) of a large-scale system (for example, an operating system), to provide a gait recognition service.
  • the video processing apparatus may be, for example, a controller/processor of a terminal or server.
  • FIG. 1 a to FIG. 1 d are example structural diagrams of a video processing application scenario according to example embodiments.
  • a camera 101 shoots a video of a pedestrian in motion (e.g., an object with a to-be-recognized identity), and provides the video to a video processing device 102 .
  • the video processing device 102 performs gait recognition based on videos of each of an object with a known identity in a database 103 .
  • the video processing device 102 needs to be provided with a module or an apparatus capable of extracting a gait energy diagram.
  • a video processing device 102 shoots a video of a pedestrian in motion (e.g., an object with a to-be-recognized identity), and performs gait recognition based on videos of each of an object with a known identity in a database 103 .
  • the video processing device 102 needs to be provided with a photographing apparatus and a module or apparatus capable of extracting a gait energy diagram.
  • an external device 104 provides a gait energy diagram or a video of an object with a to-be-recognized identity to a video processing device 102 .
  • the video processing device 102 performs gait recognition based on gait energy diagrams of each of an object with a known identity stored in a database 103 .
  • the external device 104 provides a video to the video processing device 102
  • the video processing device 102 needs to be provided with a module or an apparatus capable of extracting a gait energy diagram from the video.
  • a training device 105 may be further included in the foregoing scenarios. Functions of the training device 105 may alternatively be implemented by the video processing device 102 .
  • the training device 105 may be configured to train the deep neural network, or provides samples used for training.
  • a web server 106 a web server 106 , a video processing server 107 (that is, a video processing device), and a database 103 may be included.
  • a training server 108 (or a training device) may be further included.
  • the web server 106 is a front end (foreground), and is responsible for communicating with a client browser (the foregoing external device).
  • the video processing server 107 , the database 103 , the training server 108 , and the like are back ends.
  • the video processing server 107 may provide a video processing (e.g., gait recognition) service to the client browser.
  • the training server 108 may be configured to train a video processing algorithm used by the video processing server 107 (that is, training the deep neural network), or provide samples used for training.
  • FIG. 2 a An internal structure of the video processing apparatus is described below.
  • An example structure of the video processing apparatus is shown in FIG. 2 a , and includes: a first obtaining unit 11 and a gait recognition unit 12 .
  • the first obtaining unit 11 is configured to:
  • the second gait energy diagram being generated according to a video including an object with a known identity.
  • the gait recognition unit 12 includes a deep neural network.
  • the deep neural network may be configured to perform first gait recognition on the first gait energy diagram and the second gait energy diagram provided by the first obtaining unit 11 .
  • the video processing apparatus may further include: a training unit 13 , configured to perform a training process.
  • FIG. 2 c shows a possible schematic structural diagram of the video processing device in the foregoing embodiment.
  • the video processing device includes a bus, a processor 1 , a memory 2 , a communications interface 3 , an input device 4 , and an output device 5 .
  • the processor 1 , the memory 2 , the communications interface 3 , the input device 4 , and the output device 5 are connected to each other through the bus.
  • the bus may include a path for transferring information between components of a computer system.
  • the processor 1 may be a general-purpose processor, for example, a general-purpose central processing unit (CPU), a network processor (NP), or a microprocessor, or may be an application-specific integrated circuit (ASIC) or one or more integrated circuits configured to control program execution in the solution in the disclosure.
  • the processor 1 may alternatively be a digital signal processor (DSP), a field programmable gate array (FPGA), or another programmable logic device, discrete gate, or a transistor logic device, or a discrete hardware component.
  • DSP digital signal processor
  • FPGA field programmable gate array
  • the memory 2 stores a program or script for executing the technical solution in the disclosure, and may further store an operating system and another key service.
  • the program may include program code, and the program code includes a computer operation instruction.
  • the script is normally stored as text (such as ASCII), and is explained or compiled merely when being invoked.
  • the memory 2 may include a read-only memory (ROM), another type of static storage device that may store static information and an instruction, a random access memory (RAM), another type of dynamic storage device that may store information and an instruction, a magnetic disk memory, a flash, or the like.
  • ROM read-only memory
  • RAM random access memory
  • dynamic storage device that may store information and an instruction
  • the input device 4 may include an apparatus for receiving data and information inputted by a user, for example, a keyboard, a mouse, a camera, a voice input apparatus, or a touch screen.
  • the output device 5 may include an apparatus allowed to output information to a user, for example, a display screen, or a speaker.
  • the communications interface 3 may include an apparatus using any transceiver or the like, to communicate with another device or a communications network such as an Ethernet, a radio access network (RAN), or a wireless local area network (WLAN).
  • a communications network such as an Ethernet, a radio access network (RAN), or a wireless local area network (WLAN).
  • RAN radio access network
  • WLAN wireless local area network
  • FIG. 2 shows only a simplified design of the video processing device.
  • the video processing device may include any quantity of transmitters, receivers, processors, controllers, memories, communications interfaces, and the like, and all servers/intelligent terminals that may implement the disclosure shall fall within the protection scope of the disclosure.
  • the processor 1 may implement the video processing method provided in the following example embodiments by executing the program stored in the memory 2 and invoking another device.
  • functions of the units of the video processing apparatus shown in FIG. 1 a to FIG. 1 d may be implemented by the processor 1 by executing the program stored in the memory 2 and invoking another device.
  • FIG. 3 shows an example flowchart of a video processing method performed by the foregoing video processing apparatus/device.
  • the method may include at least the following operations 300 - 303 :
  • Operation 300 Obtain a to-be-processed video, generate a first gait energy diagram according to the to-be-processed video, and obtain a second gait energy diagram.
  • Video frames in the to-be-processed video include an object with a to-be-recognized identity, and the second gait energy diagram is generated according to a video (second video) of an object with a known identity.
  • the first gait energy diagram and the second gait energy diagram may each include a uniquely corresponding ID.
  • An ID corresponding to a gait energy diagram may identify an identity of an object corresponding to the gait energy diagram.
  • FIG. 4 is a schematic diagram of a gait energy diagram according to an example embodiment.
  • a plurality of frames of gait silhouettes may be obtained according to video frames, and the plurality of frames of gait silhouettes are combined and normalized to obtain the gait energy diagram.
  • FIG. 1 a An application scenario shown in FIG. 1 a is used as an example.
  • the camera 101 may provide the to-be-processed video to the video processing device 102 .
  • the video processing device 102 extracts a first gait energy diagram from the to-be-processed video, obtains a second video from the database 103 , and obtains a second gait energy diagram from the second video through extraction (or obtains the second gait energy diagram from the database 103 ).
  • FIG. 1 b An application scenario shown in FIG. 1 b is used as an example. After a camera of the video processing device 102 shoots a to-be-processed video, the video processing device 102 extracts a first gait energy diagram from the to-be-processed video, obtains a second video from the database 103 , and obtains a second gait energy diagram from the second video through extraction (or obtains the second gait energy diagram from the database 103 ).
  • FIG. 1 c An application scenario shown in FIG. 1 c is used as an example.
  • the video processing device 102 extracts a first gait energy diagram from the to-be-processed video, obtains a second video from the database 103 , and obtains a second gait energy diagram from the second video through extraction; or the external device 104 provides the first gait energy diagram to the video processing device 102 , and the video processing device 102 obtains the second video from a database 103 , and obtains a second gait energy diagram from the second video through extraction; or the external device 104 provides a first gait energy diagram to the video processing device 102 , and the video processing device 102 obtains a second gait energy diagram from a database 103 .
  • FIG. 1 d An application scenario shown in FIG. 1 d is used as an example.
  • the video processing server 107 extracts a first gait energy diagram from the to-be-processed video, obtains a second video from the database 103 , and obtains a second gait energy diagram from the second video through extraction; or the client provides a first gait energy diagram to the video processing server 107 , and the video processing server 107 obtains a second video from the database 103 , and obtains a second gait energy diagram from the second video through extraction; or the client provides a first gait energy diagram to the video processing server 107 , and the video processing server 107 obtains a second gait energy diagram from the database 103 .
  • operation 300 may be performed by the first obtaining unit 11 of the foregoing video processing apparatus; or the to-be-processed video provided by the external device or client may be received by the communications interface 3 of the foregoing video processing device; or the input device 4 (for example, a camera) shoots the to-be-processed video; or the processor 1 obtains the to-be-processed video from a gallery of the memory 2 .
  • the input device 4 for example, a camera
  • the ID may be allocated by the first obtaining unit 11 or the processor 1 described above.
  • Operation 301 Perform first gait recognition on the first gait energy diagram and the second gait energy diagram according to a deep neural network.
  • inputting in operation 301 may be performed by the first obtaining unit 11 of the foregoing video processing apparatus, or be performed by the processor 1 .
  • Operation 302 A Extract respective identity information of the first gait energy diagram and the second gait energy diagram and a fused gait feature vector of the first gait energy diagram and the second gait energy diagram.
  • Identity information of any gait energy diagram may include: gait feature vectors of the gait energy diagram. Further, the identity information may further include an ID of the gait energy diagram.
  • identity information of the first gait energy diagram may include: gait feature vectors corresponding to the first gait energy diagram; and identity information of the second gait energy diagram may include: gait feature vectors corresponding to the second gait energy diagram.
  • the fused gait feature vector depends on a combination of respective gait feature vectors of the first gait energy diagram and the second gait energy diagram. A method of obtaining the fused gait feature vector is further described below in this specification.
  • the deep neural network may include an identity information extraction layer and a fused gait feature vector extraction layer.
  • the identity information extraction layer may include at least a first extraction layer and a second extraction layer.
  • the first extraction layer may extract inputted first-level gait feature vectors of gait energy diagrams and input the first-level gait feature vectors into the second extraction layer.
  • the second extraction layer may extract respective second-level gait feature vectors of two gait energy diagrams (for example, the first gait energy diagram and the second gait energy diagram).
  • the fused gait feature vector extraction layer may fuse the second-level gait feature vectors of the two gait energy diagrams, to obtain a second-level fused gait feature vector.
  • the fused gait feature vector extraction layer may fuse inputted first-level gait feature vectors of two gait energy diagrams (for example, the first gait energy diagram and the second gait energy diagram) of the deep neural network, to obtain a first-level fused gait feature vector, and obtain a second-level fused gait feature vector through extraction according to the first-level fused gait feature vector.
  • the fused gait feature vector extraction layer may further include a fusion layer (configured to fuse the first-level gait feature vectors to obtain the first-level fused gait feature vector) and an extraction layer (configured to obtain the second-level fused gait feature vector through extraction according to the first-level fused gait feature vector).
  • the first extraction layer and the second extraction layer may be logical layers, and may further include a plurality of feature vector extraction layers, to extract image feature vectors.
  • a feature vector extracted by a feature vector extraction layer closer to input has a lower level
  • a feature vector extracted by a feature vector extraction layer closer to output has a higher level.
  • identity information extraction layers may separately include two channels (each of the channels includes a first extraction layer and a second extraction layer), configured to respectively extract gait feature vectors of two gait energy diagrams.
  • low-level gait feature vectors may be first extracted, and gait feature vectors of a higher level are extracted by combining the low-level gait feature vectors. Because the same operation needs to be performed on the two gait energy diagrams, two channels in the first extraction layer may share a weight.
  • a gait feature vector extracted by a first feature vector extraction layer in a channel has a lowest level, and generally is an edge, an angle, a curve, or the like (corresponding to cov-16, where “cov” represents convolution, 16 represents a quantity of filters, and the quantity of the filters determines dimensionality of the extracted gait feature vectors).
  • a second feature vector extraction layer is configured to extract a combined feature vector (corresponding to cov-64) of gait feature vectors outputted by the first extraction layer, and the remaining may be deduced by analogy. Therefore, levels of extracted gait feature vectors may be from a low level to a middle level to a high level/abstract (semantic level), where in FIG. 5 c , “FC′ represents a connection layer, “FC-2048” represents a connection layer having 2048 neurons, and an extracted feature vector has 2048 dimensions.
  • the low level and the middle level may be collectively referred to as a first level.
  • the fused gait feature vector extraction layer may string low-level gait feature vectors together to obtain a low-level fused gait feature vector, and further extract a fused gait feature vector of a higher level (a middle-level fused gait feature vector) until a high-level fused gait feature vector is obtained.
  • the fused gait feature vector extraction layer may string middle-level gait feature vectors together to obtain a middle-level fused gait feature vector, and further extract a fused gait feature vector of a higher level.
  • the fused gait feature vector extraction layer may directly string high-level gait feature vectors together to obtain a high-level fused gait feature vector.
  • low-level fused gait feature vectors and middle-level gait feature vectors are collectively referred to as first-level gait feature vectors. It may alternatively be considered that the first-level gait feature vectors include a final middle-level gait feature vector.
  • the first-level gait feature vectors may merely include the low-level gait feature vectors.
  • first 6 layers of network structures each extract respective gait feature vectors of a pair of gait energy diagrams, and subsequently, the process is divided into two parts of independent operations.
  • the first part of operation includes: in the seventh-layer network structure, fusing the respective gait feature vectors, to obtain a fused gait feature vector, and further performing extraction of a higher level on the fused gait feature vector.
  • the second part of operation includes: continuing to extract respective gait feature vectors of a pair of gait energy diagrams, to obtain a second-level fused gait feature vector or a final high-level gait feature vector.
  • the deep neural network may adjust parameters thereof through training in advance.
  • the deep neural network in an example embodiment performs parameter adjustment during a training process, not only a fused gait feature vector of different gait energy diagrams is considered, but also implicit identity information of the gait energy diagrams is considered.
  • the deep neural network trained in this way may more effectively extract gait feature vectors that more distinctive.
  • a fused gait feature vector depends on a combination of gait feature vectors of two gait energy diagrams, the fused gait feature vector is more distinctive, so that a more accurate similarity between the two gait energy diagrams (or a similarity between the object with a to-be-recognized identity and the object with a known identity) may be obtained.
  • Operation 302 B Calculate a similarity according to at least the extracted fused gait feature vector.
  • the similarity may be specifically a percentage, and represents a probability that the object with a to-be-recognized identity and the object with a known identity correspond to a same object. For example, if the similarity is 60%, it represents that there is a probability of 60% that the object with a to-be-recognized identity and the object with a known identity are the same person.
  • the similarity may be calculated according to only the fused gait feature vector.
  • a first similarity may be calculated according to the fused gait feature vector, and a second similarity may also be calculated according to the identity information of the two gait energy diagrams; and then weighted summation is performed on the first similarity and the second similarity (e.g., the most simple weighted summation may be to add up the first similarity and the second similarity, and then divide the sum by 2, to obtain an average value), to obtain a final similarity.
  • the most simple weighted summation may be to add up the first similarity and the second similarity, and then divide the sum by 2, to obtain an average value
  • Operation 302 A and operation 302 B are the first gait recognition performed by the deep neural network.
  • operation 302 A and operation 302 B may be performed by the gait recognition unit 12 of the video processing apparatus, or be performed by the processor 1 of the video processing device.
  • the deep neural network may include a similarity calculation layer. Operation 302 B may be performed by the similarity calculation layer.
  • Operation 303 The deep neural network outputs a recognition result.
  • the recognition result includes the similarity, or the recognition result includes information indicating whether the object with a to-be-recognized identity and the object with a known identity are the same object.
  • the recognition result may include the similarity.
  • the recognition result may also include information identifying whether the two inputted gait energy diagrams belong to the same object. For example, a value “1” may be used for representing that the two gait energy diagrams belong to the same object, and “0” may be used for representing that the two gait energy diagrams belongs to different objects.
  • the deep neural network may output a recognition result each time after performing first gait recognition on a set of (two) gait energy diagrams.
  • the deep neural network may output a recognition result after completing a batch of first gait recognitions.
  • the deep neural network may calculate similarities between a first gait energy diagram of the object A and the 10 second gait energy diagrams one by one. Only after calculation is completed, the deep neural network outputs a recognition result. Therefore, the recognition result may include 10 similarities between the two gait energy diagrams.
  • the recognition result may also include information identifying whether two gait energy diagrams belong to the same object.
  • the recognition result includes information indicating whether the object with a to-be-recognized identity and the object with a known identity are the same object.
  • the recognition result includes a probability that the first gait energy diagram and the second gait energy diagram belong to different objects. The probability may be calculated by using “1-similarity”. For example, if the similarity between the first gait energy diagram and the second gait energy diagram is 80%, then the probability that the first gait energy diagram and the second gait energy diagram belong to different objects is 20%.
  • the similarity meets a recognition condition, it is determined that the first gait energy diagram and the second gait energy diagram correspond to the same object. That is, a unique ID corresponding to the second gait energy diagram may identify the identity of the object with a to-be-recognized identity. Otherwise, it is determined that the first gait energy diagram and the second gait energy diagram correspond to different objects.
  • the recognition condition includes: a similarity is not less than a similarity threshold or a similarity is greater than the similarity threshold.
  • the similarity threshold is 80%
  • the similarity between the two gait energy diagrams is 70%
  • it is considered that the object with a to-be-recognized identity and the object with a known identity are not the same person
  • the similarity between the two gait energy diagrams is greater than (or equal to) 80%
  • the unique ID corresponding to the second gait energy diagram may identify the identity of the object with a to-be-recognized identity.
  • the database stores videos or gait energy diagrams of each of an object with a known identity. Therefore, in another embodiment of the disclosure, similarities between second gait energy diagrams of each of an object with a known identity a in the database and a first gait energy diagram may be calculated one by one until a similarity between a second gait energy diagram of an object with a known identity and the first gait energy diagram meets the recognition condition or similarities between all second gait energy diagrams of each of an object with a known identity are known and the first gait energy diagram are calculated.
  • an identity of an object A is to be recognized, and there are 10 videos of each of an object with a known identity in the database. Therefore, according to a sequence, similarities between a first gait energy diagram of the object A and 10 second gait energy diagrams of each of an object with a known identity are known are calculated one by one until there is a similarity meeting the recognition condition or 10 similarities are calculated.
  • the deep neural network may alternatively output respective gait feature vectors of two gait energy diagrams.
  • the deep neural network may output respective gait feature vectors of gait energy diagrams, to facilitate calculation of a loss value.
  • the deep neural network in an example embodiment performs filter parameter adjustment according to identity information and a similarity during the training process. That is, during the parameter adjustment, not only a fused gait feature vector of different gait energy diagrams is considered, but also implicit identity information of the gait energy diagrams is considered. In this way, a gait feature vector that is more distinctive may be more effectively extracted. Because a fused gait feature vector depends on gait feature vectors of two gait energy diagrams, the fused gait feature vector is more distinctive, so that a more accurate similarity calculation between gait energy diagrams may be possible.
  • a method of training the deep neural network is described below.
  • a training or optimization process of the deep neural network may also be understood as a process of adjusting the filter parameters to minimize a loss value of a loss function (a smaller loss value means that a corresponding prediction/output result is closer to an actual result).
  • most used loss functions reflect a classification loss, that is, determining categories of two gait energy diagrams (the category herein refers to distinguishing different people), and it cannot be ensured that extracted gait feature vectors corresponding to the same person are as similar to each other as much as possible, and extracted gait feature vectors from different people are as far away (or different) from each other as much as possible. Therefore, it cannot be ensured that the extracted gait feature vectors are sufficiently distinctive.
  • a training objective of the training process includes: making gait feature vectors extracted from different gait energy diagrams of a same object be similar, and making gait feature vectors extracted from gait energy diagrams of different objects be far away from each other.
  • an example embodiment of this application further provides new loss functions, to achieve the training objectives through training.
  • the new loss functions include an identity information loss function and a fused gait feature vector loss function.
  • FIG. 6 and FIG. 7 show an example training process based on the new loss functions.
  • the process may include at least the following operations S 600 -S 605 .
  • Each of the training samples may include n training sub samples, and any one of the training subsamples may include two (a pair of) gait energy diagrams of each of an object with a known identity.
  • n may be a positive integer
  • S 600 may be performed by the foregoing first obtaining unit 11 , the training unit 13 , or the processor 1 .
  • a deep neural network performs second gait recognition on each of the training subsamples in the training sample.
  • the second gait recognition may include: extracting respective identity information of two gait energy diagrams in the training subsample and a fused gait feature vector of the two gait energy diagrams, and calculating a similarity of the two gait energy diagrams according to at least the extracted fused gait feature vector.
  • the second gait recognition is similar to the foregoing first gait recognition. For specific details, reference may be made to descriptions of operation 302 A and operation 302 B, and the details are not described herein again.
  • S 601 may be performed by the foregoing gait recognition unit 12 , the training unit 13 , or the processor 1 .
  • S 602 Calculate, according to the identity information extracted in the gait recognition, an identity loss value of the training sample by using an identity information loss function.
  • a smaller identity loss value represents that gait feature vectors extracted from different gait energy diagrams of a same object are more similar, and gait feature vectors extracted from gait energy diagrams of different objects are farther away from each other.
  • Sequences for performing S 602 and S 603 may be interchanged, and S 602 and S 603 may alternatively be performed in parallel.
  • the filter parameters may be jointly adjusted once. Therefore, after final loss values of the training samples are respectively calculated, the filter parameters may be adjusted according to the final loss values.
  • S 602 to S 605 may be performed by the foregoing training unit 13 or the processor 1 .
  • a training objective that gait feature vectors extracted from different gait energy diagrams of a same object are similar, and gait feature vectors extracted from gait energy diagrams of different objects are far away from each other is used for training the deep neural network, so that extracted gait feature vectors of the same person may be as similar as possible, and extracted gait feature vectors from different people may be as far away from each other as possible. Therefore, the extracted gait feature vectors are distinctive, so that more accurate similarity calculation between gait energy diagrams may be possible.
  • a training sample may include n training subsamples.
  • An example in which one training sample includes 3 subsamples (that is, includes 3 pairs of gait energy diagrams) is used below for describing the training process more specifically.
  • FIG. 8 shows a training architecture.
  • FIG. 9 shows an example training process based on the new loss functions. The process may include at least the following operations S 900 -S 906 :
  • each training sample includes a first training subsample, a second training subsample, and a third training subsample (first, second, and third are only used for distinguishing, and do not represent a sequence of being inputted into a deep neural network).
  • a combination manner of the first training subsample, the second training subsample, and the third training subsample may include:
  • a second combination manner two gait energy diagrams in the first training subsample corresponding to a same object; two gait energy diagrams in the second training subsample corresponding to the same object; and two gait energy diagrams in the third training sub sample corresponding to different objects.
  • S 900 may be performed by the foregoing first obtaining unit 11 , the training unit 13 , or the processor 1 .
  • a deep neural network performs second gait recognition on each of the training subsamples in the training sample.
  • S 901 may be performed by the foregoing gait recognition unit 12 , the training unit 13 , or the processor 1 .
  • S 902 Calculate an identity loss value of the training sample by using a first identity loss function in a case that a combination manner of first training subsample, the second training subsample, and the third training subsample in the training sample is the first combination manner.
  • Lu represents the identity loss value
  • represents a coefficient (whose value range is from 0 to 1), and ⁇ * ⁇ 2 2 represents a Euclidean distance
  • p, g, p′, g′, p′′ and g′′ represent IDs of gait energy diagrams
  • Xp and Xg represent a pair of gait energy diagrams in the first training subsample (Xp may also be referred to as a first gait energy diagram, and Xg may also be referred to as a second gait energy diagram);
  • Xp′ and Xg′ represent a pair of gait energy diagrams in the second training subsample (Xp′ may also be referred to as a third gait energy diagram, and Xg′ may also be referred to as a fourth gait energy diagram);
  • Xp′′ and Xg′′ represent a pair of gait energy diagrams in the third training subsample (Xp′′ may also be referred to as a fifth gait energy diagram, and
  • ⁇ U(Xp) ⁇ U(Xg) ⁇ 2 2 in the first identity loss function represents a Euclidean distance of the two gait feature vectors in the first training subsample. Because Xp and Xg correspond to a same object, to make gait feature vectors extracted from different gait energy diagrams of the same object be similar, ⁇ U(Xp) ⁇ U(Xg) ⁇ 2 2 is enabled to be as small as possible (approaching 0) by adjusting filter parameters.
  • Xp′ and Xg′ correspond to different objects, to make gait feature vectors extracted from gait energy diagrams of different objects be far away from each other, ⁇ U(Xp′) ⁇ U(Xg′) ⁇ 2 2 is enabled to be as large as possible (approaching 1) by adjusting the filter parameters.
  • the first identity loss function reflects the training objective: gait feature vectors extracted from different gait energy diagrams of a same object are similar, and gait feature vectors extracted from gait energy diagrams of different objects are far away from each other.
  • S 903 Calculate an identity loss value of the training sample by using a second identity loss function in a case that a combination manner of first training subsample, the second training subsample, and the third training subsample in the training sample is the second combination manner.
  • Xp and Xg correspond to a same object
  • Xp′ and Xg′ correspond to a same object
  • Xp and Xp′′ correspond to different objects, so that ⁇ U(Xp′′) ⁇ U(Xg′′) ⁇ 2 2 is expected to be as large as possible, and is to be used as a minuend.
  • the second identity loss function also reflects the training objective: gait feature vectors extracted from different gait energy diagrams of a same object are similar, and gait feature vectors extracted from gait energy diagrams of different objects are far away from each other.
  • a fused loss subvalue corresponding to each of the training subsamples may be calculated, and then, the fused loss subvalues of the training subsamples are accumulated, to obtain the fused loss value.
  • the fused gait feature vector loss function may have a plurality of presentation forms.
  • a true label distribution thereof is “1, 0”, where “1” represents that a probability that the two gait energy diagrams are from a same object is 100%, and “0” represents that a probability that the two gait energy diagrams are from different objects is 0%.
  • a true label distribution thereof is “0, 1”, where “0” represents that a probability that the two gait energy diagrams are from a same object is 0%, and “1” represents that a probability that the two gait energy diagrams are from different objects is 100%.
  • the fused gait feature vector loss function may include a first fused gait feature vector loss function and a second fused gait feature vector loss function according to the foregoing different combination manners (e.g., the first combination manner and the second combination manner).
  • Lc represents the fused loss value
  • ⁇ and ⁇ c represents weight coefficients, and a value range thereof is from 0 to 1
  • represents a relaxation factor, and a value range thereof is from 0 to 1, max(*, 0)
  • ⁇ * ⁇ + represents comparing a value with 0, and selecting the greater one of the value and 0, that is, x pg represents a pair of gait energy diagrams including a gait energy diagram P and a gait energy diagram g, by analogy, x pg and x gp′ represent pairs of gait energy diagrams, and C(*) is a probability calculation function, used for calculating a probability that two pairs of gait energy diagrams have a same label distribution; using C(x pg , x p′′g′′ ) as an example, a probability that a pair of gait energy diagrams x pg and a pair of gait energy diagrams x p′′g′′ have a same label distribution is calculated; and using C(
  • a label distribution thereof is “1, 0”; otherwise, a label distribution thereof is “0, 1”.
  • a true label distribution of x pg is “1, 0”; and if the gait energy diagram P and the gait energy diagram g correspond to different objects, a label distribution of x pg is “0, 1”. Similarly, a true label distribution of another pair of gait energy diagrams may be deduced.
  • D[*] represents a Euclidean distance
  • D[C(x pg , x p′′g′′ ),C(x pg , x pg′′ )] as an example, a distance between a probability a and a probability b is calculated.
  • the probability a is a probability that x pg and x p′′g′′ have a same label distribution
  • the probability b is a probability that x pg and x pg′′ have a same label distribution.
  • D[C(x pg , x p′′g′′ ),C(x pg , x pg′′ )] shall be 0.
  • x pg and x p′′g′′ have a same label distribution (for example, both being “1, 0” or “0, 1”), but x pg and x pg′′ have different label distributions, or x pg and x p′′g′′ have different label distributions, but x pg and x pg′′ have a same label distribution, a greater D[C(x pg , x p′′g′′ ),C(x pg ,x pg′′ )] is more desirable.
  • P, g, and p′ correspond to a same object; “P, g′, p′′, and g′′”, “g, g′, p′′, and g′′” or “g′, p′′, and g′′” correspond to different objects.
  • any two gait energy diagrams may be selected from P, g, and p′ for combination, to obtain a pair of gait energy diagrams whose label distribution is “1, 0”; and similarly, any two gait energy diagrams may be selected from “P, g′, p′′, and g′′”, “g, g′, p′′, and g′′” or “g′, p′′, and g′′” for combination, to obtain a pair of gait energy diagrams whose label distribution is “0, 1”.
  • the pairs of gait energy diagrams are filled into different positions of a C function, so that another first fused gait feature vector loss function may be obtained.
  • D[*] calculates a distance between probabilities that two calculation samples have a same label distribution.
  • a first probability corresponding to the first calculation sample is far away from a second probability corresponding to the second calculation sample; otherwise, the first probability is close to the second probability.
  • the first probability is a probability that the two pairs of gait energy diagrams in the first calculation sample have a same label distribution
  • the second probability is a probability that the two pairs of gait energy diagrams in the second calculation sample have a same label distribution.
  • any two gait energy diagrams may be selected from p, g, p′, and g′ for combination, to obtain a pair of gait energy diagrams whose label distribution is “1, 0”; and similarly, any two gait energy diagrams may be selected from “p, p′′, and g′′” or “g, p′′, and g′′” for combination, to obtain a pair of gait energy diagrams whose label distribution is “0, 1”.
  • the pairs of gait energy diagrams are filled into different positions of the C function, and another second fused gait feature vector loss function may be obtained.
  • the fused gait feature vector loss function may classify each set of gait energy diagrams; on the other hand, according to features of every two sets of gait energy diagrams, fused gait feature vector loss function may make feature vectors as close as possible if two sets of gait energy diagrams are from a same category, and may make feature vectors as far away as possible if the two sets of gait energy diagrams are from different category.
  • S 905 is similar to the foregoing S 604 , and details are not described herein again.
  • S 906 is similar to the foregoing S 605 , and details are not described herein again.
  • S 902 to S 905 may be performed by the foregoing training unit 13 or the processor 1 .
  • F1 to F3 represent pairs of gait energy diagrams
  • D(*) in D(C(F1), C(F2)) represents a distance
  • C represents a probability calculation function
  • ID( ⁇ ) represents a probability of being from different objects.
  • gait feature vectors of gait energy diagrams may be extracted by using the deep neural network, and then are fused. Subsequently, the to-be-trained deep neural network is adjusted by using a loss function: On the one hand, each set of gait energy diagrams is classified; on the other hand, according to features of every two sets of gait energy diagrams, feature vectors are made as close as possible if two sets of gait energy diagrams are from a same category, and feature vectors are made as far away from each other as possible if two sets of gait energy diagrams are from different categories. After training of the network is completed, the trained deep neural network may be used for recognizing gaits.
  • the example embodiments of the disclosure further provide a video processing device.
  • the video processing device includes at least a processor and a memory, the processor performing the foregoing video processing method by executing a program stored in the memory and invoking another device.
  • the example embodiments of the disclosure further provide a storage medium.
  • the storage medium stores a plurality of instructions, the instructions being configured to be loaded by a processor to perform operations in the video processing method provided in any embodiment of the disclosure.
  • the gait is one of the postures. Therefore, the example embodiments of the disclosure further provide an image processing method, an image processing apparatus, an image processing device, and a storage medium, to implement posture recognition.
  • the image processing method includes:
  • first posture energy diagram or a first posture diagram
  • second posture energy diagram or a second posture diagram
  • the first posture recognition including:
  • the identity information and the fused posture feature vector in an example embodiment are similar to the foregoing identity information and the foregoing fused gait feature vector, and details are not described herein again.
  • the object with a to-be-recognized identity may be a human, or may be an animal, or even a moving or stationary object that does not have life.
  • the image processing apparatus may be applied in the image processing device in a software or hardware form.
  • the image processing device may be a server or PC providing a gait recognition service, or may be a terminal such as a digital camera, a mobile terminal (for example, a smartphone), and an iPad.
  • the image processing apparatus When being applied in the image processing device in a software form, the image processing apparatus may be independent software.
  • the video processing apparatus may also be used as a subsystem (child component) of a large-scale system (such an operating system), and provides a gait recognition service.
  • the image processing apparatus may be, for example, a controller/processor of a terminal or a server.
  • FIG. 10 is an example structural diagram of an image processing application scenario according to an example embodiment.
  • An image processing device 1001 obtains a first posture energy diagram of an object with a to-be-recognized identity, and performs first posture recognition based on second posture energy diagrams of each of an object with a known identity in a database 1002 .
  • a training device 1003 may be further included in the foregoing scenarios. Functions of the training device 1003 may alternatively be implemented by the image processing device 1001 .
  • the training device 1003 may be configured to train the deep neural network, or provides samples used for training.
  • FIG. 11 An example structure of the image processing apparatus is shown in FIG. 11 , and includes: a second obtaining unit 111 and a posture recognition unit 112 .
  • the second obtaining unit 111 is configured to:
  • the posture recognition unit 112 includes a deep neural network.
  • the deep neural network may be configured to perform first posture recognition on the first posture energy diagram and the second posture energy diagram provided by the second obtaining unit 111 .
  • the image processing apparatus may further include: a training unit 113 , configured to perform a training process.
  • the training process may relate to second posture recognition.
  • the second posture recognition is similar to the first posture recognition, and details are not described herein again.
  • training process in an example embodiment is similar to the training process of the foregoing embodiments; a training objective in an example embodiment is similar to the training objective of the foregoing embodiments; and formulas are also similar. Details are not described herein again.
  • FIG. 2 c For another possible schematic structural diagram of the image processing device, reference may be made to FIG. 2 c , and details are not described herein again.
  • the example embodiments of the disclosure further provide an image processing device.
  • the image processing device includes at least a processor and a memory, the processor performing the foregoing image processing method by executing a program stored in the memory and invoking another device.
  • the example embodiments of the disclosure further provide a storage medium.
  • the storage medium stores a plurality of instructions, the instructions being configured to be loaded by a processor to perform operations in the image processing method provided in the example embodiments of the disclosure.
  • the operations of the method or algorithm described with reference to the disclosed embodiments in this specification may be implemented directly by using hardware, a software unit executed by a processor, or a combination thereof.
  • the software unit may be set in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable and programmable ROM, a register, a hard disk, a removable magnetic disk, a CD-ROM, or any storage medium in other forms well-known in the technical field.
  • At least one of the components, elements, modules or units described herein may be embodied as various numbers of hardware, software and/or firmware structures that execute respective functions described above, according to an example embodiment.
  • at least one of these components, elements or units may use a direct circuit structure, such as a memory, a processor, a logic circuit, a look-up table, etc. that may execute the respective functions through controls of one or more microprocessors or other control apparatuses.
  • at least one of these components, elements or units may be specifically embodied by a module, a program, or a part of code, which contains one or more executable instructions for performing specified logic functions, and executed by one or more microprocessors or other control apparatuses.
  • At least one of these components, elements or units may further include or implemented by a processor such as a central processing unit (CPU) that performs the respective functions, a microprocessor, or the like.
  • a processor such as a central processing unit (CPU) that performs the respective functions, a microprocessor, or the like.
  • CPU central processing unit
  • Two or more of these components, elements or units may be combined into one single component, element or unit which performs all operations or functions of the combined two or more components, elements of units.
  • at least part of functions of at least one of these components, elements or units may be performed by another of these components, element or units.
  • a bus is not illustrated in the block diagrams, communication between the components, elements or units may be performed through the bus.
  • Functional aspects of the above example embodiments may be implemented in algorithms that execute on one or more processors.
  • the components, elements or units represented by a block or processing steps may employ any number of related art techniques for electronics configuration, signal processing and/or control, data processing and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Image Analysis (AREA)
US16/903,929 2018-04-12 2020-06-17 Media processing method, related apparatus, and storage medium Active 2039-08-03 US11335127B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201810327638.4A CN110378170B (zh) 2018-04-12 2018-04-12 视频处理方法及相关装置,图像处理方法及相关装置
CN201810327638.4 2018-04-12
PCT/CN2019/079156 WO2019196626A1 (zh) 2018-04-12 2019-03-22 媒体处理方法及相关装置

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/079156 Continuation WO2019196626A1 (zh) 2018-04-12 2019-03-22 媒体处理方法及相关装置

Publications (2)

Publication Number Publication Date
US20200320284A1 US20200320284A1 (en) 2020-10-08
US11335127B2 true US11335127B2 (en) 2022-05-17

Family

ID=68163919

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/903,929 Active 2039-08-03 US11335127B2 (en) 2018-04-12 2020-06-17 Media processing method, related apparatus, and storage medium

Country Status (5)

Country Link
US (1) US11335127B2 (de)
EP (1) EP3779775B1 (de)
JP (1) JP7089045B2 (de)
CN (3) CN110378170B (de)
WO (1) WO2019196626A1 (de)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378170B (zh) * 2018-04-12 2022-11-08 腾讯科技(深圳)有限公司 视频处理方法及相关装置,图像处理方法及相关装置
CN111310587B (zh) * 2020-01-19 2023-04-28 中国计量大学 一种基于渐弱运动轨迹图的步态特征表示和特征提取方法
CN111340090B (zh) * 2020-02-21 2023-08-01 每日互动股份有限公司 图像特征比对方法及装置、设备、计算机可读存储介质
CN113657169B (zh) * 2021-07-19 2023-06-20 浙江大华技术股份有限公司 一种步态识别方法、装置、系统和计算机可读存储介质
CN114140873A (zh) * 2021-11-09 2022-03-04 武汉众智数字技术有限公司 一种基于卷积神经网络多层次特征的步态识别方法
CN114627424A (zh) * 2022-03-25 2022-06-14 合肥工业大学 一种基于视角转化的步态识别方法和系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104299012A (zh) * 2014-10-28 2015-01-21 中国科学院自动化研究所 一种基于深度学习的步态识别方法
US20150205997A1 (en) * 2012-06-25 2015-07-23 Nokia Corporation Method, apparatus and computer program product for human-face features extraction
CN105574510A (zh) 2015-12-18 2016-05-11 北京邮电大学 一种步态识别方法及装置
CN106503687A (zh) 2016-11-09 2017-03-15 合肥工业大学 融合人脸多角度特征的监控视频人物身份识别系统及其方法
CN106951753A (zh) 2016-01-06 2017-07-14 北京三星通信技术研究有限公司 一种心电信号的认证方法和认证装置
CN107085716A (zh) 2017-05-24 2017-08-22 复旦大学 基于多任务生成对抗网络的跨视角步态识别方法
US20170243058A1 (en) 2014-10-28 2017-08-24 Watrix Technology Gait recognition method based on deep learning

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101571924B (zh) 2009-05-31 2011-04-06 北京航空航天大学 一种多区域特征融合的步态识别方法及系统
CN101630364A (zh) 2009-08-20 2010-01-20 天津大学 基于融合特征的步态信息处理与身份识别方法
CN101807245B (zh) 2010-03-02 2013-01-02 天津大学 基于人工神经网络的多源步态特征提取与身份识别方法
GB201113143D0 (en) * 2011-07-29 2011-09-14 Univ Ulster Gait recognition methods and systems
WO2013177060A2 (en) * 2012-05-20 2013-11-28 Trustees Of Boston University Methods and systems for monitoring, diagnosing, and treating chronic obstructive polmonary disease
CN103942577B (zh) * 2014-04-29 2018-08-28 上海复控华龙微系统技术有限公司 视频监控中基于自建立样本库及混合特征的身份识别方法
CN104794449B (zh) * 2015-04-27 2017-11-28 青岛科技大学 基于人体hog特征的步态能量图获取及身份识别方法
US10014967B2 (en) * 2015-11-23 2018-07-03 Huami Inc. System and method for authenticating a broadcast device using facial recognition
CN106529499A (zh) * 2016-11-24 2017-03-22 武汉理工大学 基于傅里叶描述子和步态能量图融合特征的步态识别方法
CN106919921B (zh) * 2017-03-06 2020-11-06 重庆邮电大学 结合子空间学习与张量神经网络的步态识别方法及系统
CN107122711A (zh) * 2017-03-20 2017-09-01 东华大学 一种基于角度径向变换和质心的夜视视频步态识别方法
CN107590452A (zh) * 2017-09-04 2018-01-16 武汉神目信息技术有限公司 一种基于步态与人脸融合的身份识别方法及装置
CN107423730B (zh) * 2017-09-20 2024-02-13 湖南师范大学 一种基于语义折叠的人体步态行为主动检测识别系统和方法
CN110378170B (zh) * 2018-04-12 2022-11-08 腾讯科技(深圳)有限公司 视频处理方法及相关装置,图像处理方法及相关装置

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150205997A1 (en) * 2012-06-25 2015-07-23 Nokia Corporation Method, apparatus and computer program product for human-face features extraction
CN104299012A (zh) * 2014-10-28 2015-01-21 中国科学院自动化研究所 一种基于深度学习的步态识别方法
US20170243058A1 (en) 2014-10-28 2017-08-24 Watrix Technology Gait recognition method based on deep learning
CN105574510A (zh) 2015-12-18 2016-05-11 北京邮电大学 一种步态识别方法及装置
US9633268B1 (en) * 2015-12-18 2017-04-25 Beijing University Of Posts And Telecommunications Method and device for gait recognition
CN106951753A (zh) 2016-01-06 2017-07-14 北京三星通信技术研究有限公司 一种心电信号的认证方法和认证装置
CN106503687A (zh) 2016-11-09 2017-03-15 合肥工业大学 融合人脸多角度特征的监控视频人物身份识别系统及其方法
CN107085716A (zh) 2017-05-24 2017-08-22 复旦大学 基于多任务生成对抗网络的跨视角步态识别方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Extended European Search Report dated Nov. 26, 2021, issued by the European Patent Office in application No. 19784939.1.
International Search Report for PCT/CN2019/079156 dated Jun. 20, 2019 (PCT/ISA/210).
Written Opinion of the International Searching Authority dated Jun. 20, 2019 in International Application English No. PCT/CN2019/079156.

Also Published As

Publication number Publication date
JP7089045B2 (ja) 2022-06-21
CN110472622B (zh) 2022-04-22
JP2021515321A (ja) 2021-06-17
EP3779775A4 (de) 2021-12-29
CN110378170A (zh) 2019-10-25
CN110443232B (zh) 2022-03-25
EP3779775B1 (de) 2024-02-28
CN110472622A (zh) 2019-11-19
US20200320284A1 (en) 2020-10-08
WO2019196626A1 (zh) 2019-10-17
EP3779775A1 (de) 2021-02-17
CN110378170B (zh) 2022-11-08
CN110443232A (zh) 2019-11-12

Similar Documents

Publication Publication Date Title
US11335127B2 (en) Media processing method, related apparatus, and storage medium
US11354901B2 (en) Activity recognition method and system
CN110188641B (zh) 图像识别和神经网络模型的训练方法、装置和系统
Zhang et al. MoWLD: a robust motion image descriptor for violence detection
Gao et al. Discriminative multiple canonical correlation analysis for information fusion
CN106415594B (zh) 用于面部验证的方法和系统
US20180261071A1 (en) Surveillance method and system based on human behavior recognition
EP3757873A1 (de) Gesichtserkennungsverfahren und -vorrichtung
CN110348362B (zh) 标签生成、视频处理方法、装置、电子设备及存储介质
CN109635643B (zh) 一种基于深度学习的快速人脸识别方法
Wang et al. Abnormal behavior detection in videos using deep learning
WO2023123923A1 (zh) 人体重识别方法、人体重识别装置、计算机设备及介质
Huang et al. A high-efficiency and high-accuracy fully automatic collaborative face annotation system for distributed online social networks
CN113128526B (zh) 图像识别方法、装置、电子设备和计算机可读存储介质
CN114677611B (zh) 数据识别方法、存储介质及设备
CN115705706A (zh) 视频处理方法、装置、计算机设备和存储介质
Zhao et al. Research on face recognition based on embedded system
Gao et al. Data-driven lightweight interest point selection for large-scale visual search
CN112380369B (zh) 图像检索模型的训练方法、装置、设备和存储介质
CN115578765A (zh) 目标识别方法、装置、系统及计算机可读存储介质
US11200407B2 (en) Smart badge, and method, system and computer program product for badge detection and compliance
CN116457795A (zh) 用于提供计算高效神经网络的设备和方法
KR102060110B1 (ko) 컨텐츠에 포함되는 객체를 분류하는 방법, 장치 및 컴퓨터 프로그램
He et al. Latent variable pictorial structure for human pose estimation on depth images
Hristov et al. Personal Identification Based Automatic Face Annotation in Multi-View Systems

Legal Events

Date Code Title Description
AS Assignment

Owner name: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LTD, CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHANG, KAIHAO;LUO, WENHAN;MA, LIN;AND OTHERS;SIGNING DATES FROM 20200525 TO 20200527;REEL/FRAME:052965/0508

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE