CN111223127B - Human body joint point-based 2D video multi-person tracking method, system, medium and equipment - Google Patents

Human body joint point-based 2D video multi-person tracking method, system, medium and equipment Download PDF

Info

Publication number
CN111223127B
CN111223127B CN202010045947.XA CN202010045947A CN111223127B CN 111223127 B CN111223127 B CN 111223127B CN 202010045947 A CN202010045947 A CN 202010045947A CN 111223127 B CN111223127 B CN 111223127B
Authority
CN
China
Prior art keywords
frame
similarity
similarity relation
joint point
identity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010045947.XA
Other languages
Chinese (zh)
Other versions
CN111223127A (en
Inventor
金雪梅
彭琪钧
朱绘霖
曹伟
陈佳佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN202010045947.XA priority Critical patent/CN111223127B/en
Publication of CN111223127A publication Critical patent/CN111223127A/en
Application granted granted Critical
Publication of CN111223127B publication Critical patent/CN111223127B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • G06V20/47Detecting features for summarising video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a human body joint point-based 2D video multi-person tracking method, a system, a medium and equipment, wherein the method comprises the following steps: cutting the acquired video file to obtain a frame sequence set; extracting human body joint point characteristics of all people in each frame of the frame sequence set; performing similarity relation model training by using videos with known character tracks, and calculating similarity relations of the same characters and body characteristics among different characters in the upper and lower frames by using human body joint point characteristics; selecting a similarity relation as a training set, and learning a similarity relation model belonging to the same figure by using a neural network algorithm; during tracking, the identity of the previous n frames of characters is initialized, the similarity relation between the current character with unknown identity and the characters with known identity in the previous frames is calculated frame by frame, the similarity relation is input into a similarity relation model, the probability of whether the current character is the corresponding identity is output, the identity of the unknown character is determined, and track information is obtained. The invention has the advantage of steady tracking capability.

Description

Human body joint point-based 2D video multi-person tracking method, system, medium and equipment
Technical Field
The invention belongs to the field of computer vision and machine learning, and particularly relates to a human body joint point-based 2D video multi-person tracking method, system, medium and device.
Background
At present, the field of multi-person tracking by using 2D video mostly comprises two parts of person detection and data association. As a basis of data association, person detection often extracts feature information of a person, such as color, shape, texture, position, and the like, and then associates the person according to a similarity relationship to realize tracking. In some scenarios, such as player games, the color and texture are not unique, greatly diminishing the ability to correlate using color. In a scene where fusion collisions are likely to occur, a device relying on shape and position tracking often experiences a phenomenon of person ID conversion. However, the existing tracking method can only track the track of people in the video and cannot extract the behavior information of people.
Therefore, it is desirable to provide a device and a method with robust tracking capability, which can extract human behavior information while tracking a 2D video track, and can be applied to a variety of fields requiring behavior recognition.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a 2D video multi-person tracking method and system based on human body joint points, which have the advantage of stable tracking capability, can extract the behavior information of people and prepare for further human behavior identification.
The purpose of the invention is realized by the following technical scheme: A2D video multi-person tracking method based on human body joint points comprises the following steps:
cutting the acquired video file to obtain a frame sequence set;
identifying and extracting human body joint point characteristics of all people in each frame of the frame sequence set;
performing similarity relation model training by using videos of known character tracks, and calculating similarity relations of body features of the same characters in upper and lower frames and similarity relations of body features of different characters in the upper and lower frames by using human body joint point features; selecting a similarity relation as a training set, and learning a similarity relation model belonging to the same character identity by using a neural network algorithm;
during tracking, the identity of the previous n frames of characters is initialized, the similarity relation between the current character with unknown identity and the characters with known identity in the previous frames is calculated frame by frame, the similarity relation is input into a similarity relation model, the probability of whether the current character is the corresponding identity is output, the identity of the unknown character is determined, and the joint point track information of the character is obtained.
Preferably, the similarity relationship includes 3 parameters, each being a pearson correlation coefficient P corr Mean value of distance D between feature points mean Of special interestStandard deviation of distance between feature points D std The calculation method is as follows:
let the mth personal character in the ith frame
Figure BDA0002369395480000021
And the characteristic of the nth person in the jth frame->
Figure BDA0002369395480000022
Similarity between them
Figure BDA0002369395480000023
Selecting
Figure BDA0002369395480000024
And &>
Figure BDA0002369395480000025
The joint point pixel position coordinates coexisting in the two arrays are obtained to obtain the joint point pixel position characteristic->
Figure BDA0002369395480000026
And &>
Figure BDA0002369395480000027
Calculating out
Figure BDA0002369395480000028
And &>
Figure BDA0002369395480000029
Has a Pearson correlation coefficient->
Figure BDA00023693954800000210
Wherein,
Figure BDA00023693954800000211
represents->
Figure BDA00023693954800000212
And &>
Figure BDA00023693954800000213
Covariance in between, <' > based on>
Figure BDA00023693954800000214
And &>
Figure BDA00023693954800000215
Respectively represent->
Figure BDA00023693954800000216
And &>
Figure BDA00023693954800000217
The standard deviation of (a);
calculating out
Figure BDA00023693954800000218
And &>
Figure BDA00023693954800000219
Mean value of the distance in between>
Figure BDA00023693954800000220
cn is the number of joint points which coexist, and dc represents the absolute value of the pixel coordinate distance of the corresponding joint points;
computing
Figure BDA00023693954800000221
And &>
Figure BDA00023693954800000222
Standard deviation of the distance between->
Figure BDA00023693954800000223
Wherein mu dc Representing the average value of dc.
Preferably, the similarity relationship model is constructed as follows:
taking out the relation of the known tracking strategy of n continuous frames (namely the person ID in each frame is known) in the set K of the human body joint pointsSet of node coordinates
Figure BDA00023693954800000224
Taking similarity parameter between two groups of joint point coordinates with object ID R between two adjacent frames as positive training set S p The label of the sample is set to 1; taking a similarity parameter between an object with an ID of R and an object with an ID of not R between two adjacent frames as a negative training set, and setting a label as 0;
setting the number of layers l of the neural network and the number of neurons n per layer l And inputting the training set into a set neural network for repeated iterative training to obtain a model omega.
Human joint point based 2D video multi-person tracking system comprises:
the video cutting unit is used for cutting the acquired video file to obtain a frame sequence set;
the joint point feature extraction unit is used for identifying and extracting a human body joint point feature set of all people in each frame of the frame sequence set;
the similarity calculation unit is used for carrying out similarity relation model training by utilizing videos of known character tracks, and calculating the similarity relation of the body characteristics of the same characters in the upper frame and the lower frame and the similarity relation of the body characteristics of different characters in the upper frame and the lower frame by utilizing the characteristics of human body joint points;
the model training unit is used for selecting the similarity relation as a training set and learning a similarity relation model belonging to the same character identity by using a neural network algorithm;
and the tracking unit is used for initializing the identities of the previous n frames of characters during tracking, calculating the similarity relation between the current character with unknown identity and the characters with known identity in the previous frames frame by frame, inputting the similarity relation into the similarity relation model, outputting the probability of judging whether the current character is the corresponding identity, determining the identity of the unknown character, and obtaining the joint point track information of the character.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention uses the human body joint point characteristics to calculate the similarity relation between two groups of characteristics, and uses the neural network to learn the relation model between the similarity and the person identity ID, and then inputs the similarity between the known object and the observed object between frames into the model, calculates the tracking probability, and confirms the identity ID of the observed object. The joint point information has the advantages of clear and simple characteristics and low possibility of being influenced by appearance factors, the system uses multi-parameter comparison similarity, the tracking precision is higher, the system is more stable, and even if character fusion and short-time shielding conditions occur, the character ID can be identified by comparing the similarity of adjacent frames, so that the track is tracked.
Drawings
FIG. 1 is a block diagram of the apparatus according to the present embodiment.
FIG. 2 is a flowchart of the method of the present embodiment.
Fig. 3 is a flow chart of feature similarity calculation in the method of the present embodiment.
Fig. 4 is a flowchart of a training set acquisition method in the method of the present embodiment.
FIG. 5 is a flowchart of model training in the method of the present embodiment.
FIG. 6 is a flowchart of the tracking process in the method of the present embodiment.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example 1
As shown in fig. 2, the method for tracking multiple persons in 2D video based on human body joint points of the present embodiment includes the following steps:
(1) Collecting a video frame;
(2) Cutting the collected video into frame sequences according to the frame rate to obtain a frame sequence set P = { P = { (P) } (i) L i =1,2, \8230 |, N }, where N is the total number of frames of the video, p (i) Representing the ith image in the frame sequence set P.
(3) The coordinates of the human joint points in the continuous video frames can be obtained by using a tool capable of extracting the human joint points in the 2D or 3D video, such as an open source tool openpos for non-commercially identifying the human joint points in the 2D video. Inventive steps identify a set of frame sequencesThe human body joint points of all people in the P are extracted, and the position coordinates of the joint points are extracted to obtain a set of the human body joint points of all people in the video
Figure BDA0002369395480000041
Figure BDA0002369395480000042
Wherein M is i For the total number of people identified in the ith frame image, based on the number of people in the frame image, the number of people in the frame image is selected>
Figure BDA0002369395480000043
The j-th person's joint point coordinates in the ith frame image.
(4) Computing the mth personal characteristic in the ith frame
Figure BDA0002369395480000044
And a characteristic of the nth person in the jth frame>
Figure BDA0002369395480000045
Similarity between them
Figure BDA0002369395480000046
Function->
Figure BDA0002369395480000047
Indicating that a measurement is taken>
Figure BDA0002369395480000048
And &>
Figure BDA0002369395480000049
Similarity relationship between them. The similarity relation S comprises 3 parameters which are respectively Pearson correlation coefficient P corr Mean value of distances between feature points D mine Distance standard deviation D between feature points std . Fig. 3 is a flowchart of the similarity calculation unit, and the details are as follows:
(4-1) feature
Figure BDA00023693954800000410
And/or>
Figure BDA00023693954800000411
Is an array of 1 × 134, comprises 67 human body joint points of 2-dimensional pixel position coordinates of the head, the trunk, the limbs and the hands>
Figure BDA00023693954800000412
Since the coordinate data of the joint points existing in the array k is uncertain, in order to ensure the accuracy of the function of the similarity, the embodiment selects ^ greater or greater than or equal to>
Figure BDA00023693954800000413
And &>
Figure BDA00023693954800000414
The co-existing position coordinates of the joint point pixels in the two arrays, i.e. < >>
Figure BDA00023693954800000415
Figure BDA00023693954800000416
Deriving co-existing knuckle point pixel location characteristics>
Figure BDA00023693954800000417
And &>
Figure BDA00023693954800000418
(4-2) calculating
Figure BDA0002369395480000051
And &>
Figure BDA0002369395480000052
Has a Pearson correlation coefficient->
Figure BDA0002369395480000053
(4-3) calculation of
Figure BDA0002369395480000054
And &>
Figure BDA0002369395480000055
Mean value of the distance between D mean ,D mean Indicating that the pixel coordinate position between the mth person in the ith frame and the nth person in the jth frame shares the joint point deviates from the mean value. />
Figure BDA0002369395480000056
cn is the number of common joints. dc denotes the absolute value of the distance of the pixel coordinate of the corresponding joint point, e.g. <>
Figure BDA0002369395480000057
Figure BDA0002369395480000058
(4-4) calculation of
Figure BDA0002369395480000059
And &>
Figure BDA00023693954800000510
Standard deviation of the distance between->
Figure BDA00023693954800000511
Wherein mu dc Representing the average value of dc.
(5) The video of the known person track is used for training the similarity relation model ω, and the specific structure is shown in fig. 5:
(5-1) taking out the joint point coordinate set of the continuous n frames of known tracking strategies from the set K of the human joint points
Figure BDA00023693954800000512
(5-2) referring to FIG. 4, the similarity parameter between two sets of joint coordinates with object ID R between two adjacent frames is taken as the positive training set S p The label of the sample is set to 1; taking a similarity parameter between an object with an ID of R and an object with an ID of not R between two adjacent frames as negative trainingSet, tag is set to 0.
(5-3) setting the number of layers l of the neural network and the number of neurons n per layer l And inputting the training set into a set neural network for repeated iterative training to obtain the model omega.
(6) The trajectory is tracked using the similarity relation model ω.
As shown in fig. 6, it is assumed that frames 1 to i are a known policy, i.e., the person ID in each frame is known. Taking a person with an ID of R as an example, the specific tracking steps are as follows:
(6-1) the character joint feature with the ID of R in the ith frame and all the character joint features in the (i + 1) th frame calculate the similarity relation S.
(6-2) inputting each similarity relation to the model ω, which outputs the probability P that the person ID is R pre
(6-3) calculating the probability P that all people ID which can be identified in the (i + 1) th frame is R pre And selecting the one with the maximum probability to judge the probability P pre Whether the ID is larger than or equal to a preset threshold value G or not is judged, if so, the person with the ID of R in the (i + 2) th frame is continuously detected, and the like; if probability P pre If the number of the people with the ID R in the (i + 1) th frame is smaller than the preset threshold value G, judging that no people with the ID R in the (i + 1) th frame exists, performing similarity probability with the ID R of the people in the (i + 2) th frame and the i-th frame, and so on, if no people with the ID R still exist in the (i + 10) th frame, judging that the R leaves the video, and terminating the tracking of the R.
Example 2
As shown in fig. 1, the 2D video multi-person tracking system based on human body joints of the present embodiment can be divided into the following modules according to functions:
the video cutting unit is used for cutting the acquired video file to obtain a frame sequence set;
the joint point feature extraction unit is used for identifying and extracting a human body joint point feature set of all people in each frame of the frame sequence set;
the similarity calculation unit is used for carrying out similarity relation model training by utilizing videos of known character tracks, and calculating the similarity relation of the body characteristics of the same characters in the upper frame and the lower frame and the similarity relation of the body characteristics of different characters in the upper frame and the lower frame by utilizing the characteristics of human body joint points;
the model training unit is used for selecting the similarity relation as a training set and learning a similarity relation model belonging to the same character identity by using a neural network algorithm;
and the tracking unit is used for initializing the identities of the previous n frames of people during tracking, calculating the similarity relation between the people with unknown current identities and the people with known identities in the previous frames frame by frame, inputting the similarity relation into the similarity relation model, outputting the probability of judging whether the identity is the corresponding identity, determining the identity of the unknown people, and obtaining the joint point track information of the people. In this module, the tracking form may be diversified, and the similarity relationship between the character features of the unknown ID and the character with the known ID in the previous 10 frames, and the probability that the person with the unknown ID is the character with the known ID may be calculated, and the maximum probability may be selected, or the mean value of the maximum probabilities may be selected.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the system, the apparatus, the device, or the unit described above may refer to the corresponding process in the foregoing method embodiment 1, and details are not described herein again.
Those of ordinary skill in the art will appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided by the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only a logical division, and there may be other divisions in actual implementation, or units with the same function may be grouped into one unit, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electrical, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (5)

1. A2D video multi-person tracking method based on human body joint points is characterized by comprising the following steps:
cutting the acquired video file to obtain a frame sequence set;
identifying and extracting human body joint point characteristics of all people in each frame of the frame sequence set;
performing similarity relation model training by using videos of known character tracks, and calculating similarity relations of body features of the same characters in upper and lower frames and similarity relations of body features of different characters in the upper and lower frames by using human body joint point features; selecting a similarity relation as a training set, and learning a similarity relation model belonging to the same character identity by using a neural network algorithm; the similarity relation comprises 3 parameters which are respectively Pearson correlation coefficient P corr Mean value of distances between feature points D mean Distance standard deviation D between feature points std The calculation method is as follows:
let the mth personal characteristic in the ith frame
Figure FDA0004057547880000011
And a characteristic of the nth person in the jth frame>
Figure FDA0004057547880000012
Similarity between them
Figure FDA0004057547880000013
Selecting
Figure FDA0004057547880000014
And &>
Figure FDA0004057547880000015
The joint point pixel position coordinates coexisting in the two arrays are obtained to obtain the joint point pixel position characteristic->
Figure FDA0004057547880000016
And &>
Figure FDA0004057547880000017
Computing
Figure FDA0004057547880000018
And &>
Figure FDA0004057547880000019
Has a Pearson correlation coefficient->
Figure FDA00040575478800000110
Wherein,
Figure FDA00040575478800000111
represents->
Figure FDA00040575478800000112
And &>
Figure FDA00040575478800000113
Covariance in between, <' > based on>
Figure FDA00040575478800000114
And &>
Figure FDA00040575478800000115
Respectively represent->
Figure FDA00040575478800000116
And &>
Figure FDA00040575478800000117
Standard deviation of (d);
computing
Figure FDA00040575478800000118
And &>
Figure FDA00040575478800000119
Mean value of the distance in between>
Figure FDA00040575478800000120
cn is the number of joint points which coexist, and dc represents the absolute value of the pixel coordinate distance of the corresponding joint point;
calculating out
Figure FDA00040575478800000121
And &>
Figure FDA00040575478800000122
Standard deviation of distance therebetween +>
Figure FDA00040575478800000123
Wherein mu dc Represents the average value of dc;
during tracking, the identities of the previous n frames of characters are initialized, the similarity relation between the current character with unknown identity and the characters with known identity in the previous frames is calculated frame by frame, the similarity relation is input into a similarity relation model, the probability of judging whether the current character is the corresponding identity is output, the identity of the unknown character is determined, and the joint point track information of the character is obtained.
2. The human joint point based 2D video multi-person tracking method of claim 1,
the construction method of the similarity relation model comprises the following steps:
taking out the joint point coordinate set of the known tracking strategy of continuous n frames from the set K of the human joint points
Figure FDA0004057547880000021
Figure FDA0004057547880000022
Taking similarity parameter between two groups of joint point coordinates with object ID R between two adjacent frames as positive training set S p The label of the sample is set to 1; taking a similarity parameter between an object with an ID of R and an object with an ID of not R between two adjacent frames as a negative training set, and setting a label as 0;
setting the number of layers l of the neural network and the number of neurons n per layer l And inputting the training set into a set neural network for repeated iterative training to obtain a model omega.
3. Many people of 2D video tracking system based on human joint point, its characterized in that includes:
the video cutting unit is used for cutting the acquired video file to obtain a frame sequence set;
the joint point feature extraction unit is used for identifying and extracting a human body joint point feature set of all people in each frame of the frame sequence set;
the similarity calculation unit is used for carrying out similarity relation model training by utilizing videos of known character tracks, and calculating the similarity relation of the body characteristics of the same characters in the upper frame and the lower frame and the similarity relation of the body characteristics of different characters in the upper frame and the lower frame by utilizing the characteristics of human body joint points;
the model training unit is used for selecting the similarity relation as a training set and learning a similarity relation model belonging to the same character identity by using a neural network algorithm;
and the tracking unit is used for initializing the identities of the previous n frames of characters during tracking, calculating the similarity relation between the current character with unknown identity and the characters with known identities in the previous frames frame by frame, inputting the similarity relation into the similarity relation model, outputting the probability of judging whether the current character is the corresponding identity, determining the identity of the unknown character, and obtaining the joint point track information of the character.
4. A storage medium storing a computer program which, when executed by a processor, causes the processor to perform the method for 2D video multi-person tracking based on human joint according to any one of claims 1-2.
5. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the method for 2D video multi-person tracking based on human joint points according to any one of claims 1-2.
CN202010045947.XA 2020-01-16 2020-01-16 Human body joint point-based 2D video multi-person tracking method, system, medium and equipment Active CN111223127B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010045947.XA CN111223127B (en) 2020-01-16 2020-01-16 Human body joint point-based 2D video multi-person tracking method, system, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010045947.XA CN111223127B (en) 2020-01-16 2020-01-16 Human body joint point-based 2D video multi-person tracking method, system, medium and equipment

Publications (2)

Publication Number Publication Date
CN111223127A CN111223127A (en) 2020-06-02
CN111223127B true CN111223127B (en) 2023-04-07

Family

ID=70826006

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010045947.XA Active CN111223127B (en) 2020-01-16 2020-01-16 Human body joint point-based 2D video multi-person tracking method, system, medium and equipment

Country Status (1)

Country Link
CN (1) CN111223127B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106559511A (en) * 2016-10-18 2017-04-05 上海优刻得信息科技有限公司 Cloud system, high in the clouds public service system and the exchanging visit method for cloud system
CN107392097A (en) * 2017-06-15 2017-11-24 中山大学 A kind of 3 D human body intra-articular irrigation method of monocular color video
CN109086706A (en) * 2018-07-24 2018-12-25 西北工业大学 Applied to the action identification method based on segmentation manikin in man-machine collaboration
CN109919977A (en) * 2019-02-26 2019-06-21 鹍骐科技(北京)股份有限公司 A kind of video motion personage tracking and personal identification method based on temporal characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106559511A (en) * 2016-10-18 2017-04-05 上海优刻得信息科技有限公司 Cloud system, high in the clouds public service system and the exchanging visit method for cloud system
CN107392097A (en) * 2017-06-15 2017-11-24 中山大学 A kind of 3 D human body intra-articular irrigation method of monocular color video
CN109086706A (en) * 2018-07-24 2018-12-25 西北工业大学 Applied to the action identification method based on segmentation manikin in man-machine collaboration
CN109919977A (en) * 2019-02-26 2019-06-21 鹍骐科技(北京)股份有限公司 A kind of video motion personage tracking and personal identification method based on temporal characteristics

Also Published As

Publication number Publication date
CN111223127A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
Nadeem et al. Human actions tracking and recognition based on body parts detection via Artificial neural network
CN110472554B (en) Table tennis action recognition method and system based on attitude segmentation and key point features
Miao et al. Identifying visible parts via pose estimation for occluded person re-identification
CN110135375B (en) Multi-person attitude estimation method based on global information integration
CN108052896B (en) Human body behavior identification method based on convolutional neural network and support vector machine
CN111339990B (en) Face recognition system and method based on dynamic update of face features
CN108447080B (en) Target tracking method, system and storage medium based on hierarchical data association and convolutional neural network
CN109086706B (en) Motion recognition method based on segmentation human body model applied to human-computer cooperation
Xu et al. Online dynamic gesture recognition for human robot interaction
Zeng et al. Silhouette-based gait recognition via deterministic learning
CN102682302B (en) Human body posture identification method based on multi-characteristic fusion of key frame
CN108256421A (en) A kind of dynamic gesture sequence real-time identification method, system and device
CN110674785A (en) Multi-person posture analysis method based on human body key point tracking
KR100969298B1 (en) Method For Social Network Analysis Based On Face Recognition In An Image or Image Sequences
CN113326835B (en) Action detection method and device, terminal equipment and storage medium
CN109934195A (en) A kind of anti-spoofing three-dimensional face identification method based on information fusion
CN109902565B (en) Multi-feature fusion human behavior recognition method
CN107424161B (en) Coarse-to-fine indoor scene image layout estimation method
KR20130013122A (en) Apparatus and method for detecting object pose
CN114067358A (en) Human body posture recognition method and system based on key point detection technology
CN112989889B (en) Gait recognition method based on gesture guidance
CN111914643A (en) Human body action recognition method based on skeleton key point detection
CN111126515A (en) Model training method based on artificial intelligence and related device
Arif et al. Human pose estimation and object interaction for sports behaviour
Batool et al. Telemonitoring of daily activities based on multi-sensors data fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant