CN117557593A - Personnel track tracking method and system based on video analysis - Google Patents

Personnel track tracking method and system based on video analysis Download PDF

Info

Publication number
CN117557593A
CN117557593A CN202311181347.6A CN202311181347A CN117557593A CN 117557593 A CN117557593 A CN 117557593A CN 202311181347 A CN202311181347 A CN 202311181347A CN 117557593 A CN117557593 A CN 117557593A
Authority
CN
China
Prior art keywords
character
features
similarity
person
tracked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311181347.6A
Other languages
Chinese (zh)
Inventor
杜向华
李叶帆
张杨
曲洋
朱佳力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Jiepushi Technology Co ltd
Original Assignee
Hangzhou Jiepushi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Jiepushi Technology Co ltd filed Critical Hangzhou Jiepushi Technology Co ltd
Priority to CN202311181347.6A priority Critical patent/CN117557593A/en
Publication of CN117557593A publication Critical patent/CN117557593A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of video analysis, and provides a personnel track tracking method and system based on video analysis, wherein the personnel track tracking method comprises the following steps: the system comprises a personnel comparison module, wherein video frame-by-frame decomposition is carried out from a camera in an area to be tracked, and a monitoring image sequence is loaded to obtain characteristics of the person to be tracked; activating feature extraction nodes of the personnel comparison module, extracting the features of the multiple monitoring image sequences, and generating multiple groups of features to be compared; performing similarity analysis to generate a plurality of groups of character similarity; traversing and extracting a plurality of first image sets with maximum similarity and meeting a similarity threshold, and adjusting according to time sequence to generate a character track image to be tracked; and carrying out personnel tracking according to the trace image of the person to be tracked. The technical problems that joint tracking among multiple lenses is difficult to realize in the tracking process and the personnel track is unclear in the prior art are solved.

Description

Personnel track tracking method and system based on video analysis
Technical Field
The invention relates to the technical field of video analysis, in particular to a personnel track tracking method and system based on video analysis.
Background
In order to ensure social order, maintain life and public property safety, personnel track analysis and tracking are needed in public place management and control, enterprise production safety, urban traffic order maintenance and other multiple scenes.
At present, the video data is mainly analyzed through a target tracking algorithm to realize personnel track tracking, and because the data set in the tracking field is numerous and shooting scenes are various and complex, including factors such as target deformation, blurring, rotation, shielding, exceeding the field of view, irregular movement and the like, the difficulty of video analysis is large, the integration capability of a plurality of video monitoring terminals is weak, the joint tracking among multiple lenses is difficult to realize, and the personnel track is unclear, the tracking difficulty is large and the time consumption is long.
In summary, in the prior art, the difficulty in analyzing and processing video information is high, so that the joint tracking among multiple shots is difficult to realize in the tracking process, and the technical problem of unclear personnel track exists.
Disclosure of Invention
The application aims to solve the technical problem that in the prior art, joint tracking among multiple lenses is difficult to realize in the tracking process, and the personnel track is unclear.
In view of the above problems, the embodiments of the present application provide a method and a system for tracking a person track based on video analysis.
In a first aspect of the disclosure, a method for tracking a person trajectory based on video analysis is provided, which is applied to a person trajectory tracking system based on video analysis, wherein the system comprises a person comparison module, and the method comprises the following steps: carrying out video frame-by-frame decomposition from a plurality of cameras in the area to be tracked, and loading a plurality of monitoring image sequences; acquiring character features to be tracked, wherein the character features to be tracked comprise character clothing features, character facial features and character body shape features; activating feature extraction nodes of the person comparison module according to the person clothing features, the person facial features and the person body shape features, extracting the person features of the plurality of monitoring image sequences, and generating a plurality of groups of person features to be compared; traversing the multiple groups of character features to be compared and the character features to be tracked to perform similarity analysis, and generating multiple groups of character similarity; traversing the multiple groups of character similarity, extracting multiple maximum similarity, and meeting a first image set of similarity threshold; adjusting the first image set according to time sequence to generate a character track image to be tracked; and carrying out personnel tracking according to the person track image to be tracked.
In another aspect of the disclosure, a system for tracking a person's trajectory based on video analysis is provided, including: the information decomposition and loading module is used for decomposing videos from a plurality of cameras in the area to be tracked frame by frame and loading a plurality of monitoring image sequences; the tracking feature extraction module is used for acquiring character features to be tracked, wherein the character features to be tracked comprise character clothing features, character facial features and character body shape features; the monitoring feature extraction module is used for activating feature extraction nodes of the person comparison module according to the person clothing features, the person facial features and the person body shape features, extracting the person features of the plurality of monitoring image sequences and generating a plurality of groups of person features to be compared; the character similarity analysis module is used for traversing the plurality of groups of character features to be compared and the character features to be tracked to carry out similarity analysis and generate a plurality of groups of character similarity; the similarity judging module is used for traversing the multiple groups of character similarity, extracting multiple maximum similarity and meeting a first image set of similarity threshold; the track adjustment generation module is used for adjusting the first image set according to the time sequence to generate a character track image to be tracked; and the task execution module is used for carrying out personnel tracking according to the person track image to be tracked.
In a third aspect of the present disclosure, there is provided an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any one of the methods described above.
In a fourth aspect of the disclosure, there is also provided a computer readable storage medium storing computer instructions for causing a computer to perform the method of any one of the above steps.
One or more technical solutions provided in the present application have at least the following technical effects or advantages:
the character feature to be tracked is obtained by loading the monitoring image sequence by decomposing the video frame by frame of the video of the region camera to be tracked. And activating feature extraction nodes of the personnel comparison module, and extracting the features of the multiple monitoring image sequences to obtain multiple groups of features to be compared. And carrying out similarity analysis on the multiple groups of character features to be compared and the character features to be tracked to obtain multiple groups of character similarity. Traversing and extracting a plurality of image sets with maximum similarity and meeting a similarity threshold, and adjusting the image sets according to time sequence to obtain the trace image of the person to be tracked. According to the technical scheme of personnel tracking based on the trace images of the person to be tracked, the data acquired by the cameras are subjected to joint analysis processing, and multi-dimensional analysis and multi-lens joint tracking are realized through extraction and comparison of the characteristics of the clothing, the face and the body of the person, so that the technical effects of improving the tracking accuracy and consistency are achieved.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
Fig. 1 is a schematic diagram of a possible flow chart of a personnel trajectory tracking method based on video analysis according to an embodiment of the present application;
fig. 2 is a schematic flow chart of extracting character features from a plurality of monitored image sequences in a method for tracking a human track based on video analysis according to an embodiment of the present application;
FIG. 3 is a schematic block diagram of a person tracking system based on video analysis according to an embodiment of the present application;
fig. 4 is a schematic block diagram of an electronic device according to an embodiment of the present application.
Reference numerals illustrate: the system comprises an information decomposition loading module 100, a tracking feature extraction module 200, a monitoring feature extraction module 300, a character similarity analysis module 400, a similarity judging module 500, a track adjustment generating module 600 and a task executing module 700.
Detailed Description
The technical scheme provided by the application has the following overall thought:
The embodiment of the application provides a personnel track tracking method and a system based on video analysis, which are characterized in that a monitoring image sequence is loaded to acquire characteristics of a person to be tracked by decomposing videos of a camera of a region to be tracked frame by frame. And activating feature extraction nodes of the personnel comparison module, and extracting the features of the multiple monitoring image sequences to obtain multiple groups of features to be compared. And carrying out similarity analysis on the multiple groups of character features to be compared and the character features to be tracked to obtain multiple groups of character similarity. Traversing and extracting a plurality of image sets with maximum similarity and meeting a similarity threshold, and adjusting the image sets according to time sequence to obtain the trace image of the person to be tracked. According to the technical scheme of personnel tracking based on the trace images of the person to be tracked, the data acquired by the cameras are subjected to joint analysis processing, and multi-dimensional analysis and multi-lens joint tracking are realized through extraction and comparison of the characteristics of the clothing, the face and the body of the person, so that the technical effects of improving the tracking accuracy and consistency are achieved.
Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
Example 1
As shown in fig. 1, an embodiment of the present application provides a personnel trajectory tracking method based on video analysis, which is applied to a personnel trajectory tracking system based on video analysis, wherein the system includes a personnel comparison module, and includes the following steps:
s10: carrying out video frame-by-frame decomposition from a plurality of cameras in the area to be tracked, and loading a plurality of monitoring image sequences;
specifically, with the development of positioning technology and communication technology, the personnel track tracking system has been widely applied to the fields of safety supervision, intelligent petrochemical industry and the like. The personnel track tracing system can analyze and compare the collected video information frame by frame to obtain the action track of the target object, so that the action track of the target object is traced in real time. The system comprises the personnel comparison module, wherein the module can provide a feature extraction function and a feature comparison function, analyze and process the input video information, extract the required feature information and compare the required feature information with target feature information. The module comprises, but is not limited to, a feature extraction model built based on downsampling convolutional neural network model training and a similarity analysis evaluation model built based on a plurality of preset similarity functions.
The video information is collected through a plurality of cameras in the area to be tracked to serve as basic data to be analyzed, the collected information is analyzed frame by frame according to different cameras through an image processing function library, for example OPENCV, an image decomposition sequence corresponding to each camera is obtained, and an image set is the plurality of monitoring image sequences. The method achieves the aim of carrying out joint analysis processing on the data acquired by the cameras and realizing joint tracking, thereby improving the accuracy and consistency of tracking.
S20: acquiring character features to be tracked, wherein the character features to be tracked comprise character clothing features, character facial features and character body shape features;
specifically, the character features to be tracked are character features to be found and tracked in a preset mode, and the picture information or the video information of the target character is extracted through the convolution layer of the personnel comparison module, wherein the picture information or the video information comprises the character clothing features, the character facial features and the character body shape features. The character clothing features comprise the color, shape, distribution position features and the like of the clothing; the facial features of the figures comprise the positions and shapes of eyes, ears, mouth, nose and eyebrows, facial shapes and hairstyle features; the figure shape features include height, fat and lean and figure ratio distribution features.
S30: activating feature extraction nodes of the person comparison module according to the person clothing features, the person facial features and the person body shape features, extracting the person features of the plurality of monitoring image sequences, and generating a plurality of groups of person features to be compared;
further, as shown in fig. 2, the step S30 includes the steps of:
s31: the feature extraction nodes at least comprise clothing feature extraction nodes, facial feature extraction nodes and body shape feature extraction nodes;
s32: activating the clothing feature extraction node, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of clothing features of the people to be compared;
s33: activating the facial feature extraction node, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of facial features of the people to be compared;
s34: activating the body shape feature extraction nodes, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of body shape features of the people to be compared;
S35: and adding the clothing features of the multiple groups of people to be compared, the facial features of the multiple groups of people to be compared and the figure features of the multiple groups of people to be compared into the multiple groups of people to be compared.
Further, the clothing feature extraction node, the facial feature extraction node and the bodily form feature extraction node are obtained through downsampling convolutional neural network model training.
Specifically, the feature extraction node includes a plurality of nodes of a character appearance, and is capable of performing feature extraction on the character clothing feature, the character facial feature, and the character body shape feature. At least comprising the clothing feature extraction node, the facial feature extraction node, and the bodily feature extraction node.
The clothing feature extraction node, the facial feature extraction node and the bodily form feature extraction node are convolutional neural network models with feature extraction functions, which are obtained through downsampling convolutional neural network model training. The model training process is preferably as follows:
and acquiring the people stream data of a plurality of groups of monitoring videos based on the big data of the area to be tracked, wherein the people stream data comprise pedestrian clothing data, pedestrian face data and pedestrian body shape data. Pedestrian clothing data, pedestrian face data, and body shape data of pedestrians are divided into 8:2, the eight-component data is set as training data, and the two-component data is set as verification data.
During training, any group of pedestrian clothing data, pedestrian face data and pedestrian body shape data are used as training input data and target output information to be input into a downsampled convolutional neural network model, and the downsampled convolutional neural network model is transmitted forwards through a convolutional layer, a downsampling layer and a full-connection layer to obtain an output result, namely a predicted feature extraction result, namely an output feature vector, the deviation between the output feature vector and target output information is calculated, when the deviation is smaller than a preset deviation, a verification counter +1 is used, and when the deviation is larger than or equal to the preset deviation, the verification counter is set to be 0; and continuously selecting the transportation weather record data, the ideal transportation duration record data and the actual transportation duration record data from the training data to train, when the number of the verification counter is greater than or equal to the preset number, using the verification data to verify, wherein the verification mode is completely the same as the training mode, and when the count of the counter in the verification process is greater than or equal to the preset number, the calculation is regarded as the convergence of the convolutional neural network model, otherwise, the calculation returns the training data to train continuously.
And when the feature extraction node is activated, the clothing feature extraction node activation, the facial feature extraction node activation and the body shape feature extraction node activation are respectively carried out.
After the clothing feature extraction node is activated, traversing the plurality of monitoring image sequences, taking the plurality of monitoring image sequences as input information, and carrying out feature extraction through a convolution layer, a downsampling layer and a full connection layer in the clothing feature extraction node to generate a plurality of groups of clothing features of the people to be compared.
After the facial feature extraction node is activated, traversing the plurality of monitoring image sequences, taking the plurality of monitoring image sequences as input information, and carrying out feature extraction through a convolution layer, a downsampling layer and a full connection layer in the clothing feature extraction node to generate a plurality of groups of facial features of the people to be compared.
After the body shape feature extraction node is activated, traversing the plurality of monitoring image sequences, taking the plurality of monitoring image sequences as input information, and carrying out feature extraction through a convolution layer, a downsampling layer and a full connection layer in the clothing feature extraction node to generate a plurality of groups of body shape features of the people to be compared.
The obtained multiple groups of character features to be compared comprise the multiple groups of character clothing features to be compared, the multiple groups of character facial features to be compared and the multiple groups of character body shape features to be compared.
S40: traversing the multiple groups of character features to be compared and the character features to be tracked to perform similarity analysis, and generating multiple groups of character similarity;
Specifically, the multiple groups of person similarity are obtained by performing distance deviation analysis on the feature vectors of the person features to be tracked and the feature vectors of the multiple groups of comparison person features through a similarity analysis function. The multiple groups of character features to be compared comprise the multiple groups of character clothing features to be compared, the multiple groups of character facial features to be compared and the multiple groups of character figure features to be compared. The character features to be tracked include character clothing features, character facial features and character body shape features. The similarity analysis on the clothing, face and body shape aspects is achieved, and the feature analysis on static and dynamic characteristics can be used for analyzing and matching the people to be tracked in a multi-dimensional and all-dimensional manner, so that comprehensive data is provided for tracking, and the accuracy of track tracking is improved.
S50: traversing the multiple groups of character similarity, extracting multiple maximum similarity, and meeting a first image set of similarity threshold;
specifically, the similarity threshold is a similarity range set based on historical tracking data, including but not limited to a mapping relationship of historical similarity data and tracking trajectory coincidence rate. And traversing the multiple groups of character similarity, sorting the similarity from high to low, and extracting image information of a plurality of character features to be compared, which are ranked forward in similarity and meet a similarity threshold, so as to form the first image set.
S60: adjusting the first image set according to time sequence to generate a character track image to be tracked;
s70: and carrying out personnel tracking according to the person track image to be tracked.
Specifically, since the first image set includes image information with a certain data amount, and possibly includes a plurality of pieces of character information with high similarity to the character to be tracked. And after the image is adjusted according to the picture shooting time sequence, obtaining an image set of a plurality of people to form the track image of the people to be tracked. And transmitting the tracking person track image to a display interface of the person track tracking system, and performing artificial judgment according to the person track image to be tracked, so as to perform personnel tracking and positioning. The method and the device realize the decomposition, comparison and extraction of the characteristics of the target person, and perform time sequence adjustment on the characteristic information with the highest similarity with the extracted characteristic information to obtain a more accurate and smoother track image.
Further, the step S40 of traversing the plurality of groups of character features to be compared and the character features to be tracked to perform similarity analysis, and generating a plurality of groups of character similarity includes the steps of:
s41: constructing a clothing similarity evaluation function, traversing the multiple groups of character features to be compared and the character features to be tracked for similarity analysis, and generating multiple groups of character clothing similarity;
S42: constructing a face similarity evaluation function, traversing the multiple groups of character features to be compared and the character features to be tracked for similarity analysis, and generating multiple groups of character face similarity;
s43: constructing a figure similarity evaluation function, traversing the multiple groups of figure features to be compared and the figure features to be tracked for similarity analysis, and generating multiple groups of figure similarity;
s44: and storing the multiple groups of character clothing similarity, the multiple groups of character face similarity and the multiple groups of character body shape similarity according to the extraction source association to generate the multiple groups of character similarity.
Further, the step S41 of constructing a clothing similarity evaluation function, traversing the multiple groups of to-be-compared person features and the to-be-tracked person features to perform similarity analysis, and generating multiple groups of person clothing similarity includes:
the clothing similarity evaluation function is as follows:
wherein the SIM is 1 Characterization of clothing similarity, x i Representing the pixel value of the ith point of the character to be compared, x i0 The pixel value of the pixel point closest to the pixel value of the ith point after the gesture of the person to be tracked is adjusted is represented according to the person to be compared, k represents the total number of pixels to be compared, and a represents the minimum pixel value deviation regarded as the similar pixel point.
Specifically, the similarity analysis is preferably performed by constructing a similarity evaluation function. The clothing similarity evaluation function is as follows:
wherein the SIM is 1 Characterization of clothing similarity, x i Representing the pixel value of the ith point of the character to be compared, x i0 The pixel value of the pixel point closest to the pixel value of the ith point after the gesture of the person to be tracked is adjusted is represented according to the person to be compared, k represents the total number of pixels to be compared, and a represents the minimum pixel value deviation regarded as the similar pixel point.
And according to the similarity evaluation function, carrying out similarity analysis on the feature information in the multiple groups of to-be-compared person features and the to-be-tracked person features in sequence to obtain a person clothing similarity analysis result, namely the multiple groups of person clothing similarities.
The similarity evaluation function uses k to represent the total number of pixels to be compared, and when the pixel value of the ith point is analyzed, the gesture of the person to be tracked is adjusted according to the extracted person to be compared of the ith point, so as to obtain the pixel value x of the ith 0 th point of the pixel point closest to the pixel value of the ith point i0 Find x i And x i0 And (3) taking the Euclidean distance as pixel deviation, counting by a count function when the deviation is smaller than or equal to the minimum pixel value deviation a of the similar pixel points, and dividing the data result after all pixel points are analyzed and compared by the pixel sum k to obtain the clothing similarity. The ratio of the number of pixels representing the deviation satisfying the minimum pixel value deviation to the total number of pixels to be compared. If the position of the person to be tracked is found to have no clothing image information after the gesture of the person to be tracked is adjusted, the position pixel deviation is regarded as being larger than a and is not counted.
Further, the step S42 of constructing a face similarity evaluation function, traversing the plurality of groups of to-be-compared character features and the to-be-tracked character features to perform similarity analysis, and generating a plurality of groups of character face similarity includes:
the face similarity evaluation function is:
wherein the SIM is 2 Characterizing facial similarity, d (y j ,y j0 ) Characterizing the Euclidean distance of the jth face positioning point after the face images of the person to be compared and the person to be tracked are aligned, y j Characterizing the j-th facial anchor point coordinate, y of the face images of the people to be compared j0 The j-th facial positioning point coordinates of the face image of the person to be tracked are represented, q represents the total number of the facial positioning point coordinates, and b represents the minimum positioning distance deviation regarded as facial similarity.
Specifically, a plurality of groups of character features to be compared and the character features to be tracked are input into the face similarity evaluation function, and the similarity of each group of character features to be compared and the similarity of the character features to be tracked are calculated in sequence to obtain a plurality of groups of face similarity.
The facial similarity function firstly sets a plurality of positioning points on the face of a person for analysis and comparison, such as the center of the eyebrow, the starting point, the middle point and the end point of the eyebrow, the nose tip, the corners of the eye, the tails of the eye, the upper, middle and lower positions of the ears and the like, as preset positioning points. When similarity analysis is carried out, firstly, the face images of the people to be compared and the face images of the people to be tracked are aligned, the distance analysis of the face positioning points is carried out, and d (y) is used j ,y j0 ) And (3) representing the Euclidean distance, representing the minimum positioning distance deviation regarded as facial similarity by using b, counting by a count function when the Euclidean distance of the positioning points is smaller than or equal to the minimum positioning distance deviation b, dividing the data result after all the positioning points are compared by the total number q of the facial positioning point coordinates by analysis, and obtaining the proportion of the total number of the facial positioning point coordinates of the positioning point coordinates with the deviation meeting the minimum positioning distance deviation, namely the facial similarity of a group of character features to be compared and the character features to be tracked.
Further, traversing the plurality of groups of character features to be compared and the character features to be tracked by the construct similarity evaluation function to perform similarity analysis so as to generate a plurality of groups of character similarity; step S43 includes:
the body shape similarity evaluation function is as follows:
wherein the SIM is 3 Representing the similarity of body shapes, wherein H represents the total number of preset body shape similarity evaluation positioning points after gesture superposition adjustment of the character to be compared and the character to be tracked, and d (z) l ,z l0 ) Characterizing the Euclidean distance, z, of the first positioning point after gesture superposition of the character to be compared and the character to be tracked l Characterizing the coordinates of the first locating point, z, of the character to be compared l0 And c, representing the coordinates of a first locating point of the person to be tracked, and c, representing the minimum locating distance deviation regarded as the similarity of the body shapes.
Specifically, a plurality of groups of character features to be compared and the character features to be tracked are input into the body shape similarity evaluation function, and the similarity of each group of character features to be compared and the similarity of the character features to be tracked are calculated in sequence to obtain a plurality of groups of character body shape similarity.
Firstly, positioning points are set on the figure, such as head, neck, shoulder, elbow, hand, waist, hip, knee joint, ankle joint and the like, and H is the total number of preset positioning points. Because the postures of the characters photographed by the video are different, the postures of the characters need to be corrected and adjusted so as to be overlapped. After the superposition, the Euclidean distance of the positioning points is analyzed, and d (z l ,z l0 ) Characterization, wherein z l Characterizing the coordinates of the first locating point, z, of the character to be compared l0 And characterizing the coordinates of the first locating point of the person to be tracked. When d (z) l ,z l0 ) And when the minimum positioning distance deviation c which is equal to or smaller than the body shape similarity is smaller than the minimum positioning distance deviation c, counting by a count counting function. Dividing the data result after all positioning points are analyzed and compared by the total number H of the preset body shape similarity evaluation positioning points to obtain the proportion of the positioning points with the deviation meeting the minimum positioning distance deviation to the total number of the positioning points, namely the body shape similarity of a group of to-be-compared character features and the to-be-tracked character features.
The extraction sources refer to corresponding extraction people to be compared, for example, the extraction sources for the clothing similarity, the facial phase velocity and the body shape similarity of Zhang three are Zhang three. And storing all the similarity information of the characters to be compared to the same address according to the extraction source association storage, generating the similarity of the characters, and further sequentially storing a plurality of groups of data to obtain the similarity of the plurality of groups of characters. Through construction of a plurality of similarity functions and similarity calculation, accurate analysis of similarity is achieved, and accuracy of feature matching is improved.
In summary, the method and the system for tracking the personnel track based on the video analysis provided by the embodiment of the application have the following technical effects:
1. the multi-dimensional analysis and multi-lens combined tracking are realized by carrying out combined analysis processing on the data acquired by the cameras and extracting and comparing the characteristics of the clothing, the face and the body form of the person, so that the technical effects of improving the tracking accuracy and consistency are achieved.
2. Through calculation of a plurality of similarity functions, the characteristic information with highest similarity and meeting a preset similarity threshold value is obtained, time sequence adjustment is carried out, accurate analysis of the similarity and coherent arrangement of images are achieved, more accurate and smoother track images are obtained, and therefore the technical effect of providing high-quality track information for human tracking is achieved.
Example two
Based on the same inventive concept as the method for tracking a person's trajectory based on video analysis in the foregoing embodiments, as shown in fig. 3, an embodiment of the present application provides a system for tracking a person's trajectory based on video analysis, including:
the information decomposition and loading module 100 is used for decomposing videos from a plurality of cameras in the area to be tracked frame by frame and loading a plurality of monitoring image sequences;
the tracking feature extraction module 200 is configured to obtain a character feature to be tracked, where the character feature to be tracked includes a character clothing feature, a character face feature, and a character figure feature;
the monitoring feature extraction module 300 is configured to activate feature extraction nodes of the person comparison module according to the clothing features, the facial features and the body shape features of the person, and perform feature extraction on the plurality of monitoring image sequences to generate a plurality of groups of to-be-compared person features;
the figure similarity analysis module 400 is configured to traverse the multiple groups of to-be-compared figure features and the to-be-tracked figure features to perform similarity analysis, so as to generate multiple groups of figure similarities;
the similarity judging module 500 is configured to traverse the multiple groups of person similarities, extract multiple maximum similarities, and satisfy a first image set of similarity threshold;
The track adjustment generation module 600 is configured to adjust the first image set according to a time sequence, and generate a person track image to be tracked;
the task execution module 700 is configured to perform personnel tracking according to the image of the person track to be tracked.
Further, the monitoring feature extraction module 300 performs the steps of:
the feature extraction nodes at least comprise clothing feature extraction nodes, facial feature extraction nodes and body shape feature extraction nodes;
activating the clothing feature extraction node, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of clothing features of the people to be compared;
activating the facial feature extraction node, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of facial features of the people to be compared;
activating the body shape feature extraction nodes, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of body shape features of the people to be compared;
and adding the clothing features of the multiple groups of people to be compared, the facial features of the multiple groups of people to be compared and the figure features of the multiple groups of people to be compared into the multiple groups of people to be compared.
Further, the monitoring feature extraction module 300 performs the steps of:
The clothing feature extraction node, the facial feature extraction node and the bodily form feature extraction node are obtained through downsampling convolutional neural network model training.
Further, the person similarity analysis module 400 performs the steps of:
constructing a clothing similarity evaluation function, traversing the multiple groups of character features to be compared and the character features to be tracked for similarity analysis, and generating multiple groups of character clothing similarity;
constructing a face similarity evaluation function, traversing the multiple groups of character features to be compared and the character features to be tracked for similarity analysis, and generating multiple groups of character face similarity;
constructing a figure similarity evaluation function, traversing the multiple groups of figure features to be compared and the figure features to be tracked for similarity analysis, and generating multiple groups of figure similarity;
and storing the multiple groups of character clothing similarity, the multiple groups of character face similarity and the multiple groups of character body shape similarity according to the extraction source association to generate the multiple groups of character similarity.
Further, the person similarity analysis module 400 performs the steps of:
the clothing similarity evaluation function is as follows:
wherein the SIM is 1 Characterization of clothing similarity, x i Representing the pixel value of the ith point of the character to be compared, x i0 The pixel value of the pixel point closest to the pixel value of the ith point after the gesture of the person to be tracked is adjusted is represented according to the person to be compared, k represents the total number of pixels to be compared, and a represents the minimum pixel value deviation regarded as the similar pixel point.
Further, the person similarity analysis module 400 performs the steps of:
the face similarity evaluation function is:
wherein the SIM is 2 Characterizing facial similarity, d (y j ,y j0 ) Characterization of the alignment to be performedThe Euclidean distance, y, of the jth face positioning point after the alignment of the face image of the person to be tracked and the face image of the person to be tracked j Characterizing the j-th facial anchor point coordinate, y of the face images of the people to be compared j0 The j-th facial positioning point coordinates of the face image of the person to be tracked are represented, q represents the total number of the facial positioning point coordinates, and b represents the minimum positioning distance deviation regarded as facial similarity.
Further, the person similarity analysis module 400 performs the steps of:
the body shape similarity evaluation function is as follows:
wherein the SIM is 3 Representing the similarity of body shapes, wherein H represents the total number of preset body shape similarity evaluation positioning points after gesture superposition adjustment of the character to be compared and the character to be tracked, and d (z) l ,z l0 ) Characterizing the Euclidean distance, z, of the first positioning point after gesture superposition of the character to be compared and the character to be tracked l Characterizing the coordinates of the first locating point, z, of the character to be compared l0 And c, representing the coordinates of a first locating point of the person to be tracked, and c, representing the minimum locating distance deviation regarded as the similarity of the body shapes.
Example III
An electronic device according to an embodiment of the present application includes a memory and a processor. The memory is for storing non-transitory computer readable instructions. In particular, the memory may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like.
The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform the desired functions. In one embodiment of the present application, the processor is configured to execute the computer readable instructions stored in the memory, so that the electronic device performs all or part of the steps of a video analysis-based person trajectory tracking method of the embodiments of the present application described above.
It should be understood by those skilled in the art that, in order to solve the technical problem of how to obtain a good user experience effect, the present embodiment may also include well-known structures such as a communication bus, an interface, and the like, and these well-known structures are also included in the protection scope of the present application.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. A schematic diagram of an electronic device suitable for use in implementing embodiments of the present application is shown. The electronic device shown in fig. 4 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments herein.
As shown in fig. 4, the electronic device may include a processing means (e.g., a central processing unit, a graphic processor, etc.), which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from the storage means into a Random Access Memory (RAM). In the RAM, various programs and data required for the operation of the electronic device are also stored. The processing device, ROM and RAM are connected to each other via a bus. An input/output (I/O) interface is also connected to the bus.
In general, the following devices may be connected to the I/O interface: input means including, for example, sensors or visual information gathering devices; output devices including, for example, display screens and the like; storage devices including, for example, magnetic tape, hard disk, etc.; a communication device. The communication means may allow the electronic device to communicate wirelessly or by wire with other devices, such as edge computing devices, to exchange data. While fig. 4 shows an electronic device having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via a communication device, or installed from a storage device, or installed from ROM. All or part of the steps of a person trajectory tracking method based on video analysis are performed when the computer program is executed by a processing device.
The detailed description of the present embodiment may refer to the corresponding description in the foregoing embodiments, and will not be repeated herein.
A computer-readable storage medium according to an embodiment of the present application has stored thereon non-transitory computer-readable instructions. When executed by a processor, the non-transitory computer readable instructions perform all or part of the steps of a video analysis-based personnel trajectory tracking method described above.
The computer-readable storage medium described above includes, but is not limited to: optical storage media (e.g., CD-ROM and DVD), magneto-optical storage media (e.g., MO), magnetic storage media (e.g., magnetic tape or removable hard disk), media with built-in rewritable non-volatile memory (e.g., memory card), and media with built-in ROM (e.g., ROM cartridge).
The detailed description of the present embodiment may refer to the corresponding description in the foregoing embodiments, and will not be repeated herein.
The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not intended to be limited to the details disclosed herein as such.
In this application, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions, and the block diagrams of devices, apparatuses, devices, systems referred to in this application are merely illustrative examples and are not intended to require or implicate a connection, arrangement, or configuration that must be made in the manner illustrated by the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
In addition, as used herein, the use of "or" in the recitation of items beginning with "at least one" indicates a separate recitation, such that recitation of "at least one of A, B or C" for example means a or B or C, or AB or AC or BC, or ABC (i.e., a and B and C). Furthermore, the term "exemplary" does not mean that the described example is preferred or better than other examples.
It is also noted that in the systems and methods of the present application, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent to the present application.
Various changes, substitutions, and alterations are possible to the techniques described herein without departing from the teachings of the techniques defined by the appended claims. Furthermore, the scope of the claims hereof is not to be limited to the exact aspects of the process, machine, manufacture, composition of matter, means, methods and acts described above. The processes, machines, manufacture, compositions of matter, means, methods, or acts, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein may be utilized. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or acts.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims (10)

1. The utility model provides a personnel track tracking method based on video analysis, which is characterized in that the personnel track tracking method based on video analysis is applied to the personnel track tracking system, and the system comprises a personnel comparison module and comprises the following steps:
carrying out video frame-by-frame decomposition from a plurality of cameras in the area to be tracked, and loading a plurality of monitoring image sequences;
acquiring character features to be tracked, wherein the character features to be tracked comprise character clothing features, character facial features and character body shape features;
Activating feature extraction nodes of the person comparison module according to the person clothing features, the person facial features and the person body shape features, extracting the person features of the plurality of monitoring image sequences, and generating a plurality of groups of person features to be compared;
traversing the multiple groups of character features to be compared and the character features to be tracked to perform similarity analysis, and generating multiple groups of character similarity;
traversing the multiple groups of character similarity, extracting multiple maximum similarity, and meeting a first image set of similarity threshold;
adjusting the first image set according to time sequence to generate a character track image to be tracked;
and carrying out personnel tracking according to the person track image to be tracked.
2. The method of claim 1, wherein the person comparison module is activated by a feature extraction node based on the person clothing feature, the person facial feature, and the person body shape feature, and the person feature extraction is performed on the plurality of monitored image sequences to generate a plurality of sets of to-be-compared person features, comprising:
the feature extraction nodes at least comprise clothing feature extraction nodes, facial feature extraction nodes and body shape feature extraction nodes;
Activating the clothing feature extraction node, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of clothing features of the people to be compared;
activating the facial feature extraction node, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of facial features of the people to be compared;
activating the body shape feature extraction nodes, traversing the plurality of monitoring image sequences to perform feature extraction, and generating a plurality of groups of body shape features of the people to be compared;
and adding the clothing features of the multiple groups of people to be compared, the facial features of the multiple groups of people to be compared and the figure features of the multiple groups of people to be compared into the multiple groups of people to be compared.
3. The method of claim 2, wherein the clothing feature extraction node, the facial feature extraction node, and the bodily feature extraction node are obtained for downsampling convolutional neural network model training.
4. The method of claim 1, wherein traversing the plurality of sets of to-be-aligned persona features and the to-be-tracked persona features for similarity analysis generates a plurality of sets of persona similarities, comprising:
constructing a clothing similarity evaluation function, traversing the multiple groups of character features to be compared and the character features to be tracked for similarity analysis, and generating multiple groups of character clothing similarity;
Constructing a face similarity evaluation function, traversing the multiple groups of character features to be compared and the character features to be tracked for similarity analysis, and generating multiple groups of character face similarity;
constructing a figure similarity evaluation function, traversing the multiple groups of figure features to be compared and the figure features to be tracked for similarity analysis, and generating multiple groups of figure similarity;
and storing the multiple groups of character clothing similarity, the multiple groups of character face similarity and the multiple groups of character body shape similarity according to the extraction source association to generate the multiple groups of character similarity.
5. The method of claim 4, wherein the garment similarity assessment function is:
wherein the SIM is 1 Characterization of clothing similarity, x i Representing the pixel value of the ith point of the character to be compared, x i0 The pixel value of the pixel point closest to the pixel value of the ith point after the gesture of the person to be tracked is adjusted is represented according to the person to be compared, k represents the total number of pixels to be compared, and a represents the minimum pixel value deviation regarded as the similar pixel point.
6. The method of claim 4, wherein the facial similarity assessment function is:
wherein the SIM is 2 Characterizing facial similarity, d (y j ,y j0 ) Characterizing the Euclidean distance of the jth face positioning point after the face images of the person to be compared and the person to be tracked are aligned, y j Characterizing the j-th facial anchor point coordinate, y of the face images of the people to be compared j0 The j-th facial positioning point coordinates of the face image of the person to be tracked are represented, q represents the total number of the facial positioning point coordinates, and b represents the minimum positioning distance deviation regarded as facial similarity.
7. The method of claim 4, wherein the body shape similarity assessment function is:
wherein the SIM is 3 Representing the similarity of body shapes, wherein H represents the total number of preset body shape similarity evaluation positioning points after gesture superposition adjustment of the character to be compared and the character to be tracked, and d (z) l ,z l0 ) Characterizing the Euclidean distance, z, of the first positioning point after gesture superposition of the character to be compared and the character to be tracked l Characterizing the coordinates of the first locating point, z, of the character to be compared l0 And c, representing the coordinates of a first locating point of the person to be tracked, and c, representing the minimum locating distance deviation regarded as the similarity of the body shapes.
8. A personnel trajectory tracking system based on video analysis, comprising:
the information decomposition and loading module is used for decomposing videos from a plurality of cameras in the area to be tracked frame by frame and loading a plurality of monitoring image sequences;
The tracking feature extraction module is used for acquiring character features to be tracked, wherein the character features to be tracked comprise character clothing features, character facial features and character body shape features;
the monitoring feature extraction module is used for activating feature extraction nodes of the person comparison module according to the person clothing features, the person facial features and the person body shape features, extracting the person features of the plurality of monitoring image sequences and generating a plurality of groups of person features to be compared;
the character similarity analysis module is used for traversing the plurality of groups of character features to be compared and the character features to be tracked to carry out similarity analysis and generate a plurality of groups of character similarity;
the similarity judging module is used for traversing the multiple groups of character similarity, extracting multiple maximum similarity and meeting a first image set of similarity threshold;
the track adjustment generation module is used for adjusting the first image set according to the time sequence to generate a character track image to be tracked;
and the task execution module is used for carrying out personnel tracking according to the person track image to be tracked.
9. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
A memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a video analytics-based person-tracking method of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a computer to perform a method of video analysis-based person trajectory tracking as claimed in any one of claims 1 to 7.
CN202311181347.6A 2023-09-13 2023-09-13 Personnel track tracking method and system based on video analysis Pending CN117557593A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311181347.6A CN117557593A (en) 2023-09-13 2023-09-13 Personnel track tracking method and system based on video analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311181347.6A CN117557593A (en) 2023-09-13 2023-09-13 Personnel track tracking method and system based on video analysis

Publications (1)

Publication Number Publication Date
CN117557593A true CN117557593A (en) 2024-02-13

Family

ID=89811719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311181347.6A Pending CN117557593A (en) 2023-09-13 2023-09-13 Personnel track tracking method and system based on video analysis

Country Status (1)

Country Link
CN (1) CN117557593A (en)

Similar Documents

Publication Publication Date Title
Zeng et al. Silhouette-based gait recognition via deterministic learning
Baskan et al. Projection based method for segmentation of human face and its evaluation
Kumano et al. Pose-invariant facial expression recognition using variable-intensity templates
CN105809144A (en) Gesture recognition system and method adopting action segmentation
CN105046219A (en) Face identification system
CN109325456A (en) Target identification method, device, target identification equipment and storage medium
JP2019096113A (en) Processing device, method and program relating to keypoint data
CN112379773B (en) Multi-person three-dimensional motion capturing method, storage medium and electronic equipment
Zhang et al. Multimodal spatiotemporal networks for sign language recognition
Valle et al. Cascade of encoder-decoder CNNs with learned coordinates regressor for robust facial landmarks detection
Gavrilescu Proposed architecture of a fully integrated modular neural network-based automatic facial emotion recognition system based on Facial Action Coding System
Kim et al. Real-time facial feature extraction scheme using cascaded networks
Li et al. Application algorithms for basketball training based on big data and Internet of things
Yu et al. 3D facial motion tracking by combining online appearance model and cylinder head model in particle filtering
Huang et al. Face detection and smile detection
US10440350B2 (en) Constructing a user's face model using particle filters
Jessika et al. A study on part affinity fields implementation for human pose estimation with deep neural network
Kim et al. Cross-view self-fusion for self-supervised 3d human pose estimation in the wild
Matthews et al. Creating a large-scale synthetic dataset for human activity recognition
CN117557593A (en) Personnel track tracking method and system based on video analysis
Martin et al. An evaluation of different methods for 3d-driver-body-pose estimation
CN102496174A (en) Method for generating face sketch index for security monitoring
Otberdout et al. Hand pose estimation based on deep learning depth map for hand gesture recognition
Moreira et al. Eyes and eyebrows detection for performance driven animation
Wang et al. Research and implementation of the sports analysis system based on 3D image technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication