CN112132873A - Multi-lens pedestrian recognition and tracking based on computer vision - Google Patents
Multi-lens pedestrian recognition and tracking based on computer vision Download PDFInfo
- Publication number
- CN112132873A CN112132873A CN202011013830.XA CN202011013830A CN112132873A CN 112132873 A CN112132873 A CN 112132873A CN 202011013830 A CN202011013830 A CN 202011013830A CN 112132873 A CN112132873 A CN 112132873A
- Authority
- CN
- China
- Prior art keywords
- tracking
- pedestrian
- picture
- camera
- gait
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005021 gait Effects 0.000 claims abstract description 26
- 238000000034 method Methods 0.000 claims abstract description 26
- 238000012544 monitoring process Methods 0.000 claims abstract description 15
- 238000012937 correction Methods 0.000 claims abstract description 10
- 238000004458 analytical method Methods 0.000 claims abstract description 7
- 238000001514 detection method Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 235000019580 granularity Nutrition 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 239000013598 vector Substances 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 230000006399 behavior Effects 0.000 abstract description 7
- 238000012545 processing Methods 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 2
- VOEYXMAFNDNNED-UHFFFAOYSA-N metolcarb Chemical compound CNC(=O)OC1=CC=CC(C)=C1 VOEYXMAFNDNNED-UHFFFAOYSA-N 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
- G06V40/25—Recognition of walking or running movements, e.g. gait recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to the technical field of video monitoring, in particular to multi-lens pedestrian identification and tracking based on computer vision, which comprises the following steps: step 1: deployment of camera equipment and video stream acquisition; step 2: a tracking module under the single camera; and 4, step 4: tracking across cameras; and 5: the re _ label block resolves the row ID-SWITCH. The invention provides an ID error correction module aiming at ID exchange of pedestrians during intersection, and the ID error correction module can distinguish tracks by analyzing and finding the gait characteristics of the pedestrians during the ID exchange, thereby achieving the purpose of ID correction and being used for community security protection, early warning of the loss and dangerous behaviors of children, crowd track analysis of supermarkets and the like; through a computer vision method, uninterrupted processing can be performed in the background, and the efficiency and the correctness can be obviously improved.
Description
Technical Field
The invention relates to the technical field of video monitoring, in particular to multi-lens pedestrian identification and tracking based on computer vision.
Background
At present, the road monitoring system enters the stage of expansion and change in the world of the urban monitoring in the world with the outburst of the different military, and under the change of the demand, a security monitoring system needs more integrated solutions with Artificial Intelligence (AI). Modern public security is no longer limited to infinitely expanding image monitoring coverage density and breadth and pursuing ultra-high definition, but the traditional security monitoring era is further developed by means and tools of AI artificial intelligence, and the traditional security monitoring era turns to the AI artificial intelligence security monitoring era focusing on data acquisition, application and management. With the increase of the number of the monitoring devices at present, the image resolution is continuously improved, the data volume of the images and pictures collected by public security is increased in geometric proportion, and the improvement of the image resolution also increases the processing capacity and the utilization rate of the server. Therefore, security monitoring image monitoring faces a great challenge in the technologies of image retrieval, access control data storage, data operation and the like.
The cross-camera multi-target tracking is a very important research topic in the field of monitoring videos, and is directly referred to as MTMC hereinafter. At present, single-target tracking and multi-target tracking of a single camera have some good solutions, but the field of MTMC generally forms no solution set and has a very large research space.
Therefore, the invention discloses a pedestrian tracking method across cameras. The behavior tracks of the persons are generated through data collected by different cameras, behavior analysis is carried out on special people, early warning is given, some behavior habits can be obtained through the track analysis, target person sequences can be searched, and manual operation is reduced.
Disclosure of Invention
The invention aims to provide multi-lens pedestrian recognition and tracking based on computer vision.
In order to achieve the purpose, the invention adopts the following technical scheme:
a multi-lens pedestrian recognition and tracking based on computer vision is provided, comprising the following steps:
step 1: deployment of camera devices and video stream capture
Cameras are arranged at important entrances, all paths, fork junctions and other places of the monitored area and are used for tracking and identifying pedestrians; when a certain camera or a certain video is obtained, the protocol is wholly matched through an internal exchange network of an intranet, the RTSP access protocol is adopted, and a limit is set on the access connection;
step 2: tracking module under single camera
Given a segment of video, the JDE model processes each frame and outputs a frame and a corresponding appearance embedding; calculating a correlation matrix between the embedding of the observations and the embedding in the pre-existing pool of trajectories; assigning the observations to the trajectories using the Hungarian algorithm; the Kalman filter is used for smoothing the track and predicting the position of the previous track in the current frame; if the assigned observation is spatially too far from the predicted location, the assignment will be rejected; the embedding of a tracker is then updated by flagging the Tracklet as missing if no observations are assigned to it; if the lost time is larger than a given threshold value, the lost track is marked and is deleted from the current track pool; or will be found anew in the allocation step; an early attempt to jointly learn a detector and an embedded model (JDE) in a single deep network was employed; the proposed JDE uses a single network to output the detection results and the corresponding appearance embedding of the detection boxes simultaneously; in contrast, the SDE method and the two-stage method feature resampled pixels (bounding boxes) and feature maps, respectively; both the bounding box and the feature map are fed into a separate re-ID model to extract appearance features; the loss function of JDE adopts trihard loss, which considers the contribution of hard sample to final loss on the basis of triple loss; suppose two input pictures I1And I2The feature f is obtained by network forward propagation1And f2The Euclidean distance between the feature vectors of the two pictures is as follows:
dI1,I2=||f1-f2||2
the picture a and the picture p are a pair of positive sample pairs, the picture a and the picture n are a pair of negative sample pairs, and the triple loss is as follows:
Lt=(da,p-da,n+α)+
in the formula (X)+Denotes max (x,0), alpha is a manually set threshold parameter
Trihard loss:
For each training batch, selecting pedestrians with P IDs randomly, wherein each pedestrian selects K different pictures randomly, namely one batch contains PxK, A is a picture set with the same ID as a, and the other picture sets with different IDs are B;
and step 3: re _ ID combined single-lens pedestrian with same ID
The global features are combined with the multi-granularity local features, the global features are responsible for extracting common features of the whole macro, and then the image is cut into different blocks, wherein each block has different granularities and is responsible for extracting features of different levels or different grades; the global and local characteristics are combined together, so that abundant information and details can be provided to represent the complete condition of an input picture;
and 4, step 4: cross-camera tracking
Taking the average value of pedestrian characteristics of a whole section of track as a as the characteristic of the track, and then adopting re-ranking to further optimize sequencing; for the tracking problem of the cross-camera, adopting an algorithm traversal of binary tree search, wherein the feature matching adopts extracted reid features which are related to the robustness of the model;
and 5: re _ label module for solving line ID-SWITCH
By linking pedestrians in different positions and monitored by a plurality of monitoring cameras, when visual appearance characteristics are unreliable, such as wearing a mask, making up and the like, analysis based on behavior characteristics (such as gait) becomes an optional solution of the human body re-identification problem; by analyzing the single shot tracking result, after the same ID is merged through the reid feature, the situation that a part of ID-SWITCH exists is that the IDs of two people are interchanged when meeting, and the part of ID needs to be solved through a correction module; the system comprises two sub-modules of target object gait feature extraction and object re-identification, wherein a target object gait feature extraction module detects an object through a foreground and extracts a target object silhouette, a gait cycle and a walking angle, and finally the gait features are fused and sent to an object re-identification module; the object re-identification module finds the top three re-identification results that most closely match the target object in the candidate object receipt set.
Further, the cameras deployed in the step 1 comprise a plurality of cameras which are horizontally arranged and installed, and the plurality of cameras can rotate at a wide angle and are not interfered with each other.
Further, in step 3, the image is cut into different blocks, each block has a different granularity, each block is assigned with a different value n (n is 1, 2, 3.. n), different values n are corresponding to different levels of features, and the detail degree of the model detail information increases with the increase of the n value.
Further, in step 4, in addition to three camera pairs with overlapping, a merging operation of some single camera tracks is also performed during cross-camera tracking.
Further, the implementation method of gait feature extraction in step 5 is as follows:
1) preprocessing a picture, and changing the ROI and the scale of the pedestrian;
2) extracting a GEI gait energy diagram of the picture;
3) extracting HOG characteristics from the GEI;
4) constructing a training set and a testing set to train an SVM classifier;
5) and (5) labeling the pedestrian by using the trained model.
Further, in the ID correction implementation method in step 5, the target space position of each frame is predicted through kalman filtering, and when a target is shielded, a large prediction error is caused; correcting part I by adopting reasonable area search methodD, assuming the center of the search area is (x)c,yc) Radius r, (x)c,yd) Is any detection r of the current framedThe central position of the rectangular area, if the condition is satisfied:
Nrthe number of frames of the target in an unassociated state is taken as the number of the frames; and then calculating the similarity according to the gait characteristics of the pedestrians in the region, and performing the subsequent re-matching.
The invention has the beneficial effects that:
1. the method can be used for community security protection, for example, the tracking of the old can be used for early warning of falling of the old, tracking of children, and early warning of the loss and dangerous behaviors of children; meanwhile, the method can be used for crowd trajectory analysis of the supermarket, so that the supermarket can reasonably analyze data to achieve intelligent shelf management.
2. Under the condition of large video data volume, the video retrieval by means of close to manual work is abandoned, and high efficiency and correctness are ensured; by means of a computer vision method, uninterrupted processing can be conducted in the background, and efficiency and accuracy can be improved remarkably.
3. The line ID-SWITCH is solved through the re _ label module, under the condition that ID exchange occurs when two people meet, a series of characteristics such as object gait characteristics and the like are utilized for identification, the previous tracking is carried out, interference and errors are avoided, and the overall accuracy is higher
Drawings
In order to more clearly illustrate the technical solution of the embodiment of the present invention, the attached drawings required to be used in the embodiment of the present invention will be briefly described below.
FIG. 1 is an overall framework flow of the present invention;
fig. 2 is a re _ label module provided by the present invention.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some components of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product.
Referring to fig. 1 and 2, a computer vision based multi-lens pedestrian recognition and tracking includes the following steps:
step 1: deployment of camera devices and video stream capture
Cameras are arranged at important entrances, all paths, fork junctions and other places of the monitored area and are used for tracking and identifying pedestrians; when a certain camera or a certain video is obtained, the protocol is wholly matched through an internal exchange network of an intranet, the RTSP access protocol is adopted, and a limit is set on the access connection;
step 2: tracking module under single camera
Given a segment of video, the JDE model processes each frame and outputs a frame and a corresponding appearance embedding; calculating a correlation matrix between the embedding of the observations and the embedding in the pre-existing pool of trajectories; assigning the observations to the trajectories using the Hungarian algorithm; the Kalman filter is used for smoothing the track and predicting the position of the previous track in the current frame; if the assigned observation is spatially too far from the predicted location, the assignment will be rejected; the embedding of a tracker is then updated by flagging the Tracklet as missing if no observations are assigned to it; if the lost time is larger than a given threshold value, the lost track is marked and is deleted from the current track pool; or will be found anew in the allocation step; an early attempt to jointly learn a detector and an embedded model (JDE) in a single deep network was employed; the proposed JDE uses a single network to output the detection results and the corresponding appearance embedding of the detection boxes simultaneously; in contrast, the SDE method and the two-stage method feature resampled pixels (bounding boxes) and feature maps, respectively; both the bounding box and the feature map are fed into a separate re-ID model to extract appearance features; in which loss of JDEThe function adopts a trihard loss, and the loss considers the contribution of the difficult sample to the final loss on the basis of the triple loss; suppose two input pictures I1And I2The feature f is obtained by network forward propagation1And f2The Euclidean distance between the feature vectors of the two pictures is as follows:
dI1,I2=||f1-f2||2
the picture a and the picture p are a pair of positive sample pairs, the picture a and the picture n are a pair of negative sample pairs, and the triple loss is as follows:
Lt=(da,p-da,n+α)+
in the formula (X)+Denotes max (x,0), alpha is a manually set threshold parameter
Trihard loss:
For each training batch, selecting pedestrians with P IDs randomly, wherein each pedestrian selects K different pictures randomly, namely one batch contains PxK, A is a picture set with the same ID as a, and the other picture sets with different IDs are B;
and step 3: re _ ID combined single-lens pedestrian with same ID
The global features are combined with the multi-granularity local features, the global features are responsible for extracting common features of the whole macro, and then the image is cut into different blocks, wherein each block has different granularities and is responsible for extracting features of different levels or different grades; the global and local characteristics are combined together, so that abundant information and details can be provided to represent the complete condition of an input picture;
and 4, step 4: cross-camera tracking
Taking the average value of pedestrian characteristics of a whole section of track as a as the characteristic of the track, and then adopting re-ranking to further optimize sequencing; for the tracking problem of the cross-camera, adopting an algorithm traversal of binary tree search, wherein the feature matching adopts extracted reid features which are related to the robustness of the model;
and 5: re _ label module for solving line ID-SWITCH
By linking pedestrians in different positions and monitored by a plurality of monitoring cameras, when visual appearance characteristics are unreliable, such as wearing a mask, making up and the like, analysis based on behavior characteristics (such as gait) becomes an optional solution of the human body re-identification problem; by analyzing the single shot tracking result, after the same ID is merged through the reid feature, the situation that a part of ID-SWITCH exists is that the IDs of two people are interchanged when meeting, and the part of ID needs to be solved through a correction module; the system comprises two sub-modules of target object gait feature extraction and object re-identification, wherein a target object gait feature extraction module detects an object through a foreground and extracts a target object silhouette, a gait cycle and a walking angle, and finally the gait features are fused and sent to an object re-identification module; the object re-identification module finds the top three re-identification results that most closely match the target object in the candidate object receipt set.
The cameras deployed in the step 1 comprise a plurality of cameras which are horizontally arranged and installed, and the cameras can rotate at a wide angle and do not interfere with each other.
In step 3, the image is cut into different blocks, each block has different granularity, each block is assigned with different values n (n is 1, 2, 3.. n), different values n are corresponding to different levels of features, and the detail degree of the model detail information is increased along with the increase of the n values.
In step 4, except for three overlapped camera pairs, merging of single camera tracks is also performed during cross-camera tracking.
The gait feature extraction implementation method in the step 5 comprises the following steps:
1) preprocessing a picture, and changing the ROI and the scale of the pedestrian;
2) extracting a GEI gait energy diagram of the picture;
3) extracting HOG characteristics from the GEI;
4) constructing a training set and a testing set to train an SVM classifier;
5) and (5) labeling the pedestrian by using the trained model.
In the ID correction implementation method in the step 5, the target space position of each frame is predicted through Kalman filtering, and a large prediction error is brought when a target is shielded; correcting partial ID by reasonable area search method, assuming the center of the search area is (x)c,yc) Radius r, (x)c,yd) Is any detection r of the current framedThe center position of the rectangular area, if the condition is satisfied:
Nrthe number of frames of the target in an unassociated state is taken as the number of the frames; and then calculating the similarity according to the gait characteristics of the pedestrians in the region, and performing the subsequent re-matching.
The foregoing is merely exemplary and illustrative of the present invention and various modifications, additions and substitutions may be made by those skilled in the art to the specific embodiments described without departing from the scope of the invention as defined in the following claims.
Claims (6)
1. A multi-lens pedestrian recognition and tracking based on computer vision, comprising the steps of:
step 1: deployment of camera devices and video stream capture
Cameras are arranged at important entrances, all paths, fork junctions and other places of the monitored area and are used for tracking and identifying pedestrians; when a certain camera or a certain video is obtained, the protocol is wholly matched through an internal exchange network of an intranet, the RTSP access protocol is adopted, and a limit is set on the access connection;
step 2: tracking module under single camera
Given a segment of video, the JDE model processes each frame and outputs a frame and a corresponding appearance embedding; calculating the embeddings of the observed valuesAn incidence matrix between embeddings into a pre-existing pool of traces; assigning the observations to the trajectories using the Hungarian algorithm; the Kalman filter is used for smoothing the track and predicting the position of the previous track in the current frame; if the assigned observation is spatially too far from the predicted location, the assignment will be rejected; the embedding of a tracker is then updated by flagging the Tracklet as missing if no observations are assigned to it; if the lost time is larger than a given threshold value, the lost track is marked and is deleted from the current track pool; or will be found anew in the allocation step; an early attempt to jointly learn a detector and an embedded model (JDE) in a single deep network was employed; the proposed JDE uses a single network to output the detection results and the corresponding appearance embedding of the detection boxes simultaneously; in contrast, the SDE method and the two-stage method feature resampled pixels (bounding boxes) and feature maps, respectively; both the bounding box and the feature map are fed into a separate re-ID model to extract appearance features; wherein, the loss function of JDE adopts trihardlos, which considers the contribution of the difficult sample to the final loss on the basis of triple loss; suppose two input pictures I1And I2The feature f is obtained by network forward propagation1And f2The Euclidean distance between the feature vectors of the two pictures is as follows:
dI1,I2=||f1-f2||2
the picture a and the picture p are a pair of positive sample pairs, the picture a and the picture n are a pair of negative sample pairs, and the triple loss is as follows:
Lt=(da,p-da,n+α)+
in the formula (X)+Denotes max (x,0), alpha is a manually set threshold parameter
Trihard loss:
For each training batch, selecting P pedestrians with ID randomly, selecting K different pictures randomly by each pedestrian,
that is, one batch contains PxK, a is a picture set with the same ID as a, and the rest picture sets with different IDs are B;
and step 3: re _ ID combined single-lens pedestrian with same ID
The global features are combined with the multi-granularity local features, the global features are responsible for extracting common features of the whole macro, and then the image is cut into different blocks, wherein each block has different granularities and is responsible for extracting features of different levels or different grades; the global and local characteristics are combined together, so that abundant information and details can be provided to represent the complete condition of an input picture;
and 4, step 4: cross-camera tracking
Taking the average value of pedestrian characteristics of a whole section of track as a as the characteristic of the track, and then adopting re-ranking to further optimize sequencing; for the tracking problem of the cross-camera, adopting an algorithm traversal of binary tree search, wherein the feature matching adopts extracted reid features which are related to the robustness of the model;
and 5: re _ label module for solving line ID-SWITCH
By linking pedestrians which are monitored by a plurality of monitoring cameras and are positioned at different positions, when visual appearance characteristics are unreliable, such as wearing a mask, making up and the like, analysis based on behavior characteristics (such as gait) becomes an optional solution of the human body re-identification problem; by analyzing the single shot tracking result, after the same ID is merged through the reid feature, the situation that a part of ID-SWITCH exists is that the IDs of two people are interchanged when meeting, and the part of ID needs to be solved through a correction module; the system comprises two sub-modules of target object gait feature extraction and object re-identification, wherein a target object gait feature extraction module detects an object through a foreground and extracts a target object silhouette, a gait cycle and a walking angle, and finally the gait features are fused and sent to an object re-identification module; the object re-identification module finds the top three re-identification results that most closely match the target object in the candidate object receipt set.
2. The computer vision based multi-lens pedestrian recognition and tracking method as claimed in claim 1, wherein the cameras deployed in step 1 comprise a plurality of horizontally arranged cameras, and the cameras can rotate at a wide angle and do not interfere with each other.
3. The computer vision-based multi-lens pedestrian recognition and tracking method according to claim 1, wherein in step 3, the image is cut into different blocks, each block has different granularity, each block is assigned with different values n (n is 1, 2, 3).
4. A computer vision based multi-lens pedestrian recognition and tracking method in accordance with claim 1, wherein in step 4, in addition to three overlapping camera pairs, a plurality of single camera tracks are merged during cross-camera tracking.
5. The computer vision-based multi-lens pedestrian recognition and tracking method according to claim 1, wherein the gait feature extraction implementation method in step 5 is as follows:
1) preprocessing a picture, and changing the ROI and the scale of the pedestrian;
2) extracting a GEI gait energy diagram of the picture;
3) extracting HOG characteristics from the GEI;
4) constructing a training set and a testing set to train an SVM classifier;
5) and (5) labeling the pedestrian by using the trained model.
6. The method for identifying and tracking the pedestrian with multiple lenses based on the computer vision according to claim 1, wherein in the step 5, the ID correction implementation method predicts the target space position of each frame through Kalman filtering, and when the target is shielded, a large prediction error is brought; partial ID correction by reasonable area search, assumingThe center of the search area is (x)c,yc) Radius r, (x)c,yd) Is any detection r of the current framedThe central position of the rectangular area, if the condition is satisfied:
Nrthe number of frames of the target in an unassociated state is taken as the number of the frames; and then calculating the similarity according to the gait characteristics of the pedestrians in the region, and performing the subsequent re-matching.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011013830.XA CN112132873A (en) | 2020-09-24 | 2020-09-24 | Multi-lens pedestrian recognition and tracking based on computer vision |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011013830.XA CN112132873A (en) | 2020-09-24 | 2020-09-24 | Multi-lens pedestrian recognition and tracking based on computer vision |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112132873A true CN112132873A (en) | 2020-12-25 |
Family
ID=73839188
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011013830.XA Pending CN112132873A (en) | 2020-09-24 | 2020-09-24 | Multi-lens pedestrian recognition and tracking based on computer vision |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112132873A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113674321A (en) * | 2021-08-25 | 2021-11-19 | 燕山大学 | Cloud-based multi-target tracking method under surveillance video |
CN114120373A (en) * | 2022-01-24 | 2022-03-01 | 苏州浪潮智能科技有限公司 | Model training method, device, equipment and storage medium |
CN114642863A (en) * | 2022-03-16 | 2022-06-21 | 温州大学 | Outdoor sports game system for kindergarten |
CN115100591A (en) * | 2022-06-17 | 2022-09-23 | 哈尔滨工业大学 | Multi-target tracking and target re-identification system and method based on joint learning |
CN117253283A (en) * | 2023-08-09 | 2023-12-19 | 三峡大学 | Wheelchair following method based on fusion of image information and electromagnetic positioning information data |
Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101488185A (en) * | 2009-01-16 | 2009-07-22 | 哈尔滨工程大学 | Partitioned matrix-based gait recognition method |
CN104794449A (en) * | 2015-04-27 | 2015-07-22 | 青岛科技大学 | Gait energy image acquisition method based on human body HOG (histogram of oriented gradient) features and identity identification method |
CN204706039U (en) * | 2015-05-29 | 2015-10-14 | 杭州晟元芯片技术有限公司 | A kind of bar code identifying device based on many camera lenses |
CN108509859A (en) * | 2018-03-09 | 2018-09-07 | 南京邮电大学 | A kind of non-overlapping region pedestrian tracting method based on deep neural network |
CN108875588A (en) * | 2018-05-25 | 2018-11-23 | 武汉大学 | Across camera pedestrian detection tracking based on deep learning |
CN108921019A (en) * | 2018-05-27 | 2018-11-30 | 北京工业大学 | A kind of gait recognition method based on GEI and TripletLoss-DenseNet |
CN109271888A (en) * | 2018-08-29 | 2019-01-25 | 汉王科技股份有限公司 | Personal identification method, device, electronic equipment based on gait |
CN109389017A (en) * | 2017-08-11 | 2019-02-26 | 苏州经贸职业技术学院 | Pedestrian's recognition methods again |
CN109472191A (en) * | 2018-09-17 | 2019-03-15 | 西安电子科技大学 | A kind of pedestrian based on space-time context identifies again and method for tracing |
CN109635695A (en) * | 2018-11-28 | 2019-04-16 | 西安理工大学 | Pedestrian based on triple convolutional neural networks recognition methods again |
CN109903312A (en) * | 2019-01-25 | 2019-06-18 | 北京工业大学 | A kind of football sportsman based on video multi-target tracking runs distance statistics method |
AU2017279658A1 (en) * | 2017-12-20 | 2019-07-04 | Canon Kabushiki Kaisha | Pose-aligned descriptor for person re-id with geometric and orientation information |
CN109993116A (en) * | 2019-03-29 | 2019-07-09 | 上海工程技术大学 | A kind of pedestrian mutually learnt based on skeleton recognition methods again |
CN110223329A (en) * | 2019-05-10 | 2019-09-10 | 华中科技大学 | A kind of multiple-camera multi-object tracking method |
CN110619268A (en) * | 2019-08-07 | 2019-12-27 | 北京市新技术应用研究所 | Pedestrian re-identification method and device based on space-time analysis and depth features |
WO2020098158A1 (en) * | 2018-11-14 | 2020-05-22 | 平安科技(深圳)有限公司 | Pedestrian re-recognition method and apparatus, and computer readable storage medium |
CN111259786A (en) * | 2020-01-14 | 2020-06-09 | 浙江大学 | Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video |
-
2020
- 2020-09-24 CN CN202011013830.XA patent/CN112132873A/en active Pending
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101488185A (en) * | 2009-01-16 | 2009-07-22 | 哈尔滨工程大学 | Partitioned matrix-based gait recognition method |
CN104794449A (en) * | 2015-04-27 | 2015-07-22 | 青岛科技大学 | Gait energy image acquisition method based on human body HOG (histogram of oriented gradient) features and identity identification method |
CN204706039U (en) * | 2015-05-29 | 2015-10-14 | 杭州晟元芯片技术有限公司 | A kind of bar code identifying device based on many camera lenses |
CN109389017A (en) * | 2017-08-11 | 2019-02-26 | 苏州经贸职业技术学院 | Pedestrian's recognition methods again |
AU2017279658A1 (en) * | 2017-12-20 | 2019-07-04 | Canon Kabushiki Kaisha | Pose-aligned descriptor for person re-id with geometric and orientation information |
CN108509859A (en) * | 2018-03-09 | 2018-09-07 | 南京邮电大学 | A kind of non-overlapping region pedestrian tracting method based on deep neural network |
CN108875588A (en) * | 2018-05-25 | 2018-11-23 | 武汉大学 | Across camera pedestrian detection tracking based on deep learning |
CN108921019A (en) * | 2018-05-27 | 2018-11-30 | 北京工业大学 | A kind of gait recognition method based on GEI and TripletLoss-DenseNet |
CN109271888A (en) * | 2018-08-29 | 2019-01-25 | 汉王科技股份有限公司 | Personal identification method, device, electronic equipment based on gait |
CN109472191A (en) * | 2018-09-17 | 2019-03-15 | 西安电子科技大学 | A kind of pedestrian based on space-time context identifies again and method for tracing |
WO2020098158A1 (en) * | 2018-11-14 | 2020-05-22 | 平安科技(深圳)有限公司 | Pedestrian re-recognition method and apparatus, and computer readable storage medium |
CN109635695A (en) * | 2018-11-28 | 2019-04-16 | 西安理工大学 | Pedestrian based on triple convolutional neural networks recognition methods again |
CN109903312A (en) * | 2019-01-25 | 2019-06-18 | 北京工业大学 | A kind of football sportsman based on video multi-target tracking runs distance statistics method |
CN109993116A (en) * | 2019-03-29 | 2019-07-09 | 上海工程技术大学 | A kind of pedestrian mutually learnt based on skeleton recognition methods again |
CN110223329A (en) * | 2019-05-10 | 2019-09-10 | 华中科技大学 | A kind of multiple-camera multi-object tracking method |
CN110619268A (en) * | 2019-08-07 | 2019-12-27 | 北京市新技术应用研究所 | Pedestrian re-identification method and device based on space-time analysis and depth features |
CN111259786A (en) * | 2020-01-14 | 2020-06-09 | 浙江大学 | Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video |
Non-Patent Citations (12)
Title |
---|
ALEXANDER HERMANS等: "In Defense of the Triplet Loss for Person Re-Identification", ARXIV * |
GUANSHUO WANG等: "Learning Discriminative Features with Multiple Granularities", ARXIV * |
TIANQING WANG 等: "PERSON RE-identification by video ranking", EUROPEAN CONFERENCE ONCOMPUTER VISION * |
WANG T Q 等: "Person Re-identification by Video Ranking", 《 EUROPEAN CONFERENCE ON COMPUTER VISION ( ECCV 2014 ) 》, 31 December 2014 (2014-12-31), pages 688 - 703 * |
XINLIANG TANG等: "Research on the pedestrian re-identification method based on local features and gait energy images", COMPUTERS, MATERIALS&CONTINUA, vol. 64, no. 2 * |
ZHIMENG ZHANG 等: "Multi-Target, Multi-Camera Tracking by Hierarchical Clustering", 《ARXIV:1712.09531V1》, pages 1 - 4 * |
ZHONGDAO WANG, LIANG ZHENG: "TOWARDS REAL-TIME MULTI-OBJECT TRACKING", ARXIV, pages 1 - 9 * |
张良等: "多粒度特征融合的行人再识别研究", 液晶与显示, vol. 35, no. 6 * |
赵翎妗 等: "行人再识别技术研究综述", 《贵州师范大学学报(自然科学版)》, vol. 37, no. 6, 31 December 2019 (2019-12-31), pages 114 - 122 * |
邵佳耀,宋春林: "基于步态识别的跨摄像头行人再识别算法研究", 信息技术与信息化, vol. 2019, no. 12, pages 85 - 89 * |
陈亮雨 等: "多形状局部区域神经网络结构的行人再识别", 《中国图像图形学报》, vol. 24, no. 11, pages 1932 - 1941 * |
陈亮雨, 李卫疆: "多形状局部区域神经网络结构的行人再识别", 中国图象图形学报, vol. 24, no. 11, pages 1932 - 1941 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113674321A (en) * | 2021-08-25 | 2021-11-19 | 燕山大学 | Cloud-based multi-target tracking method under surveillance video |
CN113674321B (en) * | 2021-08-25 | 2024-05-17 | 燕山大学 | Cloud-based method for multi-target tracking under monitoring video |
CN114120373A (en) * | 2022-01-24 | 2022-03-01 | 苏州浪潮智能科技有限公司 | Model training method, device, equipment and storage medium |
CN114642863A (en) * | 2022-03-16 | 2022-06-21 | 温州大学 | Outdoor sports game system for kindergarten |
CN115100591A (en) * | 2022-06-17 | 2022-09-23 | 哈尔滨工业大学 | Multi-target tracking and target re-identification system and method based on joint learning |
CN117253283A (en) * | 2023-08-09 | 2023-12-19 | 三峡大学 | Wheelchair following method based on fusion of image information and electromagnetic positioning information data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112132873A (en) | Multi-lens pedestrian recognition and tracking based on computer vision | |
Zhang et al. | Learning semantic scene models by object classification and trajectory clustering | |
CN103824070B (en) | A kind of rapid pedestrian detection method based on computer vision | |
Nascimento et al. | Trajectory classification using switched dynamical hidden Markov models | |
CN103246896B (en) | A kind of real-time detection and tracking method of robustness vehicle | |
CN102243765A (en) | Multi-camera-based multi-objective positioning tracking method and system | |
Nguyen et al. | Lmgp: Lifted multicut meets geometry projections for multi-camera multi-object tracking | |
Amosa et al. | Multi-camera multi-object tracking: a review of current trends and future advances | |
Choe et al. | Traffic analysis with low frame rate camera networks | |
Bhola et al. | Real-time pedestrian tracking based on deep features | |
Xu et al. | Smart video surveillance system | |
Vora et al. | Bringing generalization to deep multi-view pedestrian detection | |
Sio et al. | Multiple fisheye camera tracking via real-time feature clustering | |
CN106023252A (en) | Multi-camera human body tracking method based on OAB algorithm | |
CN113361392B (en) | Unsupervised multi-mode pedestrian re-identification method based on camera and wireless positioning | |
CN115188081A (en) | Complex scene-oriented detection and tracking integrated method | |
Jiang et al. | Vehicle tracking with non-overlapping views for multi-camera surveillance system | |
Peng et al. | Continuous vehicle detection and tracking for non-overlapping multi-camera surveillance system | |
Streib et al. | Extracting Pathlets FromWeak Tracking Data | |
Xiang et al. | Action recognition for videos by long-term point trajectory analysis with background removal | |
Moustafa et al. | Gate and common pathway detection in crowd scenes using motion units and meta-tracking | |
Prezioso et al. | Integrating Object Detection and Advanced Analytics for Smart City Crowd Management | |
Li et al. | Robust Construction of Spatial-Temporal Scene Graph Considering Perception Failures for Autonomous Driving | |
Zou et al. | A moving vehicle segmentation method based on clustering of feature points for tracking at urban intersection | |
Khel et al. | Realtime Crowd Monitoring—Estimating Count, Speed and Direction of People Using Hybridized YOLOv4 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |