LU102028B1 - Multiple view multiple target tracking method and system based on distributed camera network - Google Patents
Multiple view multiple target tracking method and system based on distributed camera network Download PDFInfo
- Publication number
- LU102028B1 LU102028B1 LU102028A LU102028A LU102028B1 LU 102028 B1 LU102028 B1 LU 102028B1 LU 102028 A LU102028 A LU 102028A LU 102028 A LU102028 A LU 102028A LU 102028 B1 LU102028 B1 LU 102028B1
- Authority
- LU
- Luxembourg
- Prior art keywords
- detected target
- information
- current frame
- camera
- coordinate
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/292—Multi-camera tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/53—Recognition of crowd images, e.g. recognition of crowd congestion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/181—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/188—Capturing isolated or intermittent images triggered by the occurrence of a predetermined event, e.g. an object reaching a predetermined position
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/255—Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
- G06V10/95—Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The present disclosure discloses a multiple view multiple target tracking method and system based on a distributed camera network. The method includes: obtaining a current frame view collected by each camera in the distributed camera network; extracting a rectangular boundary box of a detected target from the current frame view; extracting visual appearance information of the detected target from an image in the rectangular boundary box by using a pre-trained convolutional neural network; converting an image coordinate of the detected target in the current frame view into a ground coordinate; and output step: constructing a data incidence matrix based on the visual appearance information and the ground coordinate of the detected target; and processing the data incidence matrix by using the Hungarian algorithm, and outputting a result of successful matching or failed matching between the detected target in the current frame view and a known trajectory.
Description
MULTIPLE VIEW MULTIPLE TARGET TRACKING METHOD AND LU102028 |
Field of the Invention |
The present disclosure relates to the technical field of multiple target tracking, in | particular to a multiple view multiple target tracking method and system based on a | distributed camera network. | Background of the Invention |
The statements in this section merely mention the background art related to the | present disclosure, and do not necessarily constitute the prior art. |
The multiple object tracking (Multiple Object Tracking) technology has many | applications in nowadays society, such as surveillance, monitoring, and crowd | behavior analysis. |
In the process of implementing the present disclosure, the inventor found the | following technical problems in the prior art: | Multiple object tracking is still a challenging task, because it needs to solve the | problems such as target detection, trajectory estimation, data association and | re-identification at the same time.
In order to detect targets, various sensors such as | radar, laser, sonar, cameras or the like can be used according to the needs of specific | tasks, and corresponding detection algorithms are also required.
Target detection is | one of the difficulties in the multiple target tracking.
Another challenging problem of | the multiple target tracking is occlusion.
The target can be occluded by other objects, | or it can be occluded by the current field of view (Field Of View), and frequent | occlusion can easily cause loss of the target, which affects the tracking accuracy. | Summary of the Invention |
In order to overcome the shortcomings of the prior art, the present disclosure provides | a multiple view multiple target tracking method and system based on a distributed | camera network; |
In a first aspect, the present disclosure provides a multiple view multiple target LU102028 | tracking method based on a distributed camera network; Ë The multiple view multiple target tracking method based on the distributed camera À network includes: ' obtaining a current frame view collected by each camera in the distributed camera | network; | extracting a rectangular boundary box of a detected target from the current frame i view; | extracting visual appearance information of the detected target from an image in the , rectangular boundary box by using a pre-trained convolutional neural network; and | converting an image coordinate of the detected target in the current frame view into a | ground coordinate; and | output step: constructing a data incidence matrix based on the visual appearance . information and the ground coordinate of the detected target; and processing the data | incidence matrix by using the Hungarian algorithm, and outputting a result of ; successful matching or failed matching between the detected target in the current . frame view and a known trajectory. .
In a second aspect, the present disclosure further provides a multiple view multiple | target tracking system based on a distributed camera network; | The multiple view multiple target tracking system based on the distributed camera | network includes: | an obtaining module, configured to obtain a current frame view collected by each | camera in the distributed camera network; | a preprocessing module, configured to extract a rectangular boundary box of a | detected target from the current frame view; | an extraction module, configured to extract visual appearance information of the | detected target from an image in the rectangular boundary box by using a pre-trained | convolutional neural network; and convert an image coordinate of the detected target | in the current frame view into a ground coordinate; and | an output module, configured to construct a data incidence matrix based on the visual | appearance information and the ground coordinate of the detected target; and process LU102028 | the data incidence matrix by using the Hungarian algorithm, and output a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory. | In a third aspect, the present disclosure further provides an electronic device, | including a memory, a processor, and computer instructions stored in the memory and | running on the processor, and the computer instructions complete the steps of the | method in the first aspect when executed by the processor. .
In a fourth aspect, the present disclosure further provides a computer-readable storage A medium for storing computer instructions, and the computer instructions complete the | steps of the method in the first aspect when executed by a processor. | Compared with the prior art, the present disclosure has the following beneficial | effects: .
In the method, the data incidence matrix generated by combining the visual | appearance information with the ground coordinate is adopted, and the matching | between the detected target and the known trajectory is implemented by using the data | incidence matrix, so that the accuracy of matching can be improved; and | The method benefits from both deep appearance visual features and distributed | trajectory estimation. Compared with the original Deep SORT method, the method is .
more robust in dealing with target re-identification and occlusion problems. | Brief Description of the Drawings | The drawings constituting a part of the present application are used for providing a | further understanding of the present application. The exemplary embodiments of the | present application and descriptions thereof are used for explaining the present | application, but do not constitute an improper limitation to the present application. | Fig. 1 is an overall structure of a distributed multiple view multiple target tracking | system in the first embodiment. | Detailed Description of the Embodiments | It should be pointed out that the following detailed descriptions are all exemplary and | are intended to provide further descriptions of the present application. Unless 0102028 | otherwise specified, all technical and scientific terms used herein have the same | meaning as commonly understood by those of ordinary skill in the technical field of | the present application. | It should be noted that the terms used here are only for describing specific | embodiments, and are not intended to limit the exemplary embodiments according to | the present application. As used herein, unless the context clearly indicates otherwise, | the singular form is also intended to include the plural form. In addition, it should also | be understood that when the terms "comprising" and/or "including" are used in the | present specification, they indicate the presence of features, steps, operations, devices, | components and/or combinations thereof. | First embodiment | The present embodiment provides a multiple view multiple target tracking method | based on a distributed camera network; .
The multiple view multiple target tracking method based on the distributed camera | network includes: .
S1: obtaining a current frame view collected by each camera in the distributed camera | network; | S2: extracting a rectangular boundary box of a detected target from the current frame | view; | S3: extracting visual appearance information of the detected target from an image in | the rectangular boundary box by using a pre-trained convolutional neural network; | and converting an image coordinate of the detected target in the current frame view | into a ground coordinate; and | S4: output step: constructing a data incidence matrix based on the visual appearance | information and the ground coordinate of the detected target; and processing the data | incidence matrix by using the Hungarian algorithm, and outputting a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory. | As one or more embodiments, in S4, the specific steps of the output step include: | calculating the Mahalanobis distance between the ground coordinate of the detected LU102028 ’ target and the destination coordinate of each stored trajectory in the current frame | view; | calculating M cosine distances between the visual appearance information of the | detected target and the visual appearance information of previous M frames adjacent | to the current frame view; and storing the minimum value of the M cosine distances as | the final cosine distance; . when the Mahalanobis distance and the final cosine distance are both less than a set 0 threshold, performing weighted summation on the Mahalanobis distance and the final - cosine distance to obtain the data incidence matrix; and | inputting the data incidence matrix into the Hungarian algorithm, and outputting, by | the Hungarian algorithm, the result of successful matching or failed matching between | the detected target in the current frame view and the known trajectory. | As one or more embodiments, the method further includes: | S5: if image coordinate information of the successfully matched detected target and a | corresponding trajectory serial number ID are stored in the current camera, then | performing repeated iteration on the successfully matched information in the current | camera and the successfully associated information in the adjacent camera for A exchange, and calculating average consensus to obtain a convergent information | vector and a convergent information matrix; and | calculating posterior pose information based on the converged information vector and || the converged information matrix, thus so far, multiple view multiple target tracking .
is achieved; and then, predicting the position information of the detected target in the | next frame of view. .
As one or more embodiments, the method further includes: | S6: if the ground coordinate of the detected target that fails to match and a | corresponding trajectory serial number ID are stored in the current camera, then | calculating the Euclidean distance between the coordinate information of the detected | target that fails to match in the current camera and the destination coordinate || information of each stored trajectory in the views captured by the other remaining | cameras; and be if the Eucli LU102028 Mg uclidean di t . 8 dance 1 le. € detected et thresh Éd target in the curr old, Matching the Views ¢ ent Camera wi; ground coordi | Aptured by the oth With the ç orrespond Tdinate | CT remaini Ing tra Lo | As One or m ng Cameras.
Jectories in the | ore embodiment, th | POUR > the dices I distribyteq type i | distributed ca [ data in a ce centralized type, and Cans that the | | ntral Processing unit f > the centralized pe i | the data in re or Processing: and the distrib Pe is to collect | Spective Stribut | an 0 pro | D communicate with each | 0A c | § ONE or more embodiments, in gp 4, | ? » INC eXtractin | det 8 à rectangular | ected target from the current fra . | boundary box of a [I bound Me View includes: extracting th | Ounda ’ e | ry box of the detected target from the curr te 5 rectangular : As one or more embodiments, in S3, the ; _. | > 196 Specific training st : | 15 . & steps of the pre-trained | convolutional neural network include: | constructing the convolutional neural network: and i oo. : ; ; constructing a training set, | wherein the training set is an im i ; | g age of known visual appearance information; inputting the training set into the convolutional neural network to train the i convolutional neural network; and | obtaining a trained convolutional neural network. | For example, the training set is a large-scale pedestrian re-identification data set à containing more than 1.1 million images of 1261 pedestrians. à As one or more embodiments, in S3, the visual appearance information specifically | refers to 128-dimensional normalized features output by the convolutional neural | network.
For example, it may be a contour feature. | As one or more embodiments, in S3, the specific step of converting the image | coordinate of the detected target in the current frame view into the ground coordinate . includes: |, using a pixel coordinate of the midpoint of a bottom margin of the boundary box of a | person in the image as the position information of the person, and converting the pixel | coordinate into the ground coordinate through a homography matrix (homography LU102028 . matrix), wherein the homography matrix is obtained by camera calibration. | As one or more embodiments, in S5, the specific step of calculating the average | consensus includes: | based on the ICF algorithm, performing information exchange through repeated | iteration of adjacent cameras so as to obtain the convergent information vector and the |! convergent information matrix: ) Wherein, € represents a constant, v“ represents the information vector, and VE ; represents the information matrix. .
As one or more embodiments, in S5, the specific step of calculating the posterior pose | information includes: | calculating a posterior state vector x; (t) and a posterior information matrix W;t (t) .
in the current frame. | Wherein, N represents the number of cameras. | As one or more embodiments, in S5, the specific step of predicting the position | information of the detected target in the next frame of view includes: | predicting a next state variable x; (t) and a next information matrix W; (t) of the | target: | Wherein, i represents the i node, t represents the t® frame, ® represents a linear A state transition matrix, and Q represents a process noise covariance. | The overall structure of the distributed multiple target tracking method proposed in |
Coe eT TE I the present disclosure is shown in Fig. 1. First, target detection is performed on each LU102028 | camera by using YOLOv3, and the algorithm can extract the rectangular boundary | box of the detected target with a higher frame rate. Then, the visual appearance | information of the target is obtained through a pre-trained convolutional neural ‘ network. The Hungarian algorithm is used to combine the visual appearance . information and the position information of the target for data association. The . position information of the target can be fused from multiple view information by | using an information weighed consensus filter (Information Weighed Consensus | Filter). The specific steps are as follows: |
1. Target detection | Target detection refers to obtaining different targets in an image and determining their | types and positions. The target detection method based on deep learning has strong | robustness to illumination changes, occlusion problems and complex environments. | There are two main research directions: a two-stage method and a one-stage method. | The two-stage method includes: predicting the number of candidate frames that may | have the target at first, and then adjusting the sizes of the frames and classifying the | frames to obtain the precise position, size and category of the target, such as Faster . R-CNN. In the one-stage method, the first step is omitted, and the position and the | category of the target are directly predicted, for example, YOLOv3 is compared with | the two-stage method, the one-stage method is usually faster and has comparable | performance. Therefore, we choose the YOLOvV3 as a target detector. |
2. Data association | A simple Hungarian algorithm is used for data association, the visual appearance | information of the target is a 128-dimensional feature vector obtained by a trained | convolutional neural network, and the position information of the target is obtained by , converting an image coordinate onto the ground by using a corrected homography | matrix (homography matrix), and the final correlation matrix is obtained by weighting | the visual appearance information and the position information of the target: |
Wherein, i,j respectively represent the measured value of the i trajectory and the j® LU102028 | measured value, À represents an adjustable weight parameter, d) represents the | Mahalanobis distance between a measurement position and the last frame of each . stored trajectory in the current frame, and d® represents the minimum cosine distance | between the measured appearance information and the appearance information stored | in each trajectory. .
It should be understood that the homography matrix refers to a homography matrix ‘ obtained by camera calibration, and the homography matrix can convert the pixel | coordinate into the ground coordinate. | In addition, a threshold function is used to ignore irrelevant candidates: . bi = 1,if di j) « t® 2) | Wherein, k is equal to 1 or 2 and represents the appearance information or the position | information, only when both the Mahalanobis distance and the cosine distance are less | than the corresponding thresholds, the association between the i trajectory and the j® | tracked target is allowed, that is, b is set as 1. .
3. Trajectory processing using multiple view information | The trajectory processing step is used for ID management, including restoring an old . trajectory or creating a new trajectory. Restoring the old trajectory means that after a | person walks out of the field of view for 30 frames, his trajectory will be deleted, and | when he comes back again, he will be given the original ID; and creating the new E trajectory means that a new person enters the field of view and his is assigned with a | new ID. When a detection value that fails to match is found in the current view, the | position information thereof is matched with the position information of the last frame | of the trajectory at first by using the Euclidean distance to restore the old trajectory, | and if a matching candidate is found, the detection value is assigned with the ID of the | trajectory. | If the match fails again, then the algorithm will check whether there is a match in | other views, that is, the detection match that also fails to match in other views, for the | generation of the new trajectory, and if the distance meets the threshold requirements, |
CORRE SRE ee Tee TT EF 10 | a new ID is initialized for them. In addition, the algorithm removes the trajectories LU102028 ; that have disappeared for more than 30 seconds in the current view to reduce : interference. |
4. Information weighed consensus filter for multiple view information fusion | The information weighed consensus filter (ICF) is an effective distributed state | estimation method. Here, the ICF is used to perform multiple view information fusion | to estimate the position of the target. The ICF mainly includes three steps: state | prediction, measurement update and weighed consensus. In terms of state prediction, : in S6, a linear constant speed model is used to predict the next state variable x and the : next information matrix W of the target. ; Wi (6) = (®(W#(t— 1)) oT + Q) 3) | Wherein, i represents the i node, t represents the t frame, ® represents a linear : state transition matrix, and Q represents a process noise covariance. The predicted | position information is sent to a data association module to match the measured value. | During the measurement update, the current measured value z; is used to calculate : the information vector v; and the information matrix V;. | x}; Wi» H;, R; and N respectively represent a priori state vector, the information | matrix, an observation matrix, a measured noise covariance and the number of | cameras. With respect to the weighed consensus, each camera will send and receive | the information vector v; and the information matrix V; to and from the adjacent | camera, and k times of iteration are performed until the filter is convergent. | Wherein, € represents a constant. |
The posterior state vector xj (t) and the posterior information matrix W;(t) in the LU102028 | current frame are obtained at last. | Wit) = NV (10) |
Wherein, N represents the number of cameras. | Second embodiment |
The present embodiment further provides a multiple view multiple target tracking | system based on a distributed camera network; |
The multiple view multiple target tracking system based on the distributed camera E network includes: | an obtaining module, configured to obtain a current frame view collected by each | camera in the distributed camera network; | a preprocessing module, configured to extract a rectangular boundary box of a | detected target from the current frame view; | an extraction module, configured to extract visual appearance information of the | detected target from an image in the rectangular boundary box by using a pre-trained | convolutional neural network; and convert an image coordinate of the detected target | in the current frame view into a ground coordinate; and | an output module, configured to construct a data incidence matrix based on the visual | appearance information and the ground coordinate of the detected target; and process | the data incidence matrix by using the Hungarian algorithm, and output a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory. | Third embodiment |
The present embodiment further provides an electronic device, including a memory, a | processor, and computer instructions stored in the memory and running on the | processor, and the computer instructions complete the steps of the method in the first | aspect when executed by the processor. | Fourth embodiment |
The present embodiment further provides a computer-readable storage medium for | storing comp tru dth . vom we | uter Instructions, and the computer instructions comps. « method in the first aspect when executed by a Shy Ë ÿ a processor. = i ipti nen plit. Ë à The above descriptions are only preferred embodiments of the present applive, J and are not used to limit the present application. For those skilled in the art, the ny | present application can have various modifications and changes. Any modifications, yy equivalent replacements, improvements and the like, made within the spirit and “«.
principle of the present application, shall all be included in the protection scope of the | present application. |
Claims (10)
1. A multiple view multiple target tracking method based on a distributed camera | network, comprising: | obtaining a current frame view collected by each camera in the distributed camera | network; | extracting a rectangular boundary box of a detected target from the current frame | view; extracting visual appearance information of the detected target from an image in the | rectangular boundary box by using a pre-trained convolutional neural network; and | converting an image coordinate of the detected target in the current frame view into a | ground coordinate; and | output step: constructing a data incidence matrix based on the visual appearance | information and the ground coordinate of the detected target; and processing the data | incidence matrix by using the Hungarian algorithm, and outputting a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory.
2. The method of claim 1, wherein the specific steps of the output step comprise: | calculating the Mahalanobis distance between the ground coordinate of the detected target and the destination coordinate of each stored trajectory in the current frame view; calculating M cosine distances between the visual appearance information of the detected target and the visual appearance information of previous M frames adjacent to the current frame view; and storing the minimum value of the M cosine distances as the final cosine distance; when the Mahalanobis distance and the final cosine distance are both less than a set threshold, performing weighted summation on the Mahalanobis distance and the final cosine distance to obtain the data incidence matrix; and inputting the data incidence matrix into the Hungarian algorithm, and outputting, by
| CE TNT TT 7 LL 14 the Hungarian algorithm, the result of successful matching or failed matching between | LV102028 the detected target in the current frame view and the known trajectory.
3. The method of claim 1, further comprising: if image coordinate information of the successfully matched detected target and a corresponding trajectory serial number ID are stored in the current camera, then performing repeated iteration on the successfully matched information in the current | camera and the successfully associated information in the adjacent camera for exchange, and calculating average consensus to obtain a convergent information | vector and a convergent information matrix; and calculating posterior pose information based on the converged information vector and the converged information matrix, thus so far, multiple view multiple target tracking is achieved; and then, predicting the position information of the detected target in the next frame of view.
4. The method of claim 3, further comprising: if the ground coordinate of the detected target that fails to match and a corresponding trajectory serial number ID are stored in the current camera, then calculating the Euclidean distance between the coordinate information of the detected target that fails to match in the current camera and the destination coordinate information of each stored trajectory in the views captured by the other remaining cameras; and if the Euclidean distance is less than a set threshold, matching the ground coordinate of the detected target in the current camera with the corresponding trajectories in the views captured by the other remaining cameras.
5. The method of claim 3, wherein the extracting a rectangular boundary box of a detected target from the current frame view comprises: extracting the rectangular boundary box of the detected target from the current frame view by using a YOLOv3 network.
6. The method of claim 3, wherein the specific training steps of the pre-trained convolutional neural network comprise: constructing the convolutional neural network; and constructing a training set, wherein the training set is an image of known visual appearance information;
inputting the training set into the convolutional neural network to train the LU102028 convolutional neural network; and obtaining a trained convolutional neural network.
7. The method of claim 1, wherein the specific step of converting the image coordinate of the detected target in the current frame view into the ground coordinate comprises: using a pixel coordinate of the midpoint of a bottom margin of the boundary box of a person in the image as the position information of the person, and converting the pixel coordinate into the ground coordinate through a homography matrix, wherein the homography matrix is obtained by camera calibration.
8. A multiple view multiple target tracking system based on a distributed camera network, comprising: an obtaining module, configured to obtain a current frame view collected by each camera in the distributed camera network; a preprocessing module, configured to extract a rectangular boundary box of a detected target from the current frame view; an extraction module, configured to extract visual appearance information of the detected target from an image in the rectangular boundary box by using a pre-trained convolutional neural network; and convert an image coordinate of the detected target in the current frame view into a ground coordinate; and an output module, configured to construct a data incidence matrix based on the visual appearance information and the ground coordinate of the detected target; and process the data incidence matrix by using the Hungarian algorithm, and output a result of successful matching or failed matching between the detected target in the current frame view and a known trajectory.
9. An electronic device, comprising a memory, a processor, and computer instructions stored in the memory and running on the processor, wherein the computer instructions complete the steps of the method according to any one of claims 1-7 when executed by the processor.
10. À computer-readable storage medium for storing computer instructions, wherein
CR RE EE eee nm im mm TT ITA ; LU102028 | the computer instructions complete the steps of the method according to any one of | claims 1-7 when executed by a processor. |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911012807.6A CN110782483B (en) | 2019-10-23 | 2019-10-23 | Multi-view multi-target tracking method and system based on distributed camera network |
Publications (1)
Publication Number | Publication Date |
---|---|
LU102028B1 true LU102028B1 (en) | 2021-03-03 |
Family
ID=69386547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
LU102028A LU102028B1 (en) | 2019-10-23 | 2020-09-03 | Multiple view multiple target tracking method and system based on distributed camera network |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110782483B (en) |
LU (1) | LU102028B1 (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111738075A (en) * | 2020-05-18 | 2020-10-02 | 深圳奥比中光科技有限公司 | Joint point tracking method and system based on pedestrian detection |
CN111626194B (en) * | 2020-05-26 | 2024-02-02 | 佛山市南海区广工大数控装备协同创新研究院 | Pedestrian multi-target tracking method using depth correlation measurement |
CN112215873A (en) * | 2020-08-27 | 2021-01-12 | 国网浙江省电力有限公司电力科学研究院 | Method for tracking and positioning multiple targets in transformer substation |
CN112070807B (en) * | 2020-11-11 | 2021-02-05 | 湖北亿咖通科技有限公司 | Multi-target tracking method and electronic device |
CN112633205A (en) * | 2020-12-28 | 2021-04-09 | 北京眼神智能科技有限公司 | Pedestrian tracking method and device based on head and shoulder detection, electronic equipment and storage medium |
CN113674317B (en) * | 2021-08-10 | 2024-04-26 | 深圳市捷顺科技实业股份有限公司 | Vehicle tracking method and device for high-level video |
CN114089675B (en) * | 2021-11-23 | 2023-06-09 | 长春工业大学 | Machine control method and system based on man-machine distance |
CN114299128A (en) * | 2021-12-30 | 2022-04-08 | 咪咕视讯科技有限公司 | Multi-view positioning detection method and device |
CN114596337B (en) * | 2022-03-03 | 2022-11-25 | 捻果科技(深圳)有限公司 | Self-recognition target tracking method and system based on linkage of multiple camera positions |
CN116758119B (en) * | 2023-06-27 | 2024-04-19 | 重庆比特数图科技有限公司 | Multi-target circulation detection tracking method and system based on motion compensation and linkage |
CN117853759B (en) * | 2024-03-08 | 2024-05-10 | 山东海润数聚科技有限公司 | Multi-target tracking method, system, equipment and storage medium |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11144761B2 (en) * | 2016-04-04 | 2021-10-12 | Xerox Corporation | Deep data association for online multi-class multi-object tracking |
CN107292911B (en) * | 2017-05-23 | 2021-03-30 | 南京邮电大学 | Multi-target tracking method based on multi-model fusion and data association |
CN109447121B (en) * | 2018-09-27 | 2020-11-06 | 清华大学 | Multi-target tracking method, device and system for visual sensor network |
CN109816690A (en) * | 2018-12-25 | 2019-05-28 | 北京飞搜科技有限公司 | Multi-target tracking method and system based on depth characteristic |
CN109934844A (en) * | 2019-01-28 | 2019-06-25 | 中国人民解放军战略支援部队信息工程大学 | A kind of multi-object tracking method and system merging geospatial information |
CN109919981B (en) * | 2019-03-11 | 2022-08-02 | 南京邮电大学 | Multi-feature fusion multi-target tracking method based on Kalman filtering assistance |
-
2019
- 2019-10-23 CN CN201911012807.6A patent/CN110782483B/en active Active
-
2020
- 2020-09-03 LU LU102028A patent/LU102028B1/en active IP Right Grant
Also Published As
Publication number | Publication date |
---|---|
CN110782483A (en) | 2020-02-11 |
CN110782483B (en) | 2022-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
LU102028B1 (en) | Multiple view multiple target tracking method and system based on distributed camera network | |
CN113269098A (en) | Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle | |
CN109325456B (en) | Target identification method, target identification device, target identification equipment and storage medium | |
WO2022227761A1 (en) | Target tracking method and apparatus, electronic device, and storage medium | |
Denman et al. | Multi-spectral fusion for surveillance systems | |
CN112184757A (en) | Method and device for determining motion trail, storage medium and electronic device | |
CN106846367B (en) | A kind of Mobile object detection method of the complicated dynamic scene based on kinematic constraint optical flow method | |
CN116645396A (en) | Track determination method, track determination device, computer-readable storage medium and electronic device | |
KR101438377B1 (en) | Apparatus and method for detecting position of moving unit | |
Wang et al. | Effective multiple pedestrian tracking system in video surveillance with monocular stationary camera | |
Choe et al. | Traffic analysis with low frame rate camera networks | |
Bazzani et al. | A comparison of multi hypothesis kalman filter and particle filter for multi-target tracking | |
Saisan et al. | Multi-view classifier swarms for pedestrian detection and tracking | |
Mittal et al. | Pedestrian detection and tracking using deformable part models and Kalman filtering | |
KR100994722B1 (en) | Method for tracking moving object on multiple cameras using probabilistic camera hand-off | |
Bardas et al. | 3D tracking and classification system using a monocular camera | |
CN112446355B (en) | Pedestrian recognition method and people stream statistics system in public place | |
Vu et al. | Real-time robust human tracking based on lucas-kanade optical flow and deep detection for embedded surveillance | |
US20230076241A1 (en) | Object detection systems and methods including an object detection model using a tailored training dataset | |
CN110276233A (en) | A kind of polyphaser collaboration tracking system based on deep learning | |
CN114782496A (en) | Object tracking method and device, storage medium and electronic device | |
Zhang et al. | Video Surveillance Using a Multi-Camera Tracking and Fusion System. | |
Kogut et al. | A wide area tracking system for vision sensor networks | |
Klinger et al. | A dynamic bayes network for visual pedestrian tracking | |
CN117670939B (en) | Multi-camera multi-target tracking method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FG | Patent granted |
Effective date: 20210303 |