LU102028B1 - Multiple view multiple target tracking method and system based on distributed camera network - Google Patents

Multiple view multiple target tracking method and system based on distributed camera network Download PDF

Info

Publication number
LU102028B1
LU102028B1 LU102028A LU102028A LU102028B1 LU 102028 B1 LU102028 B1 LU 102028B1 LU 102028 A LU102028 A LU 102028A LU 102028 A LU102028 A LU 102028A LU 102028 B1 LU102028 B1 LU 102028B1
Authority
LU
Luxembourg
Prior art keywords
detected target
information
current frame
camera
coordinate
Prior art date
Application number
LU102028A
Other languages
French (fr)
Inventor
Guoliang Liu
Original Assignee
Univ Shandong
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Shandong filed Critical Univ Shandong
Application granted granted Critical
Publication of LU102028B1 publication Critical patent/LU102028B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/292Multi-camera tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/188Capturing isolated or intermittent images triggered by the occurrence of a predetermined event, e.g. an object reaching a predetermined position
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/95Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present disclosure discloses a multiple view multiple target tracking method and system based on a distributed camera network. The method includes: obtaining a current frame view collected by each camera in the distributed camera network; extracting a rectangular boundary box of a detected target from the current frame view; extracting visual appearance information of the detected target from an image in the rectangular boundary box by using a pre-trained convolutional neural network; converting an image coordinate of the detected target in the current frame view into a ground coordinate; and output step: constructing a data incidence matrix based on the visual appearance information and the ground coordinate of the detected target; and processing the data incidence matrix by using the Hungarian algorithm, and outputting a result of successful matching or failed matching between the detected target in the current frame view and a known trajectory.

Description

MULTIPLE VIEW MULTIPLE TARGET TRACKING METHOD AND LU102028 |
Field of the Invention |
The present disclosure relates to the technical field of multiple target tracking, in | particular to a multiple view multiple target tracking method and system based on a | distributed camera network. | Background of the Invention |
The statements in this section merely mention the background art related to the | present disclosure, and do not necessarily constitute the prior art. |
The multiple object tracking (Multiple Object Tracking) technology has many | applications in nowadays society, such as surveillance, monitoring, and crowd | behavior analysis. |
In the process of implementing the present disclosure, the inventor found the | following technical problems in the prior art: | Multiple object tracking is still a challenging task, because it needs to solve the | problems such as target detection, trajectory estimation, data association and | re-identification at the same time.
In order to detect targets, various sensors such as | radar, laser, sonar, cameras or the like can be used according to the needs of specific | tasks, and corresponding detection algorithms are also required.
Target detection is | one of the difficulties in the multiple target tracking.
Another challenging problem of | the multiple target tracking is occlusion.
The target can be occluded by other objects, | or it can be occluded by the current field of view (Field Of View), and frequent | occlusion can easily cause loss of the target, which affects the tracking accuracy. | Summary of the Invention |
In order to overcome the shortcomings of the prior art, the present disclosure provides | a multiple view multiple target tracking method and system based on a distributed | camera network; |
In a first aspect, the present disclosure provides a multiple view multiple target LU102028 | tracking method based on a distributed camera network; Ë The multiple view multiple target tracking method based on the distributed camera À network includes: ' obtaining a current frame view collected by each camera in the distributed camera | network; | extracting a rectangular boundary box of a detected target from the current frame i view; | extracting visual appearance information of the detected target from an image in the , rectangular boundary box by using a pre-trained convolutional neural network; and | converting an image coordinate of the detected target in the current frame view into a | ground coordinate; and | output step: constructing a data incidence matrix based on the visual appearance . information and the ground coordinate of the detected target; and processing the data | incidence matrix by using the Hungarian algorithm, and outputting a result of ; successful matching or failed matching between the detected target in the current . frame view and a known trajectory. .
In a second aspect, the present disclosure further provides a multiple view multiple | target tracking system based on a distributed camera network; | The multiple view multiple target tracking system based on the distributed camera | network includes: | an obtaining module, configured to obtain a current frame view collected by each | camera in the distributed camera network; | a preprocessing module, configured to extract a rectangular boundary box of a | detected target from the current frame view; | an extraction module, configured to extract visual appearance information of the | detected target from an image in the rectangular boundary box by using a pre-trained | convolutional neural network; and convert an image coordinate of the detected target | in the current frame view into a ground coordinate; and | an output module, configured to construct a data incidence matrix based on the visual | appearance information and the ground coordinate of the detected target; and process LU102028 | the data incidence matrix by using the Hungarian algorithm, and output a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory. | In a third aspect, the present disclosure further provides an electronic device, | including a memory, a processor, and computer instructions stored in the memory and | running on the processor, and the computer instructions complete the steps of the | method in the first aspect when executed by the processor. .
In a fourth aspect, the present disclosure further provides a computer-readable storage A medium for storing computer instructions, and the computer instructions complete the | steps of the method in the first aspect when executed by a processor. | Compared with the prior art, the present disclosure has the following beneficial | effects: .
In the method, the data incidence matrix generated by combining the visual | appearance information with the ground coordinate is adopted, and the matching | between the detected target and the known trajectory is implemented by using the data | incidence matrix, so that the accuracy of matching can be improved; and | The method benefits from both deep appearance visual features and distributed | trajectory estimation. Compared with the original Deep SORT method, the method is .
more robust in dealing with target re-identification and occlusion problems. | Brief Description of the Drawings | The drawings constituting a part of the present application are used for providing a | further understanding of the present application. The exemplary embodiments of the | present application and descriptions thereof are used for explaining the present | application, but do not constitute an improper limitation to the present application. | Fig. 1 is an overall structure of a distributed multiple view multiple target tracking | system in the first embodiment. | Detailed Description of the Embodiments | It should be pointed out that the following detailed descriptions are all exemplary and | are intended to provide further descriptions of the present application. Unless 0102028 | otherwise specified, all technical and scientific terms used herein have the same | meaning as commonly understood by those of ordinary skill in the technical field of | the present application. | It should be noted that the terms used here are only for describing specific | embodiments, and are not intended to limit the exemplary embodiments according to | the present application. As used herein, unless the context clearly indicates otherwise, | the singular form is also intended to include the plural form. In addition, it should also | be understood that when the terms "comprising" and/or "including" are used in the | present specification, they indicate the presence of features, steps, operations, devices, | components and/or combinations thereof. | First embodiment | The present embodiment provides a multiple view multiple target tracking method | based on a distributed camera network; .
The multiple view multiple target tracking method based on the distributed camera | network includes: .
S1: obtaining a current frame view collected by each camera in the distributed camera | network; | S2: extracting a rectangular boundary box of a detected target from the current frame | view; | S3: extracting visual appearance information of the detected target from an image in | the rectangular boundary box by using a pre-trained convolutional neural network; | and converting an image coordinate of the detected target in the current frame view | into a ground coordinate; and | S4: output step: constructing a data incidence matrix based on the visual appearance | information and the ground coordinate of the detected target; and processing the data | incidence matrix by using the Hungarian algorithm, and outputting a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory. | As one or more embodiments, in S4, the specific steps of the output step include: | calculating the Mahalanobis distance between the ground coordinate of the detected LU102028 ’ target and the destination coordinate of each stored trajectory in the current frame | view; | calculating M cosine distances between the visual appearance information of the | detected target and the visual appearance information of previous M frames adjacent | to the current frame view; and storing the minimum value of the M cosine distances as | the final cosine distance; . when the Mahalanobis distance and the final cosine distance are both less than a set 0 threshold, performing weighted summation on the Mahalanobis distance and the final - cosine distance to obtain the data incidence matrix; and | inputting the data incidence matrix into the Hungarian algorithm, and outputting, by | the Hungarian algorithm, the result of successful matching or failed matching between | the detected target in the current frame view and the known trajectory. | As one or more embodiments, the method further includes: | S5: if image coordinate information of the successfully matched detected target and a | corresponding trajectory serial number ID are stored in the current camera, then | performing repeated iteration on the successfully matched information in the current | camera and the successfully associated information in the adjacent camera for A exchange, and calculating average consensus to obtain a convergent information | vector and a convergent information matrix; and | calculating posterior pose information based on the converged information vector and || the converged information matrix, thus so far, multiple view multiple target tracking .
is achieved; and then, predicting the position information of the detected target in the | next frame of view. .
As one or more embodiments, the method further includes: | S6: if the ground coordinate of the detected target that fails to match and a | corresponding trajectory serial number ID are stored in the current camera, then | calculating the Euclidean distance between the coordinate information of the detected | target that fails to match in the current camera and the destination coordinate || information of each stored trajectory in the views captured by the other remaining | cameras; and be if the Eucli LU102028 Mg uclidean di t . 8 dance 1 le. € detected et thresh Éd target in the curr old, Matching the Views ¢ ent Camera wi; ground coordi | Aptured by the oth With the ç orrespond Tdinate | CT remaini Ing tra Lo | As One or m ng Cameras.
Jectories in the | ore embodiment, th | POUR > the dices I distribyteq type i | distributed ca [ data in a ce centralized type, and Cans that the | | ntral Processing unit f > the centralized pe i | the data in re or Processing: and the distrib Pe is to collect | Spective Stribut | an 0 pro | D communicate with each | 0A c | § ONE or more embodiments, in gp 4, | ? » INC eXtractin | det 8 à rectangular | ected target from the current fra . | boundary box of a [I bound Me View includes: extracting th | Ounda ’ e | ry box of the detected target from the curr te 5 rectangular : As one or more embodiments, in S3, the ; _. | > 196 Specific training st : | 15 . & steps of the pre-trained | convolutional neural network include: | constructing the convolutional neural network: and i oo. : ; ; constructing a training set, | wherein the training set is an im i ; | g age of known visual appearance information; inputting the training set into the convolutional neural network to train the i convolutional neural network; and | obtaining a trained convolutional neural network. | For example, the training set is a large-scale pedestrian re-identification data set à containing more than 1.1 million images of 1261 pedestrians. à As one or more embodiments, in S3, the visual appearance information specifically | refers to 128-dimensional normalized features output by the convolutional neural | network.
For example, it may be a contour feature. | As one or more embodiments, in S3, the specific step of converting the image | coordinate of the detected target in the current frame view into the ground coordinate . includes: |, using a pixel coordinate of the midpoint of a bottom margin of the boundary box of a | person in the image as the position information of the person, and converting the pixel | coordinate into the ground coordinate through a homography matrix (homography LU102028 . matrix), wherein the homography matrix is obtained by camera calibration. | As one or more embodiments, in S5, the specific step of calculating the average | consensus includes: | based on the ICF algorithm, performing information exchange through repeated | iteration of adjacent cameras so as to obtain the convergent information vector and the |! convergent information matrix: ) Wherein, € represents a constant, v“ represents the information vector, and VE ; represents the information matrix. .
As one or more embodiments, in S5, the specific step of calculating the posterior pose | information includes: | calculating a posterior state vector x; (t) and a posterior information matrix W;t (t) .
in the current frame. | Wherein, N represents the number of cameras. | As one or more embodiments, in S5, the specific step of predicting the position | information of the detected target in the next frame of view includes: | predicting a next state variable x; (t) and a next information matrix W; (t) of the | target: | Wherein, i represents the i node, t represents the t® frame, ® represents a linear A state transition matrix, and Q represents a process noise covariance. | The overall structure of the distributed multiple target tracking method proposed in |
Coe eT TE I the present disclosure is shown in Fig. 1. First, target detection is performed on each LU102028 | camera by using YOLOv3, and the algorithm can extract the rectangular boundary | box of the detected target with a higher frame rate. Then, the visual appearance | information of the target is obtained through a pre-trained convolutional neural ‘ network. The Hungarian algorithm is used to combine the visual appearance . information and the position information of the target for data association. The . position information of the target can be fused from multiple view information by | using an information weighed consensus filter (Information Weighed Consensus | Filter). The specific steps are as follows: |
1. Target detection | Target detection refers to obtaining different targets in an image and determining their | types and positions. The target detection method based on deep learning has strong | robustness to illumination changes, occlusion problems and complex environments. | There are two main research directions: a two-stage method and a one-stage method. | The two-stage method includes: predicting the number of candidate frames that may | have the target at first, and then adjusting the sizes of the frames and classifying the | frames to obtain the precise position, size and category of the target, such as Faster . R-CNN. In the one-stage method, the first step is omitted, and the position and the | category of the target are directly predicted, for example, YOLOv3 is compared with | the two-stage method, the one-stage method is usually faster and has comparable | performance. Therefore, we choose the YOLOvV3 as a target detector. |
2. Data association | A simple Hungarian algorithm is used for data association, the visual appearance | information of the target is a 128-dimensional feature vector obtained by a trained | convolutional neural network, and the position information of the target is obtained by , converting an image coordinate onto the ground by using a corrected homography | matrix (homography matrix), and the final correlation matrix is obtained by weighting | the visual appearance information and the position information of the target: |
Wherein, i,j respectively represent the measured value of the i trajectory and the j® LU102028 | measured value, À represents an adjustable weight parameter, d) represents the | Mahalanobis distance between a measurement position and the last frame of each . stored trajectory in the current frame, and d® represents the minimum cosine distance | between the measured appearance information and the appearance information stored | in each trajectory. .
It should be understood that the homography matrix refers to a homography matrix ‘ obtained by camera calibration, and the homography matrix can convert the pixel | coordinate into the ground coordinate. | In addition, a threshold function is used to ignore irrelevant candidates: . bi = 1,if di j) « t® 2) | Wherein, k is equal to 1 or 2 and represents the appearance information or the position | information, only when both the Mahalanobis distance and the cosine distance are less | than the corresponding thresholds, the association between the i trajectory and the j® | tracked target is allowed, that is, b is set as 1. .
3. Trajectory processing using multiple view information | The trajectory processing step is used for ID management, including restoring an old . trajectory or creating a new trajectory. Restoring the old trajectory means that after a | person walks out of the field of view for 30 frames, his trajectory will be deleted, and | when he comes back again, he will be given the original ID; and creating the new E trajectory means that a new person enters the field of view and his is assigned with a | new ID. When a detection value that fails to match is found in the current view, the | position information thereof is matched with the position information of the last frame | of the trajectory at first by using the Euclidean distance to restore the old trajectory, | and if a matching candidate is found, the detection value is assigned with the ID of the | trajectory. | If the match fails again, then the algorithm will check whether there is a match in | other views, that is, the detection match that also fails to match in other views, for the | generation of the new trajectory, and if the distance meets the threshold requirements, |
CORRE SRE ee Tee TT EF 10 | a new ID is initialized for them. In addition, the algorithm removes the trajectories LU102028 ; that have disappeared for more than 30 seconds in the current view to reduce : interference. |
4. Information weighed consensus filter for multiple view information fusion | The information weighed consensus filter (ICF) is an effective distributed state | estimation method. Here, the ICF is used to perform multiple view information fusion | to estimate the position of the target. The ICF mainly includes three steps: state | prediction, measurement update and weighed consensus. In terms of state prediction, : in S6, a linear constant speed model is used to predict the next state variable x and the : next information matrix W of the target. ; Wi (6) = (®(W#(t— 1)) oT + Q) 3) | Wherein, i represents the i node, t represents the t frame, ® represents a linear : state transition matrix, and Q represents a process noise covariance. The predicted | position information is sent to a data association module to match the measured value. | During the measurement update, the current measured value z; is used to calculate : the information vector v; and the information matrix V;. | x}; Wi» H;, R; and N respectively represent a priori state vector, the information | matrix, an observation matrix, a measured noise covariance and the number of | cameras. With respect to the weighed consensus, each camera will send and receive | the information vector v; and the information matrix V; to and from the adjacent | camera, and k times of iteration are performed until the filter is convergent. | Wherein, € represents a constant. |
The posterior state vector xj (t) and the posterior information matrix W;(t) in the LU102028 | current frame are obtained at last. | Wit) = NV (10) |
Wherein, N represents the number of cameras. | Second embodiment |
The present embodiment further provides a multiple view multiple target tracking | system based on a distributed camera network; |
The multiple view multiple target tracking system based on the distributed camera E network includes: | an obtaining module, configured to obtain a current frame view collected by each | camera in the distributed camera network; | a preprocessing module, configured to extract a rectangular boundary box of a | detected target from the current frame view; | an extraction module, configured to extract visual appearance information of the | detected target from an image in the rectangular boundary box by using a pre-trained | convolutional neural network; and convert an image coordinate of the detected target | in the current frame view into a ground coordinate; and | an output module, configured to construct a data incidence matrix based on the visual | appearance information and the ground coordinate of the detected target; and process | the data incidence matrix by using the Hungarian algorithm, and output a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory. | Third embodiment |
The present embodiment further provides an electronic device, including a memory, a | processor, and computer instructions stored in the memory and running on the | processor, and the computer instructions complete the steps of the method in the first | aspect when executed by the processor. | Fourth embodiment |
The present embodiment further provides a computer-readable storage medium for | storing comp tru dth . vom we | uter Instructions, and the computer instructions comps. « method in the first aspect when executed by a Shy Ë ÿ a processor. = i ipti nen plit. Ë à The above descriptions are only preferred embodiments of the present applive, J and are not used to limit the present application. For those skilled in the art, the ny | present application can have various modifications and changes. Any modifications, yy equivalent replacements, improvements and the like, made within the spirit and “«.
principle of the present application, shall all be included in the protection scope of the | present application. |

Claims (10)

) ERAS EE TES me nm ms TS SH yp 13 | LU102028 | Claims |
1. A multiple view multiple target tracking method based on a distributed camera | network, comprising: | obtaining a current frame view collected by each camera in the distributed camera | network; | extracting a rectangular boundary box of a detected target from the current frame | view; extracting visual appearance information of the detected target from an image in the | rectangular boundary box by using a pre-trained convolutional neural network; and | converting an image coordinate of the detected target in the current frame view into a | ground coordinate; and | output step: constructing a data incidence matrix based on the visual appearance | information and the ground coordinate of the detected target; and processing the data | incidence matrix by using the Hungarian algorithm, and outputting a result of | successful matching or failed matching between the detected target in the current | frame view and a known trajectory.
2. The method of claim 1, wherein the specific steps of the output step comprise: | calculating the Mahalanobis distance between the ground coordinate of the detected target and the destination coordinate of each stored trajectory in the current frame view; calculating M cosine distances between the visual appearance information of the detected target and the visual appearance information of previous M frames adjacent to the current frame view; and storing the minimum value of the M cosine distances as the final cosine distance; when the Mahalanobis distance and the final cosine distance are both less than a set threshold, performing weighted summation on the Mahalanobis distance and the final cosine distance to obtain the data incidence matrix; and inputting the data incidence matrix into the Hungarian algorithm, and outputting, by
| CE TNT TT 7 LL 14 the Hungarian algorithm, the result of successful matching or failed matching between | LV102028 the detected target in the current frame view and the known trajectory.
3. The method of claim 1, further comprising: if image coordinate information of the successfully matched detected target and a corresponding trajectory serial number ID are stored in the current camera, then performing repeated iteration on the successfully matched information in the current | camera and the successfully associated information in the adjacent camera for exchange, and calculating average consensus to obtain a convergent information | vector and a convergent information matrix; and calculating posterior pose information based on the converged information vector and the converged information matrix, thus so far, multiple view multiple target tracking is achieved; and then, predicting the position information of the detected target in the next frame of view.
4. The method of claim 3, further comprising: if the ground coordinate of the detected target that fails to match and a corresponding trajectory serial number ID are stored in the current camera, then calculating the Euclidean distance between the coordinate information of the detected target that fails to match in the current camera and the destination coordinate information of each stored trajectory in the views captured by the other remaining cameras; and if the Euclidean distance is less than a set threshold, matching the ground coordinate of the detected target in the current camera with the corresponding trajectories in the views captured by the other remaining cameras.
5. The method of claim 3, wherein the extracting a rectangular boundary box of a detected target from the current frame view comprises: extracting the rectangular boundary box of the detected target from the current frame view by using a YOLOv3 network.
6. The method of claim 3, wherein the specific training steps of the pre-trained convolutional neural network comprise: constructing the convolutional neural network; and constructing a training set, wherein the training set is an image of known visual appearance information;
inputting the training set into the convolutional neural network to train the LU102028 convolutional neural network; and obtaining a trained convolutional neural network.
7. The method of claim 1, wherein the specific step of converting the image coordinate of the detected target in the current frame view into the ground coordinate comprises: using a pixel coordinate of the midpoint of a bottom margin of the boundary box of a person in the image as the position information of the person, and converting the pixel coordinate into the ground coordinate through a homography matrix, wherein the homography matrix is obtained by camera calibration.
8. A multiple view multiple target tracking system based on a distributed camera network, comprising: an obtaining module, configured to obtain a current frame view collected by each camera in the distributed camera network; a preprocessing module, configured to extract a rectangular boundary box of a detected target from the current frame view; an extraction module, configured to extract visual appearance information of the detected target from an image in the rectangular boundary box by using a pre-trained convolutional neural network; and convert an image coordinate of the detected target in the current frame view into a ground coordinate; and an output module, configured to construct a data incidence matrix based on the visual appearance information and the ground coordinate of the detected target; and process the data incidence matrix by using the Hungarian algorithm, and output a result of successful matching or failed matching between the detected target in the current frame view and a known trajectory.
9. An electronic device, comprising a memory, a processor, and computer instructions stored in the memory and running on the processor, wherein the computer instructions complete the steps of the method according to any one of claims 1-7 when executed by the processor.
10. À computer-readable storage medium for storing computer instructions, wherein
CR RE EE eee nm im mm TT ITA ; LU102028 | the computer instructions complete the steps of the method according to any one of | claims 1-7 when executed by a processor. |
LU102028A 2019-10-23 2020-09-03 Multiple view multiple target tracking method and system based on distributed camera network LU102028B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911012807.6A CN110782483B (en) 2019-10-23 2019-10-23 Multi-view multi-target tracking method and system based on distributed camera network

Publications (1)

Publication Number Publication Date
LU102028B1 true LU102028B1 (en) 2021-03-03

Family

ID=69386547

Family Applications (1)

Application Number Title Priority Date Filing Date
LU102028A LU102028B1 (en) 2019-10-23 2020-09-03 Multiple view multiple target tracking method and system based on distributed camera network

Country Status (2)

Country Link
CN (1) CN110782483B (en)
LU (1) LU102028B1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738075A (en) * 2020-05-18 2020-10-02 深圳奥比中光科技有限公司 Joint point tracking method and system based on pedestrian detection
CN111626194B (en) * 2020-05-26 2024-02-02 佛山市南海区广工大数控装备协同创新研究院 Pedestrian multi-target tracking method using depth correlation measurement
CN112215873A (en) * 2020-08-27 2021-01-12 国网浙江省电力有限公司电力科学研究院 Method for tracking and positioning multiple targets in transformer substation
CN112070807B (en) * 2020-11-11 2021-02-05 湖北亿咖通科技有限公司 Multi-target tracking method and electronic device
CN112633205A (en) * 2020-12-28 2021-04-09 北京眼神智能科技有限公司 Pedestrian tracking method and device based on head and shoulder detection, electronic equipment and storage medium
CN113674317B (en) * 2021-08-10 2024-04-26 深圳市捷顺科技实业股份有限公司 Vehicle tracking method and device for high-level video
CN114089675B (en) * 2021-11-23 2023-06-09 长春工业大学 Machine control method and system based on man-machine distance
CN114299128A (en) * 2021-12-30 2022-04-08 咪咕视讯科技有限公司 Multi-view positioning detection method and device
CN114596337B (en) * 2022-03-03 2022-11-25 捻果科技(深圳)有限公司 Self-recognition target tracking method and system based on linkage of multiple camera positions
CN116758119B (en) * 2023-06-27 2024-04-19 重庆比特数图科技有限公司 Multi-target circulation detection tracking method and system based on motion compensation and linkage
CN117853759B (en) * 2024-03-08 2024-05-10 山东海润数聚科技有限公司 Multi-target tracking method, system, equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11144761B2 (en) * 2016-04-04 2021-10-12 Xerox Corporation Deep data association for online multi-class multi-object tracking
CN107292911B (en) * 2017-05-23 2021-03-30 南京邮电大学 Multi-target tracking method based on multi-model fusion and data association
CN109447121B (en) * 2018-09-27 2020-11-06 清华大学 Multi-target tracking method, device and system for visual sensor network
CN109816690A (en) * 2018-12-25 2019-05-28 北京飞搜科技有限公司 Multi-target tracking method and system based on depth characteristic
CN109934844A (en) * 2019-01-28 2019-06-25 中国人民解放军战略支援部队信息工程大学 A kind of multi-object tracking method and system merging geospatial information
CN109919981B (en) * 2019-03-11 2022-08-02 南京邮电大学 Multi-feature fusion multi-target tracking method based on Kalman filtering assistance

Also Published As

Publication number Publication date
CN110782483A (en) 2020-02-11
CN110782483B (en) 2022-03-15

Similar Documents

Publication Publication Date Title
LU102028B1 (en) Multiple view multiple target tracking method and system based on distributed camera network
CN113269098A (en) Multi-target tracking positioning and motion state estimation method based on unmanned aerial vehicle
CN109325456B (en) Target identification method, target identification device, target identification equipment and storage medium
WO2022227761A1 (en) Target tracking method and apparatus, electronic device, and storage medium
Denman et al. Multi-spectral fusion for surveillance systems
CN112184757A (en) Method and device for determining motion trail, storage medium and electronic device
CN106846367B (en) A kind of Mobile object detection method of the complicated dynamic scene based on kinematic constraint optical flow method
CN116645396A (en) Track determination method, track determination device, computer-readable storage medium and electronic device
KR101438377B1 (en) Apparatus and method for detecting position of moving unit
Wang et al. Effective multiple pedestrian tracking system in video surveillance with monocular stationary camera
Choe et al. Traffic analysis with low frame rate camera networks
Bazzani et al. A comparison of multi hypothesis kalman filter and particle filter for multi-target tracking
Saisan et al. Multi-view classifier swarms for pedestrian detection and tracking
Mittal et al. Pedestrian detection and tracking using deformable part models and Kalman filtering
KR100994722B1 (en) Method for tracking moving object on multiple cameras using probabilistic camera hand-off
Bardas et al. 3D tracking and classification system using a monocular camera
CN112446355B (en) Pedestrian recognition method and people stream statistics system in public place
Vu et al. Real-time robust human tracking based on lucas-kanade optical flow and deep detection for embedded surveillance
US20230076241A1 (en) Object detection systems and methods including an object detection model using a tailored training dataset
CN110276233A (en) A kind of polyphaser collaboration tracking system based on deep learning
CN114782496A (en) Object tracking method and device, storage medium and electronic device
Zhang et al. Video Surveillance Using a Multi-Camera Tracking and Fusion System.
Kogut et al. A wide area tracking system for vision sensor networks
Klinger et al. A dynamic bayes network for visual pedestrian tracking
CN117670939B (en) Multi-camera multi-target tracking method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
FG Patent granted

Effective date: 20210303