CN110276233A - A kind of polyphaser collaboration tracking system based on deep learning - Google Patents
A kind of polyphaser collaboration tracking system based on deep learning Download PDFInfo
- Publication number
- CN110276233A CN110276233A CN201810232732.1A CN201810232732A CN110276233A CN 110276233 A CN110276233 A CN 110276233A CN 201810232732 A CN201810232732 A CN 201810232732A CN 110276233 A CN110276233 A CN 110276233A
- Authority
- CN
- China
- Prior art keywords
- pedestrian
- tracking
- network
- picture
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013135 deep learning Methods 0.000 title claims description 5
- 238000001514 detection method Methods 0.000 claims abstract description 19
- 230000004927 fusion Effects 0.000 claims abstract description 3
- 230000006870 function Effects 0.000 claims description 9
- 238000000034 method Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 3
- 230000002159 abnormal effect Effects 0.000 claims description 2
- 241000406668 Loxodonta cyclotis Species 0.000 claims 1
- 208000006440 Open Bite Diseases 0.000 claims 1
- 238000007689 inspection Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 4
- 230000000903 blocking effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 11
- 239000011159 matrix material Substances 0.000 description 4
- 238000001914 filtration Methods 0.000 description 2
- 206010013082 Discomfort Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention is a kind of algorithm tracked for pedestrian track in video, belongs to computer vision field.Problems solved by the invention is: in pedestrian tracking make tracking effect not robust due to blocking etc. the problem of.The present invention proposes a kind of algorithm that pedestrian tracking is carried out under polyphaser environment.The core of main algorithm of the invention is to propose a kind of collaboration of polyphaser end to end track algorithm, data acquisition is carried out to target area using more cameras, then the detection and identification of pedestrian are carried out to every frame picture in video flowing, the tracking of pedestrian is carried out according to the result of detection, again matching fusion will be carried out with the track of a group traveling together under different cameral, pedestrian is finally obtained in the location information of every frame, constitutes the track of pedestrian.For occlusion issue common in pedestrian tracking, we generate network to solve at application confrontation, when occlusion issue occurring during tracking, we just generate network using confrontation and do not block picture generate next frame, and the detection and tracking of pedestrian are carried out using the picture of generation.
Description
Technical field
The demand and pedestrian track tracking technique hardly possible that present invention is generally directed to excavate under indoor environment to pedestrian track value
High contradiction is spent, a kind of polyphaser collaboration tracking system based on deep learning is proposed.
Background technique
In big data era, pedestrian track contains important value.Under the environment such as market, if it is possible to effectively mention
Pedestrian track is taken, can further optimize the setting of sales counter, generate huge commercial value.Cheap and widely distributed camera
So that the acquisition of pedestrian's picture becomes very simple, and the track that pedestrian how is extracted from picture remains a problem.
Presently, there are pedestrian tracking algorithm be traditional Kalman filtering scheduling algorithm mostly, pedestrian density it is low, movement
Can obtain good effect in the simple situation in track, but it is intensive for pedestrian, be still difficult to solve in the case of track is complicated.
There is preferable optimization based on the thought of the track-by-detection situation intensive to pedestrian, is first detected in every frame picture
Then all pedestrians match and merge to the pedestrian of frame every in video.It is then detected that module is still lacked with tracking module
Weary real-time, accurate algorithm.
In order to solve the problems in the existing technology, construct herein it is a set of based on deep learning polyphaser collaboration with
Track system carries out video acquisition, detection and tracking end to end to pedestrian.
Summary of the invention
The purpose of the present invention: indoors under environment, target area is shot using multiple cameras, then to video flowing
In every frame picture carry out pedestrian detection and identification, according to the result of detection carry out pedestrian tracking, in the case where camera will do not had to
Track with a group traveling together is merged, and is finally obtained pedestrian in the location information of every frame, is constituted the track of pedestrian.
Aiming at the problems existing in the prior art, the invention proposes a kind of online polyphasers to cooperate with track algorithm, main
It comprises the steps of:
Step 1: the video information of each camera acquisition pedestrian, and pedestrian therein is detected to every frame picture.
Step 2: carrying out the tracking of online to the pedestrian that one camera detects, multiple pedestrians under one camera environment are obtained
Track.
Step 3: on the trajectory map to public ground level of pedestrian under multiple camera environments, and do the feature of track
With with merge.
For step 1, the pedestrian in picture is detected using based on convolutional neural networks.Link is being trained, we
Block diagram in hand labeled picture where pedestrian, and be marked with the high width of top left co-ordinate and box, label=(x, y,
W, h), wherein x and y indicates that the coordinate in pedestrian's block diagram upper left corner, w and h indicate the width and height of pedestrian's block diagram.Picture is sent into mind
Among network, after the convolutional layer by multilayer obtains characteristic spectrum, discriminate whether it is candidate into object discrimination network
Body is re-fed into sorter network for candidate object and differentiates whether it is pedestrian.
In object discrimination network, if the degree of overlapping between selected block diagram and actual block diagram is greater than certain threshold value,
The block diagram is marked as positive example, is otherwise then counter-example.Define the loss function of the network are as follows:
Wherein,It is cross entropy loss function, indicates whether judgement block diagram is object.It is a square damage
Function is lost, indicates the difference between practical block diagram and prediction block diagram.
Step 2 tracks the pedestrian under one camera environment.The discomforts such as traditional tracking such as Kalman filtering
For the complex scene of more people, mainstream algorithm in recent years is mostly based on the thought of tracking-by-detection, however big
Mostly it is off-line algorithm, needs the contextual information of before and after frames, be not used to actual items.It is proposed that one kind is based on
The real-time tracking algorithm of tracking-by-detection thought, tracking effect are preferable.Detection obtains before the algorithm utilizes
As a result, carrying out the extraction and matching of feature in consecutive frame, and the abnormal conditions such as detection failure can be effectively treated.The algorithm mainly has
Two parts form.First part is the extraction about feature, we are extracted target using traditional feature extracting method
RGB, the features such as HSV, LBP, composition characteristic vector.The second part is the matching about pedestrian, we pass through to pedestrian's
State is modeled, and state includes init state, tracking mode, lost condition, dead state etc..Front several frames initialization with
After track object, behind every frame tracking object and test object are carried out to the calculating of characteristic similarity, obtain after similarity using greedy
Greedy algorithm is matched, and the state of tracking object is finally updated according to matched result.For the object of successful match, after continuation of insurance
Hold tracking mode.The object that it fails to match, then switch to lost condition.If several frames can be found just the object of lost condition later
True matched test object, then can be restored to tracking mode, otherwise can switch to dead state.Finally we are last by statistics
The state of all tracking objects, the trace information of all pedestrians in available video.
For occlusion issue common in pedestrian tracking, we generate network to solve at application confrontation.Occlusion issue refers to
Pedestrian is blocked by external object during the motion, so that pedestrian detection module can not detect pedestrian, so that with
Track module not robust.We, which generate network using confrontation, can generate next frame according to the picture that frames several before pedestrian are not blocked
Picture.Confrontation generates network and is made of network G and network D.Network G refers to generation network, and input is X=
(X1..., Xm), X indicates front m frame picture, after multilayer convolutional layer, exports Ygen, YgenRefer to the next frame of generation
Picture.Network D refers to decision networks, and the positive sample of input is X=(X1..., Xm, Y), refer to and is mentioned from initial data
The continuous m+1 frame picture taken, therefore Y=Xm+1.Negative sample is X=(X1..., Xm, Ygen), wherein YgenIt is to generate network
The middle next frame picture generated according to preceding m frame picture.The target of the network is to judge that the successive frame picture of input is true
Or generate what network generated, therefore loss function is defined as:
LD(X, Y)=Lcls(D (X, Y), 1)+Lcls(D (X, Ygen), 0)
Wherein LelsRefer to cross entropy cost function, is defined as:
I refers to i-th of sample therein.Network G is kept when updating the parameter of network D using stochastic gradient descent method
Parameter is fixed.
The target of network G is that the picture that generates is as true as possible, therefore its loss function is defined as:
LG(X, Y)=Lcls(D (X, Ygen), 1)
The parameter of network D is kept to fix when updating the parameter of network G using stochastic gradient descent method.
We just generate network using confrontation to generate the picture of next frame when occlusion issue occurring during tracking, and
The detection and tracking of pedestrian are carried out using the picture of generation.
Step 3 be under polyphaser the matching of pedestrian ID with merge.It is proposed that under a kind of processing polyphaser pedestrian matching with
The algorithm of fusion has good behaviour in common data sets.Multiple camera plane lists are mapped to common plane (by the algorithm
Plane), the matching of different pedestrians is then carried out on common plane.Set C={ C1... Ci ... CnIndicate n
Camera,It is i-th of the clarification of objective observed in the visual angle of camera C, can be indicated with the coordinate information of position
It.Assuming that one shares N number of camera, the track under N number of camera perspective is thrown in the track under our available N number of camera perspectives
Shadow gets up the Trace Formation of the same person to common plane, the track of M people on available common plane, then by M
The track back projection of people is to N number of camera plane.It is tracked by using multi rack camera, we, which effectively can solve to block, is expert at
The problem of being brought in people's tracking.
When pedestrian track is projected to public ground level from camera plane, it would be desirable to calculate camera plane with publicly
Projection matrix between plane, the matrix are completed by calibrated and calculated.It can be by formula:
X '=Hx
It is calculated.Wherein x=(x, y, 1), indicates the homogeneous coordinates in original plane, and x '=(x ' Y ', 1) is indicated
The homogeneous coordinates in space after projection.The form of projection matrix is
The multiple points of hand labeled are brought into above-mentioned formula, can calculate in the coordinate of camera plane and public ground level
Obtain the value of projection matrix.
Detailed description of the invention
Detailed description of the invention further understands technical solution of the present invention for providing, and constitutes part of specification, with
Implementation of the invention technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.Attached drawing
It is described as follows:
Fig. 1 is the architecture diagram of whole system.
Specific embodiment
Carry out the embodiment that the present invention will be described in detail below with reference to attached drawing, whereby to the present invention how applied technology method
It solves the problems, such as, and the realization process for reaching technical effect can fully understand and implement.It is shown in the flowchart of the accompanying drawings
Step can execute in the different computer systems of such as a group of computer-executable instructions, although also, in flow charts
Logical order is shown, but in some cases, it can be with the steps shown or described are performed in an order that is different from the one herein.
The implementation procedure of algorithm is specifically described below
Step 1: pedestrian detection.Computer, will corresponding every frame figure by being wirelessly connected the data for obtaining camera and acquiring in real time
As being sent in the neural network of pedestrian detection, the block diagram coordinate of multiple pedestrians on image is exported.
Step 2: pedestrian tracking.Obtain the pedestrian's coordinate of every frame picture, before several frames initialize tracking objects, then often
Frame picture can be sent to matching module, update tracking object, finally obtain the trace information of multiple pedestrians.
Step 3: polyphaser matches.Each camera can obtain the track of pedestrian, then by pedestrian's rail of this multiple camera
Mark result is matched and is merged, and the trace information of public ground level uplink people is obtained.
Those skilled in the art should be understood that above-mentioned system structure of the invention and each step can be with general
Computing device realizes that they can be concentrated on a single computing device, or is distributed in the net of multiple computing devices compositions
On network, optionally, they can be realized with the program code that computing device can perform, and be deposited it is thus possible to be stored in
It is performed by computing device in storage device, they is perhaps fabricated to each integrated circuit modules respectively or will be in them
Multiple modules or step are fabricated to single integrated circuit module to realize.In this way, the present invention is not limited to any specific hardware
It is combined with software.
Although embodiment shown or described by the present invention is as above, the content is only to facilitate understand this
The embodiment of invention and use, is not intended to limit the invention.Any those skilled in the art to which this invention pertains,
Do not depart from disclosed herein spirit and scope under the premise of, any repair can be done in the formal and details of implementation
Change and change, but scope of patent protection of the invention, still should be subject to the scope of the claims as defined in the appended claims.
Claims (2)
1. a kind of polyphaser based on deep learning cooperates with tracking system, it is characterized in that including following key step:
Step 1: the video information of each camera acquisition pedestrian, and pedestrian therein is detected to every frame picture
Step 2: carrying out the tracking of online to the pedestrian that one camera detects, the rail of multiple pedestrians under one camera environment is obtained
Mark.
Step 3: on the trajectory map to public ground level of pedestrian under multiple camera environments, and do the characteristic matching of track with
Fusion.
2. the step of claim 1 the method two, is characterized in that, before we utilize detection obtain as a result, in consecutive frame
The extraction and matching of feature are carried out, and the abnormal conditions such as detection failure can be effectively treated.Mainly there are two parts to form for the algorithm.
First part is the extraction about feature, we are extracted the RGB of target, HSV, LBP using traditional feature extracting method
Etc. features, composition characteristic vector.The second part is the matching about pedestrian, we are modeled by the state to pedestrian,
State includes init state, tracking mode, lost condition, dead state etc..After a few frame initialization tracking objects in front, behind
Tracking object and test object are carried out the calculating of characteristic similarity by every frame, use greedy algorithm progress after obtaining similarity
Match, the state of tracking object is finally updated according to matched result.For the object of successful match, continue to keep tracking mode.
The object that it fails to match, then switch to lost condition.If several frames can find correct matched inspection to the object of lost condition later
Object is surveyed, then can be restored to tracking mode, otherwise can switch to dead state.Finally we pass through the last all tracking pair of statistics
The state of elephant, the trace information of all pedestrians in available video.
For occlusion issue common in pedestrian tracking, we generate network to solve at application confrontation.Occlusion issue refers to pedestrian
It is blocked during the motion by external object, so that pedestrian detection module can not detect pedestrian, so that tracking mould
Block not robust.We generate network using confrontation can generate the figure of next frame according to the picture that frames several before pedestrian are not blocked
Piece.Confrontation generates network and is made of network G and network D.Network G refers to generation network, and input is X=(X1...,
Xm), X indicates front m frame picture, after multilayer convolutional layer, exports Ygen, YgenRefer to the picture of the next frame of generation.Network D
Refer to decision networks, the positive sample of input is X=(X1..., Xm, Y), refer to the continuous m+1 extracted from initial data
Frame picture, therefore Y=Xm+1.Negative sample is X=(X1..., Xm, Ygen), wherein YgenIt is to generate in network according to preceding m frame
The next frame picture that picture generates.The target of the network is to judge that the successive frame picture of input is true or generates network
It generates, therefore loss function is defined as:
LD(X, Y)=Lcls(D (X, Y), 1)+Lcls(D (X, Ygen), 0)
Wherein LclsRefer to cross entropy cost function, is defined as:
I refers to i-th of sample therein.The parameter of network G is kept when updating the parameter of network D using stochastic gradient descent method
It is fixed.
The target of network G is that the picture that generates is as true as possible, therefore its loss function is defined as:
LG(X, Y)=Lcls(D (X, Ygen), 1)
The parameter of network D is kept to fix when updating the parameter of network G using stochastic gradient descent method.
When occlusion issue occurring during tracking, we just generate network using confrontation to generate the non-Occlusion Map of next frame
Piece, and carry out using the picture of generation the detection and tracking of pedestrian.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810232732.1A CN110276233A (en) | 2018-03-15 | 2018-03-15 | A kind of polyphaser collaboration tracking system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810232732.1A CN110276233A (en) | 2018-03-15 | 2018-03-15 | A kind of polyphaser collaboration tracking system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110276233A true CN110276233A (en) | 2019-09-24 |
Family
ID=67957973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810232732.1A Pending CN110276233A (en) | 2018-03-15 | 2018-03-15 | A kind of polyphaser collaboration tracking system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110276233A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734805A (en) * | 2021-01-11 | 2021-04-30 | 北京深睿博联科技有限责任公司 | Pedestrian motion trajectory prediction method and device based on deep learning |
CN113361360A (en) * | 2021-05-31 | 2021-09-07 | 山东大学 | Multi-person tracking method and system based on deep learning |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1941850A (en) * | 2005-09-29 | 2007-04-04 | 中国科学院自动化研究所 | Pedestrian tracting method based on principal axis marriage under multiple vedio cameras |
CN104093001A (en) * | 2014-07-23 | 2014-10-08 | 山东建筑大学 | Online dynamic video compression method |
KR20160132731A (en) * | 2015-05-11 | 2016-11-21 | 계명대학교 산학협력단 | Device and method for tracking pedestrian in thermal image using an online random fern learning |
-
2018
- 2018-03-15 CN CN201810232732.1A patent/CN110276233A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1941850A (en) * | 2005-09-29 | 2007-04-04 | 中国科学院自动化研究所 | Pedestrian tracting method based on principal axis marriage under multiple vedio cameras |
CN104093001A (en) * | 2014-07-23 | 2014-10-08 | 山东建筑大学 | Online dynamic video compression method |
KR20160132731A (en) * | 2015-05-11 | 2016-11-21 | 계명대학교 산학협력단 | Device and method for tracking pedestrian in thermal image using an online random fern learning |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734805A (en) * | 2021-01-11 | 2021-04-30 | 北京深睿博联科技有限责任公司 | Pedestrian motion trajectory prediction method and device based on deep learning |
CN112734805B (en) * | 2021-01-11 | 2022-04-15 | 北京深睿博联科技有限责任公司 | Pedestrian motion trajectory prediction method and device based on deep learning |
CN113361360A (en) * | 2021-05-31 | 2021-09-07 | 山东大学 | Multi-person tracking method and system based on deep learning |
CN113361360B (en) * | 2021-05-31 | 2023-07-25 | 山东大学 | Multi-person tracking method and system based on deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113674416B (en) | Three-dimensional map construction method and device, electronic equipment and storage medium | |
Tang et al. | ESTHER: Joint camera self-calibration and automatic radial distortion correction from tracking of walking humans | |
CN110555901A (en) | Method, device, equipment and storage medium for positioning and mapping dynamic and static scenes | |
CN111862296A (en) | Three-dimensional reconstruction method, three-dimensional reconstruction device, three-dimensional reconstruction system, model training method and storage medium | |
CN111402294A (en) | Target tracking method, target tracking device, computer-readable storage medium and computer equipment | |
CN111382613B (en) | Image processing method, device, equipment and medium | |
CN104601964A (en) | Non-overlap vision field trans-camera indoor pedestrian target tracking method and non-overlap vision field trans-camera indoor pedestrian target tracking system | |
CN108197604A (en) | Fast face positioning and tracing method based on embedded device | |
CN104517095B (en) | A kind of number of people dividing method based on depth image | |
CN112949508A (en) | Model training method, pedestrian detection method, electronic device and readable storage medium | |
CN112465855B (en) | Passenger flow statistical method, device, storage medium and equipment | |
CN112446882A (en) | Robust visual SLAM method based on deep learning in dynamic scene | |
CN103729620B (en) | A kind of multi-view pedestrian detection method based on multi-view Bayesian network | |
CN112184757A (en) | Method and device for determining motion trail, storage medium and electronic device | |
CN106530407A (en) | Three-dimensional panoramic splicing method, device and system for virtual reality | |
CN112132873A (en) | Multi-lens pedestrian recognition and tracking based on computer vision | |
Jiang et al. | Data fusion-based multi-object tracking for unconstrained visual sensor networks | |
CN112053391A (en) | Monitoring and early warning method and system based on dynamic three-dimensional model and storage medium | |
Zheng et al. | Steps: Joint self-supervised nighttime image enhancement and depth estimation | |
CN107948586A (en) | Trans-regional moving target detecting method and device based on video-splicing | |
CN115620090A (en) | Model training method, low-illumination target re-recognition method and device and terminal equipment | |
CN117197388A (en) | Live-action three-dimensional virtual reality scene construction method and system based on generation of antagonistic neural network and oblique photography | |
CN110276233A (en) | A kind of polyphaser collaboration tracking system based on deep learning | |
Zhu et al. | PairCon-SLAM: Distributed, online, and real-time RGBD-SLAM in large scenarios | |
CN108765326A (en) | A kind of synchronous superposition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190924 |
|
WD01 | Invention patent application deemed withdrawn after publication |