CN115330838A - Cross-scale target tracking method and device - Google Patents
Cross-scale target tracking method and device Download PDFInfo
- Publication number
- CN115330838A CN115330838A CN202210999023.2A CN202210999023A CN115330838A CN 115330838 A CN115330838 A CN 115330838A CN 202210999023 A CN202210999023 A CN 202210999023A CN 115330838 A CN115330838 A CN 115330838A
- Authority
- CN
- China
- Prior art keywords
- perception
- target
- images
- global
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000008447 perception Effects 0.000 claims abstract description 196
- 238000003384 imaging method Methods 0.000 claims abstract description 120
- 238000012549 training Methods 0.000 claims abstract description 84
- 239000011159 matrix material Substances 0.000 claims abstract description 33
- 230000009466 transformation Effects 0.000 claims abstract description 29
- 238000004422 calculation algorithm Methods 0.000 claims description 38
- 238000003491 array Methods 0.000 claims description 32
- 238000001914 filtration Methods 0.000 claims description 15
- 238000013507 mapping Methods 0.000 claims description 11
- 230000003287 optical effect Effects 0.000 claims description 10
- 230000004927 fusion Effects 0.000 claims description 9
- 238000012545 processing Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 230000001131 transforming effect Effects 0.000 claims description 6
- 238000012512 characterization method Methods 0.000 claims description 3
- 230000002093 peripheral effect Effects 0.000 claims description 3
- 230000001360 synchronised effect Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 241000251468 Actinopterygii Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
- G06T7/75—Determining position or orientation of objects or cameras using feature-based methods involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/225—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/32—Indexing scheme for image data processing or generation, in general involving image mosaicing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a cross-scale target tracking method and a device, wherein the method comprises the following steps: building a training data acquisition device and training to obtain a cross-scale target heelA tracking model; constructing a cross-scale panoramic sensing system, collecting images, and calculating to obtain a transformation matrix of adjacent images of the overall sensing image; splicing and fusing a plurality of global perception images by using transformation matrixes of adjacent images to obtain a 360-degree panoramic spliced image I G (ii) a Calculating a coordinate positioning result of the target in the local perception image; obtaining a tracking track of a target in a corresponding global perception image according to the tracking track of the target in the local perception image; obtaining a 360-degree panoramic mosaic image I through mosaic G The specific trajectory of the target. The panoramic sensing system used by the invention solves the problems of low target sensing resolution and poor edge imaging quality, and provides data support for the training of a high-precision target tracking method, so that the target tracking method has high precision and good applicability.
Description
Technical Field
The application relates to the technical field of deep learning, in particular to a cross-scale target tracking method and device.
Background
The existing large-scene panoramic imaging mode generally utilizes a plurality of cameras to splice panoramic imaging or utilizes a plurality of fish glasses to perform panoramic imaging. The two modes have the defects of low small-scale target perception resolution and poor edge imaging quality, so that when the cross-scale target tracking is carried out under the conditions of motion blur, background similar interference, and occlusion or illumination change, the precision is low and the applicability is poor; in addition, the deployment modes have the problems of high price, high deployment cost and the like.
Therefore, the inventor recognizes that, for the requirement of tracking a panoramic target of a large scene, a large-scene panoramic imaging algorithm which has a simple structure, is convenient to deploy, and has a cross-scale high-precision target tracking performance is urgently needed.
Disclosure of Invention
Based on the above, a cross-scale target tracking method and device are provided to solve the above technical problems.
In a first aspect, a method for tracking a cross-scale target includes:
firstly, building a training data acquisition device, calibrating the position of a sensing imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
step two, a trans-scale panoramic sensing system is built, and the trans-scale panoramic sensing system is utilized to collect images to obtain a plurality of full panoramic sensing systemsThe local perception image and the local perception images are calculated to obtain a transformation matrix of adjacent images of the global perception image; splicing and fusing a plurality of global perception images by utilizing the transformation matrix of the adjacent images of the global perception image to obtain a 360-degree panoramic spliced image I G ;
Step three, processing the multiple local perception images acquired in the step two by using the cross-scale target tracking model obtained by training in the step one respectively to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
Optionally, the cross-scale panoramic sensing system comprises twelve groups of fixing devices arranged in a regular dodecagon shape, each group of fixing devices is fixed with one camera array fixing support, and each camera array fixing support is fixed with one group of camera arrays; the twelve groups of camera arrays respectively and correspondingly cover twelve peripheral areas in the circumferential direction of the regular dodecagon;
each group of camera array comprises three perception imaging cameras, each three perception imaging camera comprises two local perception imaging cameras and a global perception imaging camera, and the global perception imaging camera is fixed between the two local perception imaging cameras;
the imaging areas of the global perception imaging cameras of the two adjacent groups of camera arrays have overlapping parts; for each group of camera array, the field angle of the global perception imaging camera is larger than the sum of the field angles of the two local perception imaging cameras, the vertical field angle of the global perception imaging camera is larger than 2 times of the vertical field angle of the local perception imaging camera, and the angle of the field-of-view optical axis of the local perception imaging camera can be adjusted at will.
Further optionally, the training data collecting device includes one camera array fixing bracket and a group of camera arrays fixed on the camera array fixing bracket.
Further optionally, in the trans-scale panoramic sensing system, the optical axes of the fields of view of all the sensing imaging cameras are coplanar with the regular dodecagon, and the reverse extension lines of the optical axes of the fields of view of all the global sensing imaging cameras pass through the center of the regular dodecagon; the horizontal field angles of all the global perception imaging cameras are larger than or equal to 45 degrees, and the imaging resolutions of all the global perception imaging cameras and all the local perception imaging cameras are larger than or equal to 900 ten thousand pixels.
Further optionally, the calibration of the position of the perception imaging camera is performed on the built training data acquisition device in the step one, the built training data acquisition device is used for data acquisition, and then the training to obtain the cross-scale target tracking model specifically includes:
position calibration is carried out on two local perception imaging cameras and a global perception imaging camera in a training data acquisition device, and a mapping matrix of the two local perception imaging cameras relative to the global perception imaging camera is obtained by utilizing a characteristic point matching methodAnd
acquiring image data at a specific place by using a built training data acquisition device to obtain a local perception image data set and a global perception image data set, and marking a positioning frame on a target acquired in the local perception image data set and the global perception image data set; dividing a local perception image data set with labels into a training set, a testing set and a verification set according to a preset proportion;
training the training set by using the existing target tracking algorithm and the loss function corresponding to the existing target tracking algorithm, and iteratively training the model according to a preset training strategy until the loss function is converged to obtain a cross-scale target tracking model.
Further optionally, the feature point matching method is a surf feature point matching algorithm; the specific sites include pedestrian streets and intersections.
Further optionally, the preset ratio is 8; the target tracking algorithm is a ByteTrack algorithm.
Further optionally, the second step specifically includes:
constructing a cross-scale panoramic sensing system, synchronously acquiring image data of 12 groups of camera arrays by using a square wave synchronous triggering method, and recording a global sensing image sensed by the 12 groups of camera arrays at the time t asAnd noting the perceived local perception image asi represents the index of a group of camera arrays, i belongs to {1,2, …,12};
for a global perceptual image sequence at time tSeparately aligning adjacent images in a sequence using an image feature based matching algorithmAndextracting and matching features to obtain a transformation matrix of adjacent images of the global perception imagei∈{1,2,…,11},j=i+1;
Splicing and fusing 12 global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G 。
Further optionally, the step three specifically includes:
respectively processing 24 local perception images perceived by 12 groups of camera arrays at t moment by using the cross-scale target tracking model obtained by training in the step one to obtain a coordinate position positioning result of the target in the local perception images;
predicting the position of the target track stored at the t-1 moment at the t moment based on a Kalman filtering algorithm, performing similarity matrix calculation based on a IoU strategy or an appearance characterization vector, and then performing matching by using a Hungarian algorithm;
for the matched track, combining the matching result and the target positioning result at the time t, updating the track at the time t by using a Kalman filtering algorithm, and then taking the updated track as the target tracking track at the time t; putting the unmatched tracks into the unmatched track set, and matching at the next moment; adding the unmatched positioning result into the current tracking track set as a new track, and updating the track according to a Kalman filtering algorithm;
according to a mapping matrixAndmapping the tracking track of the target in the local perception image back to the corresponding global perception image to obtain the tracking track of the target in the global perception image;
utilizing a transformation matrix of adjacent images of a global perceptual image for an acquired target track of the global perceptual imageTransforming the target tracks of the 12 overall perception images to obtain a 360-degree panoramic mosaic image I G The specific track of the medium target realizes the cross-scale target tracking of the hundred million pixel level panoramic image.
In a second aspect, a cross-scale target tracking device includes:
the trans-scale target tracking model training module is used for building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a trans-scale target tracking model;
the panoramic image fusion module is used for building a trans-scale panoramic sensing system, acquiring images by using the trans-scale panoramic sensing system to obtain a plurality of global sensing images and a plurality of local sensing images, and calculating to obtain a transformation matrix of adjacent images of the global sensing images; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G ;
The cross-scale target tracking module is used for processing a plurality of local perception images collected in the panoramic image fusion module by respectively utilizing a cross-scale target tracking model obtained by training of the cross-scale target tracking model training module to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
The invention has at least the following beneficial effects:
the embodiment of the invention provides a large-scene-oriented trans-scale target tracking method, which comprises the steps of utilizing a multilateral heterogeneous panoramic sensing system to carry out multi-scale high-resolution panoramic imaging on a target, then carrying out large-scale tracking on the target through a local sensing image, and finally mapping the target to a global sensing image to realize trans-scale high-precision target track tracking; the panoramic sensing system used by the invention has simple structure and convenient deployment, solves the problems of low target sensing resolution and poor edge imaging quality, meets the data acquisition and training conditions of large-scene panoramic high-precision target tracking, provides data support for the training of a high-precision target tracking method, and then realizes ultrahigh-resolution panoramic target tracking, so that the target tracking method provided by the embodiment of the invention has high precision and good applicability.
Drawings
Fig. 1 is a schematic flowchart of a cross-scale target tracking method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a cross-scale panoramic sensing system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a training data acquisition device according to an embodiment of the present invention;
fig. 4 is a block diagram of a module architecture of a cross-scale target tracking apparatus according to an embodiment of the present invention.
Description of reference numerals:
1. a fixing device;
2. a camera array fixing bracket;
3. an array of cameras; 31. a first partially-aware imaging camera; 32. a second local perception imaging camera; 33. a global perception imaging camera.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a cross-scale target Tracking method is provided, the main process includes large-scene cross-scale target Tracking model training, ultra-high resolution panoramic image fusion calculation, and ultra-high resolution panoramic cross-scale target Tracking result fusion, and the method adopts a Tracking-by-Detection strategy to realize target Tracking, including the following steps:
s1, building a training data acquisition device, calibrating the position of a sensing imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
s2, building a cross-scale panoramic sensing system and sensing by utilizing the cross-scale panoramicThe system collects images to obtain a plurality of global perception images and a plurality of local perception images, and calculates a transformation matrix of adjacent images of the global perception images; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G ;
S3, processing the multiple local perception images acquired in the S2 by using the cross-scale target tracking model obtained by the training of the S1 respectively to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
As shown in fig. 2, the trans-scale panoramic sensing system adopts a polygonal heterogeneous design to realize 360-degree panoramic sensing imaging, and includes twelve groups of fixing devices 1 arranged in a regular dodecagon, each group of fixing devices 1 is fixed with one camera array fixing support 2, and each camera array fixing support 2 is fixed with one group of camera arrays 3. That is, 12 sets of camera arrays 3 are arranged around the polygonal array around 12 sets of fixtures 1, and 12 peripheral regions of the regular dodecagon in the circumferential direction are respectively covered by the 12 sets of camera arrays 3.
Fig. 2 (a) at 3 is a group of camera arrays, and specifically, as shown in fig. 2 (b), each group of camera arrays 3 includes three perceptual imaging cameras; the three perception imaging cameras specifically include two local perception imaging cameras (a first local perception imaging camera 31 and a second local perception imaging camera 32) and one global perception imaging camera 33, and the global perception imaging camera 33 is fixed in the middle of the first local perception imaging camera 31 and the second local perception imaging camera 32.
The imaging areas of the globally perceived imaging cameras 33 of two adjacent sets of camera arrays 3 have overlapping portions; for each group of camera arrays 3, the field angle of the global perception imaging camera 33 is greater than the sum of the field angles of the two local perception imaging cameras, the vertical field angle of the global perception imaging camera 33 is greater than 2 times of the vertical field angle of the local perception imaging camera, and the field optical axes of the two local perception cameras can be adjusted at any angle to achieve heterogeneous deployment.
The trans-scale panoramic sensing system built by the embodiment comprises 12 global sensing imaging cameras 33 and 24 local sensing imaging cameras, the visual field optical axes of all the sensing imaging cameras are coplanar with a regular dodecagon, and the reverse extension lines of the visual field optical axes of all the global sensing imaging cameras 33 pass through the center of the regular dodecagon; the imaging resolution of all perceptual imaging cameras (including global perceptual imaging cameras and local perceptual imaging cameras) is no lower than 900 ten thousand pixels.
The system is characterized in that: in order to avoid dead angles in panoramic imaging and also consider the design of subsequent algorithms, it is necessary to ensure that imaging areas between each group of camera arrays 3 have a certain overlap. Taking the above system as an example, the horizontal field angle of the single global perception imaging camera 33 theoretically needs to reach more than 30 degrees, and actually should reach more than 45 degrees to ensure a certain degree of imaging overlapping area; for a set of camera arrays 3, the vertical field of view of the global perception cameras should be greater than 2 times the vertical field of view of the local perception cameras; the local perception device of the system can realize any angle adjustment to realize heterogeneous deployment, as long as the vertical field angle of the global perception imaging camera 33 is ensured to be 2 times larger than that of the corresponding local perception imaging camera.
In addition, the structure of the training data acquisition device is as shown in fig. 3, and includes a camera array fixing support 2 in the cross-scale panoramic sensing system and a group of camera arrays 3 fixed on the camera array fixing support 2, where the camera arrays 3 specifically include two local perception imaging cameras (a first local perception imaging camera 31 and a second local perception imaging camera 32) and a global perception imaging camera 33, and the configuration requirements of the two local perception imaging cameras are the same as the configuration requirements of the group of camera arrays 3 in the cross-scale panoramic sensing system.
In fig. 3, a is an assumed scene tracking target, B is an imaging schematic diagram of the global perception imaging camera 33 capturing a perception target, C, D respectively represent imaging schematic diagrams of the first local perception imaging camera 31 and the second local perception imaging camera 32 capturing a perception target, and E is a schematic diagram of a region corresponding to C, D camera local imaging in B. As can be seen from fig. 3, the target scales of the sensing target imaged on the global sensing imaging camera and the local sensing imaging camera have a large difference, and the target has a small size under global sensing, which results in poor target tracking accuracy of the global sensing image.
Further, the calibration of the position of the perception imaging camera is carried out on the built training data acquisition device in the S1, the built training data acquisition device is used for data acquisition, and then the training to obtain the cross-scale target tracking model specifically comprises the following steps:
(1) The two local perception imaging cameras (the first local perception imaging camera 31 and the second local perception imaging camera 32) and the global perception imaging camera 33 in the training data acquisition device shown in fig. 3 are subjected to position calibration, and a mapping matrix of the two local perception imaging cameras relative to the global perception imaging camera 33 is obtained by using a feature point matching methodAnd
(2) Carrying out diversity large-scene data acquisition by utilizing a built training data acquisition device, acquiring image data in a specific place to obtain a local perception image data set and a global perception image data set, and carrying out positioning frame marking on targets acquired in the local perception image data set and the global perception image data set; dividing a local perception image data set with labels into a training set, a test set and a verification set according to a preset ratio (8;
(3) Training a training set by using the existing target tracking algorithm and a loss function corresponding to the existing target tracking algorithm, and iteratively training the model according to a preset training strategy (learning rate, training Batch, training Epoch and optimizer) until the loss function is converged to obtain a large-scene cross-scale target tracking model, thereby completing the training of the global cross-scale target tracking model.
The feature point matching method includes, but is not limited to, surf feature point matching algorithm, the specific location includes, but is not limited to, a wide and dense target place such as a pedestrian street square and a crossroad, and the target tracking algorithm may be, but is not limited to, byteTrack algorithm.
Further, S2 specifically includes:
s21, constructing the cross-scale panoramic sensing system shown in FIG 2, and synchronously acquiring image data of 12 groups of camera arrays by using a square wave synchronous triggering method, wherein the situation that the 12 groups of camera arrays sense global sensing images at a certain time t is assumed to beAnd the perceived local perceived image isi represents the index of a group of camera arrays, i belongs to {1,2, …,12};
s22, for the global perception image sequence at the time tSeparately aligning adjacent images in a sequence using an image feature based matching algorithmAndextracting and matching features to obtain a transformation matrix of adjacent images of the global perception imagei ∈ {1,2, …,11}, j = i +1, with which neighboring images can be stitched;
s23, splicing and fusing 12 global perception images by using the transformation matrix of the adjacent images of the global perception image to obtain a 360-degree panoramic spliced image I G Thus, the fusion calculation of the ultrahigh resolution panoramic image is completed.
Further, S3 specifically includes:
s31, respectively processing 24 local perception images perceived by 12 groups of camera arrays of the cross-scale panoramic perception system at t moments by utilizing a cross-scale target tracking model obtained by training in the S1 to obtain a coordinate position positioning result of a target in the local perception images;
s32, predicting the position of the target track stored at the t-1 moment at the t moment based on a Kalman filtering algorithm, performing similarity matrix calculation based on a IoU strategy or an appearance characterization vector, and then performing matching by using a Hungary algorithm; namely, the target track stored at the time t-1 is matched with each target position in the prediction result, and the target track stored at the time t-1 is matched with the corresponding prediction position at the time t;
s33, updating the track at the time t by using a Kalman filtering algorithm according to the matched track and the target positioning result at the time t, and then taking the updated track as a target tracking track at the time t; putting the unmatched tracks into the unmatched track set, and matching at the next moment; adding the unmatched positioning result serving as a new track into the current tracking track set, and updating the track according to a Kalman filtering algorithm;
s34, according to the mapping matrixMapping the tracking track of the target in the local perception image back to the corresponding global perception image to obtain the tracking track of the target in the global perception image;
s35, aiming at the acquired target track of the global perception image, utilizing the obtained transformation matrix of the adjacent image of the global perception imageTransforming the target tracks of the 12 overall perception images to obtain a 360-degree panoramic mosaic image I G The specific track of the medium target realizes the cross-scale target tracking of the hundred million pixel level panoramic image.
The traditional target tracking method is used for carrying out target positioning and track tracking on a local scene perceived by a monocular camera, and is difficult to cope with large scale change of a target in a panoramic imaging process under a panoramic imaging condition. Based on the panoramic target tracking requirement of the large scene, the global perception of the large scene and the high-precision tracking of the multi-scale target are realized, and higher requirements are provided for panoramic imaging hardware and a target tracking method.
Therefore, the embodiment of the invention provides a large-scene-oriented cross-scale target tracking method, which comprises the steps of utilizing a multilateral heterogeneous panoramic sensing system to carry out multi-scale high-resolution panoramic imaging on a target, then carrying out large-scale tracking on the target through a local sensing image, and finally mapping the target to a global sensing image to realize cross-scale high-precision target track tracking; the panoramic sensing system utilized by the invention has a simple structure and is convenient to deploy, the problems of low target sensing resolution and poor edge imaging quality are solved, the data acquisition and training conditions of large-scene panoramic high-precision target tracking are met, data support is provided for the training of a high-precision target tracking method, and then the ultrahigh-resolution panoramic target tracking is realized based on the designed cross-scale target tracking method, so that the target tracking method provided by the embodiment of the invention has high precision and good applicability.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
In one embodiment, as shown in fig. 4, there is provided a cross-scale target tracking apparatus, comprising:
the cross-scale target tracking model training module 401 is used for building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
the panoramic image fusion module 402 is used for building a trans-scale panoramic sensing system, collecting images by using the trans-scale panoramic sensing system to obtain a plurality of global sensing images and a plurality of local sensing images, and calculating a transformation matrix of adjacent images of the global sensing images; splicing and fusing a plurality of global perception images by utilizing the transformation matrix of the adjacent images of the global perception image to obtain a 360-degree panoramic spliced image I G ;
The cross-scale target tracking module 403 is configured to process, by using a cross-scale target tracking model obtained through training by the cross-scale target tracking model training module 401, multiple local perception images acquired by the panoramic image fusion module 402, respectively, to obtain a coordinate position positioning result of a target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
For specific definition of the cross-scale target tracking device, refer to the above definition of a cross-scale target tracking method, which is not described herein again. The modules in the cross-scale target tracking device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory storing a computer program, and relates to all or part of the flow of the method of the above embodiment.
In one embodiment, a computer-readable storage medium having a computer program stored thereon is provided, which relates to all or part of the processes of the above-described embodiment methods.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile memory may include Read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical storage, or the like. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A cross-scale target tracking method is characterized by comprising the following steps:
building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
step two, a cross-scale panoramic sensing system is built, the cross-scale panoramic sensing system is used for collecting images to obtain a plurality of global sensing images and a plurality of local sensing images, and a transformation matrix of adjacent images of the global sensing images is obtained through calculation; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G ;
Step three, processing the multiple local perception images acquired in the step two by using the cross-scale target tracking model obtained by training in the step one respectively to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
2. The cross-scale target tracking method according to claim 1, wherein the cross-scale panoramic sensing system comprises twelve groups of fixing devices arranged in a regular dodecagon, each group of fixing devices is fixed with one camera array fixing bracket, and each camera array fixing bracket is fixed with one group of camera arrays; the twelve groups of camera arrays respectively and correspondingly cover twelve peripheral areas in the circumferential direction of the regular dodecagon;
each group of camera array comprises three perception imaging cameras, each three perception imaging camera comprises two local perception imaging cameras and a global perception imaging camera, and the global perception imaging camera is fixed between the two local perception imaging cameras;
the imaging areas of the global perception imaging cameras of the two adjacent groups of camera arrays have overlapping parts; for each group of camera array, the field angle of the global perception imaging camera is larger than the sum of the field angles of the two local perception imaging cameras, the vertical field angle of the global perception imaging camera is larger than 2 times of the vertical field angle of the local perception imaging camera, and the angle of the field-of-view optical axis of the local perception imaging camera can be adjusted at will.
3. The cross-scale target tracking method according to claim 2, wherein the training data acquisition device comprises one of the camera array fixing brackets and a group of the camera arrays fixed on the camera array fixing bracket.
4. The cross-scale target tracking method according to claim 3, wherein in the cross-scale panoramic sensing system, the optical axes of the fields of view of all the sensing imaging cameras are coplanar with the regular dodecagon, and the reverse extension lines of the optical axes of the fields of view of all the global sensing imaging cameras pass through the center of the regular dodecagon; the horizontal field angles of all the global perception imaging cameras are larger than or equal to 45 degrees, and the imaging resolutions of all the global perception imaging cameras and all the local perception imaging cameras are larger than or equal to 900 ten thousand pixels.
5. The cross-scale target tracking method according to claim 4, wherein in the first step, the position of the perception imaging camera is calibrated for the built training data acquisition device, the built training data acquisition device is used for data acquisition, and then the training is carried out to obtain the cross-scale target tracking model specifically comprises the following steps:
position calibration is carried out on two local perception imaging cameras and a global perception imaging camera in a training data acquisition device, and a mapping matrix of the two local perception imaging cameras relative to the global perception imaging camera is obtained by utilizing a characteristic point matching methodAnd
acquiring image data at a specific place by using a built training data acquisition device to obtain a local perception image data set and a global perception image data set, and marking a positioning frame on a target acquired in the local perception image data set and the global perception image data set; dividing a local perception image data set with labels into a training set, a test set and a verification set according to a preset proportion;
training the training set by using the existing target tracking algorithm and the loss function corresponding to the existing target tracking algorithm, and iterating the training model according to a preset training strategy until the loss function is converged to obtain the cross-scale target tracking model.
6. The cross-scale target tracking method according to claim 5, wherein the feature point matching method is a surf feature point matching algorithm; the specific sites include pedestrian streets and intersections.
7. The cross-scale target tracking method according to claim 5, wherein the preset ratio is 8; the target tracking algorithm is a ByteTrack algorithm.
8. The cross-scale target tracking method according to claim 5, wherein the second step specifically comprises:
constructing a cross-scale panoramic sensing system, synchronously acquiring image data of 12 groups of camera arrays by using a square wave synchronous triggering method, and recording a global sensing image sensed by the 12 groups of camera arrays at the time t asAnd noting the perceived local perception image asi represents the index of a group of camera arrays, i belongs to {1,2, …,12};
for a global perceptual image sequence at time tSeparately aligning adjacent images in a sequence using an image feature based matching algorithmAndextracting and matching features to obtain a transformation matrix of adjacent images of the global perception image
Splicing and fusing 12 global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G 。
9. The cross-scale target tracking method according to claim 8, wherein the third step specifically comprises:
respectively processing 24 local perception images perceived by 12 groups of camera arrays at t moment by using the cross-scale target tracking model obtained by training in the step one to obtain a coordinate position positioning result of the target in the local perception images;
predicting the position of the target track stored at the t-1 moment at the t moment based on a Kalman filtering algorithm, performing similarity matrix calculation based on a IoU strategy or an appearance characterization vector, and then performing matching by using a Hungarian algorithm;
for the matched track, combining the matching result and the target positioning result at the time t, updating the track at the time t by using a Kalman filtering algorithm, and then taking the updated track as the target tracking track at the time t; putting the unmatched tracks into the unmatched track set, and matching at the next moment; adding the unmatched positioning result into the current tracking track set as a new track, and updating the track according to a Kalman filtering algorithm;
according to a mapping matrixAndmapping the tracking track of the target in the local perception image back to the corresponding global perception image to obtain the tracking track of the target in the global perception image;
utilizing a transformation matrix of adjacent images of a global perceptual image for an acquired target track of the global perceptual imageTransforming the target tracks of the 12 overall perception images to obtain a 360-degree panoramic mosaic image I G The specific track of the medium target realizes the cross-scale target tracking of the hundred million pixel level panoramic image.
10. A cross-scale target tracking device, comprising:
the trans-scale target tracking model training module is used for building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a trans-scale target tracking model;
the panoramic image fusion module is used for building a trans-scale panoramic sensing system, acquiring images by using the trans-scale panoramic sensing system to obtain a plurality of global sensing images and a plurality of local sensing images, and calculating to obtain a transformation matrix of adjacent images of the global sensing images; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G ;
The cross-scale target tracking module is used for processing a plurality of local perception images collected in the panoramic image fusion module by respectively utilizing a cross-scale target tracking model obtained by training of the cross-scale target tracking model training module to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210999023.2A CN115330838A (en) | 2022-08-19 | 2022-08-19 | Cross-scale target tracking method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210999023.2A CN115330838A (en) | 2022-08-19 | 2022-08-19 | Cross-scale target tracking method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115330838A true CN115330838A (en) | 2022-11-11 |
Family
ID=83925796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210999023.2A Pending CN115330838A (en) | 2022-08-19 | 2022-08-19 | Cross-scale target tracking method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115330838A (en) |
-
2022
- 2022-08-19 CN CN202210999023.2A patent/CN115330838A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107659774B (en) | Video imaging system and video processing method based on multi-scale camera array | |
CN110211043B (en) | Registration method based on grid optimization for panoramic image stitching | |
CN111750820B (en) | Image positioning method and system | |
CN108921781B (en) | Depth-based optical field splicing method | |
CN106023086B (en) | A kind of aerial images and geodata joining method based on ORB characteristic matching | |
CN109903227B (en) | Panoramic image splicing method based on camera geometric position relation | |
Sinha et al. | Pan–tilt–zoom camera calibration and high-resolution mosaic generation | |
US8755624B2 (en) | Image registration device and method thereof | |
JP5872818B2 (en) | Positioning processing device, positioning processing method, and image processing device | |
CN106529538A (en) | Method and device for positioning aircraft | |
CN112101361A (en) | Target detection method, device and equipment for fisheye image and storage medium | |
CN109118429B (en) | Method for rapidly generating intermediate wave infrared-visible light multispectral image | |
CN106373088A (en) | Quick mosaic method for aviation images with high tilt rate and low overlapping rate | |
CN111242987B (en) | Target tracking method and device, electronic equipment and storage medium | |
CN112348775B (en) | Vehicle-mounted looking-around-based pavement pit detection system and method | |
CN110782498B (en) | Rapid universal calibration method for visual sensing network | |
CN108734657A (en) | A kind of image split-joint method with parallax processing capacity | |
CN113436130B (en) | Intelligent sensing system and device for unstructured light field | |
CN105335977B (en) | The localization method of camera system and target object | |
CN110260857A (en) | Calibration method, device and the storage medium of vision map | |
CN109341720A (en) | A kind of remote sensing camera geometric calibration method based on fixed star track | |
CN111553939A (en) | Image registration algorithm of multi-view camera | |
Aeschliman et al. | Tracking vehicles through shadows and occlusions in wide-area aerial video | |
CN114897676A (en) | Unmanned aerial vehicle remote sensing multispectral image splicing method, device and medium | |
CN115619623A (en) | Parallel fisheye camera image splicing method based on moving least square transformation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20240204 Address after: 314001 9F, No.705, Asia Pacific Road, Nanhu District, Jiaxing City, Zhejiang Province Applicant after: ZHEJIANG YANGTZE DELTA REGION INSTITUTE OF TSINGHUA University Country or region after: China Address before: No.152 Huixin Road, Nanhu District, Jiaxing City, Zhejiang Province 314000 Applicant before: ZHEJIANG FUTURE TECHNOLOGY INSTITUTE (JIAXING) Country or region before: China |