CN115330838A - Cross-scale target tracking method and device - Google Patents

Cross-scale target tracking method and device Download PDF

Info

Publication number
CN115330838A
CN115330838A CN202210999023.2A CN202210999023A CN115330838A CN 115330838 A CN115330838 A CN 115330838A CN 202210999023 A CN202210999023 A CN 202210999023A CN 115330838 A CN115330838 A CN 115330838A
Authority
CN
China
Prior art keywords
perception
target
images
global
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210999023.2A
Other languages
Chinese (zh)
Inventor
高坤
邵航
夏晗
刘威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangtze Delta Region Institute of Tsinghua University Zhejiang
Original Assignee
Zhejiang Future Technology Institute (jiaxing)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Future Technology Institute (jiaxing) filed Critical Zhejiang Future Technology Institute (jiaxing)
Priority to CN202210999023.2A priority Critical patent/CN115330838A/en
Publication of CN115330838A publication Critical patent/CN115330838A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • G06V10/225Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on a marking or identifier characterising the area
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/32Indexing scheme for image data processing or generation, in general involving image mosaicing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a cross-scale target tracking method and a device, wherein the method comprises the following steps: building a training data acquisition device and training to obtain a cross-scale target heelA tracking model; constructing a cross-scale panoramic sensing system, collecting images, and calculating to obtain a transformation matrix of adjacent images of the overall sensing image; splicing and fusing a plurality of global perception images by using transformation matrixes of adjacent images to obtain a 360-degree panoramic spliced image I G (ii) a Calculating a coordinate positioning result of the target in the local perception image; obtaining a tracking track of a target in a corresponding global perception image according to the tracking track of the target in the local perception image; obtaining a 360-degree panoramic mosaic image I through mosaic G The specific trajectory of the target. The panoramic sensing system used by the invention solves the problems of low target sensing resolution and poor edge imaging quality, and provides data support for the training of a high-precision target tracking method, so that the target tracking method has high precision and good applicability.

Description

Cross-scale target tracking method and device
Technical Field
The application relates to the technical field of deep learning, in particular to a cross-scale target tracking method and device.
Background
The existing large-scene panoramic imaging mode generally utilizes a plurality of cameras to splice panoramic imaging or utilizes a plurality of fish glasses to perform panoramic imaging. The two modes have the defects of low small-scale target perception resolution and poor edge imaging quality, so that when the cross-scale target tracking is carried out under the conditions of motion blur, background similar interference, and occlusion or illumination change, the precision is low and the applicability is poor; in addition, the deployment modes have the problems of high price, high deployment cost and the like.
Therefore, the inventor recognizes that, for the requirement of tracking a panoramic target of a large scene, a large-scene panoramic imaging algorithm which has a simple structure, is convenient to deploy, and has a cross-scale high-precision target tracking performance is urgently needed.
Disclosure of Invention
Based on the above, a cross-scale target tracking method and device are provided to solve the above technical problems.
In a first aspect, a method for tracking a cross-scale target includes:
firstly, building a training data acquisition device, calibrating the position of a sensing imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
step two, a trans-scale panoramic sensing system is built, and the trans-scale panoramic sensing system is utilized to collect images to obtain a plurality of full panoramic sensing systemsThe local perception image and the local perception images are calculated to obtain a transformation matrix of adjacent images of the global perception image; splicing and fusing a plurality of global perception images by utilizing the transformation matrix of the adjacent images of the global perception image to obtain a 360-degree panoramic spliced image I G
Step three, processing the multiple local perception images acquired in the step two by using the cross-scale target tracking model obtained by training in the step one respectively to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
Optionally, the cross-scale panoramic sensing system comprises twelve groups of fixing devices arranged in a regular dodecagon shape, each group of fixing devices is fixed with one camera array fixing support, and each camera array fixing support is fixed with one group of camera arrays; the twelve groups of camera arrays respectively and correspondingly cover twelve peripheral areas in the circumferential direction of the regular dodecagon;
each group of camera array comprises three perception imaging cameras, each three perception imaging camera comprises two local perception imaging cameras and a global perception imaging camera, and the global perception imaging camera is fixed between the two local perception imaging cameras;
the imaging areas of the global perception imaging cameras of the two adjacent groups of camera arrays have overlapping parts; for each group of camera array, the field angle of the global perception imaging camera is larger than the sum of the field angles of the two local perception imaging cameras, the vertical field angle of the global perception imaging camera is larger than 2 times of the vertical field angle of the local perception imaging camera, and the angle of the field-of-view optical axis of the local perception imaging camera can be adjusted at will.
Further optionally, the training data collecting device includes one camera array fixing bracket and a group of camera arrays fixed on the camera array fixing bracket.
Further optionally, in the trans-scale panoramic sensing system, the optical axes of the fields of view of all the sensing imaging cameras are coplanar with the regular dodecagon, and the reverse extension lines of the optical axes of the fields of view of all the global sensing imaging cameras pass through the center of the regular dodecagon; the horizontal field angles of all the global perception imaging cameras are larger than or equal to 45 degrees, and the imaging resolutions of all the global perception imaging cameras and all the local perception imaging cameras are larger than or equal to 900 ten thousand pixels.
Further optionally, the calibration of the position of the perception imaging camera is performed on the built training data acquisition device in the step one, the built training data acquisition device is used for data acquisition, and then the training to obtain the cross-scale target tracking model specifically includes:
position calibration is carried out on two local perception imaging cameras and a global perception imaging camera in a training data acquisition device, and a mapping matrix of the two local perception imaging cameras relative to the global perception imaging camera is obtained by utilizing a characteristic point matching method
Figure BDA0003806808360000031
And
Figure BDA0003806808360000032
acquiring image data at a specific place by using a built training data acquisition device to obtain a local perception image data set and a global perception image data set, and marking a positioning frame on a target acquired in the local perception image data set and the global perception image data set; dividing a local perception image data set with labels into a training set, a testing set and a verification set according to a preset proportion;
training the training set by using the existing target tracking algorithm and the loss function corresponding to the existing target tracking algorithm, and iteratively training the model according to a preset training strategy until the loss function is converged to obtain a cross-scale target tracking model.
Further optionally, the feature point matching method is a surf feature point matching algorithm; the specific sites include pedestrian streets and intersections.
Further optionally, the preset ratio is 8; the target tracking algorithm is a ByteTrack algorithm.
Further optionally, the second step specifically includes:
constructing a cross-scale panoramic sensing system, synchronously acquiring image data of 12 groups of camera arrays by using a square wave synchronous triggering method, and recording a global sensing image sensed by the 12 groups of camera arrays at the time t as
Figure BDA0003806808360000033
And noting the perceived local perception image as
Figure BDA0003806808360000034
i represents the index of a group of camera arrays, i belongs to {1,2, …,12};
for a global perceptual image sequence at time t
Figure BDA0003806808360000035
Separately aligning adjacent images in a sequence using an image feature based matching algorithm
Figure BDA0003806808360000036
And
Figure BDA0003806808360000037
extracting and matching features to obtain a transformation matrix of adjacent images of the global perception image
Figure BDA0003806808360000038
i∈{1,2,…,11},j=i+1;
Splicing and fusing 12 global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G
Further optionally, the step three specifically includes:
respectively processing 24 local perception images perceived by 12 groups of camera arrays at t moment by using the cross-scale target tracking model obtained by training in the step one to obtain a coordinate position positioning result of the target in the local perception images;
predicting the position of the target track stored at the t-1 moment at the t moment based on a Kalman filtering algorithm, performing similarity matrix calculation based on a IoU strategy or an appearance characterization vector, and then performing matching by using a Hungarian algorithm;
for the matched track, combining the matching result and the target positioning result at the time t, updating the track at the time t by using a Kalman filtering algorithm, and then taking the updated track as the target tracking track at the time t; putting the unmatched tracks into the unmatched track set, and matching at the next moment; adding the unmatched positioning result into the current tracking track set as a new track, and updating the track according to a Kalman filtering algorithm;
according to a mapping matrix
Figure BDA0003806808360000042
And
Figure BDA0003806808360000043
mapping the tracking track of the target in the local perception image back to the corresponding global perception image to obtain the tracking track of the target in the global perception image;
utilizing a transformation matrix of adjacent images of a global perceptual image for an acquired target track of the global perceptual image
Figure BDA0003806808360000041
Transforming the target tracks of the 12 overall perception images to obtain a 360-degree panoramic mosaic image I G The specific track of the medium target realizes the cross-scale target tracking of the hundred million pixel level panoramic image.
In a second aspect, a cross-scale target tracking device includes:
the trans-scale target tracking model training module is used for building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a trans-scale target tracking model;
the panoramic image fusion module is used for building a trans-scale panoramic sensing system, acquiring images by using the trans-scale panoramic sensing system to obtain a plurality of global sensing images and a plurality of local sensing images, and calculating to obtain a transformation matrix of adjacent images of the global sensing images; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G
The cross-scale target tracking module is used for processing a plurality of local perception images collected in the panoramic image fusion module by respectively utilizing a cross-scale target tracking model obtained by training of the cross-scale target tracking model training module to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
The invention has at least the following beneficial effects:
the embodiment of the invention provides a large-scene-oriented trans-scale target tracking method, which comprises the steps of utilizing a multilateral heterogeneous panoramic sensing system to carry out multi-scale high-resolution panoramic imaging on a target, then carrying out large-scale tracking on the target through a local sensing image, and finally mapping the target to a global sensing image to realize trans-scale high-precision target track tracking; the panoramic sensing system used by the invention has simple structure and convenient deployment, solves the problems of low target sensing resolution and poor edge imaging quality, meets the data acquisition and training conditions of large-scene panoramic high-precision target tracking, provides data support for the training of a high-precision target tracking method, and then realizes ultrahigh-resolution panoramic target tracking, so that the target tracking method provided by the embodiment of the invention has high precision and good applicability.
Drawings
Fig. 1 is a schematic flowchart of a cross-scale target tracking method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a cross-scale panoramic sensing system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a training data acquisition device according to an embodiment of the present invention;
fig. 4 is a block diagram of a module architecture of a cross-scale target tracking apparatus according to an embodiment of the present invention.
Description of reference numerals:
1. a fixing device;
2. a camera array fixing bracket;
3. an array of cameras; 31. a first partially-aware imaging camera; 32. a second local perception imaging camera; 33. a global perception imaging camera.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a cross-scale target Tracking method is provided, the main process includes large-scene cross-scale target Tracking model training, ultra-high resolution panoramic image fusion calculation, and ultra-high resolution panoramic cross-scale target Tracking result fusion, and the method adopts a Tracking-by-Detection strategy to realize target Tracking, including the following steps:
s1, building a training data acquisition device, calibrating the position of a sensing imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
s2, building a cross-scale panoramic sensing system and sensing by utilizing the cross-scale panoramicThe system collects images to obtain a plurality of global perception images and a plurality of local perception images, and calculates a transformation matrix of adjacent images of the global perception images; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G
S3, processing the multiple local perception images acquired in the S2 by using the cross-scale target tracking model obtained by the training of the S1 respectively to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
As shown in fig. 2, the trans-scale panoramic sensing system adopts a polygonal heterogeneous design to realize 360-degree panoramic sensing imaging, and includes twelve groups of fixing devices 1 arranged in a regular dodecagon, each group of fixing devices 1 is fixed with one camera array fixing support 2, and each camera array fixing support 2 is fixed with one group of camera arrays 3. That is, 12 sets of camera arrays 3 are arranged around the polygonal array around 12 sets of fixtures 1, and 12 peripheral regions of the regular dodecagon in the circumferential direction are respectively covered by the 12 sets of camera arrays 3.
Fig. 2 (a) at 3 is a group of camera arrays, and specifically, as shown in fig. 2 (b), each group of camera arrays 3 includes three perceptual imaging cameras; the three perception imaging cameras specifically include two local perception imaging cameras (a first local perception imaging camera 31 and a second local perception imaging camera 32) and one global perception imaging camera 33, and the global perception imaging camera 33 is fixed in the middle of the first local perception imaging camera 31 and the second local perception imaging camera 32.
The imaging areas of the globally perceived imaging cameras 33 of two adjacent sets of camera arrays 3 have overlapping portions; for each group of camera arrays 3, the field angle of the global perception imaging camera 33 is greater than the sum of the field angles of the two local perception imaging cameras, the vertical field angle of the global perception imaging camera 33 is greater than 2 times of the vertical field angle of the local perception imaging camera, and the field optical axes of the two local perception cameras can be adjusted at any angle to achieve heterogeneous deployment.
The trans-scale panoramic sensing system built by the embodiment comprises 12 global sensing imaging cameras 33 and 24 local sensing imaging cameras, the visual field optical axes of all the sensing imaging cameras are coplanar with a regular dodecagon, and the reverse extension lines of the visual field optical axes of all the global sensing imaging cameras 33 pass through the center of the regular dodecagon; the imaging resolution of all perceptual imaging cameras (including global perceptual imaging cameras and local perceptual imaging cameras) is no lower than 900 ten thousand pixels.
The system is characterized in that: in order to avoid dead angles in panoramic imaging and also consider the design of subsequent algorithms, it is necessary to ensure that imaging areas between each group of camera arrays 3 have a certain overlap. Taking the above system as an example, the horizontal field angle of the single global perception imaging camera 33 theoretically needs to reach more than 30 degrees, and actually should reach more than 45 degrees to ensure a certain degree of imaging overlapping area; for a set of camera arrays 3, the vertical field of view of the global perception cameras should be greater than 2 times the vertical field of view of the local perception cameras; the local perception device of the system can realize any angle adjustment to realize heterogeneous deployment, as long as the vertical field angle of the global perception imaging camera 33 is ensured to be 2 times larger than that of the corresponding local perception imaging camera.
In addition, the structure of the training data acquisition device is as shown in fig. 3, and includes a camera array fixing support 2 in the cross-scale panoramic sensing system and a group of camera arrays 3 fixed on the camera array fixing support 2, where the camera arrays 3 specifically include two local perception imaging cameras (a first local perception imaging camera 31 and a second local perception imaging camera 32) and a global perception imaging camera 33, and the configuration requirements of the two local perception imaging cameras are the same as the configuration requirements of the group of camera arrays 3 in the cross-scale panoramic sensing system.
In fig. 3, a is an assumed scene tracking target, B is an imaging schematic diagram of the global perception imaging camera 33 capturing a perception target, C, D respectively represent imaging schematic diagrams of the first local perception imaging camera 31 and the second local perception imaging camera 32 capturing a perception target, and E is a schematic diagram of a region corresponding to C, D camera local imaging in B. As can be seen from fig. 3, the target scales of the sensing target imaged on the global sensing imaging camera and the local sensing imaging camera have a large difference, and the target has a small size under global sensing, which results in poor target tracking accuracy of the global sensing image.
Further, the calibration of the position of the perception imaging camera is carried out on the built training data acquisition device in the S1, the built training data acquisition device is used for data acquisition, and then the training to obtain the cross-scale target tracking model specifically comprises the following steps:
(1) The two local perception imaging cameras (the first local perception imaging camera 31 and the second local perception imaging camera 32) and the global perception imaging camera 33 in the training data acquisition device shown in fig. 3 are subjected to position calibration, and a mapping matrix of the two local perception imaging cameras relative to the global perception imaging camera 33 is obtained by using a feature point matching method
Figure BDA0003806808360000081
And
Figure BDA0003806808360000082
(2) Carrying out diversity large-scene data acquisition by utilizing a built training data acquisition device, acquiring image data in a specific place to obtain a local perception image data set and a global perception image data set, and carrying out positioning frame marking on targets acquired in the local perception image data set and the global perception image data set; dividing a local perception image data set with labels into a training set, a test set and a verification set according to a preset ratio (8;
(3) Training a training set by using the existing target tracking algorithm and a loss function corresponding to the existing target tracking algorithm, and iteratively training the model according to a preset training strategy (learning rate, training Batch, training Epoch and optimizer) until the loss function is converged to obtain a large-scene cross-scale target tracking model, thereby completing the training of the global cross-scale target tracking model.
The feature point matching method includes, but is not limited to, surf feature point matching algorithm, the specific location includes, but is not limited to, a wide and dense target place such as a pedestrian street square and a crossroad, and the target tracking algorithm may be, but is not limited to, byteTrack algorithm.
Further, S2 specifically includes:
s21, constructing the cross-scale panoramic sensing system shown in FIG 2, and synchronously acquiring image data of 12 groups of camera arrays by using a square wave synchronous triggering method, wherein the situation that the 12 groups of camera arrays sense global sensing images at a certain time t is assumed to be
Figure BDA0003806808360000091
And the perceived local perceived image is
Figure BDA0003806808360000092
i represents the index of a group of camera arrays, i belongs to {1,2, …,12};
s22, for the global perception image sequence at the time t
Figure BDA0003806808360000093
Separately aligning adjacent images in a sequence using an image feature based matching algorithm
Figure BDA0003806808360000094
And
Figure BDA0003806808360000095
extracting and matching features to obtain a transformation matrix of adjacent images of the global perception image
Figure BDA0003806808360000096
i ∈ {1,2, …,11}, j = i +1, with which neighboring images can be stitched;
s23, splicing and fusing 12 global perception images by using the transformation matrix of the adjacent images of the global perception image to obtain a 360-degree panoramic spliced image I G Thus, the fusion calculation of the ultrahigh resolution panoramic image is completed.
Further, S3 specifically includes:
s31, respectively processing 24 local perception images perceived by 12 groups of camera arrays of the cross-scale panoramic perception system at t moments by utilizing a cross-scale target tracking model obtained by training in the S1 to obtain a coordinate position positioning result of a target in the local perception images;
s32, predicting the position of the target track stored at the t-1 moment at the t moment based on a Kalman filtering algorithm, performing similarity matrix calculation based on a IoU strategy or an appearance characterization vector, and then performing matching by using a Hungary algorithm; namely, the target track stored at the time t-1 is matched with each target position in the prediction result, and the target track stored at the time t-1 is matched with the corresponding prediction position at the time t;
s33, updating the track at the time t by using a Kalman filtering algorithm according to the matched track and the target positioning result at the time t, and then taking the updated track as a target tracking track at the time t; putting the unmatched tracks into the unmatched track set, and matching at the next moment; adding the unmatched positioning result serving as a new track into the current tracking track set, and updating the track according to a Kalman filtering algorithm;
s34, according to the mapping matrix
Figure BDA0003806808360000101
Mapping the tracking track of the target in the local perception image back to the corresponding global perception image to obtain the tracking track of the target in the global perception image;
s35, aiming at the acquired target track of the global perception image, utilizing the obtained transformation matrix of the adjacent image of the global perception image
Figure BDA0003806808360000102
Transforming the target tracks of the 12 overall perception images to obtain a 360-degree panoramic mosaic image I G The specific track of the medium target realizes the cross-scale target tracking of the hundred million pixel level panoramic image.
The traditional target tracking method is used for carrying out target positioning and track tracking on a local scene perceived by a monocular camera, and is difficult to cope with large scale change of a target in a panoramic imaging process under a panoramic imaging condition. Based on the panoramic target tracking requirement of the large scene, the global perception of the large scene and the high-precision tracking of the multi-scale target are realized, and higher requirements are provided for panoramic imaging hardware and a target tracking method.
Therefore, the embodiment of the invention provides a large-scene-oriented cross-scale target tracking method, which comprises the steps of utilizing a multilateral heterogeneous panoramic sensing system to carry out multi-scale high-resolution panoramic imaging on a target, then carrying out large-scale tracking on the target through a local sensing image, and finally mapping the target to a global sensing image to realize cross-scale high-precision target track tracking; the panoramic sensing system utilized by the invention has a simple structure and is convenient to deploy, the problems of low target sensing resolution and poor edge imaging quality are solved, the data acquisition and training conditions of large-scene panoramic high-precision target tracking are met, data support is provided for the training of a high-precision target tracking method, and then the ultrahigh-resolution panoramic target tracking is realized based on the designed cross-scale target tracking method, so that the target tracking method provided by the embodiment of the invention has high precision and good applicability.
It should be understood that, although the steps in the flowchart of fig. 1 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 1 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
In one embodiment, as shown in fig. 4, there is provided a cross-scale target tracking apparatus, comprising:
the cross-scale target tracking model training module 401 is used for building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
the panoramic image fusion module 402 is used for building a trans-scale panoramic sensing system, collecting images by using the trans-scale panoramic sensing system to obtain a plurality of global sensing images and a plurality of local sensing images, and calculating a transformation matrix of adjacent images of the global sensing images; splicing and fusing a plurality of global perception images by utilizing the transformation matrix of the adjacent images of the global perception image to obtain a 360-degree panoramic spliced image I G
The cross-scale target tracking module 403 is configured to process, by using a cross-scale target tracking model obtained through training by the cross-scale target tracking model training module 401, multiple local perception images acquired by the panoramic image fusion module 402, respectively, to obtain a coordinate position positioning result of a target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
For specific definition of the cross-scale target tracking device, refer to the above definition of a cross-scale target tracking method, which is not described herein again. The modules in the cross-scale target tracking device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory storing a computer program, and relates to all or part of the flow of the method of the above embodiment.
In one embodiment, a computer-readable storage medium having a computer program stored thereon is provided, which relates to all or part of the processes of the above-described embodiment methods.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile memory may include Read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical storage, or the like. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A cross-scale target tracking method is characterized by comprising the following steps:
building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a cross-scale target tracking model;
step two, a cross-scale panoramic sensing system is built, the cross-scale panoramic sensing system is used for collecting images to obtain a plurality of global sensing images and a plurality of local sensing images, and a transformation matrix of adjacent images of the global sensing images is obtained through calculation; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G
Step three, processing the multiple local perception images acquired in the step two by using the cross-scale target tracking model obtained by training in the step one respectively to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using transformation matrixes of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
2. The cross-scale target tracking method according to claim 1, wherein the cross-scale panoramic sensing system comprises twelve groups of fixing devices arranged in a regular dodecagon, each group of fixing devices is fixed with one camera array fixing bracket, and each camera array fixing bracket is fixed with one group of camera arrays; the twelve groups of camera arrays respectively and correspondingly cover twelve peripheral areas in the circumferential direction of the regular dodecagon;
each group of camera array comprises three perception imaging cameras, each three perception imaging camera comprises two local perception imaging cameras and a global perception imaging camera, and the global perception imaging camera is fixed between the two local perception imaging cameras;
the imaging areas of the global perception imaging cameras of the two adjacent groups of camera arrays have overlapping parts; for each group of camera array, the field angle of the global perception imaging camera is larger than the sum of the field angles of the two local perception imaging cameras, the vertical field angle of the global perception imaging camera is larger than 2 times of the vertical field angle of the local perception imaging camera, and the angle of the field-of-view optical axis of the local perception imaging camera can be adjusted at will.
3. The cross-scale target tracking method according to claim 2, wherein the training data acquisition device comprises one of the camera array fixing brackets and a group of the camera arrays fixed on the camera array fixing bracket.
4. The cross-scale target tracking method according to claim 3, wherein in the cross-scale panoramic sensing system, the optical axes of the fields of view of all the sensing imaging cameras are coplanar with the regular dodecagon, and the reverse extension lines of the optical axes of the fields of view of all the global sensing imaging cameras pass through the center of the regular dodecagon; the horizontal field angles of all the global perception imaging cameras are larger than or equal to 45 degrees, and the imaging resolutions of all the global perception imaging cameras and all the local perception imaging cameras are larger than or equal to 900 ten thousand pixels.
5. The cross-scale target tracking method according to claim 4, wherein in the first step, the position of the perception imaging camera is calibrated for the built training data acquisition device, the built training data acquisition device is used for data acquisition, and then the training is carried out to obtain the cross-scale target tracking model specifically comprises the following steps:
position calibration is carried out on two local perception imaging cameras and a global perception imaging camera in a training data acquisition device, and a mapping matrix of the two local perception imaging cameras relative to the global perception imaging camera is obtained by utilizing a characteristic point matching method
Figure FDA0003806808350000021
And
Figure FDA0003806808350000022
acquiring image data at a specific place by using a built training data acquisition device to obtain a local perception image data set and a global perception image data set, and marking a positioning frame on a target acquired in the local perception image data set and the global perception image data set; dividing a local perception image data set with labels into a training set, a test set and a verification set according to a preset proportion;
training the training set by using the existing target tracking algorithm and the loss function corresponding to the existing target tracking algorithm, and iterating the training model according to a preset training strategy until the loss function is converged to obtain the cross-scale target tracking model.
6. The cross-scale target tracking method according to claim 5, wherein the feature point matching method is a surf feature point matching algorithm; the specific sites include pedestrian streets and intersections.
7. The cross-scale target tracking method according to claim 5, wherein the preset ratio is 8; the target tracking algorithm is a ByteTrack algorithm.
8. The cross-scale target tracking method according to claim 5, wherein the second step specifically comprises:
constructing a cross-scale panoramic sensing system, synchronously acquiring image data of 12 groups of camera arrays by using a square wave synchronous triggering method, and recording a global sensing image sensed by the 12 groups of camera arrays at the time t as
Figure FDA0003806808350000031
And noting the perceived local perception image as
Figure FDA0003806808350000032
i represents the index of a group of camera arrays, i belongs to {1,2, …,12};
for a global perceptual image sequence at time t
Figure FDA0003806808350000033
Separately aligning adjacent images in a sequence using an image feature based matching algorithm
Figure FDA0003806808350000034
And
Figure FDA0003806808350000035
extracting and matching features to obtain a transformation matrix of adjacent images of the global perception image
Figure FDA0003806808350000036
Splicing and fusing 12 global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G
9. The cross-scale target tracking method according to claim 8, wherein the third step specifically comprises:
respectively processing 24 local perception images perceived by 12 groups of camera arrays at t moment by using the cross-scale target tracking model obtained by training in the step one to obtain a coordinate position positioning result of the target in the local perception images;
predicting the position of the target track stored at the t-1 moment at the t moment based on a Kalman filtering algorithm, performing similarity matrix calculation based on a IoU strategy or an appearance characterization vector, and then performing matching by using a Hungarian algorithm;
for the matched track, combining the matching result and the target positioning result at the time t, updating the track at the time t by using a Kalman filtering algorithm, and then taking the updated track as the target tracking track at the time t; putting the unmatched tracks into the unmatched track set, and matching at the next moment; adding the unmatched positioning result into the current tracking track set as a new track, and updating the track according to a Kalman filtering algorithm;
according to a mapping matrix
Figure FDA0003806808350000037
And
Figure FDA0003806808350000038
mapping the tracking track of the target in the local perception image back to the corresponding global perception image to obtain the tracking track of the target in the global perception image;
utilizing a transformation matrix of adjacent images of a global perceptual image for an acquired target track of the global perceptual image
Figure FDA0003806808350000039
Transforming the target tracks of the 12 overall perception images to obtain a 360-degree panoramic mosaic image I G The specific track of the medium target realizes the cross-scale target tracking of the hundred million pixel level panoramic image.
10. A cross-scale target tracking device, comprising:
the trans-scale target tracking model training module is used for building a training data acquisition device, calibrating the position of a perception imaging camera of the built training data acquisition device, acquiring data by using the built training data acquisition device, and then training to obtain a trans-scale target tracking model;
the panoramic image fusion module is used for building a trans-scale panoramic sensing system, acquiring images by using the trans-scale panoramic sensing system to obtain a plurality of global sensing images and a plurality of local sensing images, and calculating to obtain a transformation matrix of adjacent images of the global sensing images; splicing and fusing a plurality of global perception images by using the transformation matrix of the adjacent images of the global perception images to obtain a 360-degree panoramic spliced image I G
The cross-scale target tracking module is used for processing a plurality of local perception images collected in the panoramic image fusion module by respectively utilizing a cross-scale target tracking model obtained by training of the cross-scale target tracking model training module to obtain a coordinate position positioning result of the target in the local perception images; calculating a tracking track of the target in the local perception image based on a Kalman filtering algorithm, and obtaining the tracking track of the target in the corresponding global perception image according to the tracking track of the target in the local perception image; transforming target tracks in a plurality of global perception images by using a transformation matrix of adjacent images of the global perception images to obtain a 360-degree panoramic mosaic image I G The specific trajectory of the target.
CN202210999023.2A 2022-08-19 2022-08-19 Cross-scale target tracking method and device Pending CN115330838A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210999023.2A CN115330838A (en) 2022-08-19 2022-08-19 Cross-scale target tracking method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210999023.2A CN115330838A (en) 2022-08-19 2022-08-19 Cross-scale target tracking method and device

Publications (1)

Publication Number Publication Date
CN115330838A true CN115330838A (en) 2022-11-11

Family

ID=83925796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210999023.2A Pending CN115330838A (en) 2022-08-19 2022-08-19 Cross-scale target tracking method and device

Country Status (1)

Country Link
CN (1) CN115330838A (en)

Similar Documents

Publication Publication Date Title
CN107659774B (en) Video imaging system and video processing method based on multi-scale camera array
CN110211043B (en) Registration method based on grid optimization for panoramic image stitching
CN111750820B (en) Image positioning method and system
CN108921781B (en) Depth-based optical field splicing method
CN106023086B (en) A kind of aerial images and geodata joining method based on ORB characteristic matching
CN109903227B (en) Panoramic image splicing method based on camera geometric position relation
Sinha et al. Pan–tilt–zoom camera calibration and high-resolution mosaic generation
US8755624B2 (en) Image registration device and method thereof
JP5872818B2 (en) Positioning processing device, positioning processing method, and image processing device
CN106529538A (en) Method and device for positioning aircraft
CN112101361A (en) Target detection method, device and equipment for fisheye image and storage medium
CN109118429B (en) Method for rapidly generating intermediate wave infrared-visible light multispectral image
CN106373088A (en) Quick mosaic method for aviation images with high tilt rate and low overlapping rate
CN111242987B (en) Target tracking method and device, electronic equipment and storage medium
CN112348775B (en) Vehicle-mounted looking-around-based pavement pit detection system and method
CN110782498B (en) Rapid universal calibration method for visual sensing network
CN108734657A (en) A kind of image split-joint method with parallax processing capacity
CN113436130B (en) Intelligent sensing system and device for unstructured light field
CN105335977B (en) The localization method of camera system and target object
CN110260857A (en) Calibration method, device and the storage medium of vision map
CN109341720A (en) A kind of remote sensing camera geometric calibration method based on fixed star track
CN111553939A (en) Image registration algorithm of multi-view camera
Aeschliman et al. Tracking vehicles through shadows and occlusions in wide-area aerial video
CN114897676A (en) Unmanned aerial vehicle remote sensing multispectral image splicing method, device and medium
CN115619623A (en) Parallel fisheye camera image splicing method based on moving least square transformation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240204

Address after: 314001 9F, No.705, Asia Pacific Road, Nanhu District, Jiaxing City, Zhejiang Province

Applicant after: ZHEJIANG YANGTZE DELTA REGION INSTITUTE OF TSINGHUA University

Country or region after: China

Address before: No.152 Huixin Road, Nanhu District, Jiaxing City, Zhejiang Province 314000

Applicant before: ZHEJIANG FUTURE TECHNOLOGY INSTITUTE (JIAXING)

Country or region before: China