CN116489516A

CN116489516A - Specific object tracking shooting method and system

Info

Publication number: CN116489516A
Application number: CN202310420072.0A
Authority: CN
Inventors: 魏正博; 张�浩
Original assignee: Inspur Software Group Co Ltd
Current assignee: Inspur Software Group Co Ltd
Priority date: 2023-04-14
Filing date: 2023-04-14
Publication date: 2023-07-25

Abstract

The invention discloses a specific object tracking shooting method and a specific object tracking shooting system, belongs to the technical field of artificial intelligence, and aims to solve the technical problem of how to effectively identify and track objects. The method comprises the following steps: for an object to be tracked, acquiring an object image, and taking the preprocessed image as a sample image; constructing a target recognition model, performing parameter optimization on the target recognition model, and performing accuracy calculation on the trained target recognition model; the acquired video stream is subjected to slicing treatment, object images are subjected to object recognition through a trained object recognition model, and the object images are compared with the position information of the previous frame based on the position information of the current frame of the object images, so that the movement direction of the object is obtained; based on the motion vectors of the feature points in the two frames of images, calculating the change amount of the relative pose of the object, calculating the control strategy of the camera based on the change amount of the relative pose of the object, and driving the camera to track and shoot the object to be tracked based on the control strategy.

Description

Specific object tracking shooting method and system

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a specific object tracking shooting method and a specific object tracking shooting system.

Background

Tracking shooting of a specific object is now generally achieved through computer vision and image processing technologies, and common methods include a tracking method based on template matching and a tracking method based on feature point matching.

The method is based on a template matching tracking method, and the template image of a specific object is prepared in advance, and the template is matched with pixels in a current picture one by one when a camera shoots, so that the specific object is tracked. The method has the advantages of simple realization, but poor adaptability to conditions such as illumination change, object shielding and the like.

The tracking method based on the characteristic point matching realizes the tracking of the specific object by extracting the characteristic points of the specific object in the picture and matching the characteristic points with the characteristic points in the previous picture. The method has good adaptability to illumination change, object shielding and other conditions, but is sensitive to changes of the shape and appearance of the object.

In the prior art, the improvement of tracking precision and robustness is a key problem, so that a more effective image recognition algorithm and tracking controller are needed to realize more accurate and stable tracking shooting of a specific object.

How to effectively identify and track objects is a technical problem to be solved.

Disclosure of Invention

The technical task of the invention is to provide a specific object tracking shooting method and a specific object tracking shooting system aiming at the defects, so as to solve the technical problem of how to effectively identify and track objects.

In a first aspect, the present invention provides a specific object tracking shooting method, including the steps of:

sample processing: for an object to be tracked, acquiring object images shot from various distances and various angles, preprocessing the images, and taking the preprocessed images as sample images;

dividing a sample image into a training image and a test image, and labeling the training image to obtain label information comprising category and position information, wherein the position information is the boundary frame coordinates of an object in the sample image, a training set is constructed based on the training image and the corresponding label information, and a test set is constructed based on the test image;

model training: constructing a target recognition model, carrying out parameter optimization on the target recognition model based on a training set to obtain a trained target recognition model, and carrying out accuracy calculation on the trained target recognition model through a testing set;

video stream processing: the acquired video stream is subjected to slicing processing to obtain multi-frame object images, object images are subjected to object recognition through a trained object recognition model to obtain category and position information, and the object images are compared with the position information of the previous frame based on the position information of the current frame of the object images to obtain the movement direction of the object;

object tracking: for multi-frame images, calculating the change amount of the relative pose of an object based on the motion vectors of the feature points in the two frames of images, calculating the control strategy of a camera based on the change amount of the relative pose of the object, and driving the camera to track and shoot the object to be tracked based on the control strategy.

Preferably, constructing a target recognition model based on a YOLOv5 algorithm;

for the target recognition model, training images are taken as input, target recognition is carried out through the target recognition model, predicted category and position information are output, parameter optimization is carried out based on the predicted position information and marked position information, and parameter optimization is carried out through a back propagation function, so that a trained target recognition model is obtained;

and for the trained target recognition model, taking a test image as input, outputting predicted category and position information through the trained target recognition model, manually checking the predicted position information, and calculating the accuracy of the trained target recognition model.

Preferably, for the multi-frame object image, preprocessing each frame of object image through OpenCV, and performing target recognition through a trained target recognition model to obtain category and position information;

and comparing the position information corresponding to the current frame of object image with the position information corresponding to the previous frame of object image for each object to obtain the movement direction of the object.

Preferably, for object tracking, the amount of change in the relative position of the object is calculated by:

for each frame of object image, performing target recognition by using a trained target recognition model, and extracting feature points of the object image by using a KLT algorithm;

for each frame of object image, estimating the position of the current feature point in the next frame of object image by calculating the gray difference of pixels in the field of the current feature point;

for each frame of object image, calculating the motion vector of the characteristic point in the current frame of object image and the next frame of object image through the difference between pixel coordinates, so as to obtain the variation of relative pose;

for object tracking, the variable quantity of relative pose is input into an MPC model, a control strategy of a camera in a preset time in the future is predicted through the MPC model, the control strategy is configured into a driving program of the camera, and the operation of the camera is controlled through the control strategy to realize tracking shooting of an object to be tracked, wherein the operation of the camera comprises a cradle head and a focusing operation.

Preferably, for object tracking, the tracking end judgment is performed by:

judging whether the tracking is finished or not through the object motion change, setting a tracking error threshold value and a maximum tracking time, if so, exiting the tracking, otherwise, continuing to execute the object tracking.

In a second aspect, the present invention is a specific object tracking shooting system for tracking shooting an object to be tracked by performing a specific object tracking shooting method according to any one of the first aspects, the system comprising:

the sample processing module is used for acquiring object images shot from various distances and various angles for an object to be tracked, preprocessing the images and taking the preprocessed images as sample images; the method comprises the steps of dividing a sample image into a training image and a test image, marking the training image to obtain label information comprising category and position information, wherein the position information is the boundary frame coordinates of an object in the sample image, constructing a training set based on the training image and the corresponding label information, and constructing a test set based on the test image;

the model training module is used for constructing a target recognition model, carrying out parameter optimization on the target recognition model based on a training set to obtain a trained target recognition model, and carrying out accuracy calculation on the trained target recognition model through a testing set;

the video stream processing module is used for carrying out slicing processing on the acquired video stream to obtain multi-frame object images, carrying out object recognition on the object images through a trained object recognition model to obtain category and position information, and comparing the position information of the object images in the current frame with the position information of the object images in the previous frame to obtain the movement direction of the object;

and the object tracking module is used for calculating the change amount of the relative pose of the object based on the motion vectors of the characteristic points in the two frames of images, calculating the control strategy of the camera based on the change amount of the relative pose of the object, and driving the camera to track and shoot the object to be tracked based on the control strategy.

Preferably, the model training module is configured to perform the following:

constructing a target recognition model based on a YOLOv5 algorithm;

Preferably, for the multi-frame object image, the video stream processing module is configured to pre-process each frame of object image through OpenCV, and perform target recognition through a trained target recognition model, so as to obtain category and position information;

and for each object, the video stream processing module is used for comparing the position information corresponding to the object image of the current frame with the position information corresponding to the object image of the previous frame to obtain the movement direction of the object.

Preferably, for object tracking, the object tracking module is configured to calculate the amount of change in the relative position of the object by:

the object tracking module is further configured to perform the following: for object tracking, the variable quantity of relative pose is input into an MPC model, a control strategy of a camera in a preset time in the future is predicted through the MPC model, the control strategy is configured into a driving program of the camera, and the operation of the camera is controlled through the control strategy to realize tracking shooting of an object to be tracked, wherein the operation of the camera comprises a cradle head and a focusing operation.

Preferably, the object tracking module is configured to perform tracking end judgment by:

The specific object tracking shooting method and system have the following advantages:

1. object recognition is carried out through the constructed object recognition model, so that the method can adapt to different conditions of illumination, object shielding and the like, and the tracking robustness is improved;

2. the self-adaptive control algorithm is adopted to track the object, and the tracking parameters can be adjusted in real time according to the tracking error and the change of the object motion, so that the stability and the accuracy of tracking are ensured;

3. the method can adapt to various characteristics such as shapes, sizes, colors and the like of different objects, and can be used for various tracking application scenes such as motion photography, intelligent monitoring and the like.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

The invention is further described below with reference to the accompanying drawings.

Fig. 1 is a flowchart of a specific object tracking shooting method according to embodiment 1.

Detailed Description

The invention will be further described with reference to the accompanying drawings and specific examples, so that those skilled in the art can better understand the invention and implement it, but the examples are not meant to limit the invention, and the technical features of the embodiments of the invention and the examples can be combined with each other without conflict.

The embodiment of the invention provides a specific object tracking shooting method and a specific object tracking shooting system, which are used for solving the technical problem of how to effectively identify and track objects.

Example 1:

the invention discloses a specific object tracking shooting method, which comprises the following steps:

s100, sample processing: for an object to be tracked, acquiring object images shot from various distances and various angles, preprocessing the images, and taking the preprocessed images as sample images;

s200, model training: constructing a target recognition model, carrying out parameter optimization on the target recognition model based on a training set to obtain a trained target recognition model, and carrying out accuracy calculation on the trained target recognition model through a testing set;

s300, video stream processing: the acquired video stream is subjected to slicing processing to obtain multi-frame object images, object images are subjected to object recognition through a trained object recognition model to obtain category and position information, and the object images are compared with the position information of the previous frame based on the position information of the current frame of the object images to obtain the movement direction of the object;

s400, object tracking: for multi-frame images, calculating the change amount of the relative pose of an object based on the motion vectors of the feature points in the two frames of images, calculating the control strategy of a camera based on the change amount of the relative pose of the object, and driving the camera to track and shoot the object to be tracked based on the control strategy.

In this embodiment, step S100 collects a large number of images of the object to be tracked, including images taken from various distances and various angles, by field photographing or online downloading, and performs preprocessing such as denoising on the acquired images by OpenCV. For the preprocessed image, the preprocessed image is divided into a training image and a test image according to the proportion of 8:2. And labeling the training image through labelimg to obtain label information comprising category and position information, wherein the position information is the boundary frame coordinates of the object in the sample image, a training set is constructed based on the training image and the corresponding label information, the training set is used for training a model, a test set is constructed based on an unlabeled test image, and the test set is used for evaluating the performance of the model.

Step S200 builds a target recognition model based on the YOLOv5 algorithm.

And for the target recognition model, taking a training image as an input, carrying out target recognition through the target recognition model, outputting predicted category and position information, carrying out parameter optimization based on the predicted position information and the marked position information, and carrying out parameter optimization through a back propagation function to obtain the trained target recognition model.

In this embodiment, training set parameters including learning rate, training batch, training period, etc. are set by modifying the configuration file of YOLOv5, a training model is started by using the prepared training set and the configured training parameters, model performance is evaluated by using the test set after training is completed, and the parameter optimization model is adjusted according to the result.

Step S300 is processing of video streams, preprocessing each frame of object image through OpenCV, and carrying out target recognition through a trained target recognition model to obtain category and position information; and comparing the position information corresponding to the current frame of object image with the position information corresponding to the previous frame of object image for each object to obtain the movement direction of the object.

As a specific implementation, this step performs frame-by-frame slicing on a video stream file photographed by a camera. And preprocessing a frame image obtained by slicing by using OpenCV, performing target detection by using the model obtained in the step S200, identifying and positioning an object to be tracked, and returning to the position of the object identification frame in the image. And comparing the current recognition result with the previous frame recognition result to estimate the running direction of the object.

Step S400 calculates the amount of change in the relative position of the object for object tracking by:

(1) For each frame of object image, performing target recognition by using a trained target recognition model, and extracting feature points of the object image by using a KLT algorithm;

(2) For each frame of object image, estimating the position of the current feature point in the next frame of object image by calculating the gray difference of pixels in the field of the current feature point;

(3) And for each frame of object image, calculating the motion vector of the characteristic point in the current frame of object image and the next frame of object image through the difference between the pixel coordinates, so as to obtain the variation of the relative pose.

For object tracking, tracking end judgment is performed by: judging whether the tracking is finished or not through the object motion change, setting a tracking error threshold value and a maximum tracking time, if so, exiting the tracking, otherwise, continuing to execute the object tracking.

As a specific implementation, this step includes the following operations:

(1) Tracking initialization: after determining that an object to be tracked is detected in a certain frame of image shot by a camera, extracting characteristic points of the frame of image by using a KLT (Kanade-Lucas-Tomasi) algorithm, estimating the position of a current characteristic point in a next frame of image by calculating the gray level difference of pixels in the field of the current characteristic point, and calculating the motion vector of the characteristic point in two frames of image by the difference between pixel coordinates. Thereby, the variation of the relative pose can be obtained, and the movement required by the camera can be further obtained;

(2) Adaptive tracking: in the self-adaptive tracking stage, the output result of the steps is input into an MPC (Kanade-Lucas-Tomasi) model, a camera control strategy in a future period of time is calculated, and the camera control strategy is input into an execution module;

(3) Tracking control phase: according to the output of the tracking controller, the camera is connected into a driving program of the camera, and the operations of a cradle head, focusing and the like of the camera are controlled, so that tracking shooting of an object to be tracked is realized;

(4) And (3) tracking end judgment: and in the tracking end judging stage, judging whether the tracking is ended or not according to the tracking error and the change of the object motion, if so, exiting the tracking, otherwise, returning to the step 4 to continue the tracking. Judging whether tracking is finished or not by setting a tracking error threshold value and maximum tracking time;

(5) And (3) outputting a tracking result: the tracking results may be output to a display, storage device, network, etc.

In actual implementation, the method can be properly adjusted and optimized according to actual requirements. For example, a suitable target detection algorithm, a tracking algorithm and a control algorithm can be selected according to the characteristics of the object to be tracked, and corresponding adjustment can be performed according to hardware parameters of the camera so as to achieve more accurate and stable tracking shooting effect.

Example 2:

the invention discloses a specific object tracking shooting system, which comprises a sample processing module, a model training module, a video stream processing module and an object tracking module, wherein the system can execute the method disclosed in the embodiment 1 to realize specific object tracking shooting.

For an object to be tracked, the sample processing module is used for acquiring object images shot from various distances and various angles, preprocessing the images and taking the preprocessed images as sample images; the method comprises the steps of dividing a sample image into a training image and a test image, labeling the training image to obtain label information comprising category and position information, wherein the position information is the boundary frame coordinates of an object in the sample image, constructing a training set based on the training image and the corresponding label information, and constructing a test set based on the test image.

The sample processing module of this embodiment is used for collecting a large number of images of the object to be tracked through field shooting or online downloading, including images shot from various distances and various angles, and performing preprocessing such as denoising on the acquired images through OpenCV. For the preprocessed image, the preprocessed image is divided into a training image and a test image according to the proportion of 8:2. And labeling the training image through labelimg to obtain label information comprising category and position information, wherein the position information is the boundary frame coordinates of the object in the sample image, a training set is constructed based on the training image and the corresponding label information, the training set is used for training a model, a test set is constructed based on an unlabeled test image, and the test set is used for evaluating the performance of the model.

The model training module is used for constructing a target recognition model, carrying out parameter optimization on the target recognition model based on a training set to obtain a trained target recognition model, and carrying out accuracy calculation on the trained target recognition model through a testing set.

The model training module of this embodiment is used for executing the following:

(1) And constructing a target recognition model based on the YOLOv5 algorithm.

(2) And for the target recognition model, taking a training image as an input, carrying out target recognition through the target recognition model, outputting predicted category and position information, carrying out parameter optimization based on the predicted position information and the marked position information, and carrying out parameter optimization through a back propagation function to obtain the trained target recognition model.

(3) And for the trained target recognition model, taking a test image as input, outputting predicted category and position information through the trained target recognition model, manually checking the predicted position information, and calculating the accuracy of the trained target recognition model.

The video stream processing module is used for carrying out slicing processing on the acquired video stream to obtain multi-frame object images, carrying out object recognition on the object images through the trained object recognition model to obtain category and position information, and comparing the category and position information based on the position information of the object images in the current frame with the position information of the previous frame to obtain the movement direction of the object.

The video stream processing module is used for executing processing of video streams, preprocessing each frame of object image through OpenCV, and carrying out target recognition through a trained target recognition model to obtain category and position information; and comparing the position information corresponding to the current frame of object image with the position information corresponding to the previous frame of object image for each object to obtain the movement direction of the object.

As a specific implementation, the module is used for slicing the video stream file shot by the camera frame by frame. And preprocessing a frame image obtained by slicing by using OpenCV, performing target detection by using the model obtained in the step S200, identifying and positioning an object to be tracked, and returning to the position of the object identification frame in the image. And comparing the current recognition result with the previous frame recognition result to estimate the running direction of the object.

For multi-frame images, the object tracking module is used for calculating the change amount of the relative pose of the object based on the motion vectors of the feature points in the two frames of images, calculating the control strategy of the camera based on the change amount of the relative pose of the object, and driving the camera to track and shoot the object to be tracked based on the control strategy.

The object tracking module of the embodiment is used for calculating the variation of the relative position of the object through the following steps:

As a specific implementation, this step includes the following operations:

While the invention has been illustrated and described in detail in the drawings and in the preferred embodiments, the invention is not limited to the disclosed embodiments, and it will be appreciated by those skilled in the art that the code audits of the various embodiments described above may be combined to produce further embodiments of the invention, which are also within the scope of the invention.

Claims

1. A specific object tracking shooting method, characterized by comprising the following steps:

2. The specific object tracking shooting method according to claim 1, wherein a target recognition model is constructed based on YOLOv5 algorithm;

3. The method for tracking and photographing a specific object as claimed in claim 1, wherein,

preprocessing each frame of object image through OpenCV for the multi-frame object image, and carrying out target recognition through a trained target recognition model to obtain category and position information;

4. The specific object tracking shooting method according to claim 1, wherein for object tracking, the amount of change in the relative position of the object is calculated by:

5. The specific object tracking shooting method according to claim 1, wherein for object tracking, tracking end judgment is made by:

6. A specific object tracking shooting system for tracking shooting an object to be tracked by performing a specific object tracking shooting method according to any one of claims 1 to 5, the system comprising:

7. The specific object tracking camera system of claim 6, wherein the model training module is configured to perform the following:

constructing a target recognition model based on a YOLOv5 algorithm;

8. The specific object tracking shooting system according to claim 6, wherein for the multi-frame object image, the video stream processing module is configured to pre-process each frame of object image through OpenCV, and perform object recognition through a trained object recognition model, so as to obtain category and location information;

9. The specific object tracking camera system of claim 6, wherein for object tracking, the object tracking module is configured to calculate the amount of change in the relative position of the object by:

10. The specific object tracking capture system of claim 6, wherein the object tracking module is configured to perform tracking end determination by: