CN117036410A

CN117036410A - Multi-lens tracking method, system and device

Info

Publication number: CN117036410A
Application number: CN202311058392.2A
Authority: CN
Inventors: 胡鹏杰; 乔辉; 赵祯; 刘�文; 关俊涛; 游冰; 杨建光; 李刚; 贺提胜
Original assignee: Sinomach Internet Research Institute Henan Co ltd
Current assignee: Sinomach Internet Research Institute Henan Co ltd
Priority date: 2023-08-22
Filing date: 2023-08-22
Publication date: 2023-11-10

Abstract

The application discloses a multi-lens tracking method, a multi-lens tracking system and a multi-lens tracking device, relates to the field of monitoring, and solves the problem that the existing cross-border head multi-target tracking technology only has an offline mode. According to the application, the ID information and the feature vector of the target object are transmitted between different monitoring videos, and the matching and the optimization are carried out, so that the continuous tracking of the target object can be realized, the next target monitoring video is determined when the target disappears, and the tracking is continued.

Description

Multi-lens tracking method, system and device

Technical Field

The present application relates to the field of monitoring, and in particular, to a method, system, and apparatus for multi-lens tracking.

Background

Currently, target tracking systems can be divided into two sub-categories, single target tracking and multi-target tracking. Single-target tracking mainly deals with tracking problems in specific targets or simple scenes, while multi-target tracking focuses more on tracking multiple targets in common scenes. In the aspect of real-time tracking technology, single-lens multi-target real-time tracking is mature, but for the cross-border multi-target real-time tracking technology, the process is complex because a plurality of lenses are required to be used for target analysis and feature comparison, and no mature technology exists at present. In addition, at present, only an offline cross-border head multi-target tracking technology exists, so that target position information cannot be obtained in time during tracking feedback.

Disclosure of Invention

The application aims to provide a multi-lens tracking method, a multi-lens tracking system and a multi-lens tracking device, which can realize continuous tracking of a target object by transmitting ID information and feature vectors of the target object between different monitoring videos and performing matching and optimization, so that the next target monitoring video is determined and tracking is continued when the target disappears.

In order to solve the above technical problems, the present application provides a multi-lens tracking method, including:

s1: when a target object disappears from a current monitoring video corresponding to a current monitoring device, acquiring first ID information and a first feature vector of the target object sent by the current monitoring device, and determining a next monitoring video acquired by a next monitoring device according to the disappearing position of the target object in the current monitoring video, wherein the occurrence probability of the target object in the next monitoring video is larger than a preset probability;

s2: transmitting the first ID information and the first feature vector to the next monitoring device, and triggering the next monitoring device to extract a second feature vector of each object in the next monitoring video;

s3: matching the first feature vector with each second feature vector, and judging whether the second feature vector is successfully matched with the first feature vector or not;

s4: if so, determining the ID information of the object corresponding to the second feature vector which is successfully matched as the first ID information, obtaining an updated feature vector of the target object according to the first feature vector and the second feature vector, and re-entering S1 by taking the updated feature vector as the first feature vector;

s5: if not, ending the tracking of the target object.

In one embodiment, before step S1, the method further includes:

extracting the characteristics of the target object in each frame of the current monitoring video to obtain a characteristic vector matrix;

and when the target object disappears from the current monitoring video, carrying out averaging processing on the feature vector to obtain the first feature vector.

In one embodiment, extracting features of the target object in each frame of the current surveillance video to obtain a feature vector matrix includes:

determining an image frame of each frame of the current monitoring video where the target object is located;

removing background areas except the target object in the image frame to obtain a pure target object image;

and extracting the characteristics of the pure target object image to obtain the characteristic vector matrix.

In one embodiment, before step S1, the method further includes:

acquiring identification information corresponding to the target object sent by the detection device;

and determining first ID information corresponding to the identification information according to the identification information and the corresponding relation of the identification-ID.

In one embodiment, after extracting the identification information corresponding to the target object, the method further includes:

recording a time stamp when the identification information is detected by the detection device;

and determining the current monitoring device and the current monitoring video corresponding to the target object according to the time stamp and the installation position of the detection device.

In one embodiment, when the ID information corresponding to the identification information does not exist in the correspondence relationship of the identification-ID, the method further includes:

and assigning an ID to the target object to obtain first ID information corresponding to the target object.

In one embodiment, when it is determined that the second feature vector successfully matched with the first feature vector does not exist, the method further includes:

and assigning second ID information for the object corresponding to the second feature vector which fails to be matched.

In one embodiment, step S3 includes:

similarity comparison is carried out on the first characteristic vector and the second characteristic vector in a cosine distance mode;

if the similarity is larger than a preset value, the second feature vector with the similarity larger than the preset value is successfully matched with the first feature vector, an object corresponding to the second feature vector with the similarity larger than the preset value is the target object, and the ID information of the object corresponding to the second feature vector is determined to be the first ID information.

In order to solve the above technical problem, the present application further provides a multi-lens tracking system, including:

the video determining unit is used for acquiring first ID information and a first feature vector of a target object in the current monitoring video sent by the current monitoring device when the target object disappears from the current monitoring video corresponding to the current monitoring device, and determining a next monitoring video acquired by a next monitoring device according to the disappearing position of the target object in the current monitoring video, wherein the occurrence probability of the target object in the next monitoring video is larger than a preset probability;

the vector processing unit is used for transmitting the first ID information and the first feature vector to the next monitoring device and triggering the next monitoring device to extract a second feature vector of each object in the next monitoring video;

the vector matching unit is used for matching the first characteristic vector with the second characteristic vector and judging whether the second characteristic vector is successfully matched with the first characteristic vector or not;

a vector updating unit, configured to determine, when there is successful matching between the second feature vector and the first feature vector, ID information of an object corresponding to the second feature vector that is successfully matched as the first ID information, obtain, according to the first feature vector and the second feature vector, an updated feature vector of the target object, and use the updated feature vector as the first feature vector;

and the ending unit is used for ending the tracking of the target object when the second characteristic vector is not successfully matched with the first characteristic vector.

In order to solve the above technical problems, the present application further provides a multi-lens tracking device, including:

a memory for storing a computer program;

a processor for implementing the steps of the multi-shot tracking method as described above when storing a computer program.

The application provides a multi-lens tracking method, a multi-lens tracking system and a multi-lens tracking device, relates to the field of monitoring, and solves the problem that the existing cross-border head multi-target tracking technology only has an offline mode. According to the application, the ID information and the feature vector of the target object are transmitted between different monitoring videos, and the matching and the optimization are carried out, so that the continuous tracking of the target object can be realized, the next target monitoring video is determined when the target disappears, and the tracking is continued.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required in the prior art and the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a multi-shot tracking method according to the present application;

fig. 2 is a schematic diagram of a surveillance video provided by the present application;

FIG. 3 is a flowchart of tracking a target object according to the present application;

FIG. 4 is a flowchart of another object tracking method according to the present application;

FIG. 5 is a schematic diagram of data matching according to the present application;

FIG. 6 is a block diagram of a monitoring system according to the present application;

FIG. 7 is a schematic diagram of a feature extraction provided by the present application;

FIG. 8 is a block diagram of a multi-lens tracking system according to the present application;

fig. 9 is a block diagram of a multi-lens tracking device according to the present application.

Detailed Description

The application provides a multi-lens tracking method, a multi-lens tracking system and a multi-lens tracking device, which can realize continuous tracking of a target object by transmitting ID information and feature vectors of the target object between different monitoring videos and performing matching and optimization, so that the next target monitoring video is determined and tracking is continued when the target disappears.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In order to solve the above technical problems, the present application provides a multi-lens tracking method, as shown in fig. 1, including:

specifically, in this step, when the target object disappears from the current surveillance video, the method processes by:

firstly, acquiring first ID information and a first feature vector of a target object sent by a current monitoring device: the current monitoring device sends the first ID information and the first feature vector of the target object to the system. The first ID information refers to a unique identifier of the target object in the system for identifying the identity of the target object. The first feature vector refers to a vector extracted from the current monitoring video and used for describing the features of the target object, and the first feature vector contains some key feature information of the target object.

Secondly, determining a next monitoring video according to the vanishing position of the target object in the current monitoring video: by analyzing the vanishing position (or vanishing direction) of the target object in the current monitoring video, the next video to be monitored can be determined (wherein the occurrence probability of the target object in the next monitoring video is greater than the preset probability). The purpose of this step is to ensure that the system is able to track the target object continuously even if the target object disappears in the current surveillance video.

According to the method, the first ID information and the first feature vector of the target object can be obtained according to the disappearance condition of the target object in the current monitoring video, and the next video to be monitored is determined according to the disappearance position, so that the continuous tracking of the target object is realized, the accuracy and the efficiency of target tracking are improved, and the real-time tracking function is realized.

S2: transmitting the first ID information and the first feature vector to a next monitoring device, and triggering the next monitoring device to extract a second feature vector of each object in the next monitoring video;

specifically, when the target object disappears in the current monitoring video, the current monitoring apparatus transmits the first ID information and the first feature vector of the target object. The information is transmitted to the next monitoring device corresponding to the next monitoring video through a network or medium or other communication means.

Once the next monitoring device receives the first ID information and the first feature vector, it extracts a second feature vector for each object in the next monitoring video based on the information. The second feature vector is obtained by extracting features of the next monitoring video and is used for describing unique attributes and characteristics of each object.

By transmitting the first ID information and the first feature vector to the next monitoring device and triggering the next monitoring device to extract the second feature vector of each object in the next monitoring video, the target object can be continuously tracked and the feature vector of the target object can be updated. Therefore, continuous tracking of the target object in different monitoring videos can be ensured, and the accuracy and efficiency of tracking are improved.

S3: matching the first feature vector with each second feature vector, and judging whether the second feature vector is successfully matched with the first feature vector;

s4: if so, determining the ID information of the object corresponding to the successfully matched second feature vector as first ID information, obtaining an updated feature vector of the target object according to the first feature vector and the second feature vector, and re-entering S1 by taking the updated feature vector as the first feature vector;

s5: if not, ending the tracking of the target object.

Specifically, in step S2, the first feature vector is already transmitted to the next monitoring device corresponding to the next monitoring video, and the next monitoring device is triggered to extract the second feature vector of each object in the next monitoring video. In this step, the second feature vectors need to be matched with the first feature vectors. The purpose of the matching is to find the same target object in different surveillance videos and to determine its ID information.

In order to achieve the matching, various feature vector matching algorithms, such as cosine similarity calculation or euclidean distance calculation, may be used. These algorithms can be used to measure the degree of similarity between the first feature vector and each of the second feature vectors. By comparing the similarity between the first feature vector and each of the second feature vectors, the second feature vector that is successfully matched, i.e., the feature vector that is most similar to the first feature vector, can be found. The ID information of the object corresponding to the second feature vector may then be determined as the first ID information.

Thus, continuous tracking and recognition of the target object can be realized among different lenses. Through continuous matching operation, the ID information of the target object can be ensured to be accurately associated in each monitoring video, so that the real-time tracking of the target object is realized, and the tracking accuracy and efficiency are improved.

Specifically, when it is determined that there is a second feature vector successfully matched with the first feature vector, the steps include:

acquiring a first feature vector: first, the ID information of the object corresponding to the successfully matched second feature vector, namely the first ID information, is obtained. Then, a first feature vector of the target object is acquired from the current monitoring apparatus based on the first ID information.

Extracting a second feature vector: next, the next monitoring device needs to be triggered to extract the second feature vector of each object in the next monitoring video. These second feature vectors will match the first feature vectors.

Feature vector matching and optimization: in S3, the first feature vector and each second feature vector have been matched, and ID information of the object corresponding to the second feature vector that has been successfully matched, that is, the first ID information, has been determined. Now, the first feature vector and the successfully matched second feature vector are used for optimization to generate the updated feature vector of the target object.

Updating the feature vector: and calculating the updated feature vector of the target object by combining the first feature vector with the second feature vector successfully matched by using a plurality of specific algorithms. This updated feature vector will reflect the state and position changes of the target object in the different surveillance videos.

Reenter S1: and finally, taking the updated characteristic vector as a new first characteristic vector, re-entering the S1 stage so as to determine the next target monitoring video when the target object disappears, and continuing to track.

And (3) ending tracking: if the second feature vector matched with the first feature vector does not exist in the next monitoring video, the target object cannot be tracked, and at the moment, the tracking is ended.

Through the steps, the characteristic vector of the target object can be ensured to be updated and optimized along with the time, so that the tracking accuracy and efficiency of the target object are improved. Thus, the system can track the target object in real time, accurately determine the next monitoring video when the target disappears, and continuously track the target object.

In one embodiment, before step S1, the method further includes:

extracting characteristics of a target object in each frame of the current monitoring video to obtain a characteristic vector matrix;

when the target object disappears from the current monitoring video, the feature vector is subjected to averaging processing to obtain a first feature vector.

Specifically, the embodiment describes that, before step S1, feature extraction is performed on the current surveillance video to obtain a feature vector matrix, and the feature vector is subjected to a averaging process to obtain a first feature vector. This feature extraction process may be used for tracking and identification of target objects.

In an embodiment, for each frame of the current surveillance video, the features of the target object are extracted and recorded to form a feature vector matrix. These feature vectors may contain information of the shape, color, texture, etc. of the target object for distinguishing between different target objects. When the target object disappears from the current monitoring video, i.e. the target object is no longer within the field of view of the current monitoring device, the feature vector will be averaged. In particular, all recorded feature vectors will be averaged to obtain one average feature vector. This average feature vector is determined as the first feature vector as a reference for matching and updating in the subsequent step.

By the feature extraction and averaging process, the feature representation of the target object can be efficiently extracted and converted into a vector with average features. The vector can be used for matching with the feature vector extracted from the subsequent monitoring video to determine the identity of the target object and track the target object, so that the accuracy and efficiency of a target object tracking system can be improved.

In one embodiment, extracting features of a target object in each frame of a current surveillance video to obtain a feature vector matrix includes:

determining an image frame where each frame of target object in the current monitoring video is located;

removing background areas except for the target object in the image frame to obtain a pure target object image;

and extracting the characteristics of the pure target object image to obtain a characteristic vector matrix.

The embodiment describes a method for extracting characteristics of a target object in a current monitoring video. The method comprises the following steps:

determining an image frame where each frame of target object in the current monitoring video is located: the position of the target object is determined in each frame of the current monitoring video through a target detection or target tracking algorithm, and the position of the target object is marked by a rectangular frame.

Removing background areas except for the target object in the image frame: and removing the areas except the target object in the image according to the determined target object image frame, and only reserving the area where the target object is located.

Obtaining a pure target object image: after the background is removed, the obtained image only contains the target object and has no background interference.

Extracting features of the pure target object image: and (3) performing feature extraction on the pure target object image by adopting image processing and a computer vision algorithm, and converting the image into a feature vector form.

Through the steps, the feature vector matrix of the target object can be extracted from the current monitoring video, and the feature vector matrix can be used for target matching and feature updating in the subsequent multi-lens tracking process. The method can improve the identification accuracy and tracking effect of the target object and enhance the performance and stability of the monitoring system.

In one embodiment, before step S1, the method further includes:

acquiring identification information corresponding to a target object sent by a detection device;

Specifically, in the monitoring system, each target object may be identified by an identification information, such as a number, letter, or other symbol. In this step, the detection device transmits identification information of the target object to the multi-shot tracking method. There is a correspondence between the identification information and the ID information. For example, a database or other means may be used to store identification information and corresponding ID information for the target object. In this step, the multi-shot tracking method determines first ID information matching the identification information by querying the correspondence.

Through the steps in this embodiment, after the identification information of the target object is acquired, the first ID information corresponding to the identification information may be determined through the correspondence relationship. Thus, in step S1, the identification information and the first ID information of the target object can be acquired at the same time, and the subsequent multi-lens tracking operation can be continued. The embodiment can improve the accuracy and the reliability of the multi-lens tracking method and ensure that the target object is accurately tracked and identified.

In this embodiment, in addition to obtaining the identification information corresponding to the target object sent by the detection device, a time stamp when the detection device detects the identification information needs to be recorded, and the current monitoring device and the current monitoring video corresponding to the target object are determined according to the time stamp and the installation position of the detection device.

This means that the identification information corresponding to the target object is first acquired by the detection means. Then, the time stamp at which the identification information was detected by the detection means is recorded. Meanwhile, the current monitoring device corresponding to the target object and the current monitoring video corresponding to the current monitoring device can be determined through detecting the installation position of the device.

The steps of this embodiment may further specify how each monitoring device and its corresponding monitoring video are associated with the identification information of the target object. This provides the necessary preparation for the subsequent implementation steps and ensures that the target object and its trajectory are correctly identified when it jumps from one monitoring device to the next.

The present embodiment describes a case where, when ID information corresponding to identification information does not exist in the correspondence relationship between identification information and ID, it is necessary to assign an ID to a target object to obtain first ID information corresponding to the target object. This can be understood as that in step S1, when the identification information of the target object has no corresponding ID in the system, a unique ID needs to be assigned to the target object. This ID may be a number, letter, or any valid identification number for identifying the target object during subsequent feature vector matching and target object tracking.

By assigning an ID to the target object, it is ensured that in a subsequent step the target object can be correctly identified and the corresponding first ID information is transferred to the next monitoring device. This ensures that the target object can be tracked in different surveillance videos and updates its feature vector to achieve more accurate identification and tracking.

In one embodiment, when it is determined that there is no second feature vector successfully matched with the first feature vector, the method further includes:

and assigning second ID information for the object corresponding to the second feature vector which fails to match.

Further, in the above-mentioned multi-shot tracking method, if the first feature vector and the second feature vector cannot be successfully matched, that is, it cannot be determined that they correspond to the same object, or it is determined that they correspond to different objects, the following operations are performed: and assigning second ID information for the object corresponding to the second feature vector which fails to match.

In other words, when tracking the target object between different surveillance videos, it is determined whether the target object is the same object by comparing the feature vectors thereof, and if the two feature vectors cannot be successfully matched, a new ID information is allocated to the object corresponding to the second feature vector. The purpose of this is to ensure that the different objects are accurately distinguished during tracking and to preserve their uniqueness.

Through the steps in the embodiment, the system is ensured to correctly process the situation that the feature vector matching fails, and a new ID information can be provided for the object which is not successfully matched. This avoids confusion between different objects and provides a correct basis for subsequent tracking and identification.

In one embodiment, step S3 includes:

if the similarity is larger than a preset value, the second feature vector with the similarity larger than the preset value is successfully matched with the first feature vector, an object corresponding to the second feature vector with the similarity larger than the preset value is judged to be a target object, and the ID information of the object corresponding to the second feature vector is determined to be first ID information.

The embodiment describes a specific implementation manner of step S3, specifically, the similarity between the first feature vector and the second feature vector is compared, so as to determine whether the object corresponding to the second feature vector is a target object, and the ID information of the object is determined as the first ID information.

For similarity comparison, a cosine distance approach may be used to calculate the similarity between two feature vectors. Cosine distance is a method for measuring the similarity of vectors, and the similarity of the vectors in the direction is measured by calculating the included angle between the two vectors. Specifically, step S3 first performs similarity comparison on the first feature vector and the second feature vector. And if the calculated similarity is larger than the preset value, judging that the object corresponding to the second feature vector is a target object. This means that the first feature vector and the second feature vector are very close in character and can be considered to represent the same object. Meanwhile, the ID information of the object corresponding to the second feature vector is determined to be the first ID information, so that tracking of the target object in different monitoring videos can be associated, and consistency of the target object information among different monitoring devices is ensured.

In summary, in this embodiment, the step of determining the target object and determining the ID information thereof by using the cosine distance similarity comparison method can accurately track the target object and ensure the consistency of the target object information among different surveillance videos.

In one embodiment, the steps of multi-shot tracking are as follows:

(1) Because of the time relation of appearance and disappearance of each target object in a single lens (namely the monitoring device), when the target object appears in the lens for the first time, the characteristic information of the target object is recorded and collected, the characteristic information is the characteristic information of the target object selected by a frame in the detection process of the target object, the characteristic extraction network is used for extracting the characteristic information of the target object, namely 128-dimensional characteristic vectors, and meanwhile ID information is assigned to the target object, and the ID information of each target is unique. When the target object disappears from the lens, the average characteristics of all the characteristics of the target object in the lens are obtained, and the characteristics and the ID information are stored in a memory in a list form to form a characteristic+ID list.

(2) As shown in fig. 2, the operation of step (1) is looped according to the direction in which the target object disappears (the direction can be determined according to the position limit point of the target object disappearing in the imaging lens), the position in which the target object is about to appear is determined, the monitoring camera (i.e. the next monitoring device) of the area is called, but when each target object appears in the area for the first time, feature matching needs to be performed with the feature vector set generated in step (1), the ID information of the feature pair that is successfully matched is used as the ID information of the target object in the lens, and then tracking is continued in the lens.

(3) When information identification is required to be carried out on the target object and identification information of the target object is displayed in a monitoring video at the moment, for example, a monitoring camera is installed at a gate, when the target object passes through the gate, a code scanning or card swiping is required, at the moment, the moment (time stamp) is recorded, meanwhile, the camera is notified, the target object is locked, inquiry is carried out from a target object information database, the identification information and the ID information of the target object are extracted and stored, and the information of the target object can be displayed in the video through the steps (1) and (2).

(4) When feature vectors of each target object are transmitted among a plurality of lenses, the target object contains background information of a space where the target object is located, so that if the whole image frame is subjected to feature extraction, the background information around the target object is contained, and meanwhile, the background information changes along with the movement of the target object, so that interference can be generated when the extracted feature information contains the background.

Further, the specific implementation steps of the above embodiment include:

(1) For a single camera, an ID information is assigned when each target object appears, the ID information is unique, the ID information is displayed back in a video, meanwhile, feature extraction is carried out on each frame of each target object to form a feature vector, the feature vectors of all frames of each target object finally form a feature matrix, and when the target object disappears from the camera, the feature vector matrix of the target object is subjected to averaging operation to finally form a first feature vector.

(2) When the target object disappears from the monitoring video corresponding to the first camera, data (specifically, first ID information of the first feature vector) is transferred to the central server, and the central server judges a monitoring area where the target object is about to appear according to the last appearance position of the target object, and transfers information data containing the first ID information of the first feature vector to the camera (next monitoring device) in the area, as shown in fig. 3.

(3) When the next monitoring device receives the transmitted first ID information of the first feature vector, feature extraction is carried out on the new target object appearing in the area, similarity comparison is carried out on the new target object and the transmitted average feature vector, the successfully matched target object is assigned with the corresponding ID, and the feature vector of the target object is continuously collected. When the target object disappears from the surveillance video, the average feature vector (updated first feature vector) of the target object is recalculated in combination with the average feature vector (first feature vector) in the previous surveillance video, and step 2 is repeated, and the same information as the first ID is transmitted to the next surveillance camera, as shown in fig. 4.

(4) When a pedestrian or a vehicle is tracked, specific identification information of the target object, such as the name of the pedestrian, the name of the vehicle owner and the like, is required to be obtained, when the target passes through the gate, the gate collects the identification information, records the time stamp of the moment, queries data from a background server, returns the target information, sends the target information to a monitoring device corresponding to the target object, and according to the time stamp, the monitoring device compares the target object appearing at the moment, locks the target object, synchronizes the target information into a video detector, and completes synchronization of the target information, as shown in fig. 5. The block diagram is shown in fig. 6.

In summary, one embodiment of the present application is as follows: (1) When the target object passes through the gate, the gate records the verification passing time after verifying the passing target object, and inquires the specific information of the target object, and the specific information and the verification passing time are transmitted to the central server together. (2) And after the central server receives the verification passing time, the camera 1 is called to lock a target object at the gate at the moment, and the information of the target object is tidied and loaded into the monitoring video display window. (3) When the target object appears on the camera 1, target detection tracking is performed on all video frames of the target object in the monitoring video, the characteristic vector of the target object is extracted, and unique ID information is assigned to the target object. (4) When the target object disappears from the camera 1, the central server is notified, the central server judges the monitoring area where the target is about to appear according to the disappearing position of the target object in the camera 1, and meanwhile, all frame feature vector sets of the target in the camera 1 are converted into feature matrixes, and the average value of each dimension of the feature matrixes is calculated to obtain average feature vectors. The process of averaging the feature vectors is specifically as follows: assuming that the current camera is the ith camera, the target object n shares k frames in the camera, the m frame is the m frame, and the feature vector of the target object n is:

wherein->And characterizing a feature vector corresponding to an mth frame of the target object n under the ith camera. The feature matrix X of the target object n at the ith camera _i,n The method comprises the following steps:

for characteristic matrix X _i,n The average feature vector X is obtained by averaging each column of the image. (5) When the target object appears on the camera 2, the monitoring device immediately locks the object after detecting the new object, extracts its feature vector (second feature vector) by using the feature extraction network, and transmits the result to the center server. (6) The central server compares the extracted second characteristic vector with the first characteristic vector transmitted by the camera 1 in a cosine distance mode, if the comparison is successful, the central server judges that the second characteristic vector is the same target object, and the ID of the target object is assigned by the first ID information transmitted by the camera 1. (7) And extracting the characteristics of all frames of the target object in the camera 2, combining the frames with the average characteristic vector obtained in the camera 1, converting the frames into a characteristic matrix, calculating the average value of each dimension, and converting the average value into a characteristic vector. (8) When the target disappears from the monitoring area, repeating the steps (4) (5) (6) (7).

The extraction method of the feature vector is shown in fig. 7, and the extraction method includes: (1) Modifying deep LabV3+ (which is a semantic segmentation network model, and can segment pictures according to edges to obtain a required target object), constructing a feature extraction network, extracting a result obtained by 3×3 convolution in a deep LabV3+ decoding layer on the basis of the deep LabV3+ model, and carrying out 1×1 convolution, 3×3 convolution and flattening layer, wherein a feature vector is obtained as shown in a feature extraction module in the figure, and the output part of the deep LabV3+ can be removed. (2) Inputting an image frame of a target object into a deep LabV3+ network, wherein the network can obtain a target with a background removed; (3) And obtaining the target feature vector after removing the background through the feature extraction network part.

In summary, the multi-lens tracking method provided by the application can ensure the accuracy and timeliness of data transmission among a plurality of lenses, so as to ensure that the next monitoring device can accurately identify the target object appearing in the previous monitoring device and give the same ID. And the monitoring device can synchronously monitor tracking information, thereby being beneficial to positioning and tracking the target object in real time and grasping the real-time dynamic state of the target object.

In order to solve the above technical problem, the present application further provides a multi-lens tracking system, as shown in fig. 8, including:

the video determining unit 81 is configured to obtain, when a target object disappears from a current monitoring video corresponding to a current monitoring device, first ID information and a first feature vector of the target object in the current monitoring video sent by the current monitoring device, and determine, according to an disappearance position of the target object in the current monitoring video, a next monitoring video acquired by a next monitoring device, where an occurrence probability of the target object in the next monitoring video is greater than a preset probability;

the vector processing unit 82 is configured to transmit the first ID information and the first feature vector to a next monitoring device, and trigger the next monitoring device to extract a second feature vector of each object in the next monitoring video;

a vector matching unit 83, configured to match a first feature vector with a second feature vector, and determine whether there is a successful match between the second feature vector and the first feature vector;

a vector updating unit 84, configured to determine, when there is a successful match between the second feature vector and the first feature vector, ID information of an object corresponding to the second feature vector that is successfully matched as first ID information, obtain an updated feature vector of the target object according to the first feature vector and the second feature vector, and use the updated feature vector as the first feature vector;

and an ending unit 85, configured to end tracking of the target object when there is no successful matching between the second feature vector and the first feature vector.

For the description of the multi-lens tracking system, please refer to the above embodiments, and the description of the present application is omitted herein.

In order to solve the above technical problem, the present application further provides a multi-lens tracking device, as shown in fig. 9, including:

a memory 91 for storing a computer program;

a processor 92 for implementing the steps of the multi-shot tracking method as described above when storing a computer program.

For the description of the multi-lens tracking device, please refer to the above embodiments, and the description of the present application is omitted herein.

It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A multi-lens tracking method, comprising:

s5: if not, ending the tracking of the target object.

2. The multi-lens tracking method of claim 1, further comprising, prior to step S1:

3. The multi-shot tracking method according to claim 2, wherein extracting features of the target object in each frame of the current surveillance video to obtain a feature vector matrix comprises:

4. The multi-lens tracking method of claim 1, further comprising, prior to step S1:

acquiring identification information corresponding to the target object sent by a detection device;

5. The multi-lens tracking method according to claim 4, further comprising, after extracting the identification information corresponding to the target object:

6. The multi-shot tracking method according to claim 4, wherein when no ID information corresponding to the identification information exists in the correspondence of the identification-IDs, further comprising:

7. The multi-shot tracking method according to any one of claims 1 to 6, characterized by further comprising, when it is determined that the second feature vector successfully matched with the first feature vector does not exist:

8. The multi-shot tracking method according to any one of claims 1 to 6, wherein step S3 includes:

similarity comparison is carried out on the first characteristic vector and each second characteristic vector in a cosine distance mode;

if the similarity is larger than a preset value, the second feature vector with the similarity larger than the preset value is successfully matched with the first feature vector, an object corresponding to the second feature vector with the similarity larger than the preset value is judged to be the target object, and the ID information of the object corresponding to the second feature vector is determined to be the first ID information.

9. A multiple lens tracking system, comprising:

10. A multiple lens tracking device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the multi-shot tracking method according to any one of claims 1-8 when storing a computer program.