CN117037057A

CN117037057A - Tracking system based on pedestrian re-identification and hierarchical search strategy

Info

Publication number: CN117037057A
Application number: CN202310892122.5A
Authority: CN
Inventors: 黄兆孟; 陈鹏; 王妍妍; 吴帆; 王玉坤
Original assignee: China Electric Rice Information System Co ltd
Current assignee: China Electric Rice Information System Co ltd
Priority date: 2023-07-19
Filing date: 2023-07-19
Publication date: 2023-11-10

Abstract

The application discloses a tracking system based on pedestrian re-identification and hierarchical search strategies, which comprises: the data acquisition processing module is used for obtaining continuous cut single-target video clips or single-target pictures from the original monitoring video by utilizing a tracking algorithm; the hierarchical search recognition module comprises a basic pedestrian re-recognition algorithm and a hierarchical search strategy, wherein the pedestrian re-recognition algorithm is used for target tracking, and the hierarchical search strategy optimizes the calculation resource allocation by calculating the probability of searching target track; the result display and confirmation module displays the targets obtained by the hierarchical search and identification module and constructs a search target space-time data chain. The application provides an automatic tracking scheme which can automatically identify the track under the condition of low manual intervention and limited computing resources, saves labor and time cost, reduces cost and increases efficiency.

Description

Tracking system based on pedestrian re-identification and hierarchical search strategy

Technical Field

The application relates to a tracking system, in particular to a tracking system based on pedestrian re-identification and hierarchical search strategies.

Background

In recent years, artificial intelligence technologies such as machine learning, deep learning and the like are rapidly developed, and the visual field is gradually mature, so that technologies such as face recognition technology, attribute recognition, vehicle detection and the like are widely applied to the security field. Pedestrian re-recognition technology aims at re-confirming the identity of a detection object after disappearing from one imaging environment when it re-enters another imaging environment. The pedestrian re-recognition method based on deep learning gradually exceeds the traditional machine learning method in recognition performance, becomes mainstream, and initially has usability in an open environment. The complete video pedestrian re-identification comprises links of pedestrian detection, single-camera pedestrian tracking, pedestrian retrieval under multiple cameras and the like, and the links have certain requirements on computing resources. In particular, the ever-increasing number of cameras and the placement of higher definition cameras has made computing resources ever-scarce. How to better track the target track under the condition of limited computing resources is a problem to be solved.

Disclosure of Invention

The application aims to: aiming at the defects of the prior art, the application provides a tracking system based on pedestrian re-identification and hierarchical search strategies.

In order to solve the technical problems, the application discloses a tracking system based on pedestrian re-identification and hierarchical search strategies, which comprises: the device comprises a data acquisition processing module, a hierarchical search identification module and a result display confirmation module;

the data acquisition processing module is used for acquiring a group of continuous cut single-target video clips or single-target pictures, namely single-target acquisition data, from original video data acquired by monitoring cameras in different environments at different positions by utilizing a tracking algorithm; the single-target acquisition data comprise geographic position information of the monitoring camera;

the hierarchical search identification module comprises: pedestrian re-recognition algorithm and hierarchical search strategy; the pedestrian re-recognition algorithm comprises the following steps: extracting features of single-target collected data collected in a history mode to obtain single-target features of the history, and warehousing the single-target features of the history to obtain a feature query library for target re-identification; the hierarchical search strategy includes: non-real-time hierarchical searching and real-time searching, wherein:

non-real-time hierarchical search: according to the characteristics of the given search target, when searching the historical single target, using the characteristic query library to query all the historical single targets meeting the requirements, and sorting according to time or position, and further confirming to obtain a historical search target to be confirmed;

searching in real time: taking a given specific target to be tracked as a determined search target, processing real-time video data containing various targets in real time by a system, comparing the real-time video data with the characteristics of the determined search target, screening out a target with high reliability to be further confirmed, and obtaining the real-time target to be confirmed; the historical search target to be confirmed and the real-time search target to be confirmed are the search targets to be confirmed;

the result display and confirmation module displays the search target to be confirmed of the hierarchical search and identification module and constructs a search target space-time data chain; the search target to be confirmed is manually confirmed to obtain a confirmed search target, and the characteristics of the confirmed search target after the manual confirmation are used as the enhanced query characteristics.

Further, the specific flow working procedure of the tracking system is as follows:

step 1: the data acquisition and processing module acquires original data from the monitoring system and performs preprocessing;

further, the data acquisition and processing specifically comprises the following steps:

step 1-1: and (3) data acquisition:

the data acquisition processing module processes and sorts the data acquired by the existing video monitoring acquisition system, namely the original data, and acquires the monitoring video with the time stamp and the camera position information from the original data;

step 1-2: detecting and cutting a target;

the target detection and cutting process comprises the following steps: cutting out a plurality of targets in the monitoring video acquired from the original data in the step 1-1 from the monitoring video to form a plurality of groups of single-target videos; in the cutting process, a target detection algorithm is used for detecting a target from the monitoring video; processing the monitoring video by using a micro ROI transformation layer, and cutting out a single target from the original monitoring video frame by calculating affine transformation of coordinates of the monitoring video frame and coordinates of the target; combining target tracking to form a single target video; and obtaining a group of continuous cut single-target video clips or single-target pictures, namely single-target acquisition data.

Step 2: the pedestrian re-identification model is constructed, the extracted characteristics are calculated and identified, and the specific method is as follows:

constructing a pedestrian re-recognition model, and re-recognizing the target to be detected by using the CNNs-RNNs re-recognition model and the single target acquisition data in the step 1-2, wherein the specific method comprises the following steps:

step 2-1: feature extraction: namely extracting space-time characteristics in multi-frame images, and the specific method is as follows:

namely: the multi-frame images containing a single object to be inspected are first fed as a sequence into the network of image transformations, respectively, to be transformed into the same size; then as input, the spatial characteristics of a single frame are obtained by entering a convolutional neural network CNNs; inputting a group of spatial features of the single target to be checked into a cyclic neural network RNNs in time sequence to obtain space-time features in the group of spatial features, and carrying out time pooling layer pooling treatment to obtain space-time feature representation of the single target to be checked;

in the pedestrian re-recognition model, a pre-trained ResNet50 network is used as a CNNs network, a GRU network is used as a RNNs network, the obtained pedestrian space-time characteristic representation calculation triplet loss is used as a loss function of the pedestrian re-recognition model to be optimized, the loss is converted into a vector Z through a full connection layer to calculate the cross entropy loss of pedestrian IDs, and the Z is the number of marked pedestrian IDs;

extracting features of single-target collected data collected in a history mode to obtain single-target features of the history, and storing the single-target features of the history into a feature query library;

step 2-2: similarity calculation and characteristic query;

calculating the characteristic of the search target by using the mode of pedestrian space-time characteristic representation in the step 2-1; the feature inquiry refers to searching and finding matched historical single-target features in a feature library by using the features of the determined search targets, determining cosine distances between the features of the search targets and the historical single-target features in the feature library through calculation, and selecting k historical targets with minimum distances; the selected k historical targets are judged manually; and querying the combination of the historical single target characteristics of the judged historical targets and the characteristics of the given determined search targets as new characteristics, and finally forming a complete travel track of the determined search targets.

Step 3: the method for constructing the hierarchical search model and dividing probability grades comprises the following specific steps: the hierarchical search model calculates probability values of capturing the search target next time by each camera according to geographic information and historical track of the search target, wherein the probability values are expressed as probability grades of each camera, and the method specifically comprises the following steps of:

step 3-1: the method for acquiring the historical track characteristics of the search target comprises the following steps:

the track gauge L of the search target is expressed as a combination of position coordinates of a series of camera track capture points [ M ] ₁ ,M ₂ ,M ₃ ,…,M _i ,…,M _t ]Wherein M is _i ＝(long _i ,lat _i ) The longitude and latitude coordinates of the ith camera; ith camera C _i Local geographic features within a nearby r radiusJ-th camera C in range _j Position information M of (2) _j Motion vector information a of ith camera _i Geographical location information as current camera +.>Namely:

wherein W is ^map Weight matrix for geographical features, W ^C Weight matrix for geographic position information of other cameras, M _i And M is as follows _j Is less than r, i.e. ||M _i -M _j || ₂ ≤r；w ^track As a weight matrix of motion vector, motion vector a _t ＝M _t -M _t-1 ，M _t For the current, i.e. t-th camera position, M _t-1 The position of the last camera; local geographic features within r radiusThe local geographic information is obtained by a convolutional neural network, and comprises traffic information of roads and small roads capable of passing, is a context graph processed by a geographic traffic graph, and is formed as +.>Local geographic features->Expressed as:

wherein CNN is convolutional neural network;

the historical track characteristics of the search target are expressed as:

step 3-2: the method for predicting the probability area of the future track based on the historical track characteristics of the search target specifically comprises the following steps:

constructing a basic prediction network W2 to process historical track information of a search target, and adopting GRU modeling time sequence relation; the structure of the predictive network W2 is as follows:

wherein F is ^loc For the historical trajectory characterization from step 3-1, the GRU is used to process time series data for the gating neural unit,probability of a target being likely to be captured for a series of cameras; specifically->For all N cameras C in the area ₁ ,C ₂ ,C ₃ ,...C _N Prediction probability of->N is the number of cameras.

Step 4: the hierarchical search strategy is used for distributing computing resources and the pedestrian re-identification method in the step 2 is used for identification search, and the specific method comprises the following steps: constructing different search scenarios, including:

first application scenario: under the condition of no preliminary judgment, carrying out historical information retrieval according to the characteristics of the search target;

the second application scenario: capturing a target under a single camera, judging and confirming, and inquiring a target track;

third application scenario: rapidly searching and judging the track of a search target under limited computing resources;

the hierarchical search strategy is adopted according to the search scene, and the method specifically comprises the following steps:

step 4-1: a global search strategy comprising:

selecting a global search strategy under the conditions of the first application scene and the second application scene; the probability of capturing the search target by all cameras is the same, i.eWherein->Representing the ith camera C _i Capturing the probability of searching the target; formalizing the computing resources as Cptsoc, assigning the same computing resources for the data collected by each camera +.>Cptsoc, the resource allocation calculation mode is equivalent to the feature query method described in step 2-2, namely the feature query method described in step 2-2 is the global search strategy.

Step 4-2: the hierarchical search strategy, namely performing hierarchical search by using the probability level of each camera divided in the step 3, specifically comprises the following steps:

under the condition of the second application scene, after single or multiple initial position information of the reconnaissance target is obtained by using the global search strategy in the step 4-1, or under the condition of searching the target in real time in a third scene, performing recognition search by using the hierarchical search model constructed in the step 3 and the pedestrian re-recognition model in the step 2; computing resource is allocated according to probability level of camera>Cptsoc。

Step 4-3: a distribution feature assisted enhanced search strategy comprising:

when the hierarchical search strategy described in step 4-2 is adopted, the computing resources are pre-arranged to the camera terminal. And the system center distributes the features of the confirmed search targets to each camera terminal, and the camera terminals extract the features of the targets under the current cameras in the mode of step 1 and step 2 and perform similarity calculation with the features of the confirmed search targets. Due to the limitation of the computing capacity of the camera terminal, the distribution feature auxiliary strengthening search strategy only carries out ambiguity screening on targets captured by the camera terminal, filters targets with similarity with the targets lower than a certain threshold value, and uploads other targets to a system center for further processing. After the system center calculates and confirms, new characteristics of the confirmation target are continuously distributed to each camera terminal for subsequent identification.

Step 5: the result display confirmation specifically comprises the following steps:

step 5-1: sequencing and displaying results; and sorting all target identification results according to time, displaying the results in the tracking system, and updating and displaying the results. In particular, the system displays in combination with geographic views and matching video images, i.e., displays abstract map representations, with the representation lines being made temporally first, enabling further analysis of historical trajectories of search targets. The complete video showing the current most recent matching location waits for further confirmation.

Step 5-2: and confirming the result, namely manually judging the matching result, removing the result of the matching error, namely, the correct result is the characteristic of the search target image confirmed by the manual judgment, and fusing the characteristic as a new matching characteristic.

The beneficial effects are that:

(1) The application provides an automatic tracking scheme based on a pedestrian re-identification technology, and can automatically identify the track of the target under the condition of low manual intervention, thereby saving a great deal of manpower and time cost, reducing cost and enhancing efficiency.

(2) The application provides a pedestrian re-identification method with enhanced target confirmation characteristics and two search identification strategies of static global search and hierarchical search, so that the system can process more complex tracking environments.

(3) According to the hierarchical search strategy combining the geographic information, the probability area of capturing the tracking object next time by the nearby cameras is calculated according to the target tracks captured by the cameras and the geographic information, and the calculation resources are distributed according to the calculation probability, so that the track of the target is tracked better under the condition of limited calculation resources.

Drawings

The foregoing and/or other advantages of the application will become more apparent from the following detailed description of the application when taken in conjunction with the accompanying drawings and detailed description.

Fig. 1 is a block diagram of a tracking system of the present application.

FIG. 2 is a flowchart of the pedestrian re-recognition algorithm used in the present application;

FIG. 3 is a schematic diagram of a hierarchical search strategy incorporating geographic information in accordance with the present application;

FIG. 4 is a schematic diagram of a tracking algorithm of the pedestrian re-recognition and hierarchical search strategy used in the present application;

FIG. 5 is a schematic diagram of the allocation of computing resources for different situations when using a hierarchical search strategy in accordance with the present application.

Detailed Description

Fig. 1 shows a tracking system based on pedestrian re-recognition and hierarchical search strategy according to the present application, which includes a data acquisition module, a hierarchical search recognition module, and a result display confirmation module. Wherein,

the data acquisition processing module obtains a group of continuous cut single-target video clips or single-target pictures from original video data acquired by monitoring cameras in different positions and different environments by using a tracking algorithm. The data simultaneously comprise the geographic position information of the camera.

The hierarchical search recognition module comprises a basic pedestrian re-recognition algorithm and a hierarchical search strategy, wherein the pedestrian re-recognition algorithm processes the single-target video segment with the geographic position information through the data acquisition processing module to obtain a characteristic query library for further target re-recognition. When searching historical targets, using a query library to query all the targets meeting the requirements, sorting the targets according to time and position, and further confirming the targets. When the real-time target tracking is performed, the hierarchical search strategy is used for processing the characteristic comparison of the video data and the determined target in real time, and the target with high credibility is screened out to be further confirmed.

The result display and confirmation module displays the targets obtained by the hierarchical search and identification module and constructs a search target space-time data chain. The manually confirmed data is used as a confirmed target, and target features obtained by the video frames after each confirmation are used as enhanced query features, so that the reliability of target pursuit is increased.

The implementation flow of the tracking system based on the pedestrian re-identification and hierarchical search strategy is as follows:

step 1: and (5) data acquisition and processing. The data acquisition process acquires original data from the monitoring system and performs preprocessing. The method specifically comprises the following steps:

step 1-1: and (5) data acquisition.

The data acquisition module carries out further processing and arrangement based on the data acquired by the existing video monitoring acquisition systems such as the heaven-net engineering, the snowy-bright engineering and the like, and the cameras are distributed in different positions of the environment. We collect surveillance videos with time stamps and camera position information from these surveillance systems. And obtaining a group of continuous cut single-target video clips or single-target pictures from the original video data obtained by the monitoring cameras in different environments at different positions by using a tracking algorithm.

Step 1-2: and (5) target detection and clipping. The video picture of a single camera often has a plurality of pedestrian targets, and the targets are needed to be cut out from videos respectively to form a plurality of groups of single-target videos. Typically, the background information of a single camera's recorded video is deterministic, so pedestrian targets are well detected from the video using a target detection algorithm (ref: LUO, hui-lan, and Hong-kun chen. "Survey of object detection based on deep learning." Acta Electonica Sinica 48.6.6 (2020): 1230 "). But the background information and quality of each camera are different, and there is no explicit quantization criterion how to determine the position of the more correct bounding box. The present application uses a microtopography layer that has been developed (see: ding, jian, nan Xue, yang Long, gui-Song Xia, and Qiai Lu. "Learning ROI transformer for oriented object detection in aerial images." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.2849-2858.2019.). It cuts out the single target from the original image by calculating the affine transformation of the coordinates of the original video frame and the coordinates of the target. And combining the target tracking to form a single target video.

In the model training process, the cut video image is input into a pedestrian re-recognition model, corresponding loss calculation is carried out, and the transformation layer in the step 1-2 is adjusted and corrected through the back propagation of gradients. In the use process of the model, the cut video image is used as a target for feature extraction and identification.

Step 2: and constructing a pedestrian re-identification model, calculating and extracting characteristics and identifying. In the past, pedestrian re-recognition in a single frame image is the dominant research method. However, the information existing in a single picture is limited, and particularly, the recognition performance may be affected by the situation that a pedestrian may be blocked, distorted, and the like during actual pedestrian tracking detection. Studies have shown that multi-frame sequences can compensate for the lack of single-frame video image information. Thus in this example, a simple CNNs-RNNs re-identification model was used, with a ResNet50 network (reference: he, kaiming, et al "Deep residual learning for image recovery" Proceedings of the IEEE conference on computer vision and pattern recovery.2016.) pre-trained on a Market1501 dataset (reference: zheng, L., shen, L., tian, L., wang, S., wang, J., bu, J., tian, Q.: scalable person re-identification: a benchmark.In IEEE International Conference on Computer Vision (2015)) as the CNNs network, and a basic GRU network (reference: dey, rahul, and Fathi M.Salem "Gate-variants of Gated Recurrent Unit (GRU) functional works" 2017IEEE 60th international midwest symposium on circuits and systems (MWSCAS). IEEE 2017.) as the multi-frame RNNs network, and the re-identification of the image detection provided in step 1. The method specifically comprises the following steps:

step 2-1: and (5) extracting characteristics. As shown in the pedestrian re-recognition flowchart of fig. 2, all the comparison and similarity calculation of the targets are performed based on the characteristics of the video multi-frame images. How to extract the spatiotemporal features in the video data is a key to pedestrian re-recognition. The application constructs a CNNs-RNNs model for extracting the space-time characteristics of the detection target. The specific feature extraction flow is as follows:

multi-frame images of different sizes of a single examination object are first fed as a sequence into a network of image transformations, respectively, and transformed to the same size. And then as input into Convolutional Neural Networks (CNNs) to obtain the spatial features of a single frame. A set of spatial features of the target is input into a cyclic neural network (RNNs) in time sequence to obtain a space-time feature in the set of features, and the space-time feature representation of the detection target is obtained through time pooling layer pooling processing.

And putting the data in the history monitoring into a feature library after the operation to wait for further feature inquiry. The real-time monitored target and the single-frame identification target are processed by CNNs and RNNs with the same parameters to obtain corresponding characteristics.

In the embodiment, the obtained pedestrian space-time characteristic representation calculation triplet loss is used as a loss function of a model to be optimized, and is transformed into cross entropy loss of which Z is vector calculation ID through a full connection layer, wherein Z is the number of marked pedestrian IDs.

Step 2-2: similarity calculation and feature query. As shown in fig. 2, we use the features calculated in step 2-1 to perform further similarity calculations and feature queries. The feature query refers to a process of searching in a feature library by using the features of the picture or the multi-frame video of the determined target, and k targets with minimum distance are selected by calculating the cosine distance between the features of the target and the features in the feature library. Because the system is in an open environment, the trained model does not depend on the marked ID information to judge, and therefore, the selected k targets need to be manually and further judged. And querying the characteristic of the target by combining the characteristic of the determined target historical video as a new characteristic, and finally forming a complete target travelling track. The real-time tracking process requires that the track of the tracked object be determined more quickly, and therefore requires that the feature extraction of the process is performed on the data monitored in real time, the similarity of the cosine distance representation is directly calculated with the features of the confirmed targets, and the targets exceeding the threshold value are provided for the staff to confirm.

Step 3: and constructing a hierarchical search model, and dividing probability grades. And calculating the probability value of capturing the search target next time by each camera according to the geographic information and the historical track of the search target, wherein the probability value is expressed as the probability level of each camera, and videos shot by cameras with different probability levels are treated differently. The method specifically comprises the following steps:

step 3-1: and acquiring historical track characteristics of the search target. As shown in fig. 3, the dashed lines represent simplified geographical traffic routes, and the tracked person travels in complex road routes. The black solid dots represent that the camera at the corresponding position captures the target and performs recognition confirmation. Thus, the target's track gauge L may be represented as a combination of position coordinates of a series of camera track capture points [ M ] ₁ ,M ₂ ,M ₃ ,…,M _i ,…,M _t ]Wherein M is _i ＝(long _i ,lat _i ) Is a specific longitude and latitude coordinate. Each camera C _i Local geographic features within a nearby r radiusOther cameras C in range _j Position information M of (2) _j Motion vector information a _i Geographical location information as current camera +.>Namely:

wherein W is ^map Weight matrix for geographical features, W ^C Weight matrix for geographic position information of other cameras, M _i And M is as follows _j Is less than r, i.e. ||M _i -M _j || ₂ ≤r。W ^track As a weight matrix of motion vector, motion vector a _t ＝M _t -M _t-1 ，M _t For the current camera position, M _t-1 Is the last camera position. Local geographic features within r radiusThe local geographic information is obtained by a convolutional neural network and mainly comprises important traffic information such as roads, small roads and the like which can pass through, and the local geographic information is a context graph processed by a geographic traffic graph and is formed into +.>Local geographic features->Can be expressed as:

wherein CNN is convolutional neural network, in this real-time example, a 3-layer Faster R-CNN network is used.

The historical track characteristics of the search target are expressed as:

step 3-2: the probability region of the future track is predicted based on historical track features of the search target. The application constructs a basic prediction network W2 to process the historical track information of the search target, and because the information has time sequence property, the application adopts GRU modeling time sequence relation. The structure of the predictive network W2 is as follows:

wherein F is ^loc For the historical trajectory characterization from step 3-1, the GRU is used to process time series data for the gating neural unit,probability that a target may be captured for a series of cameras. Specifically->For all cameras C in the area ₁ ,C ₂ ,C ₃ ,...C _N Prediction probability of->N is the number of cameras.

Step 4: as shown in fig. 4, computing resources are allocated using a hierarchical search strategy and an identification search is performed using the pedestrian re-identification method of step 2. The application constructs different search scenes, including the following application scenes: 1) Under the condition of no preliminary judgment, carrying out historical information retrieval according to a single photo of the reconnaissance target; 2) Capturing a target under a certain single camera and inquiring a target track after judging and confirming; 3) And quickly searching and judging the track of the search target under the limited computing resources in the emergency capture action. The method specifically comprises the following steps:

step 4-1: and (5) global searching. The global search strategy is selected under the scenario conditions of scenario 1) and scenario 2) described above. Deterministic position information of the search target cannot be collected in such an initial scenario, so that the probability of capturing the search target by all cameras is the same, i.eFormalizing the computing resources as Cptsoc, assigning the same computing resources for the data collected by each camera +.>Cptsoc, as shown in the left hand diagram of FIG. 5. The resource allocation calculation mode is equivalent to that in the step 2-2, features of the target image are used for carrying out feature query in feature libraries extracted from all the historical video data. Therefore, in actual operation, the method directly uses the characteristic query method to search, and the method saves the resources required by probability calculation of the corresponding camera.

Step 4-2: and (3) carrying out hierarchical search by using the probability level of each camera obtained in the step (3). In the case that the above-mentioned scene 2) has confirmed that the single or multiple initial position information of the scout target is obtained by using the step 4-1, in order to further refine all track information as soon as possible under limited resources, or in the case that the search target needs to be searched in real time in the scene 3), the recognition search is performed by using the hierarchical search model constructed in the step 3 and the pedestrian re-recognition method of the step 2. In this caseMatching under different cameras allocates computing resources of corresponding probability classes +.>Cptsoc, as shown in the right hand drawing of FIG. 5.

Step 4-3: the distribution feature assists in enforcing the search strategy. In step 4-2, in order to further increase the efficiency of the search, the computing resources are prepended to the imaging terminal under conditional conditions. And the system center distributes the features of the confirmed search targets to each camera terminal, and the camera terminals extract the features of the targets under the current cameras in the mode of step 1 and step 2 and perform similarity calculation with the features of the confirmed search targets. Due to the limitation of the computing capacity of the camera terminal, the distribution feature auxiliary strengthening search strategy only carries out fuzzy screening on targets captured by the camera terminal, the targets with the similarity lower than a certain threshold value with the targets are filtered, and the rest targets are uploaded to a system center for further processing. After the system center calculates and confirms, new characteristics of the confirmation target are continuously distributed to each camera terminal for subsequent identification.

Step 5: the result display and confirmation, namely sorting the query and matching results in the steps according to time, and manually confirming the results, wherein the method specifically comprises the following steps of:

step 5-1: and sequencing and displaying the results. The application sorts all the matching results according to time, displays the results in the system and updates and displays the new matching results. In particular, the system displays the geographic view and the matching video image, namely, displays abstract map marks, and marks the connecting line in time first, so that staff can further analyze the historical track of the search target. The complete video showing the current most recent matching location waits for further confirmation.

Step 5-2: and confirming the result, manually judging the matching result, removing the result of the matching error, and entering the correct result into the system to increase the matching capability of the system. In the similarity calculation process using pedestrian re-recognition in the above step 4, the comparison calculation is performed using the image originally provided. The tracking system based on the pedestrian re-identification and hierarchical search strategy is applied to an open environment of a non-laboratory, all confirmed target images are fused, and the novel matching characteristics are used as novel matching characteristics, have more target information and play a reinforcing effect on search, so that the robustness of the system can be improved by manually judging the result.

In a specific implementation, the present application provides a computer storage medium and a corresponding data processing unit, where the computer storage medium is capable of storing a computer program, where the computer program when executed by the data processing unit may perform part or all of the steps in the embodiment and the summary of the tracking system based on the pedestrian re-identification and hierarchical search strategy provided by the present application. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (random access memory, RAM), or the like.

It will be apparent to those skilled in the art that the technical solutions in the embodiments of the present application may be implemented by means of a computer program and its corresponding general hardware platform. Based on such understanding, the technical solutions in the embodiments of the present application may be embodied essentially or in the form of a computer program, i.e. a software product, which may be stored in a storage medium, and include several instructions to cause a device (which may be a personal computer, a server, a single-chip microcomputer, MUU or a network device, etc.) including a data processing unit to perform the methods described in the embodiments or some parts of the embodiments of the present application.

The application provides a method and a method for tracking a system based on a pedestrian re-recognition and hierarchical search strategy, and the method for realizing the technical scheme are a plurality of methods and paths, the above description is only a preferred embodiment of the application, and it should be pointed out that a plurality of improvements and modifications can be made by those skilled in the art without departing from the principle of the application, and the improvements and modifications are also considered as the protection scope of the application. The components not explicitly described in this embodiment can be implemented by using the prior art.

Claims

1. A tracking system based on pedestrian re-recognition and hierarchical search strategies, comprising: the device comprises a data acquisition processing module, a hierarchical search identification module and a result display confirmation module;

2. The tracking system based on pedestrian re-recognition and hierarchical search strategy of claim 1, wherein the tracking system specifically flows as follows:

step 2: constructing a pedestrian re-identification model, calculating and extracting characteristics and identifying;

step 3: constructing a hierarchical search model, and dividing probability grades;

step 4: distributing computing resources by using a hierarchical search strategy and performing recognition search by using the pedestrian re-recognition method in the step 2;

step 5: the results show confirmation.

3. The tracking system based on pedestrian re-recognition and hierarchical search strategy according to claim 2, wherein the data acquisition process in step 1 specifically comprises the following steps:

step 1-1: and (3) data acquisition:

step 1-2: detecting and cutting a target;

4. The tracking system based on pedestrian re-recognition and hierarchical search strategy as set forth in claim 3, wherein the pedestrian re-recognition model in step 2 is constructed, and the extracted features are calculated and recognized by the following specific method:

constructing a pedestrian re-recognition model, and re-recognizing the target to be detected by using the CNN-RNN re-recognition model and the single target acquisition data in the step 1-2, wherein the specific method comprises the following steps:

the multi-frame images containing a single object to be inspected are first fed as a sequence into the network of image transformations, respectively, to be transformed into the same size; then as input, the spatial characteristics of a single frame are obtained by entering a convolutional neural network CNNs; inputting a group of spatial features of the single target to be checked into a cyclic neural network RNNs in time sequence to obtain space-time features in the group of spatial features, and carrying out time pooling layer pooling treatment to obtain space-time feature representation of the single target to be checked;

step 2-2: similarity calculation and characteristic query;

5. The tracking system based on pedestrian re-recognition and hierarchical search strategy as set forth in claim 4, wherein the hierarchical search model of step 3, the probability classification method is as follows: the hierarchical search model calculates probability values of capturing the search target next time by each camera according to geographic information and historical track of the search target, wherein the probability values are expressed as probability grades of each camera, and the method specifically comprises the following steps of:

the track gauge L of the search target is expressed as a combination of position coordinates of a series of camera track capture points [ M ] ₁ ，M ₂ ，M ₃ ，...，M _i ，...，M _t ]Wherein M is _i ＝(long _i ，lat _i ) The longitude and latitude coordinates of the ith camera; ith camera C _i Local geographic features within a nearby r radiusJ-th camera C in range _j Position information M of (2) _j Motion vector information a of ith camera _i Geographical location information as current camera +.>Namely:

wherein CNN is convolutional neural network;

the historical track characteristics of the search target are expressed as:

wherein F is ^loc For the historical trajectory characterization from step 3-1, the GRU is used to process time series data for the gating neural unit,probability of a target being likely to be captured for a series of cameras; specifically->For all N cameras C in the area ₁ ，C ₂ ，C ₃ ，...C _N Prediction probability of->N is the number of cameras.

6. The tracking system based on pedestrian re-recognition and hierarchical search strategy as set forth in claim 5, wherein the method of assigning computing resources using the hierarchical search strategy and performing the recognition search using the pedestrian re-recognition method of step 2 in step 4 includes: constructing different search scenarios, including:

step 4-1: a global search strategy;

step 4-2: a hierarchical search strategy;

step 4-3: the distribution feature assists in enforcing the search strategy.

7. The pedestrian re-identification and tiered search strategy-based tracking system of claim 6 wherein the global search strategy of step 4-1 includes:

selecting a global search strategy under the conditions of the first application scene and the second application scene; the probability of capturing the search target by all cameras is the same, i.eWherein->Representing the ith camera C _i Capturing the probability of searching the target; formalizing the computing resources as Cptsoc, assigning the same computing resources for the data collected by each camera +.>The resource allocation calculation mode is equivalent to the characteristic query method described in the step 2-2, namely the characteristic query described in the step 2-2The method is a global search strategy.

8. The tracking system based on the pedestrian re-recognition and hierarchical search strategy according to claim 7, wherein the hierarchical search strategy described in step 4-2, i.e. performing the hierarchical search using the probability level of each camera divided in step 3, specifically comprises:

under the condition of the second application scene, after single or multiple initial position information of the reconnaissance target is obtained by using the global search strategy in the step 4-1, or under the condition of searching the target in real time in a third scene, performing recognition search by using the hierarchical search model constructed in the step 3 and the pedestrian re-recognition model in the step 2; computing resource is allocated according to probability level of camera>

9. The pedestrian re-identification and tiered search strategy-based tracking system of claim 8 wherein the distribution feature-assisted enhanced search strategy of step 4-3 includes:

when the hierarchical search strategy described in the step 4-2 is adopted, the computing resource is pre-arranged to the camera terminal; the system center distributes the features of the confirmed search targets to each camera terminal, and the camera terminals extract the features of the targets under the current cameras in the mode of step 1 and step 2 and perform similarity calculation with the features of the confirmed search targets; the distribution characteristic auxiliary strengthening search strategy only carries out fuzzy screening on targets captured by the camera terminal, filters targets with similarity with the targets lower than a certain threshold value, and uploads other targets to a system center for further processing; after the system center calculates and confirms, new characteristics of the confirmation target are continuously distributed to each camera terminal for subsequent identification.

10. The tracking system based on pedestrian re-recognition and hierarchical search strategy of claim 9, wherein the result presentation confirmation of step 5 comprises the steps of:

step 5-1: sequencing and displaying results; sorting all target identification results according to time, displaying the results in the tracking system and updating and displaying the results; the system combines the geographic view and the matched video image for display, namely, displaying abstract map marks, marking connecting lines according to time, and further analyzing historical tracks of search targets; displaying the complete video of the current latest matching position to wait for manual confirmation;