WO2023123916A1

WO2023123916A1 - Target tracking method and apparatus, and electronic device and storage medium

Info

Publication number: WO2023123916A1
Application number: PCT/CN2022/100164
Authority: WO
Inventors: 王京; 王孝宇; 肖嵘; 黄哲
Original assignee: 深圳云天励飞技术股份有限公司
Priority date: 2021-12-31
Filing date: 2022-06-21
Publication date: 2023-07-06
Also published as: CN114445458B; CN114445458A

Abstract

Disclosed in the present application are a target tracking method and apparatus, and an electronic device and a storage medium. The method comprises: determining human body detection images corresponding to human body regions in a plurality of pictures; according to association information between the plurality of human body detection images, performing feature fusion on the plurality of human body detection images so as to construct a first path image; adding mutual exclusion information to the first path image so as to construct a second path image; performing path calculation on the second path image according to a preset algorithm so as to obtain a result path; and determining the obtained result path as a tracking result. The present application may make information fusion between a plurality of cameras be simpler, thereby improving the accuracy of tracking.

Description

Target tracking method, device, electronic device and storage medium

This application claims the priority of the Chinese patent application with the application number 202111674242.5 and the application name "target tracking method, device, electronic equipment and storage medium" submitted to the China Patent Office on December 31, 2021, the entire contents of which are incorporated by reference in this application.

technical field

The present application relates to the technical field of image processing, and in particular to a target tracking method, device, electronic equipment and storage medium.

Background technique

With the development and progress of artificial intelligence technology, human body tracking technology has been widely used in various aspects of social life, such as face capture in public security, face building in business scenarios, etc. However, most of the existing tracking technologies are based on human body detection under a single camera. At present, human body detection is tracked under a single camera. Once the human body is blocked, human body detection will fail, which will greatly affect the tracking accuracy. , but when multiple cameras are used to shoot the human body, it is extremely difficult to fuse information between multiple cameras, resulting in insufficient tracking accuracy.

application content

In the first aspect, the main purpose of this application is to provide a target tracking method, including:

Determining human body detection images corresponding to human body regions in multiple pictures;

performing feature fusion on the plurality of human detection images according to correlation information between the plurality of human detection images, to construct a first path image;

adding mutually exclusive information to the first path image to construct a second path image;

performing path calculation on the second path image according to a preset algorithm to obtain a resultant path;

The obtained result path is determined as a trace result.

Optionally, the determining the human body detection images corresponding to the human body regions in the multiple pictures includes:

Obtain pictures collected by multiple cameras;

Perform human body detection on the human body area in each picture to generate a corresponding detection frame;

Determine the corresponding detection point according to the detection frame, and determine the intersection ratio and feature similarity between at least two detection frames in each picture;

Calculating the intersection ratio and the feature similarity based on the Mahalanobis distance to determine the corresponding first weight edge;

Constructing the detection points and the first weighted edge to obtain the human body detection image.

Optionally, performing feature fusion on the plurality of human detection images according to association information between the plurality of human detection images to construct the first route image includes:

Determine the epipolar distance between detection points corresponding to the same frame of at least two cameras according to the human body detection image;

judging whether the epipolar distance is less than a preset threshold;

When the epipolar line distance is less than a preset threshold, calculate the average value of the closest points between the epipolar lines;

determining a three-dimensional hypothetical point according to the average value;

The 3D hypothetical point and the first weighted edge are calculated, and the calculation result is combined with the detection point, the 3D hypothetical point, and the first weighted edge to construct the first path image.

Optionally, after determining the epipolar distance between the detection points corresponding to the same frame of at least two cameras according to the human body detection image, the method includes:

Determine the feature similarity corresponding to each camera according to the human body detection image;

The epipolar distance and the feature similarity are calculated based on the Mahalanobis distance to determine a corresponding second weighted edge.

Optionally, the calculation is performed on the three-dimensional hypothetical point and the first weighted edge, and the calculation result is combined with the detection point, the three-dimensional hypothetical point and the first weighted edge to construct the first path image ,include:

determining a Euclidean distance between three-dimensional hypothetical points corresponding to the same frame in at least two cameras;

Calculate the Euclidean distance, the first weight edge, and the second weight edge based on the Mahalanobis distance to determine a third weight edge corresponding to the three-dimensional hypothetical point;

The detection point, the three-dimensional hypothetical point, the first weight edge, the second weight edge and the third weight edge are constructed to obtain the first path image.

Optionally, adding mutually exclusive information to the first path image to construct the second path image includes:

Adding a mutually exclusive edge to the three-dimensional hypothetical point to generate a corresponding three-dimensional real point; wherein, the three-dimensional information between the three-dimensional hypothetical point and the three-dimensional real point is the same and the category information is different;

Judging the three-dimensional hypothetical point to determine the correct three-dimensional real point and mutually exclusive edge;

The correct 3D real point, mutually exclusive edge, detection point, 3D hypothetical point, first weighted edge, second weighted edge, and third weighted edge are constructed to obtain the second path image.

Optionally, performing path calculation on the second path image according to a preset algorithm to obtain a resultant path includes:

calculating the second path image according to the minimum cost maximum flow algorithm to determine the shortest path;

The shortest path is determined as the resulting path.

In a third aspect, the embodiment of the present application provides a target tracking device, including:

A first determining module, configured to determine human body detection images corresponding to human body regions in multiple pictures;

A fusion module, configured to perform feature fusion of the plurality of human detection images according to the association information between the plurality of human detection images, so as to construct a first path image;

A building module, configured to add mutually exclusive information to the first path image to construct a second path image;

A calculation module, configured to perform path calculation on the second path image according to a preset algorithm to obtain a resultant path;

The second determining module is configured to determine the obtained result path as a tracking result.

In a third aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, Steps for realizing the above-mentioned target tracking method.

In a fourth aspect, the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of the above-mentioned target tracking method are implemented.

The above-mentioned scheme of the present application at least includes the following beneficial effects:

The target tracking method provided in the present application firstly determines the human body detection images corresponding to the human body regions in multiple pictures; and performs feature fusion on the multiple human body detection images according to the correlation information between the multiple human body detection images, to construct a first path image; then add mutually exclusive information to the first path image to construct a second path image; perform path calculation on the second path image according to a preset algorithm to obtain a resultant path; finally The obtained result path is determined as a tracking result; thus, the information fusion between multiple cameras can be made simpler, and the tracking accuracy can be improved.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present application, and those skilled in the art can also obtain other drawings according to the structures shown in these drawings without creative effort.

FIG. 1 is a schematic diagram of the overall flow of a target tracking method provided in an embodiment of the present application;

FIG. 2 is a schematic flowchart of step S10 provided in the embodiment of the present application;

FIG. 3 is an example diagram of a target tracking method provided in an embodiment of the present application;

FIG. 4 is another example diagram of the target tracking method provided by the embodiment of the present application;

FIG. 5 is another example diagram of the target tracking method provided by the embodiment of the present application;

FIG. 6 is a schematic flowchart of step S20 provided in the embodiment of the present application;

FIG. 7 is another schematic flowchart of step S21 provided by the embodiment of the present application;

FIG. 8 is a schematic flowchart of step S25 provided in the embodiment of the present application;

FIG. 9 is a schematic flowchart of step S30 provided in the embodiment of the present application;

FIG. 10 is a structural block diagram of a target tracking device provided in an embodiment of the present application;

FIG. 11 is a structural block diagram of an electronic device provided by an embodiment of the present application.

Detailed ways

As shown in Figure 1, a specific embodiment of the present application provides a target tracking method, including:

S10. Determine human body detection images corresponding to human body regions in the multiple pictures.

Wherein, the multiple pictures can be multiple consecutive frames of pictures in the same video stream, and the video stream can be collected by multiple cameras at different angles in real time, and the pictures collected by each camera include the characters and the surrounding environment of the characters, and Each camera corresponds to different human body angles of people. Therefore, by performing human body detection on pictures corresponding to different human body angles of each camera, and then determining the human body detection image corresponding to the human body area, the impact of human body occlusion on tracking can be reduced.

As shown in Figure 2, the specific implementation of the above step S10 includes:

S11. Obtain pictures collected by multiple cameras;

S12. Perform human body detection on the human body area in each picture to generate a corresponding detection frame;

S13. Determine the corresponding detection point according to the detection frame, and determine the intersection ratio and feature similarity between at least two detection frames in each picture;

S14. Calculate the intersection-union ratio and the feature similarity based on the Mahalanobis distance, so as to determine the corresponding first weight edge;

S15. Construct the detection points and the first weighted edge to obtain a human detection image.

In this embodiment, the Mahalanobis distance represents the distance between a point and a distribution, and the problem of inconsistent scales of each dimension can be eliminated through the Mahalanobis distance; Intersection-over-Union (IoU) represents two detection frames The overlapping part is divided by the result of the set of two detection frames. When the ratio is 1, it means that the two detection frames are completely overlapped. When performing human body detection on the corresponding pictures in each camera, you can use The human body area in a single frame picture is detected, and corresponding detection frames are generated respectively. For example, if there are two people in a single frame picture, two detection frames can be generated correspondingly. By matching the two detection frames, two detection frames can be calculated. The intersection ratio between two detection frames; and, the corresponding detection point can be determined through the center point of the detection frame, and the corresponding features in the two detection frames can be extracted by using the feature extraction model, so that the two detection frames can be calculated by calculating The cosine distance of the features in each detection frame to obtain the feature similarity between the two detection frames; after calculating the intersection ratio and feature similarity between the two detection frames, due to the intersection ratio and feature similarity The measurement units between are different, so the Mahalanobis distance can be used to eliminate the influence of the different scales between the intersection ratio and the feature similarity, and then calculate the first weight edge, and the corresponding human body can be constructed through the detection point and the first weight edge Detect images.

As shown in Figure 3, each detection point can be set as _Vd (concentric circle point in the figure), and the first weight side can be set as _Ed (thin line in the figure), where the position of _Vd can be determined by the position of the detection frame Center point, E _d can be obtained by the following formula:

Understandably,

Represents the intersection and union ratio of the two detection frames;

Represents the feature similarity between two detection frames. The feature similarity can be obtained by extracting features from the detection frame with a feature extraction model, and then calculating the cosine distance of the feature; D _M represents the Mahalanobis distance, and calculates all After detecting the point and the first weight edge, the human body detection image G _d can be constructed through the detection point and the first weight edge, G _d = (V _d , E _d ); and each camera can independently construct its corresponding human body Detect images.

S20. According to the association information between the multiple human detection images, perform feature fusion on the multiple human detection images to construct a first path image.

As shown in Figure 6, the specific implementation of the above step S20 includes:

S21. Determine the epipolar distance between the detection points corresponding to the same frame of at least two cameras according to the human body detection image;

S22. Determine whether the epipolar distance is smaller than a preset threshold;

S23. When the epipolar line distance is less than the preset threshold, calculate the average value of the closest point between the epipolar lines;

S24. Determine the three-dimensional hypothetical point according to the average value;

S25. Calculate the three-dimensional hypothetical point and the first weighted edge, and construct the calculation result with the detection point, the three-dimensional hypothetical point, and the first weighted edge to obtain a first path image.

In this embodiment, different cameras correspond to the same frame during synchronous acquisition, and the two detection points corresponding to the same frame in each camera are calculated separately to obtain the epipolar distance between the two detection points in each camera , and the preset threshold can be preset; after the human body detection image is constructed through the detection point and the first weighted edge, the detection point corresponding to each camera can be used to determine the corresponding epipolar line; The polar constraint means that two cameras have imaging planes in three-dimensional space, and the same detection point can be projected onto the two imaging planes to form two projection points, and the polar plane can be determined by connecting the two projection points with the detection point. The two intersecting lines between the plane and the two imaging planes are epipolar lines. Since the two epipolar lines are assumed to intersect, the poles at the shortest distance can be determined on the two epipolar lines. By calculating the distance between the poles The average value, and then the three-dimensional hypothetical point can be determined. Of course, the three-dimensional hypothetical point does not exist in the case of epipolar mis-matching or errors in the internal and external parameters of the camera; it can be understood that when the detection point in a certain camera When there are two, two 3D hypothetical points can be determined; when there are two detection points in both cameras, four 3D hypothetical points can be generated. Since the 3D hypothetical points may not exist, the 3D hypothetical Two points can also be generated; it can be understood that a corresponding three-dimensional hypothetical point can be generated corresponding to each frame of pictures, so as to establish a connection between multiple cameras through the three-dimensional hypothetical point.

As shown in Figure 7, after the above step S21 includes:

S211. Determine the corresponding feature similarity of each camera according to the human body detection image;

S212. Calculate the epipolar distance and feature similarity based on the Mahalanobis distance, so as to determine the corresponding second weight edge.

In this embodiment, the second weight edge (dotted line in the figure) is represented as a weight edge between the detection point V _d (concentric circle point in the figure) and the three-dimensional hypothetical point V _h (circle point in the figure), which can be obtained through the above-mentioned Feature similarity, determine the feature similarity corresponding to each camera, and the feature similarity can correspond to the same frame of the two cameras, the epipolar distance between the two detection points in each camera obtained through the above calculation , and then the Mahalanobis distance can be used to eliminate the influence of different scales between the epipolar distance and the feature similarity, and then calculate the second weight edge.

Optionally, the second weight edge can be calculated using the following formula:

in,

Represents the epipolar distance of two detection points in the same frame of different cameras;

Represents the feature similarity of two detection frames in the same frame of each camera; D _M represents the Mahalanobis distance. Since the distance and similarity belong to different measurement units, the Mahalanobis distance is used to eliminate the influence of different measurement units, and thus calculated Second weight edge.

As shown in Figure 8, the specific implementation of the above step S25 includes:

S251. Determine the Euclidean distance between the three-dimensional hypothetical points corresponding to the same frame in at least two cameras;

S252. Calculate the Euclidean distance, the first weight edge, and the second weight edge based on the Mahalanobis distance, so as to determine the third weight edge corresponding to the three-dimensional hypothetical point;

S253. Construct the detection point, the three-dimensional hypothetical point, the first weighted edge, the second weighted edge, and the third weighted edge to construct a first path image.

In this embodiment, after the two cameras determine the obtained three-dimensional hypothetical points, by calculating the Euclidean distance between the corresponding three-dimensional hypothetical points in the same frame of the two cameras, the distance between the three-dimensional hypothetical points can be determined by the Mahalanobis distance The Euclidean distance and the above-mentioned first weight edge and the second weight edge are calculated to eliminate the influence of the different measurement units between the Euclidean distance, the first weight edge and the second weight edge, and thus the third weight edge can be determined .

As shown in Figure 4, the generation method of the third weight edge E _h2 (thick line in the figure):

in,

Represents the Euclidean distance between two dots in each frame of the picture;

Indicates the two dashed E _h1 lines connected to dot 1;

Indicates the two dashed E _h1 lines connected to dot 2;

Indicates two E _d thin lines indirectly connected to dot 1 and dot 2; it can be understood that the Mahalanobis distance is used to eliminate the influence of different measurement units, so when calculating the third weight edge E _h2 , it is necessary to pass the first a weighted edge

and the second weight edge E _h1 to obtain the third weight edge E _h2 .

S30. Add mutually exclusive information to the first path image to construct a second path image.

In this embodiment, each detection point corresponds to each detection frame, and each detection frame represents a person, since each camera can generate a corresponding three-dimensional hypothetical point, but when there is an error in the internal and external parameters of the camera, then The generated 3D hypothetical points will have errors, so it is necessary to add a mutual exclusion relationship in the first path image to determine which camera generates 3D hypothetical points as wrong 3D hypothetical points; for example, there is detection point A in camera 1, camera 2 There are detection points A and B, detection point A in camera 1 and detection point A in camera 2 can generate a three-dimensional hypothetical point AA correspondingly, detection point A in camera 1 and detection point A in camera 2 can be generated correspondingly For a three-dimensional hypothetical point AB, it can be determined that the three-dimensional hypothetical point AA is correct, and at the same time it can be determined that the three-dimensional hypothetical point AB is wrong. In this way, the three-dimensional hypothetical point AB can be removed through mutually exclusive information.

As shown in Figure 9, the specific implementation of the above step S30 includes:

S31. Add mutually exclusive edges to the three-dimensional hypothetical points to generate corresponding three-dimensional real points; wherein, the three-dimensional information between the three-dimensional hypothetical points and the three-dimensional real points is the same and the category information is different;

S32. Judging the three-dimensional hypothetical points to determine the correct three-dimensional real points and mutually exclusive edges;

S33. Construct correct 3D real points, mutually exclusive edges, detection points, 3D hypothetical points, first weighted edges, second weighted edges, and third weighted edges to obtain a second path image.

Wherein, after determining the 3D hypothetical points between multiple cameras, since the 3D hypothetical points may or may not exist, in order to determine whether the generated 3D hypothetical points are correct, the 3D hypothetical points can be generated At the same time, mutually exclusive edges are correspondingly added to generate corresponding 3D real points, and at the same time, it is judged whether the 3D hypothetical points are correct. Remove, so as to get the correct 3D hypothetical point, 3D real point and mutually exclusive edge; it can be understood that when judging whether the 3D hypothetical point generated by the two cameras is correct, it can be judged by the second weight edge or directly Perform feature matching on the detection points. Since the second weight edge contains the feature similarity between the two detection points, it can be determined by the feature similarity of the two detection points or whether the two detection points match. Whether the obtained 3D hypothetical points are correct, thereby avoiding errors in camera tracking and ensuring more accurate information fusion between multiple cameras.

As shown in Figure 5, on the basis of the hypothetical human body graph G _h , mutually exclusive edges can be added to generate the second path image; where, the three-dimensional point V _r (black point in the figure) and the three-dimensional hypothetical point V _h (in the figure dot) have the same three-dimensional position; when a point in the first path image

Generate two hypothetical body points

By joining the mutually exclusive edge

(dotted line in the figure), then real human body points can be generated at the same time

Since each detection point can only represent one person, so

is a group of mutually exclusive edges, when the 3D hypothetical point corresponding to one of the mutually exclusive edges is correct, it means that the mutually exclusive edge exists, and the other mutually exclusive edge must not exist in the same way, thus, when determining the 3D real point After V _r and the mutually exclusive edge E _r , the second path image G _r =(V _d , V _h , V _r , E _d , E _h1 , E _h2 , E _r ) can be constructed.

S40. Perform path calculation on the second path image according to a preset algorithm to obtain a resultant path.

In this embodiment, the preset algorithm is the minimum cost maximum flow algorithm, and the minimum cost maximum flow algorithm means that by selecting the path and allocating the flow passing through the path, the minimum cost can be achieved under the premise of the maximum flow; When the path in the second path image is calculated, the three-dimensional real point determined in the starting frame can be used as the starting path to start the calculation, and the three-dimensional real point determined from the ending frame can be used as the ending path to end the calculation; It is understood that since there are multiple paths in the second path image, the second path image is calculated through a preset algorithm, so that the resulting path can be determined as the tracking result, making the tracking less difficult and more accurate.

Specifically, the above-mentioned path calculation on the second path image according to the preset algorithm to obtain the result path includes performing calculations in the second path image according to the minimum cost maximum flow algorithm to determine the shortest path; determining the shortest path as the result path .

Among them, the second path image G _r includes the detection point V _d , the three-dimensional hypothetical point

V _h , the 3D real point V _r , the first weight edge E _d , the second weight edge E _h1 , the third weight edge E _h2 , and the mutually exclusive edge E _r , by combining the 3D real point in the starting frame

Calculated as the starting point of the path and will end the 3D real point in the frame

Calculated as the end point of the path to determine the true points that make up the 3D

to

The shortest path between them is determined to determine the resultant path in the second path image, which makes the camera tracking less difficult and more accurate.

For example, as shown in Figure 5, from the 3D real point obtained in the first frame (the upper black point in the figure) to the corresponding 3D real point in the second frame (the lower black point in the figure), the shortest distance can be determined from the figure The path is the 3D real point (the upper black point in the figure), the mutually exclusive edge (the dotted line in the upper figure), the 3D hypothetical point (the upper circle point in the figure), the third weight edge, and the 3D hypothetical point (the lower circle point in the figure ), mutually exclusive edges (dotted line at the bottom of the figure), three-dimensional real points (black dots at the bottom of the figure); since the calculation of the shortest path needs to be calculated using the minimum cost maximum flow algorithm, the shortest path is only used as an example description, without any limitation.

S50. Determine the obtained result path as a tracking result.

Among them, after the shortest path is determined, the person is tracked according to the shortest path, which can improve the tracking accuracy of the person, and can reduce the impact of occlusion on the person tracking when multiple cameras are tracking and shooting. Information fusion among them is easier.

The object tracking method provided by this application first determines the human body detection images corresponding to the human body regions in multiple pictures; and according to the correlation information between the multiple human body detection images, performs feature fusion on the multiple human body detection images to construct the first A path image; then add mutually exclusive information to the first path image to construct a second path image; perform path calculation on the second path image according to a preset algorithm to obtain a result path; finally determine the obtained result path as tracking As a result, information fusion between multiple cameras can be made simpler, and the accuracy of tracking can be improved.

As shown in Figure 10, the embodiment of the present application provides a target tracking device 10, including:

The first determination module 11 is used to determine the human body detection image corresponding to the human body area in multiple pictures;

The fusion module 12 is used to perform feature fusion of multiple human detection images according to the association information between the multiple human detection images, so as to construct the first path image;

A construction module 13, configured to add mutually exclusive information to the first path image to construct a second path image;

A calculation module 14, configured to perform path calculation on the second path image according to a preset algorithm to obtain a resultant path;

The second determining module 15 is configured to determine the obtained result path as the tracking result.

The target tracking device 10 provided by the present application firstly determines the human body detection images corresponding to the human body regions in multiple pictures; The first path image; then add mutually exclusive information in the first path image to construct the second path image; perform path calculation on the second path image according to a preset algorithm to obtain a result path; finally determine the obtained result path as Tracking results; this can make the information fusion between multiple cameras easier and improve the accuracy of tracking.

It should be noted that the target tracking device 10 provided in the specific embodiment of the present application is a device corresponding to the above-mentioned target tracking method, and all embodiments of the above-mentioned target tracking method are applicable to the target tracking device 10. The above-mentioned target tracking device 10 embodiment There are corresponding modules corresponding to the steps in the above target tracking method, which can achieve the same or similar beneficial effects. In order to avoid too much repetition, each module in the target tracking device 2 will not be described in detail here.

As shown in FIG. 11 , the specific embodiment of the present application also provides an electronic device 20, including a memory 202, a processor 201, and a computer program stored in the memory 202 and operable on the processor 201. The processor 201 The steps of realizing the above-mentioned target tracking method when the computer program is executed.

Specifically, the processor 201 is used to call the computer program stored in the memory 202, and perform the following steps:

According to the association information between the multiple human detection images, feature fusion is performed on the multiple human detection images to construct the first path image;

Determine the resulting path as a trace result.

Optionally, the determination of the human body detection images corresponding to the human body regions in the multiple pictures performed by the processor 201 includes:

Obtain pictures collected by multiple cameras;

Determine the corresponding detection points according to the detection frame, and determine the intersection ratio and feature similarity between at least two detection frames in each picture;

Calculate the intersection ratio and feature similarity based on the Mahalanobis distance to determine the corresponding first weight edge;

The detection points and the first weighted edge are constructed to obtain a human body detection image.

Optionally, the process performed by the processor 201 to perform feature fusion of multiple human detection images according to the association information between the multiple human detection images, so as to construct the first path image includes:

Determine the epipolar distance between detection points corresponding to the same frame of at least two cameras according to the human detection image;

Judging whether the epipolar distance is less than a preset threshold;

When the epipolar line distance is less than the preset threshold, calculate the average value of the nearest points between the epipolar lines;

Determine the three-dimensional hypothetical point according to the average value;

The three-dimensional hypothetical point and the first weight edge are calculated, and the calculation result is combined with the detection point, the three-dimensional hypothetical point and the first weight edge to construct the first path image.

Optionally, after the processor 201 determines the epipolar distance between the detection points corresponding to the same frame of at least two cameras according to the human body detection image, it includes:

According to the human body detection image, determine the feature similarity corresponding to each camera;

The epipolar distance and feature similarity are calculated based on the Mahalanobis distance to determine the corresponding second weight edge.

Optionally, the processor 201 executes the calculation of the three-dimensional hypothetical point and the first weighted edge, and constructs the first path image by combining the calculation result with the detection point, the three-dimensional hypothetical point, and the first weighted edge, including:

Calculate the Euclidean distance, the first weight edge, and the second weight edge based on the Mahalanobis distance to determine the third weight edge corresponding to the three-dimensional hypothetical point;

Optionally, adding mutual exclusion information to the first path image to construct the second path image performed by the processor 201 includes:

Adding mutually exclusive edges to the three-dimensional hypothetical points to generate corresponding three-dimensional real points; wherein, the three-dimensional information between the three-dimensional hypothetical points and the three-dimensional real points is the same and the category information is different;

Judging the three-dimensional hypothetical points to determine the correct three-dimensional real points and mutually exclusive edges;

The correct 3D real point, mutually exclusive edge, detection point, 3D hypothetical point, first weighted edge, second weighted edge, and third weighted edge are constructed to obtain a second path image.

Optionally, the path calculation performed by the processor 201 on the second path image according to a preset algorithm to obtain a resultant path includes:

Calculate the second path image according to the minimum cost maximum flow algorithm to determine the shortest path;

Determine the shortest path as the resulting path.

That is, in a specific embodiment of the present application, when the processor 201 of the electronic device 20 executes the computer program, the steps of the above-mentioned target tracking method can be implemented, thereby making the information fusion between multiple cameras simpler and improving the accuracy of tracking Spend

It should be noted that, since the processor 201 of the electronic device 20 executes the computer program to implement the steps of the above object tracking method, all embodiments of the above object tracking method are applicable to the electronic device 20, and can achieve the same or similar Beneficial effect.

The computer-readable storage medium provided in the embodiment of the present application stores a computer program on the computer-readable storage medium. When the computer program is executed by the processor, each of the target tracking method or the application-side target tracking method provided in the embodiment of the present application is implemented. process, and can achieve the same technical effect, in order to avoid repetition, it will not be repeated here.

Claims

A target tracking method, comprising:

Determining human body detection images corresponding to human body regions in multiple pictures;

performing feature fusion on the plurality of human detection images according to correlation information between the plurality of human detection images, to construct a first path image;

adding mutually exclusive information to the first path image to construct a second path image;

performing path calculation on the second path image according to a preset algorithm to obtain a resultant path;

The obtained result path is determined as a trace result.
The target tracking method according to claim 1, wherein said determining the human body detection image corresponding to the human body area in the plurality of pictures comprises:

Obtain pictures collected by multiple cameras;

Perform human body detection on the human body area in each picture to generate a corresponding detection frame;

Determine the corresponding detection point according to the detection frame, and determine the intersection ratio and feature similarity between at least two detection frames in each picture;

Calculating the intersection ratio and the feature similarity based on the Mahalanobis distance to determine the corresponding first weight edge;

Constructing the detection points and the first weighted edge to obtain the human body detection image.
The target tracking method according to claim 2, wherein said performing feature fusion of said plurality of human detection images according to association information between said plurality of human detection images to construct a first path image comprises:

Determine the epipolar distance between detection points corresponding to the same frame of at least two cameras according to the human body detection image;

judging whether the epipolar distance is less than a preset threshold;

When the epipolar line distance is less than the preset threshold, calculate the average value of the closest points between the epipolar lines;

determining a three-dimensional hypothetical point according to the average value;

The 3D hypothetical point and the first weighted edge are calculated, and the calculation result is combined with the detection point, the 3D hypothetical point, and the first weighted edge to construct the first path image.
The target tracking method according to claim 3, wherein, after determining the epipolar distance between detection points corresponding to the same frame of at least two cameras according to the human body detection image, comprising:

Determine the feature similarity corresponding to each camera according to the human body detection image;

The epipolar distance and the feature similarity are calculated based on the Mahalanobis distance to determine a corresponding second weighted edge.
The target tracking method according to claim 4, wherein the calculation process is performed on the three-dimensional hypothetical point and the first weight edge, and the calculation result is combined with the detection point, the three-dimensional hypothetical point and the first weighted edge. The weight edge is constructed to obtain the first path image, including:

determining a Euclidean distance between three-dimensional hypothetical points corresponding to the same frame in at least two cameras;

Calculate the Euclidean distance, the first weight edge, and the second weight edge based on the Mahalanobis distance to determine a third weight edge corresponding to the three-dimensional hypothetical point;

The detection point, the three-dimensional hypothetical point, the first weight edge, the second weight edge and the third weight edge are constructed to obtain the first path image.
The target tracking method according to claim 5, wherein said adding mutually exclusive information to said first path image to construct a second path image comprises:

Adding a mutually exclusive edge to the three-dimensional hypothetical point to generate a corresponding three-dimensional real point; wherein, the three-dimensional information between the three-dimensional hypothetical point and the three-dimensional real point is the same and the category information is different;

Judging the three-dimensional hypothetical point to determine the correct three-dimensional real point and mutually exclusive edge;

The correct 3D real point, mutually exclusive edge, detection point, 3D hypothetical point, first weighted edge, second weighted edge, and third weighted edge are constructed to obtain the second path image.
The target tracking method according to claim 6, wherein said performing path calculation on said second path image according to a preset algorithm to obtain a resulting path comprises:

calculating the second path image according to the minimum cost maximum flow algorithm to determine the shortest path;

The shortest path is determined as the resulting path.
A target tracking device, including:

A first determining module, configured to determine human body detection images corresponding to human body regions in multiple pictures;

A fusion module, configured to perform feature fusion of the plurality of human detection images according to the association information between the plurality of human detection images, so as to construct a first path image;

A building module, configured to add mutually exclusive information to the first path image to construct a second path image;

A calculation module, configured to perform path calculation on the second path image according to a preset algorithm to obtain a resultant path;

The second determining module is configured to determine the obtained result path as a tracking result.
An electronic device, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein, when the processor executes the computer program, any of claims 1 to 7 is implemented. A step of the target tracking method.
A computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, the steps of the target tracking method according to any one of claims 1 to 7 are realized.