CN111047626B

CN111047626B - Target tracking method, device, electronic equipment and storage medium

Info

Publication number: CN111047626B
Application number: CN201911374132.XA
Authority: CN
Inventors: 曾佐祺
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2019-12-26
Filing date: 2019-12-26
Publication date: 2024-03-22
Anticipated expiration: 2039-12-26
Also published as: CN111047626A

Abstract

The embodiment of the application discloses a target tracking method, a target tracking device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a first target frame of at least one target object in a current image frame of a target video; carrying out overlapping region marking on the first target frame to obtain a shielding condition of the first target frame; acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object; performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame; and acquiring a second target frame of the target object in the next image frame according to the target tracking point. The embodiment of the application is beneficial to improving the efficiency and effect of tracking multiple targets in the video.

Description

Target tracking method, device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of video target tracking technologies, and in particular, to a target tracking method, a device, an electronic apparatus, and a storage medium.

Background

With the development of machine vision theory and technology, the identification and understanding of video content is becoming a research hotspot, and especially after a single target tracking related product based on video lands, the need for tracking multiple targets in video is also becoming stronger. Current video-based target tracking methods are more, for example: methods for associating targets in adjacent video frames according to physical indexes such as target center point distance, target area intersection ratio and the like, but the methods are not applicable in certain scenes; moreover, the current optical flow tracking method is generally applied, but is mostly used for tracking a single target in a video image, and a mode of uniformly selecting tracking points is adopted, so that background pixel points are inevitably taken as the tracking points, and therefore, the current optical flow tracking method has poor tracking efficiency and insignificant tracking effect when multi-target tracking is carried out.

Disclosure of Invention

In order to solve the problems, the application provides a target tracking method, a target tracking device, electronic equipment and a storage medium, which are beneficial to improving the efficiency and the effect of tracking multiple targets in a video.

An embodiment of the present application provides a target tracking method, where the target tracking method includes:

acquiring a first target frame of at least one target object in a current image frame of a target video;

carrying out overlapping region marking on the first target frame to obtain a shielding condition of the first target frame;

acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object;

performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame;

and acquiring a second target frame of the target object in the next image frame according to the target tracking point.

With reference to the first aspect, in one possible example, the acquiring the optical flow tracking point set of the current image frame according to the occlusion condition of the first target frame and the class of the target object includes:

Selecting the optical flow tracking points from a preset area of each first target frame according to the shielding condition of each first target frame and the category of the target object corresponding to each first target frame;

and adding the optical flow tracking points selected by each first target frame into the same set to obtain the optical flow tracking point set.

With reference to the first aspect, in one possible example, the selecting, according to a shielding condition of each first target frame and a category of the target object corresponding to each first target frame, the optical flow tracking point from a preset area of each first target frame includes:

for the first target frame belonging to the first type of target object, if the first target frame is not shielded, the first target frame is reduced by taking the center of the first target frame as the center and a first preset proportion, so as to obtain a first selection window, and the optical flow tracking point is selected in the first selection window; if the first selected window is shielded, selecting the optical flow tracking point in a region where the first selected window is not shielded;

for the first target frame belonging to the second class of target objects, if the first target frame is not shielded, selecting the optical flow tracking points at a first preset height and a second preset height of the first target frame; if the first preset height or the second preset height is shielded, selecting the optical flow tracking point in the area which is not shielded at the first preset height and the second preset height;

For the first target frame belonging to the third class of target objects, if the first target frame is not shielded, the first target frame is reduced by taking the center of the first target frame as the center and a second preset proportion to obtain a second selection window, and the optical flow tracking point is selected in the second selection window; and if the second selection window is shielded, selecting the optical flow tracking point in the area where the second selection window is not shielded.

With reference to the first aspect, in one possible example, the performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in a next image frame of the current image frame includes:

converting the current image frame into a gray level image, and constructing an image pyramid by using the gray level image of the current image frame; the bottommost layer of the image pyramid is the current image frame;

acquiring coordinates of each optical flow tracking point in the optical flow tracking point set on each layer of image of the image pyramid;

calculating the optical flow of each optical flow tracking point in the optical flow tracking point set;

and obtaining the target tracking point through the coordinates of each optical flow tracking point on the current image frame and the optical flow in the optical flow tracking point set.

With reference to the first aspect, in one possible example, the acquiring, according to the target tracking point, a second target frame of the target object in the next image frame includes:

calculating the displacement of each optical flow tracking point in the X direction and the Y direction through the coordinates of each optical flow tracking point in the optical flow tracking point set and the coordinates of the target tracking point corresponding to each optical flow tracking point;

calculating a first distance between every two optical flow tracking points, taking an absolute value, calculating a second distance between two target tracking points corresponding to the two optical flow tracking points, taking an absolute value, obtaining a distance ratio between the first distance after taking the absolute value and the second distance after taking the absolute value, and determining a median value of the distance ratio as a scaling size;

selecting a first median value of displacement of each optical flow tracking point in the X direction and a second median value of displacement of each optical flow tracking point in the Y direction in the optical flow tracking point set;

and calculating according to the coordinates, width and height of the center point of the first target frame, the scaling size, the first median value and the second median value to obtain the second target frame.

A second aspect of embodiments of the present application provides a multi-target tracking apparatus, including:

the target detection module is used for acquiring a first target frame of at least one target object in the current image frame of the target video;

the shielding detection module is used for marking the overlapping area of the first target frame to obtain the shielding condition of the first target frame;

the tracking point set acquisition module is used for acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object;

the optical flow calculation module is used for carrying out sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame;

and the target position determining module is used for acquiring a second target frame of the target object in the next image frame according to the target tracking point.

A third aspect of the embodiments of the present application provides an electronic device, including an input device and an output device, and further including a processor adapted to implement one or more instructions; the method comprises the steps of,

a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the steps of the method of the first or second aspect described above.

A fourth aspect of the embodiments of the present application provides a computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the steps of the method of the first or second aspect described above.

It can be seen that, in the technical solution provided in the embodiments of the present application, a first target frame of at least one target object in a current image frame of a target video is obtained; carrying out overlapping region marking on the first target frame to obtain a shielding condition of the first target frame; acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object; performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame; and acquiring a second target frame of the target object in the next image frame according to the target tracking point. In this way, the optical flow tracking point of each object is selected according to the shielding condition of each target object in the current image frame, and then all the selected optical flow tracking points are put in a set to perform single sparse optical flow calculation; meanwhile, when the optical flow tracking point is selected, the optical flow tracking point is selected according to the shielding condition of the target object, so that the condition that the background pixel is also selected as the optical flow tracking point for tracking is avoided, and the effect of multi-target tracking is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is an application architecture diagram provided in an embodiment of the present application;

fig. 2 is a schematic flow chart of a target tracking method according to an embodiment of the present application;

FIG. 3 is an exemplary diagram of an occlusion processing unit according to an embodiment of the present application;

fig. 4 is a schematic flow chart of obtaining a second target frame according to an embodiment of the present application;

FIG. 5 is a flowchart of another object tracking method according to an embodiment of the present disclosure;

FIG. 6a is an exemplary diagram of selecting an optical flow tracking point when a first target frame is not occluded according to an embodiment of the present application;

FIG. 6b is a schematic diagram of selecting an optical flow tracking point when a first target frame is occluded according to an embodiment of the present disclosure;

FIG. 7a is a schematic diagram of selecting an optical flow tracking point when another first target frame provided in an embodiment of the present application is not occluded;

FIG. 7b is a diagram illustrating an example of selecting an optical flow tracking point when another first target frame provided in an embodiment of the present application is occluded;

FIG. 8a is a schematic diagram of selecting an optical flow tracking point when another first target frame provided in an embodiment of the present application is not occluded;

FIG. 8b is a diagram illustrating an example of selecting an optical flow tracking point when another first target frame provided in an embodiment of the present application is occluded;

fig. 9 is a schematic structural diagram of a target tracking apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

The terms "comprising" and "having" and any variations thereof, as used in the specification, claims and drawings, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. Furthermore, the terms "first," "second," and "third," etc. are used for distinguishing between different objects and not for describing a particular sequential order.

Application architecture to which the schemes of embodiments of the present application may be applied is first described by way of example with reference to the accompanying drawings. Referring to fig. 1, fig. 1 is an application architecture diagram provided in an embodiment of the present application, as shown in fig. 1, including a user terminal, a server, a database, and an image acquisition device, where each portion is communicated through a network connection, so as to provide a reliable system architecture for a target tracking method of the present application. The user terminal provides a man-machine interaction interface for sending a user input instruction or request to the server, for example: target tracking request, convolutional neural network training request, video image acquisition request, etc., and receives the results returned by the server after corresponding processing according to the instructions or requests, and displays the results in a display window, for example: and displaying the target frame of the pedestrian obtained during target tracking, and the like. The server is an execution subject of the whole target tracking method, and is used for executing a series of target tracking operations on various objects in the acquired video image according to an instruction or a request sent by the user terminal, for example: target detection, tracking point selection, algorithm running, etc., the servers include, but are not limited to, local servers, cloud servers, server clusters. The database may be a database of the server, or may be a database independent of the server, for example: a cloud database or some open source database, in which a data set that can be used for target tracking experiments is stored, for example: the complete video sequence, adjacent image frames, of course the database is also used to store video acquired by the image acquisition device, for example: video collected by monitoring equipment of a certain district, video collected by a high-speed camera of a certain road, and the like. The image acquisition device may be any device capable of acquiring video images, the acquired video images may be displayed at a user terminal, and the acquired video images may be sent to the server in real time to enable the server to perform a target tracking operation, or the acquired video images may be stored in a database for later recall, which is not limited in particular.

Based on the application architecture shown in fig. 1, an embodiment of the present application proposes a target tracking method, which may be executed by an electronic device, and is applicable to not only a single target tracking scene in a video image, but also a multi-target tracking scene and a scene where a target is blocked, referring to fig. 2, the target tracking method may include the following steps:

s21, a first target frame of at least one target object in the current image frame of the target video is acquired.

In the specific embodiment of the application, the target video refers to a video acquired by an image acquisition device, which may be a video acquired by the image acquisition device in real time, for example: the real-time monitoring video of the camera on the street or the industrial park can be the historical video stored in the database and collected before the image collection device, for example, in the scene of testing the effect of the target tracking method, the real-time collection video of the image collection device is not needed, and the aim of testing can be achieved by any section of historical video. The current image frame, i.e. the image frame of the current time target video appearing on the display window of the user terminal, may also be a user selected image frame in some specific scenarios, for example: in the criminal investigation process, an image frame of a first occurrence of a suspected person in a target video is selected as a current image frame.

Specifically, the target object may be a pedestrian, a human face, a vehicle, or the like in any image frame in the target video, and after the current image frame is acquired,inputting the current image frame into a pre-training neural network for feature extraction and target detection, and outputting a detection frame O which is all target objects in the current image frame _i I.e., a first target frame, where i represents the unique tracking identity of the i-th target object in the current image frame. The pretraining neural network may employ Fast R-CNN (Fast Region-based Convolutional Network, fast Region-based convolutional neural network), OR MTCNN (Multi-task Cascaded Convolutional Networks, multi-task convolutional neural network), OR OR-CNN (Occlusion-aware Region-Convolutional Network), etc.

S22, carrying out overlapping region marking on the first target frame to obtain the shielding condition of the first target frame.

In a specific embodiment of the present application, the overlay area may be marked according to the coordinates of the center point of each first target frame, and the width and the height of each first target frame, for example: and (3) marking an overlapped area by calculating the intersection ratio of every two first target frames, namely, the overlapped area of the two first target frames, and judging the front and back positions of the two first target frames, so that the shielding condition of each first target frame, namely, whether the first target frame is shielded, the shielded area and the non-shielded area can be obtained. Optionally, the overlapping area marking may be performed by using an occlusion processing unit provided in the OR-CNN, after the first target frame is obtained by using the OR-CNN in step S21, the first target frame is divided into a preset number (for example, 5 target areas), then features of the preset number of target areas are extracted respectively, the extracted features are input into an occlusion processing unit shown in fig. 3, and are subjected to processing of three 3*3 convolutions and two classifications of a softmax classifier, and output as an occlusion score of each target area in the preset number of target areas, and when the occlusion score of a certain target area is smaller than a threshold (for example, 0.9, 0.8, etc.), the target area is marked as an overlapping area, which indicates that the first target frame is occluded, and the occluded area is the marked area.

S23, acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object.

In this embodiment of the present application, when the optical flow tracking point is selected in the first target frame, the shielding condition of the first target frame is considered, and the category of the target object is considered, for example: the target object is a human face, a human body, or other objects such as a vehicle, the category of the target object is obtained when the target is detected in step S21, and the area of the optical flow tracking point selected by the target object of each category is different, but the selected modes are as follows: if the first target frame is not blocked, selecting m x n feature points as optical flow tracking points in a preset area, wherein m and n can be equal, if the preset area of the first target frame is blocked, selecting the feature points which are not blocked in the preset area as optical flow tracking points, and finally forming an optical flow tracking point set { p) of the current image frame by all the selected optical flow tracking points _ij J represents the j-th optical flow tracking point of the i-th target object.

S24, performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame.

In this embodiment of the present application, the target tracking point is a point corresponding to the selected optical flow tracking point in the next image frame of the current image frame, and is specific to the obtained optical flow tracking point set { p } _ij Sparse optical flow algorithm is adopted to track the point set { p }, of the optical flow _ij All optical flow tracking points p in } _ij Performing one-time calculation to obtain each optical flow tracking point p _ij Target tracking point p 'in the next image frame' _ij 。

The sparse optical flow calculation is performed on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the next image frame of the current image frame, where the method includes:

Specifically, scaling the gray-scale image of the current image frame results in a common L-layer image of the image pyramid, l=0, 1,2,.. _m L with lowest resolution _m The layer is the top layer, and the original image of the current image frame is at the bottom layer of the image pyramid. The purpose of the optical flow algorithm is to calculate the optical flow on the topmost image from the topmost layer of the image pyramid, and pass the calculation result of the topmost layer as an initial value to the next layer (L _m-1 Layer) and then calculate the L-th based on the initial value _m-1 Optical flow of layer, L _m-1 The calculation result of the layer is continuously transmitted to the next layer as an initial value until being transmitted to the bottommost layer, and the calculated optical flow of the bottommost layer is used as a final optical flow calculation result. Each optical flow tracking point p is used for calculating each optical flow _ij At the level of coordinates, each of said optical flow tracking points p therefore needs to be located _ij Coordinates on each layer of image of the image pyramid, including on the original image of the current image frame, tracked by each of said optical flow tracking points p _ij The coordinates of each layer are calculated to obtain each optical flow tracking point p _ij Optical flow at each layer, for example: l th _m-1 The optical flow of the layer isFrom L < th) _m-1 Optical flow initial value +.>And (d) is->Can calculate the L _m-2 Initial values of layers are calculated by iterating the steps to calculate each optical flow tracking point p _ij In the picture of Chinese character' jinOptical flow d on the image of the bottom layer of the tower (i.e. the original image of the current image frame), each optical flow tracking point p _ij Adding the optical flow d to the coordinates on the original image of the current image frame yields each optical flow tracking point p _ij Target tracking point p 'in the next image frame' _ij 。

S25, acquiring a second target frame of the target object in the next image frame according to the target tracking point.

In this embodiment, as shown in fig. 4, the step of acquiring the second target frame of the target object in the next image frame according to the target tracking point includes steps S2501 to S2504:

s2501, calculating the displacement of each optical flow tracking point in the X direction and the Y direction through the coordinates of each optical flow tracking point in the optical flow tracking point set and the coordinates of the target tracking point corresponding to each optical flow tracking point;

s2502, calculating a first distance between every two optical flow tracking points and taking an absolute value, calculating a second distance between two target tracking points corresponding to the two optical flow tracking points and taking an absolute value, obtaining a distance ratio between the first distance after taking the absolute value and the second distance after taking the absolute value, and determining a median value of the distance ratio as a zoom size;

S2503, selecting a first median value of displacement of each optical flow tracking point in the X direction and a second median value of displacement of each optical flow tracking point in the Y direction in the optical flow tracking point set;

s2504, calculating to obtain the second target frame according to the coordinates, the width and the height of the center point of the first target frame, the zoom size, the first median value and the second median value.

It can be appreciated that each optical flow tracking point p is obtained _ij Corresponding target tracking point p' _ij Thereafter, each of the optical flow tracking points p is employed _ij Coordinates of (c) and corresponding target tracking point p' _ij Is calculated to obtain each optical flow tracking point p _ij Displacement dx in the X direction _ij And displacement dy in Y direction _ij And choose dx _ij Is the first median deltax, dy is selected _ij Is the second median deltay, and then calculates each two of the optical flow tracking points p _ij The distance between the two optical flow tracking points p is defined as a first distance a _ij Corresponding two target tracking points p' _ij The distance between the two is defined as a second distance b, the median value of the distance ratio |b|/|a| between the absolute value |b| of b and the absolute value |a| of a is taken as a scaling size, and the formula is adopted: and x+Δx+width (1-scale)/2, y+Δy+height (1-scale)/2, respectively calculating coordinates (x ', y') of a center point of the second target frame, wherein (x, y) represents coordinates of the center point of the first target frame, width and height are widths and heights of the first target frame, and similarly, width scale is a width of the second target frame, height scale is a height of the second target frame, and a position of the second target frame in a next image frame of the current image frame is determined according to the coordinates of the center point of the second target frame, the width of the second target frame and the height of the second target frame.

It can be seen that, in the embodiment of the present application, a first target frame of at least one target object in a current image frame of a target video is acquired; carrying out overlapping region marking on the first target frame to obtain a shielding condition of the first target frame; acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object; performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame; and acquiring a second target frame of the target object in the next image frame according to the target tracking point. In this way, the optical flow tracking point of each object is selected according to the shielding condition of each target object in the current image frame, and then all the selected optical flow tracking points are put in a set to perform single sparse optical flow calculation; meanwhile, when the optical flow tracking point is selected, the optical flow tracking point is selected according to the shielding condition of the target object, so that the condition that the background pixel is also selected as the optical flow tracking point for tracking is avoided, and the effect of multi-target tracking is improved.

Referring to fig. 5, fig. 5 is a flowchart of another object tracking method according to an embodiment of the present application, as shown in fig. 5, including steps S51-S56:

s51, acquiring a first target frame of at least one target object in a current image frame of a target video;

s52, carrying out overlapping region marking on the first target frame to obtain a shielding condition of the first target frame;

s53, selecting the optical flow tracking points from the preset area of each first target frame according to the shielding condition of each first target frame and the category of the target object corresponding to each first target frame;

in a specific embodiment of the present application, different optical flow tracking point selection areas are preset mainly for three kinds of target objects, and if the first target frame belongs to a first kind of target object, the first target frame is narrowed with a first preset ratio by taking the center of the first target frame as the center if the first target frame is not shielded, so as to obtain a first selection window, and the optical flow tracking point is selected in the first selection window; and if the first selected window is shielded, selecting the optical flow tracking point in the area where the first selected window is not shielded. If the first target frame is a face detection frame and the face detection frame is not shielded, the face detection frame is reduced by taking the center of the face detection frame as the center and a first preset proportion (for example, 50% -80% of the width and the height) to obtain a first selection window shown in fig. 6a, and 5*5 characteristic points are selected as optical flow tracking points in the first selection window to cover eyes, noses and mouths; as shown in fig. 6b, if there is a mask in the face detection frame and the masked area is the lower right half corner of the first selection window obtained according to the above method, the feature points of the area where the first selection window is not masked are selected as optical flow tracking points, that is, the non-masked feature points in the feature points of 5*5 are simply selected as optical flow tracking points.

For the first target frame belonging to the second class of target objects, if the first target frame is not shielded, selecting the optical flow tracking points at a first preset height and a second preset height of the first target frame; and if the first preset height or the second preset height is shielded, selecting the optical flow tracking point in the area which is not shielded at the first preset height or the second preset height. As shown in fig. 7a, if the first target frame is a human body detection frame and the human body detection frame is not blocked, taking 1/4 of the height of the human body detection frame (near the head of the human body here) as a first preset height, taking a feature point at the center of the first preset height as a center, selecting a feature point of 5*3 as an optical flow tracking point, and taking 1/2 of the height of the human body detection frame (near the chest and abdomen of the human body) as a second preset height, and taking the feature point at the center of the second preset height as a center, selecting a feature point of 5*5 as an optical flow tracking point; as shown in fig. 7b, if there is a shielding in the human body detection frame and the shielded area covers most of the left feature points at the second preset height, the feature points of 5*3 are still selected as optical flow tracking points at the first preset height, the feature points which are not shielded in the feature points of 5*5 are selected as optical flow tracking points at the second preset height, and similarly, if there is a shielding in the first preset height, the feature points which are not shielded in the feature points of 5*3 at the first preset height are selected as optical flow tracking points.

For the first target frame belonging to the third class of target objects, if the first target frame is not shielded, the first target frame is reduced by taking the center of the first target frame as the center and a second preset proportion to obtain a second selection window, and the optical flow tracking point is selected in the second selection window; and if the second selection window is shielded, selecting the optical flow tracking point in the area where the second selection window is not shielded. Taking a vehicle detection frame as an example, if the vehicle detection frame is not blocked, shrinking the vehicle detection frame by taking the center of the vehicle detection frame as the center and a second preset proportion (for example, 80% of width and height) to obtain a second selection window as shown in fig. 8a, wherein the feature points of 5*5 are selected as optical flow tracking points in the second selection window; as shown in fig. 8b, if there is a blockage in the vehicle detection frame, and the blocked area is a partial area of the second selection window obtained according to the foregoing method, feature points of an area where the second selection window is not blocked are also selected as optical flow tracking points.

S54, adding the optical flow tracking points selected by each first target frame into the same set to obtain the optical flow tracking point set;

In the specific embodiment of the present application, in step S53, after selecting an optical flow tracking point for each first target frame, the optical flow tracking points are added to the same set, where the set is an optical flow tracking point set, so that subsequent single sparse optical flow calculation is facilitated.

S55, performing sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in the next image frame of the current image frame;

s56, acquiring a second target frame of the target object in the next image frame according to the target tracking point.

In the embodiment shown in fig. 5, some steps are described in the embodiment shown in fig. 2, and the same or similar advantages can be achieved, so that repetition is avoided and detailed description is omitted.

Based on the above description of the embodiments of the object tracking method, the embodiments of the present application also provide an object tracking device, which may be a computer program (including program code) running in a terminal. The object tracking device may perform the method shown in fig. 2 or fig. 5. Referring to fig. 9, the object tracking device includes:

a target detection module 91, configured to acquire a first target frame of at least one target object in a current image frame of a target video;

The shielding detection module 92 is configured to perform overlapping region marking on the first target frame, so as to obtain a shielding situation of the first target frame;

a tracking point set obtaining module 93, configured to obtain an optical flow tracking point set of the current image frame according to a shielding condition of the first target frame and a category of the target object;

an optical flow calculation module 94, configured to perform sparse optical flow calculation on the optical flow tracking point set, so as to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in a next image frame of the current image frame;

and the target position determining module 95 is configured to obtain a second target frame of the target object in the next image frame according to the target tracking point.

In one possible example, the tracking point set obtaining module 93 is specifically configured to, in obtaining the optical flow tracking point set of the current image frame according to the occlusion condition of the first target frame and the category of the target object:

In one possible example, the tracking point set obtaining module 93 is specifically configured to, in terms of selecting the optical flow tracking point from the preset area of each first target frame according to the occlusion condition of each first target frame and the class of the target object corresponding to each first target frame:

In one possible example, the optical flow calculation module 94 is configured to perform sparse optical flow calculation on the optical flow tracking point set to obtain a target tracking point corresponding to each optical flow tracking point in the optical flow tracking point set in a next image frame of the current image frame, where the target tracking point is specifically:

In one possible example, the occlusion detection module 92 is specifically configured to:

dividing the first target frame into a preset number of target areas;

obtaining shielding scores of each target area in the preset number of target areas;

and marking the target region with the occlusion score smaller than a threshold value as an overlapping region.

In one possible example, the target position determination module 95 includes, in acquiring a second target frame of the target object in the next image frame from the target tracking point:

According to one embodiment of the present application, each unit in the object tracking device shown in fig. 9 may be separately or completely combined into one or several other units to form a structure, or some unit(s) thereof may be further split into a plurality of units with smaller functions to form a structure, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present invention. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present invention, the object tracking device may also include other units, and in practical applications, these functions may also be implemented with assistance from other units, and may be implemented by cooperation of a plurality of units.

According to another embodiment of the present application, an apparatus device as shown in fig. 9 may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods as shown in fig. 2 or 5 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element, and implementing the above-described methods of the embodiments of the present invention. The computer program may be recorded on, for example, a computer-readable recording medium, and loaded into and executed by the above-described computing device via the computer-readable recording medium.

Based on the description of the method embodiment and the device embodiment, the embodiment of the invention also provides electronic equipment. Referring to fig. 10, the electronic device includes at least a processor 1001, an input device 1002, an output device 1003, and a computer storage medium 1004. Wherein the processor 1001, input device 1002, output device 1003, and computer storage medium 1004 within an electronic device may be connected by a bus or other means.

A computer storage medium 1004 may be stored in a memory of an electronic device, the computer storage medium 1004 being for storing a computer program, the computer program comprising program instructions, the processor 1001 being for executing the program instructions stored by the computer storage medium 1004. The processor 1001, or CPU (Central Processing Unit ), is a computing core as well as a control core of the electronic device, which is adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement a corresponding method flow or a corresponding function.

In one embodiment, the processor 1001 of the electronic device provided in the embodiments of the present application may be configured to perform a series of target tracking processes, including:

In one embodiment, the processor 1001 performs the acquiring the optical flow tracking point set of the current image frame according to the occlusion condition of the first target frame and the class of the target object, including:

In one embodiment, the processor 1001 performs the selecting the optical flow tracking point from the preset area of each first target frame according to the occlusion condition of each first target frame and the class of the target object corresponding to each first target frame, including:

In one embodiment, the performing, by the processor 1001, the performing the sparse optical flow calculation on the set of optical flow tracking points to obtain a target tracking point corresponding to each optical flow tracking point in the set of optical flow tracking points in a next image frame of the current image frame includes:

In one embodiment, the processor 1001 performs the overlapping region marking on the first target frame, including:

dividing the first target frame into a preset number of target areas;

In one embodiment, the processor 1001 executes the acquiring, according to the target tracking point, a second target frame of the target object in the next image frame, including:

By way of example, the electronic device may be a computer, a notebook computer, a tablet computer, a palm top computer, a server, etc. The electronic devices may include, but are not limited to, a processor 1001, an input device 1002, an output device 1003, and a computer storage medium 1004. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of an electronic device and is not limiting of an electronic device, and may include more or fewer components than shown, or certain components may be combined, or different components.

It should be noted that, since the steps in the above-described target tracking method are implemented when the processor 1001 of the electronic device executes the computer program, the embodiments of the target tracking method described above are all applicable to the electronic device, and all achieve the same or similar beneficial effects.

The embodiment of the application also provides a computer storage medium (Memory), which is a Memory device in the electronic device and is used for storing programs and data. It will be appreciated that the computer storage medium herein may include both a built-in storage medium in the terminal and an extended storage medium supported by the terminal. The computer storage medium provides a storage space that stores an operating system of the terminal. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 1001. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; alternatively, it may be at least one computer storage medium located remotely from the processor 1001. In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by the processor 1001 to implement the corresponding steps described above with respect to the target tracking method.

It should be noted that, since the steps in the above-mentioned target tracking method are implemented when the computer program of the computer storage medium is executed by the processor, all the embodiments or implementations of the above-mentioned target tracking method are applicable to the computer storage medium, and the same or similar beneficial effects can be achieved.

The foregoing has outlined rather broadly the more detailed description of embodiments of the present application, wherein specific examples are provided herein to illustrate the principles and embodiments of the present application, the above examples being provided solely to assist in the understanding of the methods of the present application and the core ideas thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims

1. A method of target tracking, the method comprising:

acquiring a first target frame corresponding to at least one target object in a current image frame of a target video, wherein the first target frame comprises: inputting the current image frame into a pre-training neural network for feature extraction and target detection, and outputting detection frames of all target objects in the current image frame;

acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object, wherein the target object of each category has different areas for selecting optical flow tracking points;

2. The method of claim 1, wherein the obtaining the set of optical flow tracking points for the current image frame according to the occlusion condition of the first target frame and the class of the target object comprises:

3. The method of claim 2, wherein selecting the optical flow tracking point from the preset area of each first target frame according to the occlusion condition of each first target frame and the category of the target object corresponding to each first target frame comprises:

for the first target frame belonging to the second class of target objects, if the first target frame is not shielded, selecting the optical flow tracking points at a first preset height and a second preset height of the first target frame; if the first preset height or the second preset height is shielded, selecting the optical flow tracking point in the area which is not shielded at the first preset height or the second preset height;

4. The method of claim 1, wherein performing sparse optical flow computation on the set of optical flow tracking points to obtain a target tracking point for each optical flow tracking point in the set of optical flow tracking points in a next image frame of the current image frame comprises:

5. The method of claim 1, wherein the marking the overlapping area of the first target frame comprises:

dividing the first target frame into a preset number of target areas;

6. The method of claim 1, wherein the acquiring a second target frame of the target object in the next image frame from the target tracking point comprises:

7. A multi-target tracking apparatus, the apparatus comprising:

the target detection module is configured to obtain a first target frame corresponding to at least one target object in a current image frame of a target video, and includes: inputting the current image frame into a pre-training neural network for feature extraction and target detection, and outputting detection frames of all target objects in the current image frame;

the tracking point set acquisition module is used for acquiring an optical flow tracking point set of the current image frame according to the shielding condition of the first target frame and the category of the target object, wherein the target object of each category selects different areas of optical flow tracking points;

8. The apparatus of claim 7, wherein the tracking point set obtaining module is configured to obtain the optical flow tracking point set of the current image frame according to the occlusion condition of the first target frame and the class of the target object, in particular:

9. An electronic device comprising an input device and an output device, further comprising:

a processor adapted to implement one or more instructions; the method comprises the steps of,

a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the method of any one of claims 1-6.

10. A computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the method of any one of claims 1-6.