CN113989332A

CN113989332A - Target tracking method and device, storage medium and electronic equipment

Info

Publication number: CN113989332A
Application number: CN202111358671.1A
Authority: CN
Inventors: 赵晓萌; 李发成; 张如高; 虞正华
Original assignee: Suzhou Moshi Intelligent Technology Co ltd
Current assignee: Suzhou Moshi Intelligent Technology Co ltd
Priority date: 2021-11-16
Filing date: 2021-11-16
Publication date: 2022-01-28
Anticipated expiration: 2041-11-16
Also published as: CN113989332B

Abstract

The invention discloses a target tracking method, a target tracking device, a storage medium and electronic equipment, wherein the method comprises the following steps: receiving an image uploaded by a sensor and carrying out target object detection in the image to obtain a plurality of target object detection frames; determining target observation points from the plurality of target detection frames; processing the area where the target observation point is located; detecting the target object according to the processing result; and responding to corresponding target object tracking operation according to the detection result. According to the method, the target observation point is determined according to the target object detection frame, the target object is detected through the processing result of the area where the target observation point is located, the feature extraction of a plurality of feature points is not needed for a single target block, and the target tracking time is shortened.

Description

Target tracking method and device, storage medium and electronic equipment

Technical Field

The invention relates to the technical field of computer vision, in particular to a target tracking method, a target tracking device, a storage medium and electronic equipment.

Background

The target tracking is an important branch of computer vision, and integrates advanced technologies and research results in related fields such as image processing, pattern recognition, artificial intelligence, automatic control, computer application technology and the like. The key to realize target tracking is to completely segment the target, extract the characteristics reasonably and quickly and accurately identify the target, and simultaneously consider the time for realizing the algorithm to ensure the real-time property. The target is very difficult to track in a complex environment, and in the prior art, the target image features need to be extracted and more complex feature matching needs to be carried out when the target is tracked in the complex environment, so that the time consumption is long.

Disclosure of Invention

In view of this, embodiments of the present invention provide a target tracking method, an apparatus, a storage medium, and an electronic device, so as to solve the technical problem in the prior art that target tracking is time-consuming in a complex occlusion environment.

The technical scheme provided by the invention is as follows:

a first aspect of an embodiment of the present invention provides a target tracking method, where the target tracking method includes: receiving an image uploaded by a sensor and carrying out target object detection in the image to obtain a plurality of target object detection frames; determining target observation points from the plurality of target detection frames; processing the area where the target observation point is located and detecting the target object according to the processing result; and responding to corresponding target object tracking operation according to the detection result.

Optionally, processing the area where the target observation point is located and detecting the target object according to the processing result includes: performing semantic segmentation processing on the region where the target observation point is located; and carrying out inverse projection transformation on the semantic segmentation processing result.

Optionally, detecting the target object according to the processing result includes: target detection is performed by a target observation model of the formula:

in the formula: omega ═ O₁∪O₂∪...∪O_nA multi-target state set is formed; c is a clutter observation set, and O is a target observation set; z is a single-frame multi-target observation set and is a union set Z ═ omega ═ U.C of the target observation set and the clutter set;

each observation set representing a summed traversal set Z; p (C) represents the Poisson point process of C; λ (x) is an intensity function; p (O | x) represents the observation model under a single sensor with a single target state vector x.

Optionally, responding to a corresponding target tracking operation according to the detection result includes: when the target object is detected, responding to corresponding target object tracking operation; and when the target object is not detected, determining whether the target object detection area is blocked or not, and responding to corresponding target object tracking operation according to a blocking judgment result.

Optionally, when the target object is not detected, determining whether the target object detection area is blocked includes: calculating the detection probability of the target object according to the inverse projection transformation; when the target object detection probability meets a first probability condition, determining that the target object detection area is blocked; and when the target object detection probability meets a second probability condition, determining that the target object detection area is not shielded.

Optionally, responding to the corresponding target tracking operation according to the occlusion determination result includes: when the target object detection area is not shielded, updating the state data of the target object at the previous moment according to the state data of the target object at the current moment, and outputting an updated target estimation value of the target object at the current moment; and when the target object detection area is shielded, predicting the current state data of the target object according to a preset state prediction model of the state of the target object at the previous moment, and responding to the tracking operation of the target object according to the current state data.

Optionally, updating the state data of the target object at the previous time according to the state data of the target object at the current time includes: the state data update is performed by the target transition model of the following formula:

in the formula: x ═ xi @ B denotes the current time multi-target state set, where xi ═ S @¹∪S²∪...∪SⁿThe multi-target state set at the previous moment is used as the state set, and the new multi-target state set generated by the multi-sensor at the current moment is used as the state set B; p (B) represents the Poisson point process of B; λ (x) is an intensity function; p (S | x) is a finite set of Bernoulli randoms.

A second aspect of an embodiment of the present invention provides a target tracking apparatus, including: the receiving module is used for receiving the image uploaded by the sensor and detecting the target object in the image to obtain a plurality of target object detection frames; a determination module for determining a target observation point from the plurality of target detection frames; the detection processing module is used for processing the area where the target observation point is located and detecting the target object according to the processing result; and the tracking module is used for responding to corresponding target object tracking operation according to the detection result.

A third aspect of the embodiments of the present invention provides a computer-readable storage medium, where computer instructions are stored, and the computer instructions are configured to cause a computer to execute the target tracking method according to any one of the first aspect and the first aspect of the embodiments of the present invention.

A fourth aspect of an embodiment of the present invention provides an electronic device, including: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory storing computer instructions, and the processor executing the computer instructions to perform the object tracking method according to any one of the first aspect and the first aspect of the embodiments of the present invention.

The technical scheme provided by the invention has the following effects:

the target tracking method provided by the embodiment of the invention receives the image uploaded by the sensor and detects the target object in the image to obtain a plurality of target object detection frames; determining target observation points from the plurality of target detection frames; processing the area where the target observation point is located and detecting the target object according to the processing result; and responding to corresponding target object tracking operation according to the detection result. According to the method, the target observation point is determined according to the target object detection frame, the target object is detected through the processing result of the area where the target observation point is located, the feature extraction of a plurality of feature points is not needed for a single target block, and the target tracking time is shortened.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a flow diagram of a target tracking method according to an embodiment of the invention;

FIG. 2 is a diagram illustrating the effect of probability matrix mapping provided by the target tracking method according to the embodiment of the present invention;

FIG. 3 is a block diagram of a target tracking device according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a computer-readable storage medium provided according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

An embodiment of the present invention provides a target tracking method, as shown in fig. 1, the method includes the following steps:

step S101: and receiving the images uploaded by the sensors and carrying out target object detection in the images to obtain a plurality of target object detection frames. Specifically, a sensor (such as a camera) acquires images, and each frame of the images is detected and calculated through a depth learning (such as fast R-CNN, YOLO or FCOS) so as to obtain an image plane target-of-interest circumscribed rectangle frame or a vehicle tail frame, namely a plurality of target object detection frames.

Step S102: and determining target observation points from the plurality of target object detection frames. Specifically, after the image is processed to obtain a plurality of target detection frames, the central point of the bottom edge of each target detection frame is extracted as a target observation point by adopting non-maximum value suppression. The non-maximum value suppression is to search for a local maximum value, suppress a maximum value, that is, find an optimal boundary frame among a plurality of target object detection frames, and eliminate redundant boundary frames.

Step S103: and processing the area where the target observation point is located and detecting the target object according to the processing result. Specifically, after the target observation point is determined, the observation result of the image at this time satisfies the observation assumption: each target produces at most one observation; different targets generate different observations. Then, processing the area where the target observation point is located, wherein the area where the target observation point is located can be a travelable area in the visible range of the camera, for example, the image processing mode such as resolution processing can be adopted; after the area where the target observation point is located is processed, the target object is detected according to the processing result.

Step S104: and responding to corresponding target object tracking operation according to the detection result. Specifically, after the target object is detected, the corresponding target object is tracked according to the detection result.

As an optional implementation manner of the embodiment of the present invention, the area where the target observation point is located is processed. Specifically, after the target observation point is determined, semantic segmentation processing is performed on the region where the target observation point is located, and back projection transformation is performed on the semantic segmentation processing result.

Each frame of the camera image outputs a travelable region of a target in the image through a deep learning semantic segmentation network such as DeepLab, SegNet or PSPNet, and the travelable region is given in a pixel-by-pixel probability value form, wherein the form is as follows:

in the formula (d)_iCorresponding to the random variable of the ith pixel in the current frame image, describing the probability distribution of the current pixel as a drivable area; p is a radical of_Φ(ii | I) is a probability value that the corresponding ith pixel in the segmentation result of the drivable region output by the semantic segmentation network is the drivable region, specifically, the semantic segmentation network is trained on a drivable region semantic segmentation data set, and when a new image I is input, the network can output the probability of the semantic class corresponding to any pixel I in the image I, namely, the probability of the semantic class corresponding to p_Φ(I | I); phi is a semantic segmentation network parameter, I is a current frame image,

is an index function, namely the function takes the value of 1 when the condition in the function bracket is satisfied,otherwise, it is 0.

In one embodiment, as shown in fig. 2, the region 1 is a segmentation result of the target travelable region determined according to the probability, and the regions 2 and 3 are segmentation results of the target non-travelable region determined according to the probability. The specific region segmentation mode can be determined according to the use needs of the user, for example, a probability threshold value can be set to be 80%, when the region with the probability value greater than or equal to 80% is used as the target travelable region, the region less than 80% is used as the target non-travelable region.

In one embodiment, after the semantic segmentation process, the semantic segmentation probability matrix is defined as

Wherein [ ·]_iRepresenting a matrix listed by pixel index. Assuming known camera internal and external parameters, projecting the semantic segmentation probability matrix into an IPM probability matrix by using the internal and external parameters and adopting an inverse projection transformation technology

Define the camera detection rate distribution p under the target position state condition x in the IPM plane_D(d | x) is:

where < > denotes an interpolation operation.

Since x represents the target state location, and

the probability matrix is adopted, so that bilinear interpolation processing is carried out at the position x, and the target state track is smoother.

As an optional implementation manner of the embodiment of the present invention, the target object is detected according to the processing result. And (4) detecting the target object by constructing a target observation model. Specifically, determining a camera observation model, i.e., a single-target observation model, according to the camera detection rate distribution is as follows:

that is, p (O | x) is distributed as Bernoulli random finite set (Bernoulli RFS), and is recorded as

Setting a clutter observation set of each frame of the sensor as C, and assuming that the position distribution of the clutter C is a Poisson point process (Poisson point process), namely:

in the formula, λ (x) is an intensity function.

Further, when the multi-target state set X is known, the clutter state C is independent from the target observation state O, and the target observation interval is also independent. The single-frame multi-target observation set is a union set Z ═ omega ═ C of the target observation set and the clutter set, and a multi-target state set Ω ═ O is defined in the formula₁∪O₂∪...∪O_n. Then according to the random finite set convolution theorem, a multi-target observation model is constructed as follows:

in the formula (I), the compound is shown in the specification,

for set non-union, symbols are summed

Representing each observation set of the summed traversal set Z.

As an optional implementation manner of the embodiment of the present invention, a corresponding target tracking operation is responded according to the detection result. When the target object is not detected, it is first determined whether the target object detection area is blocked. Specifically, the target object detection probability is calculated according to the inverse projection transformation, and when the target object detection probability meets a first probability condition, it is determined that a target object detection area is shielded; when the target object detection probability meets the second probability condition, the target object detection area is determined not to be shielded, the target object detection area is determined without performing feature matching on the target object, and the target tracking time is shortened.

In one embodiment, the distribution p is based on the camera detection rate_D(d | x) calculating the detection probability of the target object, and when the detection probability of the target object is close to 0, determining that the detection area of the target object is blocked; and when the detection probability of the target object is close to 1, determining that the detection area of the target object is not blocked.

As an optional implementation manner of the embodiment of the present invention, when the target detection area is not blocked, a target transfer model is constructed, and the state data of the target at the current time is combined with the target transfer model to update the state data of the target at the previous time. Specifically, before the target transfer model is built, an independent motion condition between targets and a new target state (or called new generation) model are assumed to be a poisson point process, namely, a multi-target state set at the current moment is only related to a state set at the last moment; then the current frame multi-target state set X ═ ne ═ e ═ B, where xi ═ S ═ e { [ S { [ n ] e¹∪S²∪...∪SⁿThe multi-target state set is a multi-target state set at the last moment, and the B is a new multi-target state set generated by the multi-sensor at the current moment, namely a new target set; and obtaining a multi-target transition probability model according to a random finite set convolution theorem, wherein the probability model comprises the following steps:

in the formula, λ_b(. cndot) is an intensity function of the nascent target poisson point process, and p (S | x) is of the form:

in the formula, p_S(x) The probability of survival for state x, i.e., p (S | x) is a finite set of Bernoulli randoms,

the probability distribution of Markov transition, namely a multi-target transition model is also called Markov motion model.

After a target transfer model is built, firstly, based on a Bernoulli random finite set, the state distribution of constraint iteration obeys the multi-Bernoulli mixed distribution, and then, the multi-target state parameters are updated in an iteration mode. Specifically, the state distribution of the constraint iteration follows a Multi-Bernoulli Mixture distribution, i.e., a Multi-Bernoulli Mixture (MBM) random finite set form is:

wherein MB (. cndot.) represents a finite set of Bernoulli rands; w is a^hA multi-bernoulli weight coefficient under a global assumption h;

in MBM, to carry weighted sums to Bernoulli components, the number of summation items increases exponentially in a plurality of iterations, and particularly, a data correlation algorithm is adopted to reduce the summation item. Firstly, the definition of global and local assumptions is introduced according to the actual meaning of the summation term:

the local assumption is that: the data association records of all historical moments of a single track are recorded as h;

global assumptions: the data association records of all historical moments of the current survival trajectory can form a global hypothesis superset

Wherein each element contains several local hypotheses for a single global hypothesis set

Determining that the multi-target state parameter set required to be subjected to iterative update calculation is combined as follows:

the associated weight is:

bernoulli RFS parameter:

interpreting the parameters as parameters of a weighted Bernoulli random finite set, and for a camera observation model (single-target observation model), a state updating formula is as follows:

updating the posterior-tape-weight Bernoulli RFS parameter under the condition of missing detection:

in the formula (II)

Updating the posterior-weighted Bernoulli RFS parameter under the assumption of correlated observation:

and after the state data of the target object is updated, responding to the tracking operation on the target object according to the state data at the current moment. Specifically, the timestamp sequence of a single dominant sensor is selected, and the updated target estimation value of the current time of the target object is output: selecting the optimal global hypothesis set with the maximum global hypothesis weight

And selecting local hypotheses in the optimal global hypothesis set, and calculating the estimated value of the Bernoulli RFS in each local hypothesis. In particular, traversal through the set of optimal global assumptionsLocal assumptions, if there is a probability in Bernoulli RFS in a local assumption

T_eFor the estimation threshold, the target estimation value is:

and determining and outputting the state of the target object according to the target estimation value at the current moment to obtain a state estimation track.

As an optional implementation manner of the embodiment of the present invention, when the target object is blocked in the detection area, the state data of the target object at the current time may be predicted according to the state data of the target object at the previous time and a preset state prediction model. Specifically, assume that the set of multi-target posterior state vectors at the current time is X, and the timestamp at the current time is t_cWhen the detection result does not contain the target object, recording a next group of observed arrival time t_sAnd calculating the time difference delta_t＝t_s-t_c(ii) a The multi-bernoulli parameter is then updated for each local hypothesis in the global hypothesis:

and after the multi-Bernoulli parameters are updated, predicting the state data of the target object at the current moment according to a Bayesian filter formula. And then designing and calculating the weighted Bernoulli random finite set parameters of the new multi-target state set B generated by the camera according to the camera observation model. In particular, each camera S is designed for the observation characteristics of a set S of cameras on the platform_iA priori new weights of

And a priori new growth distribution p^B,i(. to) combined with a camera viewMeasuring a model, and constructing a new model of the state of the target object:

performing first-order Taylor expansion on a general nonlinear observation function H to obtain a corresponding linear observation model, and omitting a white noise error term, namely z is H x after the neighborhood linearization of the observation model, and a generalized inverse observation equation x is H⁺(z) in the formula H⁺Represents the Moore-Penrose generalized inverse. Designing a priori newborns distribution

The first order moment and the second order central moment are:

in the formula, P^B,iIs a new prior covariance matrix.

The new born Bernoulli RFSBer (x; r)^B,i,p^B,i(. -) parameter calculation formula:

w^B,i＝λ_c(z)+ρ(z)

determining the new set of parameters of the weighted Bernoulli random finite set as { w }according to calculation^B,i,r^B,i,p^B,i(·)|i＝1,…,n_sAnd then incorporated into the local hypothesis set.

And after the state new model of the target object is constructed, responding to the tracking operation on the target object according to the state data at the current moment. The specific implementation may be implemented by referring to the above process of outputting the updated target estimation value of the target object at the current time, and details are not described here.

In one embodiment, a target tracking method is provided, including: first, the semantic segmentation output result involves integration:

in the formula (I), the compound is shown in the specification,

the semantic segmentation network output is obtained through inverse projection transformation.

The above integral is then calculated using an unscented transformation. In particular, p within the integral^i,h(x) And g (z)^j|x)p^i,h(xⁱ) According to RFS theory and composition distribution p^i,hIn the above two equations, the integral part can be transformed into the same form by computing the corresponding mean and covariance in a manner that can be resolved under the assumption of gaussian distribution:

∫F(x)p(x)dx

wherein F (x) is an arbitrary nonlinear function, p (x) is a known distribution mean μ_xSum of covariance ∑_x. Introducing an unscented transformation parameter kappa, and performing unscented transformation on the distribution p (x) by:

in the formula, n_xIs the dimension of the state quantity, δ_x(. cndot.) is a Dirac function,

is a sigma point set and is defined as:

in the formula (I), the compound is shown in the specification,

represents the square root of the matrix, satisfies

[:,i]Representing taking the ith column of the matrix.

Substitution in ^ F (x) p (x) dx gives:

can be completed by using sigma point set

The approximation of (2).

Approximation

At xⁱThe neighborhood is constant, then

The following steps are changed:

and calculating an analytic solution of an upper expression according to the updating step in Kalman filtering according to the Gaussian distribution of the observation hypothesis and the component distribution.

An embodiment of the present invention further provides a target tracking apparatus, as shown in fig. 3, the apparatus includes:

a receiving module 401, configured to receive an image uploaded by a sensor and perform target object detection in the image to obtain a plurality of target object detection frames; for details, refer to the related description of step S101 in the above method embodiment.

A determining module 402 for determining a target observation point from a plurality of target detection frames; for details, refer to the related description of step S102 in the above method embodiment.

A detection processing module 403, configured to process an area where the target observation point is located and detect a target object according to a processing result; for details, refer to the related description of step S103 in the above method embodiment.

A tracking module 404, configured to respond to a corresponding target tracking operation according to a detection result; for details, refer to the related description of step S104 in the above method embodiment.

The target tracking device provided by the embodiment of the invention receives the images uploaded by the sensor and detects the target object in the images to obtain a plurality of target object detection frames; determining target observation points from the plurality of target detection frames; processing the area where the target observation point is located and detecting the target object according to the processing result; and responding to corresponding target object tracking operation according to the detection result. According to the method, the target observation point is determined according to the target object detection frame, the target object is detected through the processing result of the area where the target observation point is located, the feature extraction of a plurality of feature points is not needed for a single target block, and the target tracking time is shortened.

As an optional implementation manner of the embodiment of the present invention, the method further includes: the first processing module is used for performing semantic segmentation processing on the region where the target observation point is located; the second processing module is used for carrying out inverse projection transformation on the semantic segmentation processing result; the first model building module is used for building a target observation model; the second model building module is used for building a target transfer model; and the shielding module is used for determining whether the target object detection area is shielded or not when the target object is not detected.

As an optional implementation manner of the embodiment of the present invention, the tracking module further includes: the updating module is used for updating the state data of the target object at the previous moment according to the state data of the target object at the current moment when the target object detection area is shielded; the prediction module is used for predicting the state data of the target object at the current moment according to the state data of the target object at the previous moment and a preset state prediction model when the target object detection area is not shielded; and the output module is used for outputting the updated target estimation value of the target object at the current moment.

For a detailed description of the functions of the target tracking device provided by the embodiment of the present invention, reference is made to the description of the target tracking method in the above embodiment.

An embodiment of the present invention further provides a storage medium, as shown in fig. 4, on which a computer program 601 is stored, where the instructions, when executed by a processor, implement the steps of the target tracking method in the foregoing embodiments. The storage medium is also stored with audio and video stream data, characteristic frame data, an interactive request signaling, encrypted data, preset data size and the like. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, abbreviated as HDD) or a Solid State Drive (SSD), etc.; the storage medium may also comprise a combination of memories of the kind described above.

An embodiment of the present invention further provides an electronic device, as shown in fig. 5, the electronic device may include a processor 51 and a memory 52, where the processor 51 and the memory 52 may be connected by a bus or in another manner, and fig. 5 takes the connection by the bus as an example.

The processor 51 may be a Central Processing Unit (CPU). The Processor 51 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or combinations thereof.

The memory 52, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as the corresponding program instructions/modules in the embodiments of the present invention. The processor 51 executes various functional applications and data processing of the processor by running non-transitory software programs, instructions and modules stored in the memory 52, that is, implements the target tracking method in the above-described method embodiments.

The memory 52 may include a storage program area and a storage data area, wherein the storage program area may store an operating device, an application program required for at least one function; the storage data area may store data created by the processor 51, and the like. Further, the memory 52 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 52 may optionally include memory located remotely from the processor 51, and these remote memories may be connected to the processor 51 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The one or more modules are stored in the memory 52 and, when executed by the processor 51, perform the object tracking method in the embodiment shown in fig. 1-2.

The details of the electronic device may be understood by referring to the corresponding descriptions and effects in the embodiments shown in fig. 1 to fig. 2, and are not described herein again.

Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims

1. A target tracking method is characterized by comprising the following steps:

receiving an image uploaded by a sensor and carrying out target object detection in the image to obtain a plurality of target object detection frames;

determining target observation points from the plurality of target detection frames;

processing the area where the target observation point is located and detecting the target object according to the processing result;

and responding to corresponding target object tracking operation according to the detection result.

2. The target tracking method according to claim 1, wherein processing the area where the target observation point is located comprises:

performing semantic segmentation processing on the region where the target observation point is located;

and carrying out inverse projection transformation on the semantic segmentation processing result.

3. The target tracking method according to claim 1, wherein detecting the target object according to the processing result includes: target detection is performed by a target observation model of the formula:

4. The target tracking method of claim 2, wherein responding to the corresponding target tracking operation based on the detection result comprises:

when the target object is detected, responding to corresponding target object tracking operation;

and when the target object is not detected, determining whether the target object detection area is blocked or not, and responding to corresponding target object tracking operation according to a blocking judgment result.

5. The target tracking method of claim 4, wherein determining whether the target object detection area is occluded when the target object is not detected comprises:

calculating the detection probability of the target object according to the inverse projection transformation;

when the target object detection probability meets a first probability condition, determining that the target object detection area is blocked; and when the target object detection probability meets a second probability condition, determining that the target object detection area is not shielded.

6. The target tracking method of claim 4, wherein responding to the corresponding target tracking operation according to the occlusion determination result comprises:

when the target object detection area is not shielded, updating the state data of the target object at the previous moment according to the state data of the target object at the current moment, and outputting an updated target estimation value of the target object at the current moment;

and when the target object detection area is shielded, predicting the state data of the target object at the current moment according to the state data of the target object at the previous moment and a preset state prediction model, and responding to the tracking operation of the target object according to the state data of the current moment.

7. The target tracking method of claim 6, wherein updating the state data of the target object at the previous time based on the state data of the target object at the current time comprises:

the state data update is performed by the target transition model of the following formula:

8. An object tracking device, comprising:

the receiving module is used for receiving the image uploaded by the sensor and detecting the target object in the image to obtain a plurality of target object detection frames;

a determination module for determining a target observation point from the plurality of target detection frames;

the detection processing module is used for processing the area where the target observation point is located and detecting the target object according to the processing result;

and the tracking module is used for responding to corresponding target object tracking operation according to the detection result.

9. A computer-readable storage medium storing computer instructions for causing a computer to perform the object tracking method of any one of claims 1-7.

10. An electronic device, comprising: a memory and a processor communicatively coupled to each other, the memory storing computer instructions, the processor executing the computer instructions to perform the object tracking method of any one of claims 1-7.