CN113393493A - Target object tracking method and device - Google Patents
Target object tracking method and device Download PDFInfo
- Publication number
- CN113393493A CN113393493A CN202110592320.0A CN202110592320A CN113393493A CN 113393493 A CN113393493 A CN 113393493A CN 202110592320 A CN202110592320 A CN 202110592320A CN 113393493 A CN113393493 A CN 113393493A
- Authority
- CN
- China
- Prior art keywords
- frame image
- information
- current frame
- target
- filtering template
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000001914 filtration Methods 0.000 claims abstract description 145
- 230000004044 response Effects 0.000 claims abstract description 143
- 238000004590 computer program Methods 0.000 claims description 10
- 238000010586 diagram Methods 0.000 description 13
- 239000011159 matrix material Substances 0.000 description 13
- 230000006870 function Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 239000013598 vector Substances 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000009795 derivation Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000002980 postoperative effect Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000011480 coordinate descent method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/262—Analysis of motion using transform domain methods, e.g. Fourier domain methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20048—Transform domain processing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a target object tracking method and device. One embodiment of the method comprises: for each frame of image in the video to be processed, the following target tracking operation is performed: determining the position information of a target object in the current frame image through a filter template corresponding to the previous frame image; generating response information of the previous frame image to a filtering template corresponding to the previous frame image; and obtaining a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image. The application provides a target object tracking method, which improves the universality, robustness, accuracy and reliability of target tracking.
Description
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a target object tracking method and device.
Background
In a machine vision task, target tracking has wide application scenes and great commercial value. Based on the characteristics that the correlation filtering algorithm is fast and efficient, the target tracking method is easy to deploy on a Central Processing Unit (CPU), and can track the target in real time, and the target tracking method is widely applied to target tracking tasks. The current related filtering model only refers to a Gaussian pseudo label preset in each frame of image to update the filtering model.
Disclosure of Invention
The embodiment of the application provides a target object tracking method and device.
In a first aspect, an embodiment of the present application provides a target object tracking method, which performs the following target tracking operations for each frame of image in a video to be processed: determining the position information of a target object in the current frame image through a filter template corresponding to the previous frame image; generating response information of the previous frame image to a filtering template corresponding to the previous frame image; and obtaining a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image.
In some embodiments, the obtaining a filtering template corresponding to the current frame image according to the position information and the response information includes: combining the position information and the response information to obtain label information of the current frame image; and obtaining a filtering template corresponding to the current frame image based on the minimization between the label information and the target response information, wherein the target response information represents the response information of the current frame image to the filtering template corresponding to the current frame image.
In some embodiments, the obtaining a filtering template corresponding to the current frame image based on the minimization between the tag information and the target response information includes: acting on the label information and the target response information through a preset weighting window to obtain acted information; and obtaining a filtering template corresponding to the current frame image based on the minimization of the acted information.
In some embodiments, the center point of the predetermined weighting window is a first weight, and the non-center point of the predetermined weighting window is a second weight; and the above acting on the tag information and the target response information through the preset weighting window to obtain acted information, including: and acting on the label information and the target response information through a preset weighting window to obtain the acted information of the matching loss of a central point and a non-central point of the target object distinguished by the first weight and the second weight.
In some embodiments, the obtaining a corresponding filtering template of the current frame image based on the minimization of the affected information includes: and based on the minimization of the acted information, obtaining a filtering template corresponding to the current frame image by using a preset constraint aiming at distinguishing the background and the foreground of the target object in the image.
In some embodiments, the tag information and the target response information are acted on through a preset weighting window to obtain acted information; based on the minimization of the acted information, obtaining a filtering template corresponding to the current frame image, wherein the filtering template comprises the following steps: for each channel of each frame image, acting on the label information and the target response information corresponding to the channel through a preset weighting window to obtain acted information corresponding to the channel, and obtaining a filtering template corresponding to the channel of the current frame image based on minimization of the acted information; and the determining the position information of the target object in the current frame image by the filter template corresponding to the previous frame image includes: and acting on each channel of the current frame image based on the one-to-one corresponding filtering template of each channel of the previous frame image to determine the position information of the target object in the current frame image.
In some embodiments, the obtaining the post-operation information corresponding to the channel by applying the preset weighting window to the tag information and the target response information corresponding to the channel, and obtaining the filtering template corresponding to the channel of the current frame image based on minimization of the post-operation information includes: and obtaining multi-scale filtering templates corresponding to the channels of the current frame image based on minimization of the acted information by using a multi-scale filtering mode and acting on the label information and the target scale response information corresponding to the channels through a preset weighting window, wherein the target scale response information represents the channels of the current frame image and is used for responding to the response information of the filtering templates with a single scale in the multi-scale filtering templates.
In some embodiments, the above method further comprises: and aiming at a first frame image in the video to be processed, determining a filtering template corresponding to the first frame image according to a target frame representing a target object in the first frame image.
In a second aspect, an embodiment of the present application provides a target object tracker that performs the following target tracking operations for each frame of image in a video to be processed, by: a first determining unit configured to determine position information of a target object in a current frame image in response to a filtering template corresponding to a previous frame image; the second determining unit is configured to generate response information of the previous frame image to the filtering template corresponding to the previous frame image; and the obtaining unit is configured to obtain a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image.
In some embodiments, the deriving unit is further configured to: combining the position information and the response information to obtain label information of the current frame image; and obtaining a filtering template corresponding to the current frame image based on the minimization between the label information and the target response information, wherein the target response information represents the response information of the current frame image to the filtering template corresponding to the current frame image.
In some embodiments, the deriving unit is further configured to: acting on the label information and the target response information through a preset weighting window to obtain acted information; and obtaining a filtering template corresponding to the current frame image based on the minimization of the acted information.
In some embodiments, the center point of the predetermined weighting window is a first weight, and the non-center point of the predetermined weighting window is a second weight; and an obtaining unit further configured to: and acting on the label information and the target response information through a preset weighting window to obtain the acted information of the matching loss of a central point and a non-central point of the target object distinguished by the first weight and the second weight.
In some embodiments, the deriving unit is further configured to: and based on the minimization of the acted information, obtaining a filtering template corresponding to the current frame image by using a preset constraint aiming at distinguishing the background and the foreground of the target object in the image.
In some embodiments, the deriving unit is further configured to: for each channel of each frame image, acting on the label information and the target response information corresponding to the channel through a preset weighting window to obtain acted information corresponding to the channel, and obtaining a filtering template corresponding to the channel of the current frame image based on minimization of the acted information; and a first determination unit further configured to: and acting on each channel of the current frame image based on the one-to-one corresponding filtering template of each channel of the previous frame image to determine the position information of the target object in the current frame image.
In some embodiments, the deriving unit is further configured to: and obtaining multi-scale filtering templates corresponding to the channels of the current frame image based on minimization of the acted information by using a multi-scale filtering mode and acting on the label information and the target scale response information corresponding to the channels through a preset weighting window, wherein the target scale response information represents the channels of the current frame image and is used for responding to the response information of the filtering templates with a single scale in the multi-scale filtering templates.
In some embodiments, the apparatus further comprises: and the third determining unit is configured to determine, for a first frame image in the video to be processed, a filtering template corresponding to the first frame image according to a target frame in the first frame image, wherein the target frame characterizes a target object.
In a third aspect, the present application provides a computer-readable medium, on which a computer program is stored, where the program, when executed by a processor, implements the method as described in any implementation manner of the first aspect.
In a fourth aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors, cause the one or more processors to implement a method as described in any implementation of the first aspect.
According to the method and the device for tracking the target object, the following target tracking operation is executed for each frame of image in the video to be processed: determining the position information of a target object in the current frame image through a filter template corresponding to the previous frame image; generating response information of the previous frame image to a filtering template corresponding to the previous frame image; according to the position information and the response information, the filtering template corresponding to the current frame image is obtained, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image, so that the target object tracking method is provided, the position information of the target object in the current frame image and the response information corresponding to the previous frame image can be flexibly combined, the filtering template for tracking the target object is obtained according to the combined information, and the universality, the robustness and the accuracy of target tracking are improved.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for tracking a target object according to the present application;
fig. 3 is a schematic diagram of an application scenario of the tracking method of the target object according to the present embodiment;
FIG. 4 is a flow diagram of yet another embodiment of a target object tracking method according to the present application;
FIG. 5 is a block diagram of one embodiment of a tracker device for a target object according to the present application;
FIG. 6 is a block diagram of a computer system suitable for use in implementing embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates an exemplary architecture 100 to which the target object tracking method and apparatus of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The communication connections between the terminal devices 101, 102, 103 form a topological network, and the network 104 serves to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The terminal devices 101, 102, 103 may be hardware devices or software that support network connections for data interaction and data processing. When the terminal devices 101, 102, and 103 are hardware, they may be various electronic devices supporting network connection, information acquisition, interaction, display, processing, and other functions, including but not limited to cameras, smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented, for example, as multiple software or software modules to provide distributed services, or as a single software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, for example, a background server that acquires a to-be-processed video captured or sent by the terminal devices 101, 102, and 103 and tracks a target object in the to-be-processed video. Optionally, the background server may feed back the target tracking result to the terminal device. As an example, the server 105 may be a cloud server.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be further noted that the tracking method for the target object provided in the embodiments of the present application may be executed by a server, may also be executed by a terminal device, and may also be executed by the server and the terminal device in cooperation with each other. Accordingly, each part (for example, each unit) included in the target object tracker device may be entirely provided in the server, may be entirely provided in the terminal device, and may be provided in the server and the terminal device, respectively.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. When the electronic device on which the tracking method of the target object operates does not need to perform data transmission with other electronic devices, the system architecture may include only the electronic device (e.g., a server or a terminal device) on which the tracking method of the target object operates.
With continuing reference to FIG. 2, a flow 200 of one embodiment of a target object tracking method is shown, for each frame of image in a video to be processed, performing the following target tracking operations:
In this embodiment, an execution subject (for example, the server in fig. 1) of the target object tracking method may determine the position information of the target object in the current frame image through a filter template corresponding to the previous frame image.
The video to be processed may be a video including an arbitrary target object. For example, the video to be processed is a video of a target object including a person, an animal, or the like, captured by a monitoring apparatus. Each frame of image in the video to be processed may include a plurality of target objects, or some frames may not include target objects.
As an example, the executing body may perform a convolution operation on the current frame image based on the filtering template corresponding to the previous frame image to obtain a response map representing response information corresponding to the current frame image; and then, determining the point with the maximum response in the response image as the central point of the target object in the current frame image so as to determine the position information of the target object. For example, the central point of the target object in the current frame image is directly determined as the position information of the target object; for another example, a target frame surrounding the target object is obtained according to the central point of the target object, so as to determine the position of the area where the target object is located.
Wherein, the filtering template should satisfy the following conditions: convolution with a target object in an image has the largest response at the central point of the target object and smaller response at non-central points, and specifically, the smaller the response of the filter template to the more background in the image.
Since the time interval between two frames of images of the video to be processed is short, the difference between the two frames of images should not be too large. Based on the above assumptions, the position information of the target object in the current frame image is consistent with the position information of the target object in the previous frame image with a high probability, that is, the two frame images have visual characteristics such as temporal continuity, spatial continuity, and the like. And performing convolution operation on the filtering template corresponding to the previous frame image and the current frame image to obtain a response image of the current frame image, and further taking the point with the maximum response in the response image as the central point of the target object in the current frame image. And for each frame of image in the video to be processed, determining the position information of the target object in the current frame of image, namely realizing the tracking of the target object in the video to be processed.
In this embodiment, the execution subject may generate response information of the previous frame image to the filter template corresponding to the previous frame image.
As an example, the execution subject may perform a convolution operation on the previous frame image based on the corresponding filtering template of the previous frame image, so as to determine the response information of the previous frame image to the filtering template. Wherein, the response information can be represented in a response graph.
It can be understood that the response map has the largest response at the center point of the target object.
And step 203, obtaining a filtering template corresponding to the current frame image according to the position information and the response information.
In this embodiment, the execution main body may obtain the filtering template corresponding to the current frame image according to the position information and the response information. Wherein, the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image
As an example, the executing subject may set a gaussian distribution label for the current frame image with a center point of the target object in the response map corresponding to the current frame image (i.e., a point with the largest response in the response map) as a center. And the central point corresponding to the Gaussian distribution label is superposed with the central point of the target object in the response image corresponding to the current frame image.
And the response graph of the previous frame image to the filtering template is also maximum at the central point of the target object, and is smaller at the non-central point of the target object. The execution main body may combine a gaussian distribution tag representing position information of a target object in a current frame image and response information of a previous frame image to obtain combined information, and determine a filtering template corresponding to the current frame image by using the combined information as a tag. If the central point of the target object in the response graph of the previous frame image to the filter template corresponding to the previous frame image is not coincident with the central point of the gaussian distribution label, the central point of the target object in the response graph of the previous frame image to the filter template needs to be coincident with the central point of the gaussian distribution label through a matrix shift operation.
In some optional implementations of this embodiment, the executing main body may execute the step 203 by:
firstly, combining the position information and the response information to obtain the label information of the current frame image.
The execution body may set a preset weight for the position information and the response information, and obtain the label information of the current frame image based on the preset weight in combination with the position information and the response information.
Secondly, a filtering template corresponding to the current frame image is obtained based on the minimization between the label information and the target response information.
The target response information represents the response information of the current frame image to the corresponding filtering template of the current frame image.
In this implementation, the executing entity may determine the distance between the tag information and the target response information according to the norm of L2, and then determine the filtering template corresponding to the current frame image based on the minimization of the distance between the tag information and the target response information. In order to improve the generalization capability of the filtering template, a regularization term based on the filtering template can be introduced.
In some optional implementation manners of this embodiment, the execution main body may act on the tag information and the target response information through a preset weighting window to obtain the acted information, and obtain the filtering template corresponding to the current frame image based on minimization of the acted information.
Wherein different weights are set at different positions in the preset weighting window. As an example, the weight of the center point in the preset weighting window is greater than the weight of the non-center point. The addition of the preset weighting window can make the filtering template more concentrated on the tag matching loss of the central point position of the target object in the response map (realized by setting a larger weight at the central point in the preset weighting window), or reduce the tag matching loss of the central position (realized by setting a smaller weight at the central point in the preset weighting window).
In some optional implementations of this embodiment, a central point of the preset weighting window is a first weight, and a non-central point of the preset weighting window is a second weight. In this implementation manner, the execution subject may act on the tag information and the target response information through a preset weighting window to obtain post-action information that distinguishes matching loss between a central point and a non-central point of the target object by using the first weight and the second weight.
Therefore, different weights are set for the central point and the non-central point of the preset weighting window, and the following three considerations are mainly considered: in the correlation filtering algorithm, only the central point part is considered as a positive sample, and each point of the non-central point position is not existed in a real scene because of the assumption of a cycle signal in the discrete Fourier hypothesis, so the value or attention of the label matching process of the non-central point is not required to be too high; secondly, the position of the non-center point is mainly based on the background, the local response of the image with more background is expected to be as low as possible in the embodiment, and the background part does not require accurate label matching; and thirdly, in the process of detecting the position of the target object in the current frame image, only the central point of the target object is used, and the attention point of the filtering template is at the central point position of the target object instead of the non-central position. Therefore, different weights are given to the center point position and the non-center point position of the target object.
In some optional implementation manners of this embodiment, for the second step, the executing entity may obtain, based on the minimization of the post-action information, a filtering template corresponding to the current frame image by using a preset constraint aiming at distinguishing a background and a foreground of the target object in the image.
In this implementation, tag matching of the background portion in the image is omitted, and tag matching concentrated on the tracked target object region is omitted.
In some optional implementation manners of this embodiment, the execution subject may determine, in units of channels of the image, a filter template corresponding to each channel of the image. Taking an RGB image as an example, including three channels of R (red), G (green), and B (blue), the execution body may determine three filtering templates corresponding to the three channels one by one.
Specifically, for each channel of each frame image, the preset weighting window is applied to the label information and the target response information corresponding to the channel to obtain the post-action information corresponding to the channel, and the filtering template corresponding to the channel of the current frame image is obtained based on the minimization of the post-action information.
In this implementation, the executing body executes the step 201 as follows: and acting on each channel of the current frame image based on the one-to-one corresponding filtering template of each channel of the previous frame image to determine the position information of the target object in the current frame image.
As an example, for each frame of image, the execution subject may obtain a response map of each channel through a filter template corresponding to each channel of the image, further fuse each response map, obtain a fused response map, determine a point with the maximum response in the fused response map as a central point of the target object, and further determine the position information of the target object.
In this implementation, the executing body executes the step 202 as follows: and determining the response information of the previous frame image to each filtering template through the filtering templates corresponding to the channels of the previous frame image one by one.
In some optional implementations of this embodiment, the executing entity determines the filtering template corresponding to the first frame image in the video to be processed by: and aiming at a first frame image in the video to be processed, determining a filtering template corresponding to the first frame image according to a target frame representing a target object in the first frame image.
Wherein the target frame indicates a region position of the target object in the first frame image. And with the target frame as a label, based on the convolution of the filtering template and the target object in the image, determining the filtering template of the first frame image under the condition that the response at the central point of the target object is maximum and the response at the non-central point is small.
A specific implementation of the present embodiment is given as follows:
first, a mathematical model of the filter template is constructed:
wherein, epsilon (W)t) A solving function of the filtering template representing the t frame image;sequentially represents the t frame imageA filter template of a k channel and a filter template of a k channel of a t-1 frame image; u represents a preset weighting window; y represents a Gaussian distribution label of the t frame image; sequentially representing the image information of the kth channel of the t frame image and the image information of the kth channel of the t-1 frame image;a regularization term for improving the function generalization ability; theta represents a weight parameter, lambda represents a regular term coefficient, and C represents the number of channels of the image; the delta represents matrix shift operation, aims to make the response image of the previous frame image and the response image of the current frame image as similar as possible, and realizes cross-time continuity prior, namely the response of the filtering template to the same target object in the adjacent frame image should be consistent; and ∑ indicates hadamard product and convolution in turn.
Here, y, u, x, and W are all described by taking a 1-dimensional vector as an example, specifically, n × 1 may be used, and the calculation of the 2-dimensional image matrix is similar to the one-dimensional vector. When using normal convolution calculations for a matrix of n x n, i.e. the 2-dimensional case, a matrix of n x n is convolved with a filtered template of n x n, the time complexity of which is O (n)4) Based on the relevant filtering model, the calculation complexity can be reduced to O (n2log (n)), namely the calculation complexity of fast Fourier transform, thereby reducing the algorithm complexity and speeding up the tracking process.
x may be a Histogram of Oriented Gradient (HOG) feature, a Color space feature (CN), a gray scale feature, or other manual features (enhanced craft), or may be a depth feature.
Wherein the post-operative information is as follows:
the post-contribution information may be viewed as a label that determines a filter template for the t-th frame image.
In order to simplify the operation, the position matched with the label is divided into two areas, one area is a central point, and the weight is assigned as b; one is a non-center point region, and the assigned weight is a. Characterized by a preset weighting window as follows:
in order to enhance the distinguishing capability of the filtering template for the foreground and the background, the above equations (1), (2) and (3) are combined to obtain:
wherein,b is a mapping matrix of N x D as an auxiliary variable. All values in B are 0 (corresponding to background position) or 1 (corresponding to foreground position) aiming at ignoring tag matching of background part but tag matching concentrated on tracked target object area, and N>>D; (i) representing a mapping function, representing that the characteristic vector x of the kth channel of the t frame is translated by i units to obtain a new vector; u (i) and l (i) represent the values of the ith position in vectors u (characterizing the preset weighting window) and l (characterizing the post-operative information), respectively, and u (i),
then, using a block coordinate descent method to solve the above equation (4), and introducing the auxiliary variable g, we can obtain:
where μ is a penalty parameter for making the constraint condition The requirements are met as much as possible. Equation (6) above can be divided into two sub-problems of iteration, one for each of Wt,gtAnd (6) solving.
Based on gtDerivation of equation (5) yields:
Based on gtDerivation of equation (5) yields:
where I represents a unit vector.
Substituting equation (3) into equation (7) reduces the above equation (7) to:
wherein M is:
To avoid the operation of the inverse of the matrix, based on the properties of the circulant matrix, we obtain:
wherein diag denotes a diagonal matrix, F denotes a fourier transform matrix, and H denotes a conjugate transpose.
Then, based on the rank 1 optimized Sherman-Morrison formula:
wherein, in the characteristic represented by the Sherman-Morrison formula, A can be any square matrix; x and y are both vectors.
The initial optimal formula for W can be obtained:
wherein A ═ a2M+μIN,φ=ft(1)。
therefore, the property of the circulation matrix is also realized through the above rank 1 optimization method, the inverse of the n x n matrix is avoided being calculated, and the speed is greatly acceleratedAnd (4) solving.
As aboveThe solution space of (a) is very redundant and can only be applied to a single channel. Each channel of the image can thus be considered as identical, and a filter template for each channel is obtained based on the above-described solving process.
After the filtering template of the current frame image is obtained, the filtering template is regarded as a filter, and a response image of the next frame image is obtained based on the filtering template of the current frame image and the characteristic x of the next frame image:
then, the position of the target object in the target frame of the next frame is calculated based on the following equation:
(x,y)=arg max Rest+1(x,y)
x,y
thus, target tracking for the t +1 frame is completed.
In some optional implementation manners of this embodiment, for each channel of each frame of image, a multi-scale filtering manner is adopted, the label information and the target scale response information corresponding to the channel are acted on by a preset weighting window, so as to obtain post-action information corresponding to the channel, and based on minimization of the post-action information, a multi-scale filtering template corresponding to the channel of the current frame of image is obtained. Wherein the target scale response information characterizes the channel of the current frame image, and the response information of the filtering template of a single scale in the multi-scale filtering template
In order to realize multi-scale target tracking, a relevant filtering algorithm generates a plurality of filtering templates with different scales, then generates a plurality of response graphs with different scales, and further determines a response graph with the maximum response from the plurality of response graphs with different scales, and the response graph is regarded as the scaling scale of the image.
The super parameter can be flexibly adjusted according to the characteristics of the tracking target and the condition of the data set based on the set values of the non-central point a and the central point b of the preset weighting window u in the formula (2) and the formula (3) so as to adapt to different situations and improve the universality and the accuracy of the method. Even in some cases, the above-mentioned hyper-parameters can be adjusted to characterize the method in the present application as a general target model without using the method of the present application, thereby ensuring that the universality of the present application is enhanced on the premise of not being weaker than the results of the general method. Moreover, the method in the application does not require that the image of the video to be processed meets the data distribution assumption of low rank and the like, and instead, the specialization of the central point and the non-central point through the weighting window enhances the distinguishing capability of the model on the data in the image and enhances the robustness and the reliability of the tracking method of the target object.
With continued reference to fig. 3, fig. 3 is a schematic diagram 300 of an application scenario of the tracking method of the target object according to the present embodiment. In the application scenario of fig. 3, the server 302 first acquires the surveillance video captured by the terminal device 301. For each frame of image in the surveillance video, the server 302 performs the following target tracking operations: firstly, determining a filter template i-1 corresponding to a t-1 frame image, and determining the position information of a target object in the t-1 frame image based on the filter template i-1; then, determining the response information of the t-1 frame image to the filtering template i-1; and then, obtaining a filtering template i corresponding to the t frame image according to the position information and the response information. Furthermore, in the next target tracking operation, the position information of the target object in the t +1 th frame image can be determined through the filter template i corresponding to the t th frame image.
The method provided by the above embodiment of the present application performs the following target tracking operation for each frame of image in the video to be processed: determining the position information of a target object in the current frame image through a filter template corresponding to the previous frame image; generating response information of the previous frame image to a filtering template corresponding to the previous frame image; and obtaining a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image, so that the target object tracking method is provided, and the universality, robustness, accuracy and reliability of target tracking are improved.
With continuing reference to FIG. 4, a schematic flow chart 400 illustrating one embodiment of a target object tracking method in accordance with the present application performs the following target tracking operations for each frame of image in a video to be processed:
And 402, generating response information of the previous frame image to the filter template corresponding to the previous frame image.
And step 403, combining the position information and the response information to obtain the label information of the current frame image.
The filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image.
And step 404, acting on the tag information and the target response information through a preset weighting window to obtain acted information for distinguishing the matching loss of the central point and the non-central point of the target object by using the first weight and the second weight, and obtaining a filtering template corresponding to the current frame image based on minimization of the acted information.
The center point of the preset weighting window is a first weight, and the non-center point of the preset weighting window is a second weight.
As can be seen from this embodiment, compared with the embodiment corresponding to fig. 2, the flow 400 of the target object tracking method in this embodiment specifically illustrates a process of determining a filtering template, in the relevant filtering, a label based on a time domain smoothing assumption is constructed, a preset weighting window for matching the label based on different weights is constructed, and robustness and accuracy of target tracking are further improved.
With continuing reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of a target object tracker apparatus, which corresponds to the method embodiment shown in fig. 2 and can be applied to various electronic devices.
As shown in fig. 5, the tracker device of the target object performs the following target tracking operation for each frame image in the video to be processed by: a first determining unit 501 configured to determine position information of a target object in a current frame image through a filter template corresponding to a previous frame image; a second determining unit 502 configured to generate response information of the previous frame image to a corresponding filtering template of the previous frame image; the obtaining unit 503 is configured to obtain, according to the position information and the response information, a filtering template corresponding to the current frame image, where the filtering template corresponding to the current frame image is used to determine the position information of the target object in the next frame image.
In some optional implementations of this embodiment, the obtaining unit 503 is further configured to: combining the position information and the response information to obtain label information of the current frame image; and obtaining a filtering template corresponding to the current frame image based on the minimization between the label information and the target response information, wherein the target response information represents the response information of the current frame image to the filtering template corresponding to the current frame image.
In some optional implementations of this embodiment, the obtaining unit 503 is further configured to: acting on the label information and the target response information through a preset weighting window to obtain acted information; and obtaining a filtering template corresponding to the current frame image based on the minimization of the acted information.
In some optional implementation manners of this embodiment, a central point of the preset weighting window is a first weight, and a non-central point of the preset weighting window is a second weight; a deriving unit 503, further configured to: and acting on the label information and the target response information through a preset weighting window to obtain the acted information of the matching loss of a central point and a non-central point of the target object distinguished by the first weight and the second weight.
In some optional implementations of this embodiment, the obtaining unit 503 is further configured to: and based on the minimization of the acted information, obtaining a filtering template corresponding to the current frame image by using a preset constraint aiming at distinguishing the background and the foreground of the target object in the image.
In some optional implementations of this embodiment, the obtaining unit 503 is further configured to: for each channel of each frame image, acting on the label information and the target response information corresponding to the channel through a preset weighting window to obtain acted information corresponding to the channel, and obtaining a filtering template corresponding to the channel of the current frame image based on minimization of the acted information; and a first determination unit further configured to: and acting on each channel of the current frame image based on the one-to-one corresponding filtering template of each channel of the previous frame image to determine the position information of the target object in the current frame image.
In some embodiments, the deriving unit 503 is further configured to: and for each channel of each frame of image, a multi-scale filtering mode is adopted, label information and target scale response information corresponding to the channel are acted through a preset weighting window to obtain acted information corresponding to the channel, and a multi-scale filtering template corresponding to the channel of the current frame of image is obtained based on minimization of the acted information, wherein the target scale response information represents the channel of the current frame of image, and response information of the filtering template with a single scale in the multi-scale filtering template is obtained.
In some embodiments, the apparatus further comprises: and a third determining unit (not shown in the figure) configured to determine, for a first frame image in the video to be processed, a corresponding filtering template of the first frame image according to a target frame representing a target object in the first frame image.
In this embodiment, the tracker device of the target object performs the following target tracking operation for each frame of image in the video to be processed by the following units: the first determining unit determines the position information of a target object in the current frame image through a filter template corresponding to the previous frame image; the second determining unit generates response information of the previous frame image to the filtering template corresponding to the previous frame image; the obtaining unit obtains the filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image, so that the target object tracking device is provided, and the target tracking universality, robustness and accuracy are improved.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing devices of embodiments of the present application (e.g., devices 101, 102, 103, 105 shown in FIG. 1). The apparatus shown in fig. 6 is only an example, and should not bring any limitation to the function and use range of the embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a processor (e.g., CPU, central processing unit) 601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the system 600 are also stored. The processor 601, the ROM602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program, when executed by the processor 601, performs the above-described functions defined in the method of the present application.
It should be noted that the computer readable medium of the present application can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the client computer, partly on the client computer, as a stand-alone software package, partly on the client computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the client computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a first determining unit, a second determining unit obtaining unit. The names of these units do not form a limitation to the unit itself in some cases, and for example, the deriving unit may also be described as a "unit that derives a filter template corresponding to the current frame image according to the position information and the response information".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the computer device to: for each frame of image in the video to be processed, the following target tracking operation is performed: determining the position information of a target object in the current frame image through a filter template corresponding to the previous frame image; generating response information of the previous frame image to a filtering template corresponding to the previous frame image; and obtaining a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
Claims (13)
1. A method of tracking a target object, comprising: for each frame of image in the video to be processed, the following target tracking operation is performed:
determining the position information of a target object in the current frame image through a filter template corresponding to the previous frame image;
generating response information of the previous frame image to a filtering template corresponding to the previous frame image;
and obtaining a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image.
2. The method according to claim 1, wherein the obtaining a filtering template corresponding to the current frame image according to the position information and the response information includes:
combining the position information and the response information to obtain label information of the current frame image;
and obtaining a filtering template corresponding to the current frame image based on the minimization between the label information and target response information, wherein the target response information represents the response information of the current frame image to the filtering template corresponding to the current frame image.
3. The method of claim 2, wherein the deriving the corresponding filtering template of the current frame image based on the minimization between the tag information and the target response information comprises:
acting on the label information and the target response information through a preset weighting window to obtain acted information;
and obtaining a filtering template corresponding to the current frame image based on the minimization of the acted information.
4. The method according to claim 3, wherein the center point of the preset weighting window is a first weight, and the non-center point of the preset weighting window is a second weight; and
the acting on the tag information and the target response information through a preset weighting window to obtain acted information comprises the following steps:
and acting on the label information and the target response information through the preset weighting window to obtain the acted information for distinguishing the matching loss of the central point and the non-central point of the target object by the first weight and the second weight.
5. The method of claim 3, wherein the deriving a corresponding filtering template for the current frame image based on the minimization of the post-action information comprises:
and based on the minimization of the acted information, obtaining a filtering template corresponding to the current frame image by using a preset constraint aiming at distinguishing the background and the foreground of the target object in the image.
6. The method according to claim 3, wherein the acting on the tag information and the target response information through a preset weighting window obtains acted information; obtaining a filtering template corresponding to the current frame image based on the minimization of the acted information, wherein the filtering template comprises:
for each channel of each frame image, acting on the label information and the target response information corresponding to the channel through the preset weighting window to obtain acted information corresponding to the channel, and obtaining a filtering template corresponding to the channel of the current frame image based on minimization of the acted information; and
the determining the position information of the target object in the current frame image through the filtering template corresponding to the previous frame image includes:
and acting on each channel of the current frame image based on the one-to-one corresponding filtering template of each channel of the previous frame image, and determining the position information of the target object in the current frame image.
7. The method of claim 6, wherein the acting on the tag information and the target response information corresponding to the channel through the preset weighting window to obtain the acted information corresponding to the channel, and obtaining the filtering template corresponding to the channel of the current frame image based on the minimization of the acted information comprises:
and obtaining multi-scale filtering templates corresponding to the channels of the current frame image based on minimization of the acted information by using a multi-scale filtering mode and acting on the label information and the target scale response information corresponding to the channels through the preset weighting window, wherein the target scale response information represents the channels of the current frame image and is used for responding to the response information of the filtering templates with a single scale in the multi-scale filtering templates.
8. The method of claim 1, further comprising:
and aiming at a first frame image in the video to be processed, determining a filtering template corresponding to the first frame image according to a target frame which is used for representing the target object in the first frame image.
9. A target object tracking device performs the following target tracking operation for each frame of image in a video to be processed by the following units:
the first determining unit is configured to determine the position information of the target object in the current frame image through a filter template corresponding to the previous frame image;
a second determining unit configured to generate response information of the previous frame image to a corresponding filtering template of the previous frame image;
and the obtaining unit is configured to obtain a filtering template corresponding to the current frame image according to the position information and the response information, wherein the filtering template corresponding to the current frame image is used for determining the position information of the target object in the next frame image.
10. The apparatus of claim 9, wherein the deriving unit is further configured to:
combining the position information and the response information to obtain label information of the current frame image; and obtaining a filtering template corresponding to the current frame image based on the minimization between the label information and target response information, wherein the target response information represents the response information of the current frame image to the filtering template corresponding to the current frame image.
11. The apparatus of claim 10, wherein the deriving unit is further configured to:
acting on the label information and the target response information through a preset weighting window to obtain acted information; and obtaining a filtering template corresponding to the current frame image based on the minimization of the acted information.
12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-8.
13. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110592320.0A CN113393493B (en) | 2021-05-28 | 2021-05-28 | Target object tracking method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110592320.0A CN113393493B (en) | 2021-05-28 | 2021-05-28 | Target object tracking method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113393493A true CN113393493A (en) | 2021-09-14 |
CN113393493B CN113393493B (en) | 2024-04-05 |
Family
ID=77619549
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110592320.0A Active CN113393493B (en) | 2021-05-28 | 2021-05-28 | Target object tracking method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113393493B (en) |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109146912A (en) * | 2018-07-26 | 2019-01-04 | 湖南人文科技学院 | A kind of visual target tracking method based on Objective analysis |
CN109410247A (en) * | 2018-10-16 | 2019-03-01 | 中国石油大学(华东) | A kind of video tracking algorithm of multi-template and adaptive features select |
CN110084836A (en) * | 2019-04-26 | 2019-08-02 | 西安电子科技大学 | Method for tracking target based on the response fusion of depth convolution Dividing Characteristics |
CN110097575A (en) * | 2019-04-28 | 2019-08-06 | 电子科技大学 | A kind of method for tracking target based on local feature and scale pond |
CN110349190A (en) * | 2019-06-10 | 2019-10-18 | 广州视源电子科技股份有限公司 | Target tracking method, device and equipment for adaptive learning and readable storage medium |
CN111161323A (en) * | 2019-12-31 | 2020-05-15 | 北京理工大学重庆创新中心 | Complex scene target tracking method and system based on correlation filtering |
CN111899278A (en) * | 2020-06-22 | 2020-11-06 | 北京航空航天大学 | Unmanned aerial vehicle image rapid target tracking method based on mobile terminal |
CN111931722A (en) * | 2020-09-23 | 2020-11-13 | 杭州视语智能视觉系统技术有限公司 | Correlated filtering tracking method combining color ratio characteristics |
WO2020228522A1 (en) * | 2019-05-10 | 2020-11-19 | 腾讯科技(深圳)有限公司 | Target tracking method and apparatus, storage medium and electronic device |
US20200380274A1 (en) * | 2019-06-03 | 2020-12-03 | Nvidia Corporation | Multi-object tracking using correlation filters in video analytics applications |
CN112036381A (en) * | 2020-11-03 | 2020-12-04 | 中山大学深圳研究院 | Visual tracking method, video monitoring method and terminal equipment |
CN112233143A (en) * | 2020-12-14 | 2021-01-15 | 浙江大华技术股份有限公司 | Target tracking method, device and computer readable storage medium |
-
2021
- 2021-05-28 CN CN202110592320.0A patent/CN113393493B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109146912A (en) * | 2018-07-26 | 2019-01-04 | 湖南人文科技学院 | A kind of visual target tracking method based on Objective analysis |
CN109410247A (en) * | 2018-10-16 | 2019-03-01 | 中国石油大学(华东) | A kind of video tracking algorithm of multi-template and adaptive features select |
CN110084836A (en) * | 2019-04-26 | 2019-08-02 | 西安电子科技大学 | Method for tracking target based on the response fusion of depth convolution Dividing Characteristics |
CN110097575A (en) * | 2019-04-28 | 2019-08-06 | 电子科技大学 | A kind of method for tracking target based on local feature and scale pond |
WO2020228522A1 (en) * | 2019-05-10 | 2020-11-19 | 腾讯科技(深圳)有限公司 | Target tracking method and apparatus, storage medium and electronic device |
US20200380274A1 (en) * | 2019-06-03 | 2020-12-03 | Nvidia Corporation | Multi-object tracking using correlation filters in video analytics applications |
CN110349190A (en) * | 2019-06-10 | 2019-10-18 | 广州视源电子科技股份有限公司 | Target tracking method, device and equipment for adaptive learning and readable storage medium |
CN111161323A (en) * | 2019-12-31 | 2020-05-15 | 北京理工大学重庆创新中心 | Complex scene target tracking method and system based on correlation filtering |
CN111899278A (en) * | 2020-06-22 | 2020-11-06 | 北京航空航天大学 | Unmanned aerial vehicle image rapid target tracking method based on mobile terminal |
CN111931722A (en) * | 2020-09-23 | 2020-11-13 | 杭州视语智能视觉系统技术有限公司 | Correlated filtering tracking method combining color ratio characteristics |
CN112036381A (en) * | 2020-11-03 | 2020-12-04 | 中山大学深圳研究院 | Visual tracking method, video monitoring method and terminal equipment |
CN112233143A (en) * | 2020-12-14 | 2021-01-15 | 浙江大华技术股份有限公司 | Target tracking method, device and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
侯颖;王颖;林歆钰;: "多尺度视频目标跟踪算法研究", 信息技术与信息化, no. 04 * |
Also Published As
Publication number | Publication date |
---|---|
CN113393493B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10902245B2 (en) | Method and apparatus for facial recognition | |
CN108898086B (en) | Video image processing method and device, computer readable medium and electronic equipment | |
CN109816589B (en) | Method and apparatus for generating cartoon style conversion model | |
CN109255337B (en) | Face key point detection method and device | |
CN108197618B (en) | Method and device for generating human face detection model | |
CN112132847A (en) | Model training method, image segmentation method, device, electronic device and medium | |
CN109389072B (en) | Data processing method and device | |
US11941529B2 (en) | Method and apparatus for processing mouth image | |
EP3598386A1 (en) | Method and apparatus for processing image | |
CN109377508B (en) | Image processing method and device | |
CN110516678B (en) | Image processing method and device | |
CN110059623B (en) | Method and apparatus for generating information | |
CN113379627A (en) | Training method of image enhancement model and method for enhancing image | |
CN110874853A (en) | Method, device and equipment for determining target motion and storage medium | |
CN112907628A (en) | Video target tracking method and device, storage medium and electronic equipment | |
CN112991218B (en) | Image processing method, device, equipment and storage medium | |
CN114677422A (en) | Depth information generation method, image blurring method and video blurring method | |
CN111445496B (en) | Underwater image recognition tracking system and method | |
CN110288625B (en) | Method and apparatus for processing image | |
CN117011137B (en) | Image stitching method, device and equipment based on RGB similarity feature matching | |
CN110852250B (en) | Vehicle weight removing method and device based on maximum area method and storage medium | |
CN109523564B (en) | Method and apparatus for processing image | |
CN110895699B (en) | Method and apparatus for processing feature points of image | |
CN110827254A (en) | Method and device for determining image definition | |
US20240177409A1 (en) | Image processing method and apparatus, electronic device, and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 100176 601, 6th floor, building 2, No. 18, Kechuang 11th Street, Daxing Economic and Technological Development Zone, Beijing Applicant after: Jingdong Technology Information Technology Co.,Ltd. Address before: 100176 601, 6th floor, building 2, No. 18, Kechuang 11th Street, Daxing Economic and Technological Development Zone, Beijing Applicant before: Jingdong Shuke Haiyi Information Technology Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |