CN110610510A - Target tracking method and device, electronic equipment and storage medium - Google Patents

Target tracking method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110610510A
CN110610510A CN201910810400.1A CN201910810400A CN110610510A CN 110610510 A CN110610510 A CN 110610510A CN 201910810400 A CN201910810400 A CN 201910810400A CN 110610510 A CN110610510 A CN 110610510A
Authority
CN
China
Prior art keywords
frame
target
detection
detection data
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910810400.1A
Other languages
Chinese (zh)
Other versions
CN110610510B (en
Inventor
许明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201910810400.1A priority Critical patent/CN110610510B/en
Publication of CN110610510A publication Critical patent/CN110610510A/en
Application granted granted Critical
Publication of CN110610510B publication Critical patent/CN110610510B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a target tracking method, a target tracking device, electronic equipment and a storage medium, and belongs to the technical field of image processing. The method comprises the following steps: acquiring a plurality of frame images of a video to be tracked, and acquiring a plurality of key frames from the plurality of frame images; inputting the plurality of key frames into a detection network to obtain detection data, wherein the detection data comprises position information of a target object; and tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm. When the plurality of frame images are acquired, the plurality of key frames can be acquired from the plurality of frame images, the introduction of the key frames can effectively delete the blurred images in the plurality of frame images, and the accuracy of target tracking can be improved to a certain extent.

Description

Target tracking method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a target tracking method and apparatus, an electronic device, and a storage medium.
Background
The target tracking technology is a technology developed and applied in recent years, and is also an important subject and research hotspot currently, the target tracking estimates the occupied area of the position and the shape of a tracking target in a continuous video image sequence, determines the motion information of the target, such as the motion speed, the direction, the track and the like, and realizes the analysis and understanding of the motion of the moving target so as to complete higher-level tasks. Obviously, target tracking is one of the core problems of video analysis, and has wide application in the fields of video monitoring, human-computer interaction, virtual reality and the like. Although the research on the technology has been greatly advanced in the past decades, due to the application of various factors such as scale change, illumination change, rapid movement, partial occlusion, and complex background, it is always a challenge to find a tracking method which is fast and stable and suitable for object change.
Disclosure of Invention
In view of the above, the present invention provides a target tracking method, an apparatus, an electronic device and a storage medium to solve the above problems.
In a first aspect, an embodiment of the present application provides a target tracking method, which is applied to an electronic device, and the method includes: acquiring a plurality of frame images of a video to be tracked, and acquiring a plurality of key frames from the plurality of frame images; inputting the plurality of key frames into a detection network to obtain detection data, wherein the detection data comprises position information of a target object; and tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm.
In a second aspect, an embodiment of the present application provides a target tracking apparatus, which is applied to an electronic device. The device comprises: the device comprises a key frame acquisition module, a detection data acquisition module and a target tracking module. And the key frame acquisition module is used for acquiring a plurality of frame images of the video to be tracked and acquiring a plurality of key frames from the plurality of frame images. And the detection data acquisition module is used for inputting the plurality of key frames into a detection network to obtain detection data, and the detection data comprises position information of the target object. And the target tracking module is used for tracking the target object in the plurality of frame images by using the detection data and the nuclear correlation filtering algorithm.
In a third aspect, an embodiment of the present application further provides an electronic device, including one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the above-described methods.
In a fourth aspect, an embodiment of the present application further provides a computer-readable storage medium, where a program code is stored in the computer-readable storage medium, and the program code can be called by a processor to execute the above method.
Compared with the prior art, the target tracking method, the device, the electronic device and the storage medium provided by the embodiment of the application can acquire a plurality of frame images of a video to be tracked, acquire a plurality of key frames from the frame images, input the plurality of key frames into a detection network to obtain detection data, wherein the detection data can include position information of a target object, and finally track the target object in the plurality of frame images by using the acquired detection data and a nuclear correlation filtering algorithm. After the plurality of frame images are acquired, the plurality of key frames are acquired from the frame images instead of being directly input into the detection network, and then the plurality of key frames are input into the detection network.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 illustrates an overall framework diagram of object tracking provided by embodiments of the present application;
FIG. 2 illustrates a method flow diagram of a target tracking method provided by an embodiment of the present application;
FIG. 3 illustrates a method flow diagram of a target tracking method provided by another embodiment of the present application;
fig. 4 shows a detailed flowchart of step S302 in the target tracking method provided in another embodiment of the present application;
FIG. 5 is a schematic diagram illustrating a target tracking method according to another embodiment of the present application for obtaining a plurality of keyframes;
FIG. 6 illustrates a method flow diagram of a target tracking method provided by yet another embodiment of the present application;
FIG. 7 illustrates a block diagram of a target tracking device provided by an embodiment of the present application;
FIG. 8 shows a block diagram of an electronic device provided by an embodiment of the present application;
fig. 9 shows a storage unit for storing or carrying a program code for implementing the object tracking method according to the embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
With the continuous development of computer technology, the requirements of people on the intelligent camera are also continuously improved, and then different functions of the intelligent camera are developed. For example, baidu lens of hundred degrees and intelligent literacy of Huacheng, etc., through which the smart camera can display detailed information of an object in its screen in real time. Generally, a plurality of objects are displayed on a screen, and in order to visually display tag information of each object, a tag is required to be capable of moving along with the object in real time, so that a tracking requirement is provided for an electronic device. The existing target tracking method generally utilizes a CNN detection network to give the position of a target object, so that the problem that a label jumps around the object and is stuck easily occurs when the target is tracked. Although some target tracking methods solve the above-mentioned problems, the cost of improving accuracy is usually to reduce the real-time performance of target tracking.
Therefore, in order to overcome the above-mentioned defects, the present application provides an overall frame diagram of target tracking as shown in fig. 1, and it can be seen from fig. 1 that the whole target tracking can be divided into two parts, the first part is to acquire data to be filtered, and the second part is to acquire a detection frame, that is, the data to be filtered and the detection frame corresponding to the original image can be acquired by the present application, and the target object can be tracked by combining the two data. Specifically, the original image may be input to an initialization extractor, and feature extraction may be performed on the original image using the initialization extractor.
The initialization extractor in the embodiment of the present application may include a target detection network, which may be a MobileNet-SSD (deep convolutional neural network), a goturn (generic Object Tracking using regresion networks), and the like. After the feature extraction is performed by initialization, feature data of a tracking target can be acquired, the feature data can be stored in a Database (DB) of the electronic device, and then the target can be tracked by using the feature data of the tracking target. In addition, as can be seen from fig. 1, when the target tracking is performed, the present application may continuously acquire a new detection frame by using the update extractor, and may further continuously update the feature data stored in the database.
The embodiment of the present application as shown in fig. 2 provides a target tracking method applied to an electronic device, and the method may include steps S201 to S203.
Step S201: the method comprises the steps of obtaining a plurality of frame images of a video to be tracked, and obtaining a plurality of key frames from the plurality of frame images.
In an embodiment of the present invention, the electronic device may be a mobile phone, a notebook computer, a tablet computer (tablet personal computer), a palmtop computer, a laptop computer (laptop computer), a Personal Digital Assistant (PDA), a Mobile Internet Device (MID), or a wearable device (e.g., a smart watch (such as iWatch), a smart bracelet, a pedometer, etc.), or other electronic devices that can be installed and deployed with an instant messaging application client.
In an embodiment, after a video to be tracked is acquired, the video to be tracked may be subjected to framing processing to obtain a plurality of frame images, and relevant parameters of the video to be tracked are stored, where the relevant parameters may include: the video recording method comprises the following steps of video duration length, video coding format, total video frame number, video recording time point, video exposure time, video frame rate, latitude and longitude of the video recording location and the like. In the embodiment of the present application, the acquisition of the multiple frame images may start from the first frame of the video, and the frame-by-frame acquisition is performed sequentially, so that the corresponding state and parameters are saved after each frame image is acquired. For example, acquiring a plurality of frame images from a video file requires saving a file name of the video file, a corresponding decoding type, and the like, and specifically, which way to acquire the frame images of the video to be tracked is used, and is not explicitly displayed here.
When the electronic device acquires a plurality of frame images of a video to be tracked, a plurality of key frames can be acquired from the plurality of frame images, wherein the key frames can be any number of images randomly selected from the plurality of frame images or images selected according to a certain rule. The embodiment of the application can select a plurality of key frames by comparing the pixel values of the plurality of frame images, wherein the pixel values of the key frames are higher than the pixel values of the common frame images; key frames can also be selected by comparing the brightness value of each frame image, namely the brightness value of the key frame is higher; alternatively, a distortion value, an equalization value, or a resolution of each of the frame images may be acquired to acquire a plurality of key frames. Obviously, the key frame mainly refers to an image with better image quality in a plurality of frame images, and in the embodiment of the present application, a plurality of key frames may be obtained by using any one of the above methods, or may be obtained by combining the above methods, and detailed description of how to obtain the plurality of key frames is omitted here.
Step S202: and inputting the plurality of key frames into a detection network to obtain detection data, wherein the detection data comprises the position information of the target object.
In a real-time manner, the electronic device may input a plurality of acquired key frames into a detection network, perform feature extraction using the detection network, and further obtain detection data, where the detection data may include location information of a target object. In the embodiment of the application, the detection network may be a target detection network, and the target detection network may also be referred to as a target extraction network, and the main function of the network is to segment a target object from a target image and identify the target object. The commonly used target detection networks include GOTURN network, MobileNet-SSD deep convolutional neural network, FasterRCNN neural network, Yolo neural network, and SPP-Net (spatial Pyramid) neural network, and the neural networks commonly used for video tracking are GOTURN and MobileNet-SSD.
The GOTURN neural network is a target tracking algorithm for off-line training by using a convolutional neural network, and is characterized by extracting features by using a CNN classification network pre-trained by the existing large-scale classification data set and classifying by using an observation model to obtain a tracking result. The tracking process comprises the steps of obtaining feature expressions of two frames of images of a current frame image and a previous frame image through a VGG-Net (Very Deep consistent networks for Large-scale electronic registration), and obtaining the position of a current frame target image through a decision network for off-line training. The GOTURN neural network has good tracking capability on a complete target in a simple scene and has strong robustness on scale change, deformation and the like.
The MobileNet-SSD deep convolutional neural network is based on a streamline framework, and the lightweight deep neural network is mainly constructed through deep separable convolution, namely the MobileNet-SSD deep convolutional neural network is mainly a lightweight deep network model provided for being suitable for a mobile terminal. The MobileNet-SSD deep convolutional neural network may include two simple global superparameters that effectively balance delay and accuracy, allowing a model builder to select a model of appropriate size according to constraints of a problem, with slightly reduced precision, and with greater real-time performance.
In one embodiment, the MobileNet-SSD deep convolutional neural network can adopt a Feature pyramid for target detection, and during detection, MobileNet (basic network) can be used as a Feature extractor to extract a plurality of Feature Maps (Feature Maps) with different sizes, and Softmax classification and position regression are performed on the plurality of Feature Maps at the same time; for a feature map with width W and height H, setting W × H color boxes (preselected boxes), uniformly distributing the centers of the preselected boxes on the whole map, and then acquiring the accurate positions of the targets through Softmax classification and Bounding Box regreseSion (Bounding Box RegreSSion). In addition, the MobileNet-SSD deep convolution neural network can operate in a neural computing rod, the neural computing rod is a deep learning device using a USB interface, the size of the neural computing rod is slightly larger than that of a U disk, the power consumption is 1W, the floating point performance can reach 100GFLOPs, and the neural computing rod has the advantages of low power consumption and high performance.
Compared with a GOTURN neural network, the MobileNet-SSD deep convolution neural network has high detection precision, strong robustness and good real-time performance when target tracking is carried out, namely, compared with the GOTURN neural network, the MobileNet-SSD deep convolution neural network has higher frame output precision and higher frame output speed, and meanwhile, the method can also preprocess the image according to the frame output result and transmit the preprocessed image to a next-stage classification model. Therefore, in the embodiment of the application, the MobileNet-SSD deep convolutional neural network is used as the detection network, and the main reason is that the speed and the accuracy of the target detection can be well considered.
In one embodiment, the MobileNet-SSD deep convolutional neural network may include four parts: the first part is an input layer, which is mainly used for inputting images, i.e. inputting key frames; the second part is an improved MobileNet convolution network which is mainly used for extracting the characteristics of an input key frame; the third part is an SSD meta structure and is used for classification regression and boundary regression; the fourth part is an output layer which is mainly used for outputting the detection result. The SSD structure is a regression model, classification regression and boundary box regression are performed by using features output by different convolution layers, the contradiction between translation invariance and translation variability can be relieved well, detection precision and speed are compromised well, and the detection precision is improved while the detection speed is improved.
The process of obtaining the detection data by utilizing the MobileNet-SSD deep convolution neural network comprises the following steps: acquiring a first key frame, acquiring a target detection frame corresponding to a target object in the first key frame, and storing position information corresponding to the target detection frame as detection data; reading the next key frame, acquiring a target detection frame corresponding to a target object in the key frame, storing position information corresponding to the target detection frame as detection data, and repeating the steps until the target detection frame corresponding to the target object in each key frame is acquired.
Step S203: and tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm.
In one embodiment, when the detection data is acquired and stored in the database, the target object in the plurality of frame images may be tracked by using a Kernel Correlation Filter (KCF), where the KCF is a target tracking algorithm based on an online learning model. The KCF tracking process can be divided into four steps of target feature extraction, template training, target position prediction and template updating. The KCF adopts a ridge regression model, the ridge regression is a linear regression model, and the ridge regression has a simple closed-form solution, so that the classification effect of the model is equivalent to that of a complex model such as SVM.
The specific process of tracking the target object in the plurality of frame images by using the detection data and the kernel correlation filtering algorithm is as follows: and transmitting the position information of the target object contained in the detection data to a kernel correlation filtering algorithm, and tracking the detected target object by using the kernel correlation filtering algorithm to obtain the motion track of the target object. Obviously, the kernel correlation filtering algorithm is a discriminant tracking method, which can train a target detector during target tracking, detect whether the next keyframe predicted position is a target or not by using the target detector, and then update the target detector by using a new detection result, while the target detector is trained, a target area is generally selected as a positive sample, an area around the target is a negative sample, and the probability that an area closer to the target is a positive sample is higher.
According to the target tracking method, after the frame images of the video to be tracked are acquired, the key frames can be acquired from the frame images, the overall quality of the key frames is better than that of the frame images, therefore, the key frames are input into a detection network, the acquired detection data are more accurate, namely the efficiency and accuracy of target detection can be improved to a certain extent, meanwhile, the target objects in the frame images are tracked by using a kernel correlation filtering algorithm, and the requirements of stability and effectiveness of tracking can be met.
Another embodiment of the present application provides a target tracking method applied to an electronic device, please refer to fig. 3, where the target tracking method may include steps S301 to S304.
Step S301: the method comprises the steps of obtaining a plurality of frame images of a video to be tracked, and dividing the plurality of frame images into N image groups.
As can be known from the above description, when a video to be tracked is acquired, an electronic device may perform framing processing on the video to be tracked, so as to obtain a plurality of frame images, and in order to improve accuracy of target tracking, in the embodiment of the present application, a plurality of key frames with a fixed number may be acquired from the plurality of frame images, so that the plurality of frame images need to be divided into N image groups, and the number of frame images included in each image group is kept consistent. For example, a video to be tracked is acquired and is subjected to framing processing to obtain 14400 frame images, and the frame images can be divided into 720 image groups, that is, each image group contains 20 frame images.
Step S302: and acquiring a key frame from each image group to obtain a plurality of key frames.
In an embodiment, after dividing a plurality of frame images of a video to be tracked into N image groups, one key frame may be obtained from each of the image groups, so that a plurality of key frames may be obtained. For example, after 14400 frame images can be divided into 720 image groups, one key frame can be obtained from each image group, that is, one key frame is selected from 20 frame images, and finally 720 key frames can be obtained. The specific steps of obtaining a plurality of key frames from each of the image groups are shown in fig. 4, and it can be known from fig. 4 that obtaining a plurality of key frames may include steps S401 to S405.
Step S401: an alternate frame is obtained from the current group of images.
In one embodiment, the first frame image in the current image group may be used as the candidate frame, or one frame image may be randomly acquired from the current image group as the candidate frame. In a specific implementation, each image group includes 20 frame images, in the embodiment of the present application, a first frame of the 20 frame images may be used as a candidate frame, or a tenth frame of the 20 frame images may be used as a candidate frame, and specifically which frame image in the current image group is used as a candidate frame is not specifically limited here.
Step S402: and judging whether the alternative frame is a clear image.
In one embodiment, determining whether the candidate frame is a sharp image may include: and acquiring a pixel mean value of the candidate frame, and judging whether the pixel mean value is greater than a preset threshold value, wherein when the pixel mean value is greater than the preset threshold value, the candidate frame is a clear image, and then the step S403 is performed. The pixel mean value refers to the mean value of all R channel pixels, G channel pixels and B channel pixels in the alternative frame, and then the mean values of the three channels are averaged, so that the obtained value is the pixel mean value. When the pixel mean value is less than or equal to the preset threshold, it may be determined that the candidate frame is not a sharp image, that is, the candidate frame is a blurred image, at this time, a next frame image of the candidate frame may be acquired, that is, step S404 is performed.
In a specific embodiment, a current image group may include 20 frame images, a first frame image of the 20 frame images is first used as a candidate frame, a pixel mean value of the candidate frame is obtained, whether the pixel mean value is greater than a preset threshold is judged, if the pixel mean value is greater than the preset threshold, it is indicated that the first frame image is a clear image, that is, the candidate frame may be used as a key frame, and if the pixel mean value is less than or equal to the preset threshold, a second frame image of the 20 frame images is obtained, and whether the second frame image is a clear image is judged, and so on until a clear image is obtained. In summary, in the embodiment of the present application, whether the candidate frame is a blurred image may be determined by using laplacian edge detection, whether the candidate frame is a blurred image may also be determined by using a sharpness algorithm BreNNer, or whether the candidate frame is a blurred image may also be determined by using a Tenengrad gradient function, and specifically, which method is used to determine the sharpness of the candidate frame is not explicitly limited herein.
Step S403: and taking the alternative frame as a key frame of the current image group.
If the candidate frame is determined to be a clear image, the candidate frame is used as a key frame of the current image group, after which the next image group can be acquired, i.e. the process proceeds to step S405.
Step S404: and acquiring a next frame image of the alternative frame, and taking the next frame image as a new alternative frame.
If the candidate frame is not a clear image, acquiring a next frame image of the candidate frame, taking the next frame image as a new candidate frame, and returning to execute the operation of judging whether the candidate frame is a clear image, namely, entering step S402. For example, after a first frame image of a first image group is acquired, it is determined that the first frame image is a blurred image, at this time, a second frame image of the first image group may be acquired, and the second frame image is used as a candidate frame to determine whether a pixel mean value of the second frame image is greater than a preset threshold, if the pixel mean value of the second frame image is still less than or equal to the preset threshold, a third frame image of the first image group is acquired, and it is determined whether a pixel mean value of the third frame image is greater than the preset threshold, and the steps are sequentially performed until a key frame meeting a condition is acquired.
It should be noted that, if the pixel mean values of all the frame images in the current image group are less than or equal to the preset threshold, the pixel mean values of the frame images may be sorted, and the frame image with the largest pixel mean value is selected as the key frame of the current image group; the last frame image in the current image group may also be directly used as the key frame of the current image group, or the key frame may not be acquired from the current image group, and the key frame may continue to be acquired from the next image group.
Step S405: and acquiring a next image group, taking the next image group as a current image group, and acquiring a key frame in the current image group.
In one embodiment, if the candidate frame is a clear image, the next image group may be acquired after the candidate frame is used as a key frame of the current image group, and the key frame in the current image group may be acquired by using the next image group as the current image group. To more clearly understand how to obtain multiple key frames from multiple frame images, a specific embodiment is given in which one key frame can be obtained from every 20 frames. Firstly, an initial frame (reference numeral 1) of a first image group can be used as a candidate frame, whether the candidate frame is blurred or not is judged, if the candidate frame is not blurred, the initial frame is used as a key frame to be input into a detection network, then a second image group is obtained, a first frame (reference numeral 21) of the second image group is used as the candidate frame, if the first frame (reference numeral 21) is blurred, a second frame (reference numeral 22) of the image group is obtained, the frame is used as the candidate frame, if the second frame (reference numeral 22) is not blurred, the second frame (reference numeral 22) is input into the detection network, then a third image group is obtained, a second frame (reference numeral 42) of the third image group is used as the candidate frame, and so on until all key frames of the image groups are obtained.
As shown in fig. 5, a specific schematic diagram of acquiring a plurality of key frames is given, and it can be seen from fig. 5 that the key frames are mainly acquired by a key frame extractor, before selecting the key frames, a plurality of frame images need to be sent to the key frame extractor in a data stream manner, where a sending frame rate (several frames are sent in each second) of the data stream is 30fps, that is, a time interval of every 20 frames is 2/3 ≈ 0.67s, and the time is also an update frequency of a convolution kernel of a kernel correlation filtering algorithm, which can completely satisfy a frequency of scene update in actual use, and can avoid a power consumption problem caused by too frequent operation of a detection network model.
In an embodiment, the key extractor may select a key frame from every 20 frame images, before which a candidate frame is acquired, determine whether the candidate frame is blurred through laplacian edge detection, determine that the candidate frame is a key frame if the candidate frame is not blurred, skip 20 frames to continue acquiring the candidate frame, and determine whether the selected candidate frame is blurred. And if the current frame is fuzzy, selecting the next frame of the current frame to perform fuzzy judgment again, and so on. The initial frame may be considered as a special key frame (a first key frame), and the fuzzy judgment may be to judge whether the pixel mean of the candidate frame is greater than a preset threshold, and if so, the candidate frame is sent to the detection network as a key frame.
It should be noted that in the embodiment of the present application, one key frame may be acquired from each image group, two key frames may be acquired from each image group, or three key frames may be acquired from each image group, and specifically, how many key frames are acquired from each image group is not specifically limited, and may be selected according to actual situations. For example, if the quality of the video to be tracked is overall good, several more key frames can be acquired from each image group, and if the overall quality of the video is not good and the definition is poor, the number of acquired key frames in each image group can be reduced appropriately.
Step S303: and inputting the plurality of key frames into a detection network to obtain detection data, wherein the detection data comprises the position information of the target object.
Step S304: and tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm.
The target tracking method is implemented and provided by the application, in order to enable target tracking to be more accurate, the acquired frame images can be divided into N image groups, a key frame is acquired from each image group, and the key frames are acquired, wherein the key frames can be acquired by judging the pixel mean value of the images.
Still another embodiment of the present application provides a target tracking method applied to an electronic device, please refer to fig. 6, where the target tracking method may include steps S601 to S606.
Step S601: the method comprises the steps of obtaining a plurality of frame images of a video to be tracked, and obtaining a plurality of key frames from the plurality of frame images.
Step S602: and inputting the plurality of key frames into a detection network to obtain detection data, wherein the detection data comprises the position information of the target object.
In an embodiment, in order to make target tracking more accurate, a plurality of key frames may be input into a detection network to obtain a target detection frame corresponding to a target object in each key frame, and then a confidence degree corresponding to the target detection frame may be obtained, and it is determined whether the confidence degree of the target detection frame is greater than a confidence threshold, and if the confidence degree of the target detection frame is greater than the confidence threshold, position information corresponding to the target detection frame is used as detection data. The confidence of the target boundary box refers to the probability that the candidate box really includes the target, and the calculation process of the confidence of the target boundary box is related to a specific detection network.
In addition, if the confidence of the target detection frame is less than or equal to the confidence threshold, the position information corresponding to the target detection frame is not used as the detection data, that is, the detection frame with the confidence less than or equal to the confidence threshold is not displayed when the screen of the electronic device displays the target. For example, the confidence threshold is 0.5, the target object included in the key frame includes a person, a dog, and a cat, the confidence of the detection frame corresponding to the person obtained by the detection network is 0.88, the confidence of the detection frame corresponding to the dog is 0.75, and the confidence of the detection frame corresponding to the cat is 0.45, and obviously, both the confidence of the detection frame corresponding to the person and the confidence of the detection frame corresponding to the dog are greater than the confidence threshold 0.5, and therefore, the position information corresponding to the detection frame corresponding to the person and the position information corresponding to the detection frame corresponding to the dog can be used as the detection data, and the position information corresponding to the detection frame corresponding to the cat cannot be used as the detection data, so only the detection frames corresponding to the person and the dog are finally displayed on the screen of the electronic device, and the detection frame corresponding to the cat is not displayed.
Step S603: and judging whether the detection data is successfully acquired.
According to the introduction, after the electronic device acquires the plurality of key frames, the plurality of key frames can be input into the detection network, feature extraction is performed on the key frames by using the detection network, and then detection data can be acquired, wherein the detection data can include position information of the target object, and then the electronic device can store the detection data in the database. When the key frame is not input to the detection network, or the key frame is input to the detection network, a frame loss phenomenon occurs, or when the parameter of the detection network generates an error warning during operation, the detection data may not be obtained. Therefore, it is necessary to determine whether the detection data is successfully acquired, if the detection data is successfully acquired, the detection data and the kernel-dependent filtering algorithm are used to track the target object in the plurality of frame images, that is, step S604 is performed, and if the detection data is not successfully acquired, the tracking of the target object in the target frame image in the plurality of frame images is suspended, that is, step S605 is performed.
Step S604: and tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm.
Step S605: pausing tracking the target object in a target frame image of the plurality of frame images.
In one embodiment, if the detection data is not successfully acquired, the tracking of the target object in the target frame image of the plurality of frame images is suspended, and then the next frame image of the target frame is taken as a new target frame image, i.e., the process proceeds to step S606.
Step S606: and taking the next frame image of the target frame as a new target frame image.
When it is determined that the electronic device has not successfully acquired the detection data, the tracking of the target object in the target frame image of the plurality of frame images may be suspended, then the next frame image of the target frame is used as a new target frame image, and the operation of determining whether the detection data has been successfully acquired is executed, that is, the process proceeds to step S603. Therefore, before the initial detection data is stored in the database, the images transmitted to the kernel correlation filter algorithm are discarded and are not processed until the initialization extractor is completed, and target tracking is started after the kernel correlation filter algorithm can access the detection data in the database. When the target tracking is carried out, the embodiment of the application still extracts new features according to the position information of the target object obtained after the key frame is sent into the detection network, and then updates the new features into the database, and the latest data can also be used for the target tracking immediately.
As can be understood from fig. 1, in the embodiment of the present application, the operation of inputting the plurality of keyframes to the detection network, and the operation of tracking the target object in the plurality of frame images by using the detection data and the kernel-dependent filtering algorithm may be performed in parallel. In other words, the operation of obtaining the detection data by using the detection network is independent of the structure of the kernel-related filtering algorithm, the detection network and the kernel-related filtering algorithm run in respective threads in the software framework design, a plurality of frame images are divided by the key frame extractor and then are respectively transmitted to the threads corresponding to the detection network and the kernel-related filtering algorithm, the two threads are only interacted in the database, namely, the detection network obtains the detection data and stores the detection data in the database, and the kernel-related filtering algorithm extracts the detection data from the database and tracks the target object by using the detection data.
According to the introduction, the real-time performance can be well reflected without waiting for the operation result of the detection network for tracking the nuclear correlation filtering algorithm, the frame-out precision and the frame-out quantity of the detection network can be independently adjusted, the accuracy and the quantity of the target which can be tracked by the nuclear correlation filtering algorithm can be further controlled, meanwhile, the detection result of the detection network can be used for target tracking and image preprocessing, and the accurate classification result can be obtained even if the detection result is transmitted to the network with poor real-time performance.
According to the target tracking method provided by the embodiment of the application, more accurate position information can be acquired by using the detection network, so that the tracking accuracy output by the nuclear correlation filtering algorithm can be improved, the number of frames can be controlled through a confidence threshold value, so that the tracking accuracy is improved, the detection network and the nuclear correlation filtering algorithm are operated in parallel, the data updating of the nuclear correlation filtering algorithm from the operation result of the detection network does not fall into the waiting state, the nuclear correlation filtering algorithm can still operate even if the detection network stops operating, and the real-time performance of target tracking is improved.
Referring to fig. 7, a target tracking apparatus 700 according to an embodiment of the present application is applied to an electronic device, and specifically, the target tracking apparatus 700 includes: a key frame acquisition module 701, a detection data acquisition module 702 and a target tracking module 703.
The key frame obtaining module 701 is configured to obtain a plurality of frame images of a video to be tracked, and obtain a plurality of key frames from the plurality of frame images.
Further, the key frame acquiring module 701 is further configured to divide the plurality of frame images into N image groups, and acquire a key frame from each of the image groups to obtain a plurality of key frames.
Further, the key frame acquiring module 701 is further configured to acquire an alternative frame from the current image group, determine whether the alternative frame is a clear image, and if the alternative frame is a clear image, use the alternative frame as the key frame of the current image group.
Further, the key frame acquiring module 701 is further configured to, if the candidate frame is a clear image, acquire a next image group after taking the candidate frame as a key frame of the current image group, and acquire a key frame in the current image group by taking the next image group as the current image group.
Further, the key frame obtaining module 701 is further configured to obtain a pixel mean value of the candidate frame, and determine whether the pixel mean value is greater than a preset threshold, where when the pixel mean value is greater than the preset threshold, the candidate frame is a clear image.
Further, the key frame obtaining module 701 is further configured to, if the candidate frame is not a clear image, obtain a next frame image of the candidate frame, use the next frame image as a new candidate frame, and return to perform the operation of determining whether the candidate frame is a clear image.
A detection data obtaining module 702, configured to input the plurality of key frames to a detection network to obtain detection data, where the detection data includes position information of the target object.
Further, the detection data obtaining module 702 is further configured to input the plurality of key frames into a detection network, so as to obtain a target detection frame corresponding to the target object in each key frame, obtain a confidence level of the target detection frame, and determine whether the confidence level of the target detection frame is greater than a confidence threshold, and if the confidence level of the target detection frame is greater than the confidence threshold, use position information corresponding to the target detection frame as the detection data.
A target tracking module 703, configured to track the target object in the multiple frame images by using the detection data and a kernel correlation filtering algorithm.
Further, the target tracking module 703 is further configured to determine whether the detection data is successfully acquired, and if the detection data is successfully acquired, track the target object in the plurality of frame images by using the detection data and a kernel-dependent filtering algorithm.
Further, the target tracking module 703 is further configured to, if the detection data is not successfully acquired, suspend tracking the target object in a target frame image of the plurality of frame images, use a next frame image of the target frame as a new target frame image, and return to perform the operation of determining whether the detection data is successfully acquired.
In addition, the operation of inputting the plurality of key frames into a detection network and the operation of tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm are executed in parallel.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, the coupling between the modules may be electrical, mechanical or other type of coupling.
In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
Referring to fig. 8, a block diagram of an electronic device according to an embodiment of the present disclosure is shown. The electronic device 100 may be a smart phone, a tablet computer, an electronic book, or other electronic devices capable of running an application. The electronic device 100 in the present application may include one or more of the following components: a processor 110, a memory 120, a screen 130, and one or more applications, wherein the one or more applications may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more programs configured to perform the methods as described in the aforementioned method embodiments.
Processor 110 may include one or more processing cores. The processor 110 connects various parts within the overall electronic device 100 using various interfaces and lines, and performs various functions of the electronic device 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-programmable gate array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or a combination of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.
Specifically, the processor 110 may include one or a combination of a Central Processing Unit (CPU), a multimedia display processor and Graphics Processing Unit (GPU), a modem, and the like. The CPU mainly processes an operating system, a user interface, a client and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.
The memory 120 may include a Random Access Memory (RAM) or a Read-only memory (Read-only memory). The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created by the electronic device 100 during use (e.g., phone book, audio-video data, chat log data), and the like.
The screen 130 is used to display information input by a user, information provided to the user, and various graphic user interfaces of the electronic device, which may be formed of graphics, text, icons, numerals, video, and any combination thereof, and in one example, a touch screen may be provided on the display panel so as to be integrated with the display panel.
Referring to fig. 9, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 1000 has stored therein program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 1000 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 1000 includes a non-volatile computer-readable medium (non-transitory-readablestogramedium). The computer readable storage medium 1000 has storage space for program code 1010 for performing any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 1010 may be compressed, for example, in a suitable form.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (13)

1. A target tracking method is applied to an electronic device, and the method comprises the following steps:
acquiring a plurality of frame images of a video to be tracked, and acquiring a plurality of key frames from the plurality of frame images;
inputting the plurality of key frames into a detection network to obtain detection data, wherein the detection data comprises position information of a target object;
and tracking the target object in the plurality of frame images by using the detection data and a kernel correlation filtering algorithm.
2. The method of claim 1, wherein said obtaining a plurality of keyframes from said plurality of frame images comprises:
dividing the plurality of frame images into N image groups;
and acquiring a key frame from each image group to obtain a plurality of key frames.
3. The method of claim 2, wherein said obtaining a key frame from each of said image groups comprises:
acquiring an alternative frame from a current image group;
judging whether the alternative frame is a clear image;
and if the alternative frame is a clear image, taking the alternative frame as a key frame of the current image group.
4. The method according to claim 3, wherein said taking said candidate frame as a key frame of said current group of pictures after said candidate frame if said candidate frame is a sharp picture comprises:
and acquiring a next image group, taking the next image group as a current image group, and acquiring a key frame in the current image group.
5. The method of claim 3, wherein the determining whether the candidate frame is a sharp image comprises:
acquiring a pixel mean value of the alternative frame, and judging whether the pixel mean value is greater than a preset threshold value;
and when the pixel mean value is larger than a preset threshold value, the alternative frame is a clear image.
6. The method of claim 3, further comprising:
and if the alternative frame is not a clear image, acquiring a next frame image of the alternative frame, taking the next frame image as a new alternative frame, and returning to execute the operation of judging whether the alternative frame is the clear image.
7. The method of claim 1, wherein tracking the target object in the plurality of frame images using the detection data and a kernel-dependent filtering algorithm comprises:
judging whether the detection data is successfully acquired or not;
and if the detection data are successfully acquired, tracking the target object in the plurality of frame images by using the detection data and a nuclear correlation filtering algorithm.
8. The method of claim 7, further comprising:
if the detection data is not successfully acquired, suspending tracking of the target object in a target frame image in the plurality of frame images;
and taking the next frame image of the target frame as a new target frame image, and returning to execute the operation of judging whether the detection data is successfully acquired.
9. The method of claim 1, wherein the inputting the plurality of keyframes into a detection network and the tracking the target object in the plurality of frame images using the detection data and a kernel-dependent filtering algorithm are performed in parallel.
10. The method of claim 1, wherein inputting the plurality of key frames into a detection network to obtain detection data comprises:
inputting the plurality of key frames into a detection network to obtain a target detection frame corresponding to a target object in each key frame;
obtaining the confidence of the target detection frame, and judging whether the confidence of the target detection frame is greater than a confidence threshold value;
and if the confidence of the target detection box is greater than a confidence threshold, using the position information corresponding to the target detection box as detection data.
11. An object tracking apparatus applied to an electronic device, the apparatus comprising:
the system comprises a key frame acquisition module, a tracking module and a tracking module, wherein the key frame acquisition module is used for acquiring a plurality of frame images of a video to be tracked and acquiring a plurality of key frames from the plurality of frame images;
the detection data acquisition module is used for inputting the plurality of key frames into a detection network to obtain detection data, and the detection data comprises position information of a target object;
and the target tracking module is used for tracking the target object in the plurality of frame images by using the detection data and the nuclear correlation filtering algorithm.
12. An electronic device, comprising:
one or more processors;
a memory;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-10.
13. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 10.
CN201910810400.1A 2019-08-29 2019-08-29 Target tracking method and device, electronic equipment and storage medium Active CN110610510B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910810400.1A CN110610510B (en) 2019-08-29 2019-08-29 Target tracking method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910810400.1A CN110610510B (en) 2019-08-29 2019-08-29 Target tracking method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110610510A true CN110610510A (en) 2019-12-24
CN110610510B CN110610510B (en) 2022-12-16

Family

ID=68890436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910810400.1A Active CN110610510B (en) 2019-08-29 2019-08-29 Target tracking method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110610510B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598924A (en) * 2020-05-08 2020-08-28 腾讯科技(深圳)有限公司 Target tracking method and device, computer equipment and storage medium
CN111598923A (en) * 2020-05-08 2020-08-28 腾讯科技(深圳)有限公司 Target tracking method and device, computer equipment and storage medium
CN111860559A (en) * 2019-12-31 2020-10-30 滴图(北京)科技有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111879313A (en) * 2020-07-31 2020-11-03 中国人民解放军国防科技大学 Multi-target continuous positioning method and system based on unmanned aerial vehicle image recognition
CN111915639A (en) * 2020-08-06 2020-11-10 广州市百果园信息技术有限公司 Target detection tracking method and device, electronic equipment and storage medium
CN112001950A (en) * 2020-08-25 2020-11-27 天地伟业技术有限公司 Multi-target tracking algorithm based on target detection and feature extraction combined model
CN112070802A (en) * 2020-09-02 2020-12-11 合肥英睿系统技术有限公司 Target tracking method, device, equipment and computer readable storage medium
CN112188212A (en) * 2020-10-12 2021-01-05 杭州电子科技大学 Method and device for intelligent transcoding of high-definition monitoring video
CN113076882A (en) * 2021-04-03 2021-07-06 国家计算机网络与信息安全管理中心 Specific mark detection method based on deep learning
CN113312949A (en) * 2020-04-13 2021-08-27 阿里巴巴集团控股有限公司 Video data processing method, video data processing device and electronic equipment
CN113450387A (en) * 2020-03-26 2021-09-28 京东方科技集团股份有限公司 Target tracking method and device, electronic equipment and computer readable storage medium
CN113497889A (en) * 2020-04-08 2021-10-12 杭州萤石软件有限公司 Object tracking method and device under motion shooting condition and storage medium
WO2022193990A1 (en) * 2021-03-17 2022-09-22 腾讯科技(深圳)有限公司 Method and apparatus for detection and tracking, device, storage medium, and computer program product
CN115845350A (en) * 2023-03-02 2023-03-28 成都谷帝科技有限公司 Method and system for automatically measuring distance of standing long jump
WO2023221770A1 (en) * 2022-05-16 2023-11-23 京东方科技集团股份有限公司 Dynamic target analysis method and apparatus, device, and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105208377A (en) * 2015-09-14 2015-12-30 北京奇虎科技有限公司 Image processing method and device
CN107609497A (en) * 2017-08-31 2018-01-19 武汉世纪金桥安全技术有限公司 The real-time video face identification method and system of view-based access control model tracking technique
US20180137892A1 (en) * 2016-11-16 2018-05-17 Adobe Systems Incorporated Robust tracking of objects in videos
CN108053427A (en) * 2017-10-31 2018-05-18 深圳大学 A kind of modified multi-object tracking method, system and device based on KCF and Kalman
CN108629224A (en) * 2017-03-15 2018-10-09 北京京东尚科信息技术有限公司 Information demonstrating method and device
CN109035304A (en) * 2018-08-07 2018-12-18 北京清瑞维航技术发展有限公司 Method for tracking target, calculates equipment and device at medium
CN109151501A (en) * 2018-10-09 2019-01-04 北京周同科技有限公司 A kind of video key frame extracting method, device, terminal device and storage medium
CN109872342A (en) * 2019-02-01 2019-06-11 北京清帆科技有限公司 A kind of method for tracking target under special scenes
CN110008793A (en) * 2018-01-05 2019-07-12 中国移动通信有限公司研究院 Face identification method, device and equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105208377A (en) * 2015-09-14 2015-12-30 北京奇虎科技有限公司 Image processing method and device
US20180137892A1 (en) * 2016-11-16 2018-05-17 Adobe Systems Incorporated Robust tracking of objects in videos
CN108629224A (en) * 2017-03-15 2018-10-09 北京京东尚科信息技术有限公司 Information demonstrating method and device
CN107609497A (en) * 2017-08-31 2018-01-19 武汉世纪金桥安全技术有限公司 The real-time video face identification method and system of view-based access control model tracking technique
CN108053427A (en) * 2017-10-31 2018-05-18 深圳大学 A kind of modified multi-object tracking method, system and device based on KCF and Kalman
CN110008793A (en) * 2018-01-05 2019-07-12 中国移动通信有限公司研究院 Face identification method, device and equipment
CN109035304A (en) * 2018-08-07 2018-12-18 北京清瑞维航技术发展有限公司 Method for tracking target, calculates equipment and device at medium
CN109151501A (en) * 2018-10-09 2019-01-04 北京周同科技有限公司 A kind of video key frame extracting method, device, terminal device and storage medium
CN109872342A (en) * 2019-02-01 2019-06-11 北京清帆科技有限公司 A kind of method for tracking target under special scenes

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
IVÁN GONZÁLEZ-DÍAZ 等: ""Temporal segmentation and keyframe selection methods for user-generated video search-based annotation"", 《ELSEVIER》 *
包晓安等: "基于KCF和SIFT特征的抗遮挡目标跟踪算法", 《计算机测量与控制》 *
李克靖等: "一种基于深度置信网络的目标跟踪算法", 《电子设计工程》 *
蔡成涛 著: "《海洋环境下的计算机视觉技术》", 31 October 2015 *

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860559A (en) * 2019-12-31 2020-10-30 滴图(北京)科技有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN113450387A (en) * 2020-03-26 2021-09-28 京东方科技集团股份有限公司 Target tracking method and device, electronic equipment and computer readable storage medium
CN113497889B (en) * 2020-04-08 2023-05-26 杭州萤石软件有限公司 Object tracking method and device under motion shooting condition and storage medium
CN113497889A (en) * 2020-04-08 2021-10-12 杭州萤石软件有限公司 Object tracking method and device under motion shooting condition and storage medium
CN113312949B (en) * 2020-04-13 2023-11-24 阿里巴巴集团控股有限公司 Video data processing method, video data processing device and electronic equipment
CN113312949A (en) * 2020-04-13 2021-08-27 阿里巴巴集团控股有限公司 Video data processing method, video data processing device and electronic equipment
CN111598924A (en) * 2020-05-08 2020-08-28 腾讯科技(深圳)有限公司 Target tracking method and device, computer equipment and storage medium
CN111598923A (en) * 2020-05-08 2020-08-28 腾讯科技(深圳)有限公司 Target tracking method and device, computer equipment and storage medium
CN111598924B (en) * 2020-05-08 2022-09-30 腾讯科技(深圳)有限公司 Target tracking method and device, computer equipment and storage medium
CN111879313A (en) * 2020-07-31 2020-11-03 中国人民解放军国防科技大学 Multi-target continuous positioning method and system based on unmanned aerial vehicle image recognition
CN111915639B (en) * 2020-08-06 2024-05-31 广州市百果园信息技术有限公司 Target detection tracking method, device, electronic equipment and storage medium
CN111915639A (en) * 2020-08-06 2020-11-10 广州市百果园信息技术有限公司 Target detection tracking method and device, electronic equipment and storage medium
WO2022028592A1 (en) * 2020-08-06 2022-02-10 百果园技术(新加坡)有限公司 Target detection and tracking method and apparatus, electronic device, and storage medium
CN112001950B (en) * 2020-08-25 2024-04-19 天地伟业技术有限公司 Multi-target tracking algorithm based on target detection and feature extraction combined model
CN112001950A (en) * 2020-08-25 2020-11-27 天地伟业技术有限公司 Multi-target tracking algorithm based on target detection and feature extraction combined model
CN112070802A (en) * 2020-09-02 2020-12-11 合肥英睿系统技术有限公司 Target tracking method, device, equipment and computer readable storage medium
CN112070802B (en) * 2020-09-02 2024-01-26 合肥英睿系统技术有限公司 Target tracking method, device, equipment and computer readable storage medium
CN112188212B (en) * 2020-10-12 2024-02-13 杭州电子科技大学 Intelligent transcoding method and device for high-definition monitoring video
CN112188212A (en) * 2020-10-12 2021-01-05 杭州电子科技大学 Method and device for intelligent transcoding of high-definition monitoring video
WO2022193990A1 (en) * 2021-03-17 2022-09-22 腾讯科技(深圳)有限公司 Method and apparatus for detection and tracking, device, storage medium, and computer program product
CN113076882A (en) * 2021-04-03 2021-07-06 国家计算机网络与信息安全管理中心 Specific mark detection method based on deep learning
WO2023221770A1 (en) * 2022-05-16 2023-11-23 京东方科技集团股份有限公司 Dynamic target analysis method and apparatus, device, and storage medium
CN115845350B (en) * 2023-03-02 2023-05-09 成都谷帝科技有限公司 Method and system for automatic ranging of standing long jump
CN115845350A (en) * 2023-03-02 2023-03-28 成都谷帝科技有限公司 Method and system for automatically measuring distance of standing long jump

Also Published As

Publication number Publication date
CN110610510B (en) 2022-12-16

Similar Documents

Publication Publication Date Title
CN110610510B (en) Target tracking method and device, electronic equipment and storage medium
EP3961485A1 (en) Image processing method, apparatus and device, and storage medium
CN111488791A (en) On-device classification of fingertip movement patterns as gestures in real time
US20180114071A1 (en) Method for analysing media content
TW202139183A (en) Method of detecting object based on artificial intelligence, device, equipment and computer-readable storage medium
CN111783620A (en) Expression recognition method, device, equipment and storage medium
CN112669344A (en) Method and device for positioning moving object, electronic equipment and storage medium
US20230267142A1 (en) Multimedia content publishing method and apparatus, electronic device, and storage medium
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
Zhang et al. An imbalance compensation framework for background subtraction
US11681409B2 (en) Systems and methods for augmented or mixed reality writing
EP3933674A1 (en) Method, apparatus, device, storage medium and program for processing image
EP3852007A2 (en) Method, apparatus, electronic device, readable storage medium and program for classifying video
CN114117128A (en) Method, system and equipment for video annotation
CN111046746A (en) License plate detection method and device
CN111414910A (en) Small target enhancement detection method and device based on double convolutional neural network
JP2022090633A (en) Method, computer program product and computer system for improving object detection within high-resolution image
Qi et al. A DNN-based object detection system on mobile cloud computing
KR102440198B1 (en) VIDEO SEARCH METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
CN113810765A (en) Video processing method, apparatus, device and medium
Deng et al. Deep learning in crowd counting: A survey
Delibasoglu et al. Motion detection in moving camera videos using background modeling and FlowNet
CN112580584A (en) Method, device and system for detecting standing behavior and storage medium
CN113361519B (en) Target processing method, training method of target processing model and device thereof
CN111914850A (en) Picture feature extraction method, device, server and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant