CN112926356A

CN112926356A - Target tracking method and device

Info

Publication number: CN112926356A
Application number: CN201911236052.8A
Authority: CN
Inventors: 朱兆琪; 董玉新; 安山
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2019-12-05
Filing date: 2019-12-05
Publication date: 2021-06-08
Anticipated expiration: 2039-12-05
Also published as: CN112926356B

Abstract

The invention discloses a target tracking method and device, and relates to the technical field of computers. One embodiment of the method comprises: carrying out target detection on the kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1; respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the (k + 1) th frame image based on the target detection frame in the kth frame image; determining the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image; and when the average displacement is smaller than or equal to the threshold value, correcting the target detection frame in the k frame of image through the average displacement, and taking the corrected target detection frame as the target detection frame of the k +2 frame of image to realize target tracking. The method and the device can solve the problems that each frame of image needs target detection, consumes long time and cannot meet the real-time requirement, further improve the detection efficiency and are suitable for application scenes with high real-time requirements.

Description

Target tracking method and device

Technical Field

The invention relates to the technical field of computers, in particular to a target tracking method and device.

Background

Target tracking is an important part of an automatic identification system, and the technology is widely applied. It generally refers to that for any given image, a certain strategy is adopted to search the image to determine whether the image contains a target (such as a human face), and if the image contains the target, the position, size and the like of the target can be returned.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: the existing target tracking algorithm is mainly divided into a traditional algorithm and a depth algorithm, the traditional algorithm is kcf (Kernel Correlation Filter) and other related filtering algorithms, a target to be tracked is given, and then the maximum response position in an image is obtained through a Filter, so that target tracking is realized. The depth algorithm regresses the position of the target in the image by extracting the characteristics of the target. However, the two methods are large in calculation amount and have high requirements on performance. Due to the limited performance of the mobile terminal, it is difficult to deploy and operate the mobile terminal in real time.

Disclosure of Invention

In view of this, embodiments of the present invention provide a target tracking method and apparatus, which can solve the problems that each frame of image needs target detection, is long in time consumption, and cannot meet a real-time requirement, thereby improving detection efficiency, and being suitable for an application scenario with a high real-time requirement.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a target tracking method including:

carrying out target detection on the kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;

respectively determining a plurality of target key points in the kth frame image and a plurality of target key points in the (k + 1) th frame image based on the target detection frame in the kth frame image;

determining the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image;

and when the average displacement is smaller than or equal to a threshold value, correcting a target detection frame in the k frame image through the average displacement, and using the corrected target detection frame as a target detection frame of the k +2 frame image to realize target tracking.

Optionally, the method further comprises: and when the average displacement is larger than a threshold value, carrying out target detection on the (k + 2) th frame image, and determining a target detection frame in the (k + 2) th frame image so as to realize target tracking.

Optionally, determining an average displacement of the target keypoint in the k frame image and the target keypoint in the k +1 frame image comprises:

respectively determining the average positions of a plurality of target key points in the k frame image and the average positions of a plurality of target key points in the k +1 frame image;

and calculating the displacement difference between the average position of the plurality of target key points in the k +1 frame image and the average position of the plurality of target key points in the k frame image, and taking the displacement difference as the average displacement of the target key points in the k frame image and the target key points in the k +1 frame image.

Optionally, the correcting the target detection frame in the k frame image by the average displacement includes:

and translating the target detection frame in the k frame image according to the average displacement.

To achieve the above object, according to another aspect of embodiments of the present invention, there is provided an object tracking apparatus including:

the detection frame determining module is used for carrying out target detection on the kth frame image and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1;

a key point determining module, configured to determine, based on the target detection frame in the kth frame image, a plurality of target key points in the kth frame image and a plurality of target key points in the (k + 1) th frame image, respectively;

a displacement determining module, configured to determine an average displacement between a target key point in the k frame image and a target key point in the (k + 1) th frame image;

and the tracking module is used for correcting the target detection frame in the k frame image through the average displacement when the average displacement is less than or equal to a threshold value, and using the corrected target detection frame as the target detection frame of the k +2 frame image so as to realize target tracking.

Optionally, the tracking module is further configured to: and when the average displacement is larger than a threshold value, carrying out target detection on the (k + 2) th frame image, and determining a target detection frame in the (k + 2) th frame image so as to realize target tracking.

Optionally, the displacement determining module is further configured to:

Optionally, the tracking module is further configured to: and translating the target detection frame in the k frame image according to the average displacement.

To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an electronic apparatus including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the object tracking method of an embodiment of the present invention.

To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided a computer-readable medium on which a computer program is stored, the program implementing the object tracking method of an embodiment of the present invention when executed by a processor.

One embodiment of the above invention has the following advantages or benefits: because a plurality of target key points in the kth frame image and a plurality of target key points in the (k + 1) th frame image are respectively determined through the target detection frame in the kth frame image, namely the target detection frame in the kth frame image is used as the target detection frame of the (k + 1) th frame image, the (k + 1) th frame image is not subjected to target detection, the target detection process is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to a threshold value, correcting a target detection frame in the kth frame image through the average displacement, and using the corrected target detection frame as a target detection frame of the (k + 2) th frame image to realize target tracking, namely under the condition that the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to the threshold value, the (k + 2) th frame image does not pass through target detection, so that the process of target detection is also saved, the speed of the whole process is accelerated, and the efficiency is improved. Therefore, the target tracking method of the embodiment of the invention avoids target detection on each frame of image, thereby overcoming the technical problems that each frame of image needs target detection, consumes long time and cannot meet the real-time requirement in the prior art, further improving the detection efficiency and being suitable for application scenes with higher real-time requirement.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of a main flow of a target tracking method of an embodiment of the present invention;

FIG. 2 is a schematic diagram of the major modules of a target tracking device of an embodiment of the present invention;

FIG. 3 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 4 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of a main flow of a target tracking method according to an embodiment of the present invention, as shown in fig. 1, the method includes:

step S101: carrying out target detection on the kth frame image, and determining a target detection frame in the kth frame image; wherein k is an integer greater than or equal to 1.

In this embodiment, the target may be a human face in the image, or may also be a vehicle or other objects in the image, and the present invention is not limited herein.

In this step, the object detection is to obtain the object in the image

And obtaining the target detection frame at the appearing position. As an example, the position of the target detection frame can be obtained by using the SSD detection algorithm, and the position of the target detection frame is (x, y, w, h)_box，(x，y)_boxPosition coordinates, w, representing the upper left corner of the target detection frame_box，h_boxRespectively representing the width and height of the target detection box. Among them, the SSD detection algorithm (target detection algorithm) is a deep convolutional neural network object detection algorithm based on a regression idea.

Step S102: and respectively determining a plurality of target key points in the k frame image and a plurality of target key points in the (k + 1) th frame image based on the target detection frame in the k frame image.

The purpose of this step is to obtain point coordinates of a specific location of an object in an image, for example to obtain point coordinates of a specific location of a human face in an image. In the present embodiment, a plurality of target key points in the k-th frame image and a plurality of target key points in the k + 1-th frame image correspond to each other.

Specifically, the target key point can be obtained through the following process:

and obtaining a target position in the image through a target detection frame obtained by a target detection algorithm, extracting the image of a target part in the image by using the target detection frame to obtain an image of the target part, and finally inputting the image of the target part into a target key point detection model to obtain a target key point. The target key point detection model is a deep learning model and is obtained through training of training data, namely, a mapping relation from an image to a point is obtained through training of a target image and a corresponding target key point, and is set as f. When the model is used for detecting the target key points, the positions of the target key points can be obtained only by inputting image data into the model. In a specific embodiment, the number of the target key points may be set to 106.

The image input at the k-th frame is I^kAnd f is the mapping model of the image to the key points, the target key point detection can be expressed by the formula (1):

f(I^k)＝{(x₁，y₁)，(x₂，y₂)，…(x_n，y_n)，}^k (1)

wherein (x)_n，y_n) And detecting a plurality of key points output by the model for the target key points.

In the step, the target detection frame in the k frame image is used as the target detection frame of the k +1 frame image, namely the k +1 frame image is not subjected to target detection, so that the target detection process is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved. In this step, although the position of the target detection frame in the k-th frame image may deviate from the actual position of the target detection frame in the (k + 1) -th frame image, which may cause a certain deviation when the target detection frame in the k-th frame image is used to extract the target in the (k + 1) -th frame image, since the target keypoint model has a certain generalization, the target keypoint model can correctly output the position of the target keypoint even if the target image input into the target keypoint model has a deviation at a certain pixel level.

Step S103: and determining the average displacement of the target key point in the k frame image and the target key point in the k +1 frame image.

Specifically, the method comprises the following steps:

Wherein the average position of a plurality of target keypoints in the kth frame image is calculated according to the following formula (2):

indicating the average position of a plurality of target keypoints in the image of the kth frame,

indicating the position of the ith target keypoint in the image of the kth frame.

Calculating the average position of a plurality of target key points in the k +1 frame image according to the following formula (3):

represents the average position of a plurality of target key points in the image of the (k + 1) th frame,

indicating the position of the ith target keypoint in the image of the (k + 1) th frame.

Calculating a displacement difference between the average position of the plurality of target keypoints in the image of the (k + 1) th frame and the average position of the plurality of target keypoints in the image of the k +1 th frame according to the following formula (4):

and representing the displacement difference between the average position of the plurality of target key points in the k +1 th frame image and the average position of the plurality of target key points in the k +1 th frame image.

Step S104: and judging the average displacement and the threshold value. In this step, the threshold may be flexibly set according to an application scenario, and the present invention is not limited herein.

Step S105: and when the average displacement is smaller than or equal to a threshold value, correcting a target detection frame in the k frame image through the average displacement, and using the corrected target detection frame as a target detection frame of the k +2 frame image to realize target tracking.

In this embodiment, if the average displacement is smaller than the threshold, which means that the target in the k +1 th frame image moves less than the k th frame image, the target detection frame in the k th frame image is translated, as shown in the following formula (5):

indicating the position of the target detection frame in the image of the k-th frame,

and indicates the position of the corrected target detection frame.

Step S106: and when the average displacement is larger than a threshold value, carrying out target detection on the (k + 2) th frame image, and determining a target detection frame in the (k + 2) th frame image so as to realize target tracking.

In this embodiment, if the average displacement is greater than the threshold, which means that the target in the (k + 1) th frame image moves more than the k +2 th frame image, the target detection needs to be performed again for the (k + 2) th frame image to obtain the target detection frame in the (k + 2) th frame image.

According to the target tracking method, the target detection frames in the kth frame image are used for respectively determining the target key points in the kth frame image and the target key points in the (k + 1) th frame image, namely the target detection frames in the kth frame image are used as the target detection frames of the (k + 1) th frame image, so that the (k + 1) th frame image is not subjected to target detection, the target detection process is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to a threshold value, correcting a target detection frame in the kth frame image through the average displacement, and using the corrected target detection frame as a target detection frame of the (k + 2) th frame image to realize target tracking, namely under the condition that the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to the threshold value, the (k + 2) th frame image does not pass through target detection, so that the process of target detection is also saved, the speed of the whole process is accelerated, and the efficiency is improved. Therefore, the target tracking method of the embodiment of the invention avoids target detection on each frame of image, thereby overcoming the technical problems that each frame of image needs target detection, consumes long time and cannot meet the real-time requirement in the prior art, further improving the detection efficiency and being suitable for application scenes with higher real-time requirement.

In order to make the target tracking method of the embodiment of the invention clearer, the processing procedure of the method is described again by taking the face in the tracking image as an example:

(1) the processing procedure for the k frame image is as follows: obtaining a face frame through an SSD (solid State disk) detection algorithm, obtaining a face image through the face frame, inputting the face image into a face key point detection model to obtain a plurality of face key points, and calculating the average positions of the face key points;

(2) the processing procedure for the k +1 frame image is as follows: and acquiring a face image in the (k + 1) th frame image by using a face frame in the kth frame image, inputting the face image into a face key point detection model to obtain face key points, and calculating the average position of the face key points. It is worth noting that the face frame in the k +1 th frame image is the face frame of the k-th frame image, but not the face frame of the k +1 th frame image, and the k +1 th frame image is not detected by the SSD algorithm, which aims to save the SSD detection process in this frame, although the position of the face frame of the k-th frame image may have a little deviation from the actual position of the face in the k +1 th frame image, which may result in a certain deviation of the face deducted from the k +1 th frame image, because the face key point model has a certain generalization, even if the face position in the face image input in the face key point model has some pixel level deviations, the face key point model can also correctly output the position of the key point.

(3) Calculating the average displacement of key points of the human face in the k frame and the k +1 frame of image, if the average displacement is less than or equal to the threshold value, indicating that the human face position of the k frame and the k +1 frame of image does not move much, and then correcting the position of the human face frame of the k frame by using the average displacement of the k frame and the k +1 frame as the human face frame of the k +2 frame of image. If the average displacement is larger than the threshold value, the fact that the human face moves too much when the k +1 th frame image is compared with the k +2 th frame image is shown, then the human face frame of the k +1 th frame image is directly translated through the average displacement to obtain the human face frame of the k +2 th frame image, and problems can be caused when the human face frame is used for conducting image matting on the k +2 th frame image, so that when the average displacement is larger than the threshold value, human face detection is conducted on the k +2 th frame image again, when image matting is conducted on the k +2 th frame image and key point detection is conducted on the k +2 th frame image, even if the human face moves fast, the position of the human face frame and the position of the image can be completely corresponding, the position of the human face image buckled on the k +2 th frame image is also correct, and therefore it is.

It is worth noting that the position of the face frame after the correction is the position of the face frame of the (k + 1) th frame image, compared with the (k + 2) th frame image, the position of the face frame may not be in the middle of the position of the face of the (k + 2) th frame image, but due to the generalization capability of the face key point model, even if the position of the face in the face image input in the face key point model has some pixel level deviations, the face key point model can correctly output the position of the key point, so that the face tracking method of the embodiment of the invention can reduce the face detection frequency of some frames by the means.

(4) The processing flow for the k +2 frame image is:

a. if the average displacement of the key points of the human faces of the (k + 1) th frame image and the k (k) th frame image is less than or equal to the threshold value, the (k + 1) th frame image moves less than the k (k) th frame image, the human face frame of the k (k) th frame image is directly translated according to the average displacement to obtain the human face frame of the image of the (k + 2) th frame, then the translated human face frame is used for matting the image of the (k + 2) th frame image, and the deducted human face image is input into a human face key point model;

b. if the average displacement of the face key points of the k +1 th frame and the k-th frame is larger than the threshold value, it indicates that the face frame of the k +1 th frame may not be accurate if the face frame of the k +1 th frame is directly translated, and if the face frame is used to scratch on the image of the k +2 th frame, the deviation may be larger, and the face key point model may not obtain a correct result. Therefore, under the condition that the average displacement is larger than the threshold value, the SSD detection is directly carried out on the (k + 2) th frame image, and the face frame of the (k + 2) th frame image is determined to be the position of the face in the (k + 2) th frame image.

According to the target tracking method, the plurality of face key points in the kth frame image and the plurality of face key points in the (k + 1) th frame image are respectively determined through the face frame in the kth frame image, namely, the face frame in the kth frame image is used as the face frame of the (k + 1) th frame image, so that the (k + 1) th frame image is not subjected to face detection, the face detection process is saved, the speed of the whole face tracking process is increased, the time is saved, and the efficiency is improved; when the average displacement between the key points of the face in the k frame image and the key points of the face in the (k + 1) th frame image is smaller than or equal to a threshold value, correcting the face frame in the k frame image through the average displacement, and using the corrected face frame as the face frame of the (k + 2) th frame image to realize face tracking, namely under the condition that the average displacement between the key points of the face in the k frame image and the key points of the face in the (k + 1) th frame image is smaller than or equal to the threshold value, the (k + 2) th frame image is not subjected to face detection, so that the process of face detection is also saved, the speed of the whole process is accelerated, and the efficiency is improved. Therefore, the face tracking method of the embodiment of the invention avoids face detection on each frame of image, thereby overcoming the technical problems that each frame of image needs face detection, is long in time consumption and cannot meet the real-time requirement in the prior art, further improving the detection efficiency, and being suitable for application scenes with higher real-time requirement.

Fig. 2 is a schematic diagram of the main modules of an object tracking apparatus 200 according to an embodiment of the present invention, as shown in fig. 2, the apparatus 200 includes:

a detection frame determining module 201, configured to perform target detection on a k-th frame image, and determine a target detection frame in the k-th frame image; wherein k is an integer greater than or equal to 1;

a keypoint determination module 202, configured to determine, based on the target detection frame in the kth frame image, a plurality of target keypoints in the kth frame image and a plurality of target keypoints in the (k + 1) th frame image, respectively;

a displacement determining module 203, configured to determine an average displacement between a target key point in the k frame image and a target key point in the (k + 1) th frame image;

and the tracking module 204 is configured to, when the average displacement is smaller than or equal to a threshold, correct the target detection frame in the k frame of image through the average displacement, and use the corrected target detection frame as the target detection frame of the k +2 frame of image, so as to implement target tracking.

In an alternative embodiment, the tracking module 204 is further configured to: and when the average displacement is larger than a threshold value, carrying out target detection on the (k + 2) th frame image, and determining a target detection frame in the (k + 2) th frame image so as to realize target tracking.

In an alternative embodiment, the displacement determining module 203 is further configured to:

In an alternative embodiment, the tracking module 204 is further configured to: and translating the target detection frame in the k frame image according to the average displacement.

According to the target tracking device, the target detection frames in the kth frame image are used for respectively determining the target key points in the kth frame image and the target key points in the (k + 1) th frame image, namely the target detection frames in the kth frame image are used as the target detection frames of the (k + 1) th frame image, so that the (k + 1) th frame image is not subjected to target detection, the target detection process is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to a threshold value, correcting a target detection frame in the kth frame image through the average displacement, and using the corrected target detection frame as a target detection frame of the (k + 2) th frame image to realize target tracking, namely under the condition that the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to the threshold value, the (k + 2) th frame image does not pass through target detection, so that the process of target detection is also saved, the speed of the whole process is accelerated, and the efficiency is improved. Therefore, the target tracking method of the embodiment of the invention avoids target detection on each frame of image, thereby overcoming the technical problems that each frame of image needs target detection, consumes long time and cannot meet the real-time requirement in the prior art, further improving the detection efficiency and being suitable for application scenes with higher real-time requirement.

The device can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.

Fig. 3 illustrates an exemplary system architecture 300 to which a target tracking method or a target tracking apparatus of an embodiment of the present invention may be applied.

As shown in fig. 3, the system architecture 300 may include

terminal devices

301, 302, 303, a network 304, and a server 305. The network 304 serves as a medium for providing communication links between the

terminal devices

301, 302, 303 and the server 305. Network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal device

301, 302, 303 to interact with the server 305 via the network 304 to receive or send messages or the like. The

terminal devices

301, 302, 303 may have various communication client applications installed thereon, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like.

The

terminal devices

301, 302, 303 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 305 may be a server providing various services, such as a background management server providing support for shopping websites browsed by the user using the

terminal devices

301, 302, 303. The background management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (e.g., target push information and product information) to the terminal device.

It should be noted that the target tracking method provided by the embodiment of the present invention is generally executed by the server 305, and accordingly, the target tracking apparatus is generally disposed in the server 305.

It should be understood that the number of terminal devices, networks, and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 4, a block diagram of a computer system 400 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 4, the computer system 400 includes a Central Processing Unit (CPU)401 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the system 400 are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as necessary, so that a computer program read out therefrom is mounted into the storage section 408 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the system of the present invention when executed by a Central Processing Unit (CPU) 401.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not in some cases constitute a limitation on the unit itself, and for example, the sending module may also be described as a "module that sends a picture acquisition request to a connected server".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:

According to the technical scheme of the embodiment of the invention, a plurality of target key points in the kth frame image and a plurality of target key points in the (k + 1) th frame image are respectively determined through the target detection frame in the kth frame image, namely the target detection frame in the kth frame image is used as the target detection frame of the (k + 1) th frame image, so that the (k + 1) th frame image is not subjected to target detection, the target detection process is saved, the speed of the whole target tracking process is increased, the time is saved, and the efficiency is improved; when the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to a threshold value, correcting a target detection frame in the kth frame image through the average displacement, and using the corrected target detection frame as a target detection frame of the (k + 2) th frame image to realize target tracking, namely under the condition that the average displacement of the target key point in the kth frame image and the target key point in the (k + 1) th frame image is smaller than or equal to the threshold value, the (k + 2) th frame image does not pass through target detection, so that the process of target detection is also saved, the speed of the whole process is accelerated, and the efficiency is improved. Therefore, the target tracking method of the embodiment of the invention avoids target detection on each frame of image, thereby overcoming the technical problems that each frame of image needs target detection, consumes long time and cannot meet the real-time requirement in the prior art, further improving the detection efficiency and being suitable for application scenes with higher real-time requirement.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A target tracking method, comprising:

2. The method of claim 1, further comprising:

and when the average displacement is larger than a threshold value, carrying out target detection on the (k + 2) th frame image, and determining a target detection frame in the (k + 2) th frame image so as to realize target tracking.

3. The method of claim 1, wherein determining an average displacement of a target keypoint in the k frame image and a target keypoint in the k +1 frame image comprises:

4. The method of claim 1, wherein correcting the object detection frame in the k frame image by the average displacement comprises:

5. An object tracking device, comprising:

6. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-4.

7. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-4.