WO2022194157A1

WO2022194157A1 - Target tracking method and apparatus, device and medium

Info

Publication number: WO2022194157A1
Application number: PCT/CN2022/080977
Authority: WO
Inventors: 郭亨凯; 杜思聪
Original assignee: 北京字跳网络技术有限公司
Priority date: 2021-03-15
Filing date: 2022-03-15
Publication date: 2022-09-22
Also published as: CN115082515A

Abstract

The embodiments of the present disclosure relate to a target tracking method and apparatus, a device and a medium. The method comprises: extracting a first video frame from a target video, and determining first location information of a target area in the first video frame; performing optical flow tracking on a second video frame according to an initial feature point determined from the first location information, so as to obtain target feature points, wherein the second video frame is a video frame adjacent to the first video frame in the target video; and fitting the target feature points to obtain second location information of the target area in the second video frame. By using the technical solution, on the basis of detecting a target area of a video frame of a video, the location of the target area in other video frames can be determined more accurately by means of feature point tracking and fitting, thereby avoiding the detection of each video frame, improving the calculation efficiency of tracking, and realizing rapid and accurate target recognition and tracking for each image frame in the video.

Description

A target tracking method, device, equipment and medium

Cross-reference to related applications

This application claims the priority of the Chinese Patent Application No. 202110276358.7 filed on March 15, 2021 and entitled "A Target Tracking Method, Apparatus, Equipment and Medium", the entire contents of which are incorporated herein by reference.

technical field

The present disclosure relates to the technical field of video processing, and in particular, to a target tracking method, apparatus, device and medium.

Background technique

With the continuous development of intelligent terminal technology, the demand for video content identification and tracking is increasing. At present, the identification and tracking of video content has the defects of low accuracy and can not meet the needs.

SUMMARY OF THE INVENTION

In order to solve the above technical problems or at least partially solve the above technical problems, the present disclosure provides a target tracking method, apparatus, device and medium.

The embodiment of the present disclosure provides a target tracking method, the method includes:

Extract the first video frame in the target video, and determine the first position information of the target area in the first video frame;

Perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain the target feature point; wherein, the second video frame is the neighbor of the first video frame in the target video video frame;

Fitting the target feature points to obtain second position information of the target area in the second video frame.

Embodiments of the present disclosure also provide a target tracking device, the device comprising:

A first position module, for extracting the first video frame in the target video, and determining the first position information of the target area in the first video frame;

A tracking module, configured to perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain a target feature point; wherein, the second video frame is the first video frame in the target video. Video frames adjacent to the video frame;

A second position module, configured to fit the target feature points to obtain second position information of the target area in the second video frame.

An embodiment of the present disclosure further provides an electronic device, the electronic device includes: a processor; a memory for storing instructions executable by the processor; the processor for reading the memory from the memory The instructions can be executed, and the instructions can be executed to implement the target tracking method provided by the embodiments of the present disclosure.

An embodiment of the present disclosure further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute the target tracking method provided by the embodiment of the present disclosure.

Embodiments of the present disclosure also provide a computer program product, including a computer program, which, when executed by a processor, implements the target tracking method provided by the embodiments of the present disclosure.

An embodiment of the present disclosure also provides a computer program, where the computer program is stored in a computer-readable storage medium, and when the computer program is executed by a processor, implements the target tracking method provided by the embodiment of the present disclosure.

Compared with the prior art, the technical solution provided by the embodiment of the present disclosure has the following advantages: the target tracking solution provided by the embodiment of the present disclosure extracts the first video frame in the target video, and determines the first video frame of the target area in the first video frame. position information; perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain the target feature point; wherein, the second video frame is the adjacent video frame of the first video frame in the target video; The target feature points are fitted to obtain second position information of the target area in the second video frame. By adopting the above technical solution, on the basis of detecting the target area of one video frame of the video, the position of the target area in other video frames can be more accurately determined through feature point tracking and fitting, avoiding the need for each video frame to be detected. It improves the computational efficiency of tracking and realizes fast and accurate target recognition and tracking of each image frame in the video.

Description of drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent when taken in conjunction with the accompanying drawings and with reference to the following detailed description. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that the originals and elements are not necessarily drawn to scale.

FIG. 1 is a schematic flowchart of a target tracking method according to an embodiment of the present disclosure;

FIG. 2 is a schematic flowchart of another target tracking method provided by an embodiment of the present disclosure;

3 is a schematic diagram of a target tracking provided by an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of a target tracking device according to an embodiment of the present disclosure;

FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed ways

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for the purpose of A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are only for exemplary purposes, and are not intended to limit the protection scope of the present disclosure.

It should be understood that the various steps described in the method embodiments of the present disclosure may be performed in different orders and/or in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this regard.

As used herein, the term "including" and variations thereof are open-ended inclusions, ie, "including but not limited to". The term "based on" is "based at least in part on." The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions of other terms will be given in the description below.

It should be noted that concepts such as "first" and "second" mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or interdependence.

It should be noted that the modifications of "a" and "a plurality" mentioned in the present disclosure are illustrative rather than restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, they should be understood as "one or a plurality of". multiple".

The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are only for illustrative purposes, and are not intended to limit the scope of these messages or information.

FIG. 1 is a schematic flowchart of a target tracking method according to an embodiment of the present disclosure. The method may be executed by a target tracking apparatus, where the apparatus may be implemented by software and/or hardware, and may generally be integrated in an electronic device. As shown in Figure 1, the method includes:

Step 101: Extract the first video frame in the target video, and determine the first position information of the target area in the first video frame.

The target video may be any video that needs to be detected and tracked, may be a video captured by a device with a video capture function, or may be a video obtained from the Internet or other devices, which is not limited in detail. A video frame is also called an image frame, which can be the smallest unit that composes a video. The first video frame can be any video frame in the target video. video frame as an example. The target area refers to an area with a preset shape. In the video, it can be the area where an object with a preset shape is located. The preset shape is not limited. For example, the preset shape can include an ellipse, a circle, and a rectangle. The target area can be the area where the elliptical object is located.

In the embodiment of the present disclosure, after the target video is acquired, the first video frame may be extracted from the target video, and a preset detection algorithm is used to detect the target area of the first video frame, and determine the No. 1 position of the target area in the first video frame. a location information. The above-mentioned preset detection algorithm may be a deep learning-based detection algorithm or a contour detection algorithm, etc., which may be determined according to the actual situation. For example, when the target area is an oval area, the preset detection algorithm may be any ellipse detection algorithm, and an ellipse detection algorithm is used. The detection algorithm performs contour detection on the first video frame, and then fits the elliptical contour obtained by the contour detection to obtain the position information of the target area in the first video frame. The first position information may be information that can represent the position of the target area in the first video frame, and may specifically include information such as vertex coordinates and center point coordinates of the target area in the first video frame.

Step 102: Perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain the target feature point; wherein the second video frame is an adjacent video frame of the first video frame in the target video.

The second video frame refers to a video frame adjacent to the first video frame in the target video, which may be the next video frame in time sequence. The initial feature points may be points obtained by sampling the contour of the target area in the first video frame.

In the embodiment of the present disclosure, determining the initial feature point according to the first position information includes: sampling the edge contour of the target area in the first video frame according to the first position information to determine the initial feature point. Optionally, sampling the edge contour of the target area in the first video frame according to the first position information, and determining the initial feature points, including: when the target area is an elliptical area, according to the first position information, the target area is in polar coordinates. Perform the following representation to obtain an ellipse outline; wherein, the first position information includes vertex coordinates and/or center point coordinates of the target area in the first video frame; sampling is performed in the ellipse outline according to preset polar angle intervals to obtain initial feature points.

The preset polar angle interval may be set according to actual conditions, for example, the preset polar angle interval may be set to 5 degrees. In this embodiment of the present disclosure, the target area in the first video frame may be sampled based on the first location information determined above, and then the initial feature points may be determined. Taking the target area as an elliptical area as an example, the The ellipse equation of the elliptical region of a video frame is expressed in polar coordinates to obtain the ellipse outline, and sampling is performed on the ellipse outline according to the preset polar angle interval, and a feature point is collected at each preset polar angle interval to obtain a plurality of initial Feature points. After that, in the second video frame, the optical flow tracking algorithm is used to track the initial feature points obtained by the above sampling, and the feature points that are successfully tracked are reserved as the target feature points, and the feature points that fail to be tracked are eliminated.

Step 103: Fit the target feature points to obtain second position information of the target area in the second video frame.

In the embodiment of the present disclosure, fitting the target area based on the target feature points, and determining the second position information of the target area in the second video frame, including: if the coverage area of the target feature points on the edge contour of the target area is greater than or equal to a predetermined If the range is set, the target feature points are fitted to obtain the second position information of the target area in the second video frame.

The preset range refers to a preset range that satisfies the shape of the target area, which may be set according to actual conditions. For example, the preset range may be 3/4 of the entire range of the edge contour. Specifically, after determining the target feature points, it can be determined whether the coverage of the target points on the edge contour of the target area is greater than or equal to the preset range, and if so, use a fitting algorithm to fit the target feature points to obtain the target area in second position information in the second video frame.

Exemplarily, when the target area is an elliptical area, if the score of the target feature points on the elliptical outline is greater than or equal to 3/4 of the elliptical outline, a random sampling consistency (Random Sample Consensus, RANSAC) algorithm is used to perform ellipse fitting. , that is, 5 points are randomly selected from the target feature points each time, and the number of interior point sets in these 5 points is judged until the largest interior point set is found, and the 5 points corresponding to the maximum interior point set are used for ellipse fitting. The above The interior point set refers to the set of points on the contour of the ellipse.

In the embodiment of the present disclosure, the target tracking method may further include: if the coverage of the target feature points on the edge contour of the target area is smaller than the preset range, determining that the target area is in the second video frame by detecting the second video frame the second location information. When the coverage area of the target point on the edge contour of the target area is smaller than the preset area, it is determined that the tracking fails, and a preset detection algorithm can be used to re-detect the second video frame to determine the second position information of the target area. The above-mentioned preset detection algorithm may be implemented by detecting the second video frame, and may be a deep learning-based detection algorithm or a contour detection algorithm, etc., which is not limited in particular.

It can be understood that after determining the position information of the target area in the second video frame, the second video frame can be determined as the new first video frame and the third video frame adjacent to the second video frame can be determined as the new first video frame. For two video frames, go back to step 102 until the determination of the position of the target area of each video frame in the video is completed.

The target tracking solution provided by the embodiment of the present disclosure extracts the first video frame in the target video, and determines the first position information of the target area in the first video frame; The frame is subjected to optical flow tracking to obtain target feature points; wherein, the second video frame is the adjacent video frame of the first video frame in the target video; the target feature points are fitted to obtain the first video frame of the target area in the second video frame. 2. Location information. By adopting the above technical solution, on the basis of detecting the target area of one video frame of the video, the position of the target area in other video frames can be more accurately determined through feature point tracking and fitting, avoiding the need for each video frame to be detected. It improves the computational efficiency of tracking and realizes fast and accurate target recognition and tracking of each image frame in the video.

In some embodiments, after determining the first position information of the target area in the first video frame, the method further includes: determining a change parameter of the second video frame relative to the first video frame; initial features determined according to the first position information Perform optical flow tracking on the second video frame to obtain the target feature point, including: if it is determined based on the change parameter that the second video frame does not meet the multiplexing condition, performing the initial feature point determined according to the first position information to the second video frame. Perform optical flow tracking to obtain target feature points. Optionally, the multiplexing condition is that the change parameter is less than or equal to the change threshold.

The transformation parameter refers to a parameter representing the change of the second video frame relative to the first video frame. The multiplexing condition refers to a specific judging condition for determining whether the first video frame can be multiplexed by the second video frame to the position of the target area. The change threshold refers to a preset threshold, which can be set according to the actual situation. For example, when the change parameter is represented by the movement information of the feature points in the second video frame relative to the corresponding feature points in the first video frame, the transformation threshold can be Distance threshold, set to 0.8. Specifically, after determining the change parameter of the second video frame relative to the first video frame, the change parameter may be compared with the change threshold, and if it is determined that the change parameter is greater than the change threshold, it may be determined that the second video frame does not meet the multiplexing condition , re-tracking is required, and optical flow tracking is performed on the second video frame based on the initial feature point determined according to the first position information to obtain the target feature point; otherwise, it is determined that the second video frame satisfies the multiplexing condition.

In some embodiments, determining a change parameter of the second video frame relative to the first video frame includes: extracting a first feature point in the first video frame; performing optical flow tracking on the second video frame according to the first feature point, A second feature point is determined, and a moving distance between the second feature point and the first feature point is determined as a change parameter. The above-mentioned first feature point may be a corner point detected on the first video frame by adopting an accelerated segmentation test (Features From Accelerated Segment Test, FAST) corner detection algorithm, and a corner point refers to an extreme point, that is, in a certain aspect Attributes that stand out in particular. The detected object may be the entire first video frame, or may only be the above-mentioned target area, which is not particularly limited.

Specifically, the FAST corner detection algorithm can be used to extract the first feature point for the first video frame, and the first feature point can be used as the input of the KLT (Kanade Lucas Tomasi) optical flow tracking algorithm to obtain the output of the second feature of successful tracking. Then, since the number of the first feature point and the second feature point can be multiple, the average value of the moving distance of the first feature point and the second whole point can be calculated, and the average value of the moving distance is determined as the transformation parameter.

In the embodiment of the present disclosure, the target tracking method may further include: if it is determined based on the change parameter that the second video frame satisfies the multiplexing condition, determining the first position information as the second position information of the target area in the second video frame. If it is determined that the change parameter is less than or equal to the change threshold, it means that the current camera is basically in a stationary state, the positions of the target areas of two adjacent video frames are similar, and the second video frame satisfies the multiplexing condition, and the first position information can be assigned to the first position information. The two video frames, that is, the location information of the target area in the first video frame and the second video frame are the same.

In the above scheme, by adding the judgment of multiplexing conditions to two adjacent video frames, when the change of the two adjacent video frames in the video is large, the above-mentioned feature point tracking and fitting are used to realize the determination of the position of the target area; When the change or difference between two adjacent video frames in the video is small, the similarity between the two video frames is high, and the next video frame can directly reuse the position information of the target area of the previous video frame without redoing The detection saves the workload and improves the computing efficiency.

FIG. 2 is a schematic flowchart of another target tracking method provided by an embodiment of the present disclosure. On the basis of the foregoing embodiment, this embodiment further optimizes the foregoing target tracking method. As shown in Figure 2, the method includes:

Step 201: Extract the first video frame in the target video, and determine the first position information of the target area in the first video frame.

Step 202: Extract the first feature point in the first video frame.

Step 203: Perform optical flow tracking on the second video frame according to the first feature point, determine the second feature point, and determine the moving distance between the second feature point and the first feature point as a change parameter.

The second video frame is an adjacent video frame of the first video frame in the target video.

Step 204: Determine whether the second video frame satisfies the multiplexing condition based on the change parameter, if yes, go to Step 210; otherwise, go to Step 205.

The multiplexing condition is that the change parameter is less than or equal to the change threshold. If the change parameter is greater than the change threshold, it is determined that the second video frame does not meet the multiplexing condition, and step 205 is executed; otherwise, it is determined that the second video frame meets the multiplexing condition, and step 210 is executed.

Step 205: Sampling the edge contour of the target area in the first video frame according to the first position information to determine initial feature points.

Optionally, sampling the edge contour of the target area in the first video frame according to the first position information, and determining the initial feature points, including: when the target area is an elliptical area, according to the first position information, the target area is in polar coordinates. Perform the following representation to obtain an ellipse outline; wherein, the first position information includes vertex coordinates and/or center point coordinates of the target area in the first video frame; sampling is performed in the ellipse outline according to preset polar angle intervals to obtain initial feature points.

Step 206: Perform optical flow tracking on the second video frame according to the initial feature points determined by the first position information to obtain target feature points.

Step 207: Check whether the coverage range of the target feature point on the edge contour of the target area is greater than or equal to the preset range, if so, go to Step 208; otherwise, go to Step 209.

If the coverage area of the target feature point on the edge contour of the target area is greater than or equal to the preset area, step 208 is performed; otherwise, step 209 is performed.

Step 208: Fit the target feature points to obtain second position information of the target area in the second video frame.

Step 209: Determine the second position information of the target area in the second video frame by detecting the second video frame.

Step 210: Determine the first position information as the second position information of the target area in the second video frame.

Next, the target tracking method in the embodiment of the present disclosure will be further described by using a specific example. Exemplarily, FIG. 3 is a schematic diagram of a target tracking provided by an embodiment of the present disclosure. The tracking process for a video may include: Step 21 , performing ellipse detection on the previous frame. The previous frame may be the first frame of the video. Specifically, any ellipse detection method may be used for detection to determine the ellipse position of the previous frame. Step 22: Whether the current frame stillness detection is passed, if yes, go to Step 26; otherwise, go to Step 23. Specifically, FAST corner detection is performed on the previous frame, and KLT optical flow tracking is performed on the current frame based on the corners of the previous frame. Calculate the average moving distance of matching points in the two frames before and after. If the distance is less than 0.8, it means that the camera is basically in a stationary state, and the stationary detection passes, then the ellipse position of the current frame should be similar to the previous frame, and directly assign the ellipse position of the previous frame to the current frame, and go to step 26. If the distance is greater than 0.8, the static detection fails, and step 23 is executed. Step 23: Sampling the circular polar angle, and track the sampling points. The ellipse equation of the previous frame is represented in polar coordinates, and the feature points are sampled on the circumference of the ellipse according to the polar angle, and a point is taken every 5 degrees, a total of 72 points; in the current frame image, the sampling obtained The feature points are tracked using optical flow, and the points that are successfully tracked are retained, and the points that fail to be tracked are eliminated. Step 24: Determine whether the sampling point range meets the requirements, if yes, go to Step 25; otherwise, go to Step 27. If the distribution of successfully tracked points on the circumference of the ellipse is greater than 3/4 of the circumference of the ellipse, it is determined that the sampling point range meets the requirements, and step 25 is performed. Otherwise, it is determined that the sampling point range does not meet the requirements, and it is considered that the tracking fails, and step 27 is executed. Step 25, RANSAC fitting. The ellipse fitting is performed according to the feature points, and the ellipse fitting is done by RANSAC, that is, 5 points are randomly sampled from the point set each time until the ellipse model with the largest inner point set is found. Step 26: The current frame ends and the next frame begins. Step 27, ellipse detection. The ellipse detection is performed again on the current frame, and after the ellipse position is determined, step 26 is continued until the ellipse position is determined for each frame in the video.

In this solution, the optical flow tracking of feature points, the still detection of video frame sequences and the quality discrimination of ellipse tracking can be used to quickly and accurately complete the ellipse tracking of each image frame in the video, and it is not necessary to detect each video frame. , reducing the amount of calculation and ensuring the real-time performance of target tracking.

FIG. 4 is a schematic structural diagram of a target tracking apparatus provided by an embodiment of the present disclosure. The apparatus may be implemented by software and/or hardware, and may generally be integrated into an electronic device. As shown in Figure 4, the device includes:

The first position module 301 is used to extract the first video frame in the target video, and determine the first position information of the target area in the first video frame;

A tracking module 302, configured to perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information, to obtain a target feature point; wherein, the second video frame is the first video frame in the target video. A video frame adjacent to a video frame;

The second position module 303 is configured to fit the target feature points to obtain second position information of the target area in the second video frame.

Optionally, the tracking module 302 is used for:

The edge contour of the target area in the first video frame is sampled according to the first position information to determine initial feature points.

Optionally, the tracking module 302 is used for:

When the target area is an elliptical area, an elliptical outline is obtained by representing the target area in polar coordinates according to the first position information; wherein, the first position information includes that the target area is in the first Vertex coordinates and/or center point coordinates in the video frame; sampling in the elliptical outline according to preset polar angle intervals to obtain the initial feature points.

Optionally, the second location module 303 is used for:

If the coverage range of the target feature point on the edge contour of the target area is greater than or equal to a preset range, perform fitting on the target feature point to obtain the coverage area of the target area in the second video frame. second location information.

Optionally, the device further includes a detection module for:

If the coverage range of the target feature point on the edge contour of the target area is smaller than the preset range, determining the coverage of the target area in the second video frame by detecting the second video frame second location information.

Optionally, the device further includes a multiplexing judging module, configured to: after determining the first position information of the target area in the first video frame,

determining a change parameter of the second video frame relative to the first video frame;

Perform optical flow tracking on the second video frame according to the initial feature points determined by the first position information to obtain target feature points, including:

If it is determined based on the change parameter that the second video frame does not meet the multiplexing condition, the initial feature point determined according to the first position information is performed to perform optical flow tracking on the second video frame to obtain target features point.

Optionally, the multiplexing judgment module is specifically used for:

extracting the first feature point in the first video frame;

Perform optical flow tracking on the second video frame according to the first feature point, determine a second feature point, and determine the moving distance between the second feature point and the first feature point as the change parameter .

Optionally, the multiplexing condition is that the change parameter is less than or equal to a change threshold.

Optionally, the device also includes a multiplexing module for:

If it is determined based on the change parameter that the second video frame satisfies the multiplexing condition, the first position information is determined as the second position information of the target area in the second video frame.

The target tracking device provided by the embodiment of the present disclosure can execute the target tracking method provided by any embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.

An embodiment of the present disclosure also provides a computer program product, including a computer program/instruction, when the computer program/instruction is executed by a processor, the target tracking method provided by any embodiment of the present disclosure is implemented.

FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. Referring specifically to FIG. 5 below, it shows a schematic structural diagram of an electronic device 400 suitable for implementing an embodiment of the present disclosure. The electronic device 400 in the embodiment of the present disclosure may include, but is not limited to, such as a mobile phone, a notebook computer, a digital broadcast receiver, a Personal Digital Assistant (PDA), a PAD (tablet computer), a portable multimedia player (Portable Media Player, PMP), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital televisions (Television, TV), desktop computers, and the like. The electronic device shown in FIG. 5 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.

As shown in FIG. 5 , the electronic device 400 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 401, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 402 or from a storage device 408 is a program loaded into a random access memory (RAM) 403 to perform various appropriate actions and processes. In the RAM 403, various programs and data required for the operation of the electronic device 400 are also stored. The processing device 401, the ROM 402, and the RAM 403 are connected to each other through a bus 404. An Input/Output (I/O) interface 405 is also connected to the bus 404 .

Typically, the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) output device 407 , a speaker, a vibrator, etc.; a storage device 408 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 409 . Communication means 409 may allow electronic device 400 to communicate wirelessly or by wire with other devices to exchange data. Although FIG. 5 shows electronic device 400 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication device 409, or from the storage device 408, or from the ROM 402. When the computer program is executed by the processing device 401, the above-mentioned functions defined in the target tracking method of the embodiment of the present disclosure are executed.

It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory) Memory, EPROM, or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . The program code embodied on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.

In some embodiments, the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects. Examples of communication networks include Local Area Network (LAN), Wide Area Network (WAN), Internet (eg, the Internet), and peer-to-peer networks (eg, Ad-Hoc peer-to-peer network), as well as any current Known or future developed networks.

The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.

The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: extracts the first video frame in the target video, and determines that the target area is in the first video frame. the first position information in a video frame; perform optical flow tracking on the second video frame according to the initial feature points determined by the first position information to obtain target feature points; wherein, the second video frame is the target video The adjacent video frames of the first video frame in the above; and fitting the target feature points to obtain the second position information of the target area in the second video frame.

Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. Where a remote computer is involved, the remote computer may be connected to the user computer through any kind of network, including a LAN or WAN, or may be connected to an external computer (eg, using an Internet service provider to connect through the Internet).

The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Product, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.

According to one or more embodiments of the present disclosure, the present disclosure provides a target tracking method, including:

According to one or more embodiments of the present disclosure, in the target tracking method provided by the present disclosure, determining an initial feature point according to the first position information includes:

According to one or more embodiments of the present disclosure, in the target tracking method provided by the present disclosure, the edge contour of the target area in the first video frame is sampled according to the first position information, and the initial feature points are determined, including: :

According to one or more embodiments of the present disclosure, in the target tracking method provided by the present disclosure, the target feature points are fitted to obtain second position information of the target area in the second video frame, including: :

According to one or more embodiments of the present disclosure, the target tracking method provided by the present disclosure further includes:

According to one or more embodiments of the present disclosure, in the target tracking method provided by the present disclosure, after the determining the first position information of the target area in the first video frame, the method further includes:

According to one or more embodiments of the present disclosure, in the target tracking method provided by the present disclosure, the determining a change parameter of the second video frame relative to the first video frame includes:

extracting the first feature point in the first video frame;

According to one or more embodiments of the present disclosure, in the target tracking method provided by the present disclosure, the multiplexing condition is that the change parameter is less than or equal to a change threshold.

According to one or more embodiments of the present disclosure, the present disclosure provides a target tracking device, including:

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the tracking module is configured to:

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the second location module is used for:

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the device further includes a detection module for:

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the device further includes a multiplexing judgment module, configured to: in the first video frame of the determined target area in the first video frame After location information,

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the multiplexing judgment module is specifically configured to:

extracting the first feature point in the first video frame;

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the multiplexing condition is that the change parameter is less than or equal to a change threshold.

According to one or more embodiments of the present disclosure, in the target tracking device provided by the present disclosure, the device further includes a multiplexing module for:

According to one or more embodiments of the present disclosure, the present disclosure provides an electronic device, comprising:

processor;

a memory for storing the processor-executable instructions;

The processor is configured to read the executable instructions from the memory, and execute the instructions to implement any one of the target tracking methods provided in the present disclosure.

According to one or more embodiments of the present disclosure, the present disclosure provides a computer-readable storage medium storing a computer program for executing any of the objects provided by the present disclosure tracking method.

According to one or more embodiments of the present disclosure, the present disclosure provides a computer program product, including a computer program, which, when executed by a processor, implements the target tracking method as provided in any one of the present disclosure.

According to one or more embodiments of the present disclosure, the present disclosure provides a computer program, the computer program is stored in a computer-readable storage medium, and when the computer program is executed by a processor, implements any one of the methods provided by the present disclosure. The described target tracking method.

The above description is merely a preferred embodiment of the present disclosure and an illustration of the technical principles employed. Those skilled in the art should understand that the scope of the disclosure involved in the present disclosure is not limited to the technical solutions formed by the specific combination of the above-mentioned technical features, and should also cover, without departing from the above-mentioned disclosed concept, the technical solutions formed by the above-mentioned technical features or Other technical solutions formed by any combination of its equivalent features. For example, a technical solution is formed by replacing the above features with the technical features disclosed in the present disclosure (but not limited to) with similar functions.

Additionally, although operations are depicted in a particular order, this should not be construed as requiring that the operations be performed in the particular order shown or in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, although the above discussion contains several implementation-specific details, these should not be construed as limitations on the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or logical acts of method, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims

A target tracking method, comprising:

Extract the first video frame in the target video, and determine the first position information of the target area in the first video frame;

Perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain the target feature point; wherein, the second video frame is the neighbor of the first video frame in the target video video frame;

Fitting the target feature points to obtain second position information of the target area in the second video frame.
The method according to claim 1, wherein, before the initial feature point determined according to the first position information performs optical flow tracking on the second video frame to obtain the target feature point, the method further comprises:

The edge contour of the target area in the first video frame is sampled according to the first position information to determine initial feature points.
The method according to claim 2, wherein sampling the edge contour of the target area in the first video frame according to the first position information to determine initial feature points, comprising:

When the target area is an elliptical area, an elliptical outline is obtained by representing the target area in polar coordinates according to the first position information; wherein, the first position information includes that the target area is in the first vertex coordinates and/or center point coordinates in the video frame;

Sampling is performed in the elliptical outline according to a preset polar angle interval to obtain the initial feature point.
The method according to any one of claims 1-3, wherein the fitting of the target feature points to obtain the second position information of the target area in the second video frame, comprising:

If the coverage range of the target feature point on the edge contour of the target area is greater than or equal to a preset range, perform fitting on the target feature point to obtain the coverage area of the target area in the second video frame. second location information.
The method of claim 4, further comprising:

If the coverage range of the target feature point on the edge contour of the target area is smaller than the preset range, determining the coverage of the target area in the second video frame by detecting the second video frame second location information.
The method according to any one of claims 1-3, wherein after the determining the first position information of the target area in the first video frame, the method further comprises:

determining a change parameter of the second video frame relative to the first video frame;

Perform optical flow tracking on the second video frame according to the initial feature points determined by the first position information to obtain target feature points, including:

If it is determined based on the change parameter that the second video frame does not meet the multiplexing condition, the initial feature point determined according to the first position information is performed to perform optical flow tracking on the second video frame to obtain target features point.
The method according to claim 6, wherein the determining a change parameter of the second video frame relative to the first video frame comprises:

extracting the first feature point in the first video frame;

Perform optical flow tracking on the second video frame according to the first feature point, determine a second feature point, and determine the moving distance between the second feature point and the first feature point as the change parameter .
The method according to claim 6, wherein the multiplexing condition is that the change parameter is less than or equal to a change threshold.
The method of claim 6, further comprising:

If it is determined based on the change parameter that the second video frame satisfies the multiplexing condition, the first position information is determined as the second position information of the target area in the second video frame.
A target tracking device, comprising:

A first position module, for extracting the first video frame in the target video, and determining the first position information of the target area in the first video frame;

A tracking module, configured to perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain a target feature point; wherein, the second video frame is the first video frame in the target video. Video frames adjacent to the video frame;

A second position module, configured to fit the target feature points to obtain second position information of the target area in the second video frame.
An electronic device, characterized in that the electronic device comprises:

processor;

a memory for storing the processor-executable instructions;

The processor is configured to read the executable instructions from the memory and execute the instructions to implement the target tracking method according to any one of the preceding claims 1-9.
A computer-readable storage medium, characterized in that the storage medium stores a computer program, and the computer program is used to execute the target tracking method according to any one of the preceding claims 1-9.
A computer program product, characterized in that it includes a computer program, which, when executed by a processor, implements the target tracking method according to any one of claims 1-9.
A computer program, characterized in that, the computer program is stored in a computer-readable storage medium, and when the computer program is executed by a processor, the target tracking method according to any one of the preceding claims 1-9 is implemented.