WO2022194157A1 - Procédé et appareil de suivi de cible, dispositif et support - Google Patents

Procédé et appareil de suivi de cible, dispositif et support Download PDF

Info

Publication number
WO2022194157A1
WO2022194157A1 PCT/CN2022/080977 CN2022080977W WO2022194157A1 WO 2022194157 A1 WO2022194157 A1 WO 2022194157A1 CN 2022080977 W CN2022080977 W CN 2022080977W WO 2022194157 A1 WO2022194157 A1 WO 2022194157A1
Authority
WO
WIPO (PCT)
Prior art keywords
video frame
target
position information
target area
video
Prior art date
Application number
PCT/CN2022/080977
Other languages
English (en)
Chinese (zh)
Inventor
郭亨凯
杜思聪
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2022194157A1 publication Critical patent/WO2022194157A1/fr
Priority to US18/468,647 priority Critical patent/US20240005552A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20164Salient point detection; Corner detection

Definitions

  • the present disclosure relates to the technical field of video processing, and in particular, to a target tracking method, apparatus, device and medium.
  • the present disclosure provides a target tracking method, apparatus, device and medium.
  • the embodiment of the present disclosure provides a target tracking method, the method includes:
  • Fitting the target feature points to obtain second position information of the target area in the second video frame
  • Embodiments of the present disclosure also provide a target tracking device, the device comprising:
  • a first position module for extracting the first video frame in the target video, and determining the first position information of the target area in the first video frame;
  • a tracking module configured to perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain a target feature point; wherein, the second video frame is the first video frame in the target video. Video frames adjacent to the video frame;
  • a second position module configured to fit the target feature points to obtain second position information of the target area in the second video frame.
  • An embodiment of the present disclosure further provides an electronic device, the electronic device includes: a processor; a memory for storing instructions executable by the processor; the processor for reading the memory from the memory The instructions can be executed, and the instructions can be executed to implement the target tracking method provided by the embodiments of the present disclosure.
  • An embodiment of the present disclosure further provides a computer-readable storage medium, where the storage medium stores a computer program, and the computer program is used to execute the target tracking method provided by the embodiment of the present disclosure.
  • Embodiments of the present disclosure also provide a computer program product, including a computer program, which, when executed by a processor, implements the target tracking method provided by the embodiments of the present disclosure.
  • An embodiment of the present disclosure also provides a computer program, where the computer program is stored in a computer-readable storage medium, and when the computer program is executed by a processor, implements the target tracking method provided by the embodiment of the present disclosure.
  • the target tracking solution provided by the embodiment of the present disclosure extracts the first video frame in the target video, and determines the first video frame of the target area in the first video frame. position information; perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain the target feature point; wherein, the second video frame is the adjacent video frame of the first video frame in the target video; The target feature points are fitted to obtain second position information of the target area in the second video frame.
  • the position of the target area in other video frames can be more accurately determined through feature point tracking and fitting, avoiding the need for each video frame to be detected. It improves the computational efficiency of tracking and realizes fast and accurate target recognition and tracking of each image frame in the video.
  • FIG. 1 is a schematic flowchart of a target tracking method according to an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of another target tracking method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a target tracking provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a target tracking device according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flowchart of a target tracking method according to an embodiment of the present disclosure.
  • the method may be executed by a target tracking apparatus, where the apparatus may be implemented by software and/or hardware, and may generally be integrated in an electronic device.
  • the method includes:
  • Step 101 Extract the first video frame in the target video, and determine the first position information of the target area in the first video frame.
  • the target video may be any video that needs to be detected and tracked, may be a video captured by a device with a video capture function, or may be a video obtained from the Internet or other devices, which is not limited in detail.
  • a video frame is also called an image frame, which can be the smallest unit that composes a video.
  • the first video frame can be any video frame in the target video. video frame as an example.
  • the target area refers to an area with a preset shape. In the video, it can be the area where an object with a preset shape is located.
  • the preset shape is not limited.
  • the preset shape can include an ellipse, a circle, and a rectangle.
  • the target area can be the area where the elliptical object is located.
  • the first video frame may be extracted from the target video, and a preset detection algorithm is used to detect the target area of the first video frame, and determine the No. 1 position of the target area in the first video frame. a location information.
  • the above-mentioned preset detection algorithm may be a deep learning-based detection algorithm or a contour detection algorithm, etc., which may be determined according to the actual situation.
  • the preset detection algorithm may be any ellipse detection algorithm, and an ellipse detection algorithm is used.
  • the detection algorithm performs contour detection on the first video frame, and then fits the elliptical contour obtained by the contour detection to obtain the position information of the target area in the first video frame.
  • the first position information may be information that can represent the position of the target area in the first video frame, and may specifically include information such as vertex coordinates and center point coordinates of the target area in the first video frame.
  • Step 102 Perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain the target feature point; wherein the second video frame is an adjacent video frame of the first video frame in the target video.
  • the second video frame refers to a video frame adjacent to the first video frame in the target video, which may be the next video frame in time sequence.
  • the initial feature points may be points obtained by sampling the contour of the target area in the first video frame.
  • determining the initial feature point according to the first position information includes: sampling the edge contour of the target area in the first video frame according to the first position information to determine the initial feature point.
  • sampling the edge contour of the target area in the first video frame according to the first position information, and determining the initial feature points including: when the target area is an elliptical area, according to the first position information, the target area is in polar coordinates.
  • the preset polar angle interval may be set according to actual conditions, for example, the preset polar angle interval may be set to 5 degrees.
  • the target area in the first video frame may be sampled based on the first location information determined above, and then the initial feature points may be determined. Taking the target area as an elliptical area as an example, the The ellipse equation of the elliptical region of a video frame is expressed in polar coordinates to obtain the ellipse outline, and sampling is performed on the ellipse outline according to the preset polar angle interval, and a feature point is collected at each preset polar angle interval to obtain a plurality of initial Feature points. After that, in the second video frame, the optical flow tracking algorithm is used to track the initial feature points obtained by the above sampling, and the feature points that are successfully tracked are reserved as the target feature points, and the feature points that fail to be tracked are eliminated.
  • Step 103 Fit the target feature points to obtain second position information of the target area in the second video frame.
  • fitting the target area based on the target feature points, and determining the second position information of the target area in the second video frame including: if the coverage area of the target feature points on the edge contour of the target area is greater than or equal to a predetermined If the range is set, the target feature points are fitted to obtain the second position information of the target area in the second video frame.
  • the preset range refers to a preset range that satisfies the shape of the target area, which may be set according to actual conditions.
  • the preset range may be 3/4 of the entire range of the edge contour.
  • the target area is an elliptical area
  • a random sampling consistency (Random Sample Consensus, RANSAC) algorithm is used to perform ellipse fitting. , that is, 5 points are randomly selected from the target feature points each time, and the number of interior point sets in these 5 points is judged until the largest interior point set is found, and the 5 points corresponding to the maximum interior point set are used for ellipse fitting.
  • the interior point set refers to the set of points on the contour of the ellipse.
  • the target tracking method may further include: if the coverage of the target feature points on the edge contour of the target area is smaller than the preset range, determining that the target area is in the second video frame by detecting the second video frame the second location information.
  • a preset detection algorithm can be used to re-detect the second video frame to determine the second position information of the target area.
  • the above-mentioned preset detection algorithm may be implemented by detecting the second video frame, and may be a deep learning-based detection algorithm or a contour detection algorithm, etc., which is not limited in particular.
  • the second video frame can be determined as the new first video frame and the third video frame adjacent to the second video frame can be determined as the new first video frame. For two video frames, go back to step 102 until the determination of the position of the target area of each video frame in the video is completed.
  • the target tracking solution extracts the first video frame in the target video, and determines the first position information of the target area in the first video frame;
  • the frame is subjected to optical flow tracking to obtain target feature points; wherein, the second video frame is the adjacent video frame of the first video frame in the target video; the target feature points are fitted to obtain the first video frame of the target area in the second video frame.
  • Location information By adopting the above technical solution, on the basis of detecting the target area of one video frame of the video, the position of the target area in other video frames can be more accurately determined through feature point tracking and fitting, avoiding the need for each video frame to be detected. It improves the computational efficiency of tracking and realizes fast and accurate target recognition and tracking of each image frame in the video.
  • the method further includes: determining a change parameter of the second video frame relative to the first video frame; initial features determined according to the first position information Perform optical flow tracking on the second video frame to obtain the target feature point, including: if it is determined based on the change parameter that the second video frame does not meet the multiplexing condition, performing the initial feature point determined according to the first position information to the second video frame. Perform optical flow tracking to obtain target feature points.
  • the multiplexing condition is that the change parameter is less than or equal to the change threshold.
  • the transformation parameter refers to a parameter representing the change of the second video frame relative to the first video frame.
  • the multiplexing condition refers to a specific judging condition for determining whether the first video frame can be multiplexed by the second video frame to the position of the target area.
  • the change threshold refers to a preset threshold, which can be set according to the actual situation. For example, when the change parameter is represented by the movement information of the feature points in the second video frame relative to the corresponding feature points in the first video frame, the transformation threshold can be Distance threshold, set to 0.8.
  • the change parameter may be compared with the change threshold, and if it is determined that the change parameter is greater than the change threshold, it may be determined that the second video frame does not meet the multiplexing condition , re-tracking is required, and optical flow tracking is performed on the second video frame based on the initial feature point determined according to the first position information to obtain the target feature point; otherwise, it is determined that the second video frame satisfies the multiplexing condition.
  • determining a change parameter of the second video frame relative to the first video frame includes: extracting a first feature point in the first video frame; performing optical flow tracking on the second video frame according to the first feature point, A second feature point is determined, and a moving distance between the second feature point and the first feature point is determined as a change parameter.
  • the above-mentioned first feature point may be a corner point detected on the first video frame by adopting an accelerated segmentation test (Features From Accelerated Segment Test, FAST) corner detection algorithm, and a corner point refers to an extreme point, that is, in a certain aspect Attributes that stand out in particular.
  • the detected object may be the entire first video frame, or may only be the above-mentioned target area, which is not particularly limited.
  • the FAST corner detection algorithm can be used to extract the first feature point for the first video frame, and the first feature point can be used as the input of the KLT (Kanade Lucas Tomasi) optical flow tracking algorithm to obtain the output of the second feature of successful tracking. Then, since the number of the first feature point and the second feature point can be multiple, the average value of the moving distance of the first feature point and the second whole point can be calculated, and the average value of the moving distance is determined as the transformation parameter.
  • KLT Kanade Lucas Tomasi
  • the target tracking method may further include: if it is determined based on the change parameter that the second video frame satisfies the multiplexing condition, determining the first position information as the second position information of the target area in the second video frame. If it is determined that the change parameter is less than or equal to the change threshold, it means that the current camera is basically in a stationary state, the positions of the target areas of two adjacent video frames are similar, and the second video frame satisfies the multiplexing condition, and the first position information can be assigned to the first position information.
  • the two video frames that is, the location information of the target area in the first video frame and the second video frame are the same.
  • the above-mentioned feature point tracking and fitting are used to realize the determination of the position of the target area;
  • the change or difference between two adjacent video frames in the video is small, the similarity between the two video frames is high, and the next video frame can directly reuse the position information of the target area of the previous video frame without redoing
  • the detection saves the workload and improves the computing efficiency.
  • FIG. 2 is a schematic flowchart of another target tracking method provided by an embodiment of the present disclosure. On the basis of the foregoing embodiment, this embodiment further optimizes the foregoing target tracking method. As shown in Figure 2, the method includes:
  • Step 201 Extract the first video frame in the target video, and determine the first position information of the target area in the first video frame.
  • Step 202 Extract the first feature point in the first video frame.
  • Step 203 Perform optical flow tracking on the second video frame according to the first feature point, determine the second feature point, and determine the moving distance between the second feature point and the first feature point as a change parameter.
  • the second video frame is an adjacent video frame of the first video frame in the target video.
  • Step 204 Determine whether the second video frame satisfies the multiplexing condition based on the change parameter, if yes, go to Step 210; otherwise, go to Step 205.
  • the multiplexing condition is that the change parameter is less than or equal to the change threshold. If the change parameter is greater than the change threshold, it is determined that the second video frame does not meet the multiplexing condition, and step 205 is executed; otherwise, it is determined that the second video frame meets the multiplexing condition, and step 210 is executed.
  • Step 205 Sampling the edge contour of the target area in the first video frame according to the first position information to determine initial feature points.
  • sampling the edge contour of the target area in the first video frame according to the first position information, and determining the initial feature points including: when the target area is an elliptical area, according to the first position information, the target area is in polar coordinates.
  • Step 206 Perform optical flow tracking on the second video frame according to the initial feature points determined by the first position information to obtain target feature points.
  • Step 207 Check whether the coverage range of the target feature point on the edge contour of the target area is greater than or equal to the preset range, if so, go to Step 208; otherwise, go to Step 209.
  • step 208 is performed; otherwise, step 209 is performed.
  • Step 208 Fit the target feature points to obtain second position information of the target area in the second video frame.
  • Step 209 Determine the second position information of the target area in the second video frame by detecting the second video frame.
  • Step 210 Determine the first position information as the second position information of the target area in the second video frame.
  • FIG. 3 is a schematic diagram of a target tracking provided by an embodiment of the present disclosure.
  • the tracking process for a video may include: Step 21 , performing ellipse detection on the previous frame.
  • the previous frame may be the first frame of the video.
  • any ellipse detection method may be used for detection to determine the ellipse position of the previous frame.
  • Step 22 Whether the current frame stillness detection is passed, if yes, go to Step 26; otherwise, go to Step 23.
  • FAST corner detection is performed on the previous frame
  • KLT optical flow tracking is performed on the current frame based on the corners of the previous frame.
  • Step 23 Sampling the circular polar angle, and track the sampling points.
  • Step 24 Determine whether the sampling point range meets the requirements, if yes, go to Step 25; otherwise, go to Step 27. If the distribution of successfully tracked points on the circumference of the ellipse is greater than 3/4 of the circumference of the ellipse, it is determined that the sampling point range meets the requirements, and step 25 is performed.
  • step 27 is executed.
  • Step 25 RANSAC fitting.
  • the ellipse fitting is performed according to the feature points, and the ellipse fitting is done by RANSAC, that is, 5 points are randomly sampled from the point set each time until the ellipse model with the largest inner point set is found.
  • Step 26 The current frame ends and the next frame begins.
  • the optical flow tracking of feature points, the still detection of video frame sequences and the quality discrimination of ellipse tracking can be used to quickly and accurately complete the ellipse tracking of each image frame in the video, and it is not necessary to detect each video frame. , reducing the amount of calculation and ensuring the real-time performance of target tracking.
  • the target tracking solution extracts the first video frame in the target video, and determines the first position information of the target area in the first video frame;
  • the frame is subjected to optical flow tracking to obtain target feature points; wherein, the second video frame is the adjacent video frame of the first video frame in the target video; the target feature points are fitted to obtain the first video frame of the target area in the second video frame.
  • Location information By adopting the above technical solution, on the basis of detecting the target area of one video frame of the video, the position of the target area in other video frames can be more accurately determined through feature point tracking and fitting, avoiding the need for each video frame to be detected. It improves the computational efficiency of tracking and realizes fast and accurate target recognition and tracking of each image frame in the video.
  • FIG. 4 is a schematic structural diagram of a target tracking apparatus provided by an embodiment of the present disclosure.
  • the apparatus may be implemented by software and/or hardware, and may generally be integrated into an electronic device.
  • the device includes:
  • the first position module 301 is used to extract the first video frame in the target video, and determine the first position information of the target area in the first video frame;
  • a tracking module 302 configured to perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information, to obtain a target feature point; wherein, the second video frame is the first video frame in the target video.
  • the second position module 303 is configured to fit the target feature points to obtain second position information of the target area in the second video frame.
  • the tracking module 302 is used for:
  • the edge contour of the target area in the first video frame is sampled according to the first position information to determine initial feature points.
  • the tracking module 302 is used for:
  • an elliptical outline is obtained by representing the target area in polar coordinates according to the first position information; wherein, the first position information includes that the target area is in the first Vertex coordinates and/or center point coordinates in the video frame; sampling in the elliptical outline according to preset polar angle intervals to obtain the initial feature points.
  • the second location module 303 is used for:
  • the coverage range of the target feature point on the edge contour of the target area is greater than or equal to a preset range, perform fitting on the target feature point to obtain the coverage area of the target area in the second video frame. second location information.
  • the device further includes a detection module for:
  • the coverage range of the target feature point on the edge contour of the target area is smaller than the preset range, determining the coverage of the target area in the second video frame by detecting the second video frame second location information.
  • the device further includes a multiplexing judging module, configured to: after determining the first position information of the target area in the first video frame,
  • the initial feature point determined according to the first position information is performed to perform optical flow tracking on the second video frame to obtain target features point.
  • the multiplexing judgment module is specifically used for:
  • the multiplexing condition is that the change parameter is less than or equal to a change threshold.
  • the device also includes a multiplexing module for:
  • the first position information is determined as the second position information of the target area in the second video frame.
  • the target tracking device provided by the embodiment of the present disclosure can execute the target tracking method provided by any embodiment of the present disclosure, and has functional modules and beneficial effects corresponding to the execution method.
  • An embodiment of the present disclosure also provides a computer program product, including a computer program/instruction, when the computer program/instruction is executed by a processor, the target tracking method provided by any embodiment of the present disclosure is implemented.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. Referring specifically to FIG. 5 below, it shows a schematic structural diagram of an electronic device 400 suitable for implementing an embodiment of the present disclosure.
  • the electronic device 400 in the embodiment of the present disclosure may include, but is not limited to, such as a mobile phone, a notebook computer, a digital broadcast receiver, a Personal Digital Assistant (PDA), a PAD (tablet computer), a portable multimedia player (Portable Media Player, PMP), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital televisions (Television, TV), desktop computers, and the like.
  • PDA Personal Digital Assistant
  • PAD tablet computer
  • PMP portable multimedia player
  • in-vehicle terminals eg, in-vehicle navigation terminals
  • stationary terminals such as digital televisions (Television, TV), desktop computers, and the like.
  • the electronic device shown in FIG. 5 is only
  • the electronic device 400 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 401, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 402 or from a storage device 408 is a program loaded into a random access memory (RAM) 403 to perform various appropriate actions and processes.
  • ROM Read-Only Memory
  • RAM random access memory
  • various programs and data required for the operation of the electronic device 400 are also stored.
  • the processing device 401, the ROM 402, and the RAM 403 are connected to each other through a bus 404.
  • An Input/Output (I/O) interface 405 is also connected to the bus 404 .
  • the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) output device 407 , a speaker, a vibrator, etc.; a storage device 408 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 409 .
  • Communication means 409 may allow electronic device 400 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 5 shows electronic device 400 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 409, or from the storage device 408, or from the ROM 402.
  • the processing device 401 When the computer program is executed by the processing device 401, the above-mentioned functions defined in the target tracking method of the embodiment of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory) Memory, EPROM, or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code embodied on the computer readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include Local Area Network (LAN), Wide Area Network (WAN), Internet (eg, the Internet), and peer-to-peer networks (eg, Ad-Hoc peer-to-peer network), as well as any current Known or future developed networks.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: extracts the first video frame in the target video, and determines that the target area is in the first video frame. the first position information in a video frame; perform optical flow tracking on the second video frame according to the initial feature points determined by the first position information to obtain target feature points; wherein, the second video frame is the target video The adjacent video frames of the first video frame in the above; and fitting the target feature points to obtain the second position information of the target area in the second video frame.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user computer through any kind of network, including a LAN or WAN, or may be connected to an external computer (eg, using an Internet service provider to connect through the Internet).
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation of the unit itself under certain circumstances.
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Product, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSP Application Specific Standard Products
  • ASSP Application Specific Standard Products
  • SOC System on Chip
  • complex programmable logic device Complex Programmable Logic Device, CPLD
  • the present disclosure provides a target tracking method, including:
  • Fitting the target feature points to obtain second position information of the target area in the second video frame
  • determining an initial feature point according to the first position information includes:
  • the edge contour of the target area in the first video frame is sampled according to the first position information to determine initial feature points.
  • the edge contour of the target area in the first video frame is sampled according to the first position information, and the initial feature points are determined, including: :
  • an elliptical outline is obtained by representing the target area in polar coordinates according to the first position information; wherein, the first position information includes that the target area is in the first Vertex coordinates and/or center point coordinates in the video frame; sampling in the elliptical outline according to preset polar angle intervals to obtain the initial feature points.
  • the target feature points are fitted to obtain second position information of the target area in the second video frame, including: :
  • the coverage range of the target feature point on the edge contour of the target area is greater than or equal to a preset range, perform fitting on the target feature point to obtain the coverage area of the target area in the second video frame. second location information.
  • the target tracking method provided by the present disclosure further includes:
  • the coverage range of the target feature point on the edge contour of the target area is smaller than the preset range, determining the coverage of the target area in the second video frame by detecting the second video frame second location information.
  • the method further includes:
  • the initial feature point determined according to the first position information is performed to perform optical flow tracking on the second video frame to obtain target features point.
  • the determining a change parameter of the second video frame relative to the first video frame includes:
  • the multiplexing condition is that the change parameter is less than or equal to a change threshold.
  • the target tracking method provided by the present disclosure further includes:
  • the first position information is determined as the second position information of the target area in the second video frame.
  • the present disclosure provides a target tracking device, including:
  • a first position module for extracting the first video frame in the target video, and determining the first position information of the target area in the first video frame;
  • a tracking module configured to perform optical flow tracking on the second video frame according to the initial feature point determined by the first position information to obtain a target feature point; wherein, the second video frame is the first video frame in the target video. Video frames adjacent to the video frame;
  • a second position module configured to fit the target feature points to obtain second position information of the target area in the second video frame.
  • the tracking module is configured to:
  • the edge contour of the target area in the first video frame is sampled according to the first position information to determine initial feature points.
  • the tracking module is configured to:
  • an elliptical outline is obtained by representing the target area in polar coordinates according to the first position information; wherein, the first position information includes that the target area is in the first Vertex coordinates and/or center point coordinates in the video frame; sampling in the elliptical outline according to preset polar angle intervals to obtain the initial feature points.
  • the second location module is used for:
  • the coverage range of the target feature point on the edge contour of the target area is greater than or equal to a preset range, perform fitting on the target feature point to obtain the coverage area of the target area in the second video frame. second location information.
  • the device further includes a detection module for:
  • the coverage range of the target feature point on the edge contour of the target area is smaller than the preset range, determining the coverage of the target area in the second video frame by detecting the second video frame second location information.
  • the device further includes a multiplexing judgment module, configured to: in the first video frame of the determined target area in the first video frame After location information,
  • the initial feature point determined according to the first position information is performed to perform optical flow tracking on the second video frame to obtain target features point.
  • the multiplexing judgment module is specifically configured to:
  • the multiplexing condition is that the change parameter is less than or equal to a change threshold.
  • the device further includes a multiplexing module for:
  • the first position information is determined as the second position information of the target area in the second video frame.
  • the present disclosure provides an electronic device, comprising:
  • a memory for storing the processor-executable instructions
  • the processor is configured to read the executable instructions from the memory, and execute the instructions to implement any one of the target tracking methods provided in the present disclosure.
  • the present disclosure provides a computer-readable storage medium storing a computer program for executing any of the objects provided by the present disclosure tracking method.
  • the present disclosure provides a computer program product, including a computer program, which, when executed by a processor, implements the target tracking method as provided in any one of the present disclosure.
  • the present disclosure provides a computer program, the computer program is stored in a computer-readable storage medium, and when the computer program is executed by a processor, implements any one of the methods provided by the present disclosure.
  • the described target tracking method is not limited to one or more embodiments of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

Les modes de réalisation de la présente divulgation se rapportent à un procédé et à un appareil de suivi de cible, ainsi qu'à un dispositif et à un support. Le procédé consiste à : extraire une première trame vidéo d'une vidéo cible, et déterminer des premières informations d'emplacement d'une zone cible dans la première trame vidéo ; réaliser un suivi de flux optique sur une seconde trame vidéo en fonction d'un point caractéristique initial déterminé à partir des premières informations d'emplacement, de façon à obtenir des points caractéristiques cibles, la seconde trame vidéo étant une trame vidéo adjacente à la première trame vidéo dans la vidéo cible ; et ajuster les points caractéristiques cibles pour obtenir des secondes informations d'emplacement de la zone cible dans la seconde trame vidéo. Par l'utilisation de la solution technique, sur la base de la détection d'une zone cible d'une trame vidéo d'une vidéo, l'emplacement de la zone cible dans d'autres trames vidéo peut être déterminé plus précisément au moyen d'un suivi et d'un ajustement de point caractéristique, ce qui permet d'éviter la détection de chaque trame vidéo, d'améliorer l'efficacité de calcul du suivi, et de réaliser une reconnaissance et un suivi de cible rapides et précis pour chaque trame d'image dans la vidéo.
PCT/CN2022/080977 2021-03-15 2022-03-15 Procédé et appareil de suivi de cible, dispositif et support WO2022194157A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/468,647 US20240005552A1 (en) 2021-03-15 2023-09-15 Target tracking method and apparatus, device, and medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110276358.7A CN115082515A (zh) 2021-03-15 2021-03-15 一种目标跟踪方法、装置、设备及介质
CN202110276358.7 2021-03-15

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080985 Continuation-In-Part WO2022194158A1 (fr) 2021-03-15 2022-03-15 Procédé et appareil de suivi de cible, dispositif et support

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/468,647 Continuation-In-Part US20240005552A1 (en) 2021-03-15 2023-09-15 Target tracking method and apparatus, device, and medium

Publications (1)

Publication Number Publication Date
WO2022194157A1 true WO2022194157A1 (fr) 2022-09-22

Family

ID=83240774

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/080977 WO2022194157A1 (fr) 2021-03-15 2022-03-15 Procédé et appareil de suivi de cible, dispositif et support

Country Status (2)

Country Link
CN (1) CN115082515A (fr)
WO (1) WO2022194157A1 (fr)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108492315A (zh) * 2018-02-09 2018-09-04 湖南华诺星空电子技术有限公司 一种动态人脸跟踪方法
CN109598744A (zh) * 2018-11-29 2019-04-09 广州市百果园信息技术有限公司 一种视频跟踪的方法、装置、设备和存储介质
CN109919971A (zh) * 2017-12-13 2019-06-21 北京金山云网络技术有限公司 图像处理方法、装置、电子设备及计算机可读存储介质
CN112258556A (zh) * 2020-10-22 2021-01-22 北京字跳网络技术有限公司 视频中指定区域的跟踪方法、装置、可读介质和电子设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4645433B2 (ja) * 2005-12-14 2011-03-09 株式会社デンソー 図形中心検出方法、楕円検出方法、画像認識装置、制御装置
JP5526401B2 (ja) * 2009-03-31 2014-06-18 国立大学法人東京農工大学 心室壁情報抽出装置
JP4990960B2 (ja) * 2009-12-24 2012-08-01 エヌ・ティ・ティ・コムウェア株式会社 物体識別装置、物体識別方法、および物体識別プログラム
CN107103323B (zh) * 2017-03-09 2020-06-16 广东顺德中山大学卡内基梅隆大学国际联合研究院 一种基于图像轮廓特征的目标识别方法
CN107590453B (zh) * 2017-09-04 2019-01-11 腾讯科技(深圳)有限公司 增强现实场景的处理方法、装置及设备、计算机存储介质
CN107610108B (zh) * 2017-09-04 2019-04-26 腾讯科技(深圳)有限公司 图像处理方法和装置
CN111429477B (zh) * 2020-04-13 2022-08-26 展讯通信(上海)有限公司 目标追踪方法及装置、存储介质、计算机设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919971A (zh) * 2017-12-13 2019-06-21 北京金山云网络技术有限公司 图像处理方法、装置、电子设备及计算机可读存储介质
CN108492315A (zh) * 2018-02-09 2018-09-04 湖南华诺星空电子技术有限公司 一种动态人脸跟踪方法
CN109598744A (zh) * 2018-11-29 2019-04-09 广州市百果园信息技术有限公司 一种视频跟踪的方法、装置、设备和存储介质
CN112258556A (zh) * 2020-10-22 2021-01-22 北京字跳网络技术有限公司 视频中指定区域的跟踪方法、装置、可读介质和电子设备

Also Published As

Publication number Publication date
CN115082515A (zh) 2022-09-20

Similar Documents

Publication Publication Date Title
CN111783626B (zh) 图像识别方法、装置、电子设备及存储介质
WO2020253616A1 (fr) Procédé et appareil de positionnement de dispositif de collecte audio, et procédé et système de reconnaissance de haut-parleur
WO2019080702A1 (fr) Procédé et appareil de traitement d'images
WO2020062494A1 (fr) Procédé et appareil de traitement d'image
CN111784712A (zh) 图像处理方法、装置、设备和计算机可读介质
CN112488095A (zh) 印章图像识别方法、装置和电子设备
WO2022028253A1 (fr) Procédé d'optimisation de modèle de positionnement, procédé de positionnement, dispositif de positionnement, et support de stockage
CN111126159A (zh) 用于实时跟踪行人的方法、装置、电子设备和介质
CN112085733B (zh) 图像处理方法、装置、电子设备和计算机可读介质
CN112257598B (zh) 图像中四边形的识别方法、装置、可读介质和电子设备
CN111783632B (zh) 针对视频流的人脸检测方法、装置、电子设备及存储介质
CN112101258A (zh) 图像处理方法、装置、电子设备和计算机可读介质
WO2023020268A1 (fr) Procédé et appareil de reconnaissance de gestes, et dispositif et support
WO2023138540A1 (fr) Procédé et appareil d'extraction de bord, dispositif électronique et support de stockage
WO2022194145A1 (fr) Procédé et appareil de détermination de position de photographie, dispositif et support
WO2022194158A1 (fr) Procédé et appareil de suivi de cible, dispositif et support
WO2022194157A1 (fr) Procédé et appareil de suivi de cible, dispositif et support
CN110765304A (zh) 图像处理方法、装置、电子设备及计算机可读介质
WO2022105622A1 (fr) Procédé et appareil de segmentation d'image, support lisible et dispositif électronique
WO2022052889A1 (fr) Procédé et appareil de reconnaissance d'image, dispositif électronique et support lisible par ordinateur
CN114863124A (zh) 模型训练方法、息肉检测方法、相应装置、介质及设备
CN111401182B (zh) 针对饲喂栏的图像检测方法和装置
CN110348374B (zh) 车辆检测方法、装置、电子设备及存储介质
CN111860209B (zh) 手部识别方法、装置、电子设备及存储介质
CN111340813A (zh) 图像实例分割方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22770510

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22770510

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20-02-2024)