CN116668806B

CN116668806B - Method and device for adding target tracking mark at playing end

Info

Publication number: CN116668806B
Application number: CN202310916067.9A
Authority: CN
Inventors: 王雪亮; 王金龙
Original assignee: ZTE Intelligent IoT Technology Co Ltd
Current assignee: ZTE Intelligent IoT Technology Co Ltd
Priority date: 2023-07-25
Filing date: 2023-07-25
Publication date: 2023-10-27
Anticipated expiration: 2043-07-25
Also published as: CN116668806A

Abstract

The invention provides a method and a device for adding a target tracking mark at a playing end, comprising the following steps: acquiring video information and characteristic image information of a vehicle; determining a video key frame according to video information of a vehicle, and calculating target tracking information data of characteristic image information in the video key frame, wherein the target tracking information data comprises position information and time stamp information; marking target tracking information data, and sending the target tracking information data to the front end through a WebSocket communication protocol; the front end separates video stream data and target tracking information data according to a predefined protocol; and playing the video stream data, obtaining frame data of target tracking by adopting a positioning strategy, and realizing a synchronization mechanism according to the time stamp. The method solves the problems that the targets are added with marks on the original video after the video is processed by the current target tracking, the marked video is required to be backed up independently, and multiple video backups are required when different targets are tracked and marked by the same video.

Description

Method and device for adding target tracking mark at playing end

Technical Field

The invention relates to the technical field of video tracking and display, in particular to a method and a device for adding a target tracking mark by a playing end.

Background

In recent years, with the development and interactive integration of big data, cloud computing, artificial intelligence and other fields, concepts such as smart electronic commerce, smart transportation, smart city and the like are attracting attention. Along with the heading of people for more intelligent, more convenient and higher-quality life, and with great academic value and wide commercial prospect, a great deal of manpower, material resources and financial resources are invested in related industries by a plurality of universities, scientific research institutions and the like. Artificial intelligence is silently penetrating into various industries and changing our lifestyle. Computer vision is an important branch of the field of artificial intelligence, aiming at studying how to let a computer perceive, analyze, process the real world as intelligently as the human visual system. Various computer vision algorithms using images and videos as information carriers have been penetrated into the daily life of the public, such as face recognition, man-machine interaction, commodity retrieval, intelligent monitoring, visual navigation and the like. Video object tracking technology, one of the fundamental, important research directions in the field of computer vision, has been a focus of attention for researchers.

Video object tracking requires continuous localization and scale estimation of the object in subsequent video frames given the position and scale information of the object of interest in the first frame. However, the current target tracking adds marks to the targets on the original video after processing the video, so that not only is the marked video required to be backed up independently, but also multiple copies of video are required to be backed up when the same video tracks and marks different targets. There is now a lack of a method to display tracking objects while video is being played and presented without changing and adding video files.

Disclosure of Invention

In view of the above, the present invention provides a method and apparatus for adding a target tracking mark at a playing end.

In order to solve the technical problems, the invention adopts the following technical scheme: a method for adding a target tracking mark at a playing end comprises the following steps:

s1: acquiring video information and characteristic image information of a vehicle;

s2: determining a video key frame according to video information of a vehicle, and calculating target tracking information data of characteristic image information in the video key frame, wherein the target tracking information data comprises position information and time stamp information;

s3: marking target tracking information data, and sending the target tracking information data to the front end through a WebSocket communication protocol;

s4: the front end separates video stream data and target tracking information data according to a predefined protocol;

s5: and playing the video stream data, obtaining frame data of target tracking by adopting a positioning strategy, and realizing a synchronization mechanism according to the time stamp.

In the present invention, preferably, the specific steps of determining video key frames according to the video information of the vehicle are as follows:

the video information comprises a plurality of image frames which are arranged according to time sequence, each image frame corresponds to a plurality of characteristic points in the characteristic image information, and the image frame of the first frame is used as a first key frame;

and traversing the image frames after the first key frame, judging that the difference value between the characteristic points in one image frame and the characteristic points of the first key frame is smaller than a preset value, and determining that the image frame is a second key frame.

In the present invention, preferably, the first keyframe includes position information of an object of interest and a region of interest, the position information of the region of interest is tracked by the image acquisition device, and the first keyframe is generated according to the acquired region of interest including the object of interest, where the region of interest includes coordinates of key points in the region of interest and a length and a width of the region of interest.

In the invention, preferably, calculating the target tracking information data of the characteristic image information in the video key frame is specifically to input the characteristic points corresponding to the first key frame and the second key frame into a preset object judgment model of interest for processing, and establish a mapping relation among the image frame, the characteristic points and the target tracking information, wherein the object judgment model of interest comprises a first judgment model and a second judgment model, the characteristic points and the position information are matched through the first judgment model to obtain the position information of the video key frame, and the second judgment model is called to match the characteristic points and the timestamp information after the matching is completed to obtain the timestamp information of the video key frame.

In the present invention, preferably, the position information is set to a frame number of the image frame.

In the present invention, preferably, the specific process of marking the target tracking information data is: and marking the calculated target tracking information data as additional information, and adding the additional information into a supplementary enhancement information frame of the video.

In the present invention, preferably, the supplemental enhancement information frame supports h.264 standard characteristics.

In the present invention, preferably, the positioning strategy specifically adopts a cascading style sheet method to obtain frame data of target tracking.

The invention has the advantages and positive effects that: according to the method, a video key frame is determined according to video information of a vehicle, target tracking information data of characteristic image information in the video key frame is calculated, and the target tracking information data comprises position information and time stamp information; marking target tracking information data, and sending the target tracking information data to the front end through a WebSocket communication protocol; the front end separates video stream data and target tracking information data according to a predefined protocol; the video stream data is played, frame data for target tracking is obtained by adopting a positioning strategy, and a synchronization mechanism is realized according to a time stamp, so that the problem that the targets are marked on the original video after the current target tracking is processed is solved, the marked video is required to be backed up independently, and multiple video backups are required when different targets are tracked and marked by the same video.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

fig. 1 is a flow chart of a method for adding a target tracking mark at a playing end according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It will be understood that when an element is referred to as being "fixed to" another element, it can be directly on the other element or intervening elements may also be present. When a component is considered to be "connected" to another component, it can be directly connected to the other component or intervening components may also be present. When an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present. The terms "vertical," "horizontal," "left," "right," and the like are used herein for illustrative purposes only.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

As shown in fig. 1, the present invention provides a method for adding a target tracking mark at a playing end, which includes the following steps:

s5: and playing the video stream data, obtaining frame data of target tracking by adopting a positioning strategy, and realizing a synchronization mechanism according to the time stamp. The identity of the vehicles is uniquely determined by a reader unit that collects the vehicle's electronic tag radio frequency signals, each corresponding to several video messages.

In this embodiment, further, the specific steps of determining the video key frame according to the video information of the vehicle are as follows:

In this embodiment, further, the first keyframe includes position information of the object of interest and the region of interest, the position information of the region of interest is tracked by the image acquisition device, and the first keyframe is generated according to the acquired region of interest including the object of interest, where the region of interest includes coordinates of key points in the region of interest and a length and a width of the region of interest. The method is a process of extracting the region of interest from the video information so as to be more targeted for tracking the vehicle target, and further reduces the problem of large occupied space of hardware equipment in the process of extracting the region of interest and reduces the load of image processing.

In this embodiment, further, calculating the target tracking information data of the feature image information in the video key frame specifically includes inputting feature points corresponding to the first key frame and the second key frame into a preset object judgment model of interest for processing, and establishing a mapping relationship between the image frame, the feature points and the target tracking information, where the object judgment model of interest includes a first judgment model and a second judgment model, the first judgment model is used to match the feature points with the position information to obtain the position information of the video key frame, and after matching is completed, the second judgment model is called to match the feature points with the timestamp information to obtain the timestamp information of the video key frame.

In this embodiment, further, the position information is set to the frame number of the image frame.

In this embodiment, further, the specific process of marking the target tracking information data is: and marking the calculated target tracking information data as additional information, and adding the additional information into a supplementary enhancement information frame of the video.

In this embodiment, further, the supplemental enhancement information frame supports h.264 standard characteristics.

In this embodiment, further, the positioning policy specifically obtains the frame data of the target tracking by using a cascading style sheet method. Specifically, the initial position of the image frame tracked by the target is defined by using the cascading style sheet, the relative positioning of the div label is defined, the absolute positioning of the dynamic picture in the img label is defined, and the initial position of the dynamic picture in the img label is defined.

The working principle and working process of the invention are as follows: the image acquisition device acquires video and tracking target information, performs video AI tracking according to the tracking target information through the target tracking unit, separates the position information of the tracking target according to the AI tracking result, and adds the information into an SEI (supplemental enhancement information) frame of the video. SEI frames, which belong to the category of code streams, provide a method of adding additional information to a video code stream, which is one of the characteristics of the h.264 standard. The method is not an essential option in the video decoding process, and can be integrated into a video stream for playing and displaying, and when playing, the additional information is extracted and marked on the picture of each frame for displaying.

The device comprises equipment which is installed on the equipment for recording video and collecting characteristic target information, and can be specifically divided into a reader unit for collecting radio frequency signals of an automobile electronic tag and a special camera for collecting target image information.

According to the method, a video key frame is determined according to video information of a vehicle, target tracking information data of characteristic image information in the video key frame is calculated, and the target tracking information data comprises position information and time stamp information; marking target tracking information data, and sending the target tracking information data to the front end through a WebSocket communication protocol; the front end separates video stream data and target tracking information data according to a predefined protocol; the video stream data is played, frame data for target tracking is obtained by adopting a positioning strategy, and a synchronization mechanism is realized according to a time stamp, so that the problem that the targets are marked on the original video after the current target tracking is processed is solved, the marked video is required to be backed up independently, and multiple video backups are required when different targets are tracked and marked by the same video.

The foregoing describes the embodiments of the present invention in detail, but the description is only a preferred embodiment of the present invention and should not be construed as limiting the scope of the invention. All equivalent changes and modifications within the scope of the present invention are intended to be covered by this patent.

Claims

1. The method for adding the target tracking mark at the playing end is characterized by comprising the following steps:

s5: playing video stream data, obtaining frame data of target tracking by adopting a positioning strategy, and realizing a synchronization mechanism according to a time stamp;

the specific steps of determining the video key frames according to the video information of the vehicle are as follows:

traversing the image frames after the first key frames, judging that the difference value between the characteristic points in one image frame and the characteristic points of the first key frames is smaller than a preset value, and determining that the image frame is a second key frame; the method comprises the steps of calculating target tracking information data of characteristic image information in video key frames, specifically inputting characteristic points corresponding to a first key frame and a second key frame into a preset object judgment model of interest to be processed, establishing a mapping relation among the image frames, the characteristic points and the target tracking information, wherein the object judgment model of interest comprises a first judgment model and a second judgment model, matching the characteristic points with position information through the first judgment model to obtain the position information of the video key frames, and calling the second judgment model to match the characteristic points with time stamp information after matching is completed to obtain the time stamp information of the video key frames.

2. The method for adding a target tracking mark to a playing end according to claim 1, wherein the first key frame includes an object of interest and position information of a region of interest, the position information of the region of interest is used for carrying out target tracking on the first key frame through the image acquisition device, and the target tracking mark is generated according to the acquired region of interest including the object of interest, and the region of interest includes coordinates of key points in the region of interest and length and width of the region of interest.

3. The method for adding a target tracking mark at a playback end according to claim 1, wherein said position information is set to a frame number of an image frame.

4. The method for adding the target tracking mark to the playing end according to claim 1, wherein the specific process of marking the target tracking information data is as follows: and marking the calculated target tracking information data as additional information, and adding the additional information into a supplementary enhancement information frame of the video.

5. The method for adding a target tracking mark at a playback end according to claim 4, wherein said supplemental enhancement information frame supports h.264 standard features.

6. The method for adding a target tracking mark to a playing end according to claim 1, wherein the positioning strategy specifically adopts a cascading style sheet method to obtain frame data of target tracking.

7. A device for adding a target tracking mark at a playing end, wherein a method for adding a target tracking mark at a playing end according to any one of claims 1 to 6 is adopted.