CN116389855A

CN116389855A - Video tagging method based on OCR

Info

Publication number: CN116389855A
Application number: CN202310639256.6A
Authority: CN
Inventors: 杨龑骄; 田国彦
Original assignee: Kuangzhi Zhongke Beijing Technology Co ltd
Current assignee: Kuangzhi Zhongke Beijing Technology Co ltd
Priority date: 2023-06-01
Filing date: 2023-06-01
Publication date: 2023-07-04

Abstract

The invention discloses a video tag method based on OCR, which is used for respectively analyzing an input field command video and a radar video, defining the field command video as a left path video, defining the field radar video as a right path video, carrying out segment analysis on the field command video and the radar video, setting the interval of each segment of video through front end configuration, generating a video tag at each time interval, and obtaining the uniqueness of the video tag through analysis on historical interval data records. The method comprises the following specific steps: processing left path video: step one, continuously acquiring a video frame sequence, wherein each time interval T1, T1 is more than or equal to 1min, and setting the video frame sequence as a video analysis unit to obtain S video analysis units. When the method is implemented, each video unit obtains the unique label, and the purposes of quickly previewing the key event and saving more storage space are achieved.

Description

Video tagging method based on OCR

Technical Field

The invention belongs to the technical field of intelligent video analysis, and particularly relates to a video tag method based on OCR.

Background

Video tagging is a specific phrase used to describe video features, and tagging video can help users to quickly and efficiently retrieve video content. The existing video tag generation method mainly relies on manual marking, and for the online video tag generation method, the online video tag generation method mainly starts based on the aspects of image, video or voice text understanding and the like. From the image perspective, frames are mainly extracted from a video to obtain a picture, the picture is marked, and finally, image tags of the video are integrated to obtain a video tag. From the video perspective, the video label is mainly obtained by using a video understanding method.

In actual command and combat, a command video image and a radar video image are usually synthesized into one path of video through a video acquisition device, and a large amount of useless information exists in the video, so that the synthesized video data needs to be subjected to labeling treatment, and more useful video information is reserved.

Disclosure of Invention

Accordingly, the present invention has been made to solve the above-mentioned problems occurring in the prior art, by providing an OCR-based video tagging method.

In order to achieve the above object, the present invention provides the following technical solutions:

according to the video tagging method based on OCR, the input site command video and radar video are respectively analyzed, the site command video is defined as a left-way video, the site radar video is defined as a right-way video, the site command video and the radar video are subjected to segmentation analysis, the interval of each video can be set through front end configuration, a video tag is generated at each time interval, and the uniqueness of the video tag is obtained through analysis of historical interval data records.

Further, the specific steps are as follows:

processing left path video:

step one, a sequence of video frames is continuously acquired, with each time interval T1,

setting the video analysis unit as one video analysis unit to obtain S video analysis units;

in the Si-th video analysis unit, taking a frame of image Ik every N seconds to extract edge characteristics, wherein the Si-th video analysis unit can obtain T60/N image characteristics containing edge information; the method comprises the following steps: firstly, carrying out graying treatment on an image Ik, and converting a color image into a gray image Gk; extracting the edges of the gray level image Gk, wherein the edge extraction algorithm can firstly obtain a binarization characteristic image by using conventional algorithms such as sobel, canny and the like in the prior art;

step three, carrying out normalization processing on each image feature containing edge information: the method comprises the following steps: counting the number of the binarized feature points and recording as

The characteristic value is normalized to be a function of the characteristic value,

where w represents the image width and h represents the image height, resulting in a set of T1 x 60/N-dimensional vector arrays _si ；

Step three, each video unit is analyzed, and the current Arrayl is utilized _si Vector and history ArrayL _H Each vector in the queue performs Euclidean distance calculation, if the distance is greater than a threshold T, flag=0 is set, flag=1 is set, and the array is simultaneously used _si Adding to history queue ArrayL _H And the judgment basis of the next round.

Further, the specific steps are as follows:

right-way video processing:

s1, continuously collecting a right-path video frame sequence, T2,

obtaining M right-path video analysis units for one video analysis unit;

s2, in a Mi video analysis unit, taking a frame of image Pk every N seconds to obtain T2 x 60/N pieces of image data to be analyzed;

s3, T1 x 60/N-dimensional vector array based on left-path video analysis _s The RN vectors in the samples are recorded as

For ArrayR _N The vectors are ordered from big to small;

s4, selecting the first Q ArrayRs _N OCR recognition is carried out on the image corresponding to the vector value;

s5, merging the recognition results based on a statistical method for the Q recognition results obtained by recognition, namely, performing comparative analysis on the recognition results of the OCR at the corresponding positions, and recording the statistical results in a form of a table;

s6, comparing the historical OCR information with the historical OCR information, if the similarity is larger than a threshold D, setting flagR=0, and setting the similarity smaller than or equal to D, setting flagR=1, and simultaneously adding the current OCR recognition information into a historical information queue to be used as a judgment basis of the next round.

Further, the specific steps are as follows,

and (3) data fusion processing:

the result value of each set video unit is obtained through the analysis of the left path video and the right path video, the next analysis is carried out according to the result value, and the method is concretely realized as follows,

obtaining state values of a left video flag L and a right video flag R;

performing AND operation on the status values of the flag L and the flag R, and when the status values are 1, indicating that the video tag belongs to a unique tag;

and combining the character information into a character string by utilizing unique tag information obtained by OCR, and then renaming the video unit to generate a tagged video file.

Further, the interval of each video may be 5 minutes, 10 minutes, or 15 minutes.

The invention has the following advantages:

when the method is implemented, each video unit obtains the unique label, and the purposes of quickly previewing the key event and saving more storage space are achieved.

Drawings

Fig. 1 is a flow chart of an OCR-based video tagging method provided in some embodiments of the present invention.

Detailed Description

Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

As shown in fig. 1, in the video tagging method based on OCR according to the embodiment of the first aspect of the present invention, an input field command video and a radar video are respectively analyzed, the field command video is defined as a left path video, the field radar video is defined as a right path video, for more precisely tagging the two paths of videos, the video is segmented and analyzed, and an interval of each segment of video can be set through front end configuration, and the time interval is generally 5 minutes, 10 minutes and 15 minutes, that is, each time interval generates a video tag, and the uniqueness of the video tag is obtained through analysis of historical interval data records, which specifically includes the following steps:

left path video processing:

a sequence of video frames is continuously acquired, with each interval T1,

in the Si-th video analysis unit, taking a frame of image Ik every N seconds to extract edge characteristics, wherein the Si-th video analysis unit can obtain T1 x 60/N image characteristics containing edge information; the method comprises the following steps: firstly, carrying out graying treatment on an image Ik, and converting a color image into a gray image Gk; extracting the edges of the gray level image Gk, wherein the edge extraction algorithm can firstly obtain a binarization characteristic image by using conventional algorithms such as sobel, canny and the like in the prior art;

normalization processing is carried out on each image feature containing edge information: the method comprises the following steps: counting the number of the binarized feature points and recording as

where w represents the image width and h represents the image height, resulting in a set of T60/N-dimensional vectors Arrayl _si ；

Analyzing each video unit using the current Arrayl _si Vector and history ArrayL _H Each vector in the queue performs Euclidean distance calculation, if the distance is greater than a threshold T, flag=0 is set, flag=1 is set, and the array is simultaneously used _si Adding to history queue ArrayL _H And the judgment basis of the next round.

Right-way video processing:

the right video frame sequence, T2,

obtaining M right-path video analysis units for one video analysis unit;

in a Mi video analysis unit, taking a frame of image Pk every N seconds to obtain T2 x 60/N pieces of image data to be analyzed;

t1 x 60/N-dimensional vector Arrayl based on left-path video analysis _si The RN vectors in the samples are recorded as

For ArrayR _N The vectors are ordered from big to small;

selecting the first Q ArrayRs _N OCR recognition is carried out on the image corresponding to the vector value;

the Q OCR recognition results obtained by recognition are combined based on a statistical method, and the recognition results are specifically: performing comparative analysis on OCR recognition results of corresponding positions, and recording statistical results in a form of a table;

comparing with the historical OCR information, if the similarity is larger than a threshold value D, setting FlagR=0, and setting FlagR=1, and meanwhile adding the current OCR recognition information into a historical information queue to be used as a judgment basis of the next round.

And (3) data fusion processing:

the result value of each set video unit is obtained through the analysis of the left path video and the right path video, and the next analysis is carried out according to the result value, and the specific implementation is as follows:

obtaining state values of a left video flag L and a right video flag R;

and combining the character information into a character string by utilizing unique tag information obtained by OCR, and renaming the video unit to generate a tagged video file.

Standard parts used in the invention can be purchased from the market, special-shaped parts can be customized according to the description of the specification and the drawings, the specific connection modes of all parts adopt conventional means such as mature bolts, rivets and welding in the prior art, the machinery, the parts and the equipment adopt conventional modes in the prior art, and the circuit connection adopts conventional connection modes in the prior art, so that details are not described in detail in the specification, and the invention belongs to the prior art known to the person skilled in the art.

Although the present invention has been described with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described, or equivalents may be substituted for elements thereof, and any modifications, equivalents, improvements and changes may be made without departing from the spirit and principles of the present invention.

Claims

1. The video tag method based on OCR is characterized in that the input site command video and radar video are respectively analyzed, the site command video is defined as a left path video, the site radar video is defined as a right path video, the site command video and the radar video are subjected to segmentation analysis, the interval of each segment of video can be set through front end configuration, a video tag is generated at each time interval, and the uniqueness of the video tag is obtained through analysis of historical interval data records.

2. The OCR based video tagging method of claim 1, comprising the specific steps of:

processing left path video:

3. The OCR based video tagging method of claim 2, comprising the specific steps of:

right-way video processing:

s1, continuously collecting a right-path video frame sequence, T2,

obtaining M right-path video analysis units for one video analysis unit;

For ArrayR _N The vectors are ordered from big to small;

4. The method for video tagging based on OCR according to claim 3, comprising the steps of,

and (3) data fusion processing:

obtaining state values of a left video flag L and a right video flag R;

5. The OCR based video tagging method of claim 1, wherein the interval of each video segment may be 5 minutes, 10 minutes, or 15 minutes.