CN117998141A

CN117998141A - Video processing method, device, electronic equipment and medium

Info

Publication number: CN117998141A
Application number: CN202211337838.0A
Authority: CN
Inventors: 郝明珠; 顾武强
Original assignee: Zhejiang Uniview Technologies Co Ltd
Current assignee: Zhejiang Uniview Technologies Co Ltd
Priority date: 2022-10-28
Filing date: 2022-10-28
Publication date: 2024-05-07

Abstract

The embodiment of the application discloses a video processing method, a video processing device, electronic equipment and a medium. Wherein the method comprises the following steps: determining a predicted frame to be processed in the video to be processed, and determining identification information of the predicted frame to be processed; comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bidirectional predicted frames; and determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result, so as to determine the processing mode of the video to be processed according to the judgment result. According to the technical scheme, when repeated analysis is carried out on the video with more image frames, the comparison quantity of the image frames of the two sections of videos is reduced, whether the two sections of videos are repeated or not can be accurately determined, the video processing calculation force is reduced, and the video processing efficiency is improved.

Description

Video processing method, device, electronic equipment and medium

Technical Field

The present invention relates to the field of video processing technologies, and in particular, to a video processing method, a device, an electronic apparatus, and a medium.

Background

With the development of video processing technology, higher demands are being made on the efficiency of video processing. For event videos acquired by police equipment, the same coding mode and the same frame spacing are often adopted for the same type of videos due to the same acquisition equipment and acquisition requirements. The video acquired offline needs to be uploaded to the intelligent server for processing, but for the video uploaded manually or offline video, the situation that the same video is uploaded for multiple times may occur, and further the same video is repeatedly processed, so that the calculation cost and the time cost of video processing are increased, and finally the resource waste and the processing efficiency are low. Therefore, how to avoid the repeated processing of the same video, thereby reducing the video processing calculation power and improving the video processing efficiency is one of the problems to be solved in the current video processing technology.

The main scheme at present is that firstly, I frame (key frame) characteristic values are extracted from a front set and a rear set of videos, then, the tail of a previous set is compared with the head of a next set in time sequence, and therefore whether the videos are repeated is judged. However, the amount of data of the I frame in this scheme is generally large, and the extraction feature calculation force and time consumption are large; and is unfavorable for processing long-time video (more I frames), and the feature comparison efficiency is low.

Disclosure of Invention

The invention provides a video processing method, a video processing device, electronic equipment and a video processing medium, which can effectively avoid repeated processing of the same video, reduce the video processing calculation force and improve the video processing efficiency.

According to an aspect of the present invention, there is provided a video processing method, the method comprising:

determining a predicted frame to be processed in a video to be processed, and determining identification information of the predicted frame to be processed;

comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bi-directional predicted frames;

and determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result, so as to determine the processing mode of the video to be processed according to the judgment result.

According to another aspect of the present invention, there is provided a video processing apparatus including:

The identification information determining module is used for determining a predicted frame to be processed in the video to be processed and determining identification information of the predicted frame to be processed;

The identification information comparison module is used for comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bi-directional predicted frames;

And the repeated video segment determining module is used for determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result so as to determine the processing mode of the video to be processed according to the judgment result.

According to another aspect of the present invention, there is provided a video processing electronic device, the electronic device comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the video processing method according to any one of the embodiments of the present invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a video processing method according to any one of the embodiments of the present invention.

According to the technical scheme, the predicted frame to be processed in the video to be processed is determined, and the identification information of the predicted frame to be processed is determined; comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bidirectional predicted frames; and determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result, so as to determine the processing mode of the video to be processed according to the judgment result. According to the technical scheme, repeated processing of the same video can be effectively avoided, the video processing calculation force is reduced, and the video processing efficiency is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a video processing method according to a first embodiment of the present application;

fig. 2 is a flowchart of a video processing method according to a second embodiment of the present application;

FIG. 3 is a flow chart of a preferred video processing method according to a second embodiment of the present application;

fig. 4 is a schematic structural diagram of a video processing apparatus according to a third embodiment of the present application;

Fig. 5 is a schematic structural diagram of an electronic device implementing a video processing method according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

It should be noted that the terms "first," "second," "target," and the like in the description and claims of the present invention and in the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

Fig. 1 is a flowchart of a video processing method according to a first embodiment of the present application, where the method can be performed by a video processing apparatus, and the video processing apparatus can be implemented in hardware and/or software, and the video processing apparatus can be configured in an electronic device with data processing capability. As shown in fig. 1, the method includes:

s110, determining a predicted frame to be processed in the video to be processed, and determining identification information of the predicted frame to be processed.

The video to be processed may refer to a video waiting to be processed. The number of predicted frames to be processed may be at least two. The predicted frame to be processed may refer to a predicted image frame to be processed. In particular, the predicted frames to be processed may include forward predicted frames (P frames) and/or bi-directional predicted frames (B frames). Where a P-frame represents the difference between the current frame picture and the previous frame, typically taking up fewer data bits than an I-frame. B frames represent the difference between the current frame picture and the previous and subsequent frames and typically occupy fewer data bits than I frames. The identification information may be used to uniquely identify the predicted image frame. For example, the identification information may include a number and a hash value. Specifically, the identification information of the predicted frame to be processed may include a number of the predicted frame to be processed in the video to be processed and/or a hash value corresponding to the predicted frame to be processed.

In the embodiment of the application, if the coding mode adopted by the video to be processed is a coding mode based on at least two reference frames and a weighted prediction technology, determining a predicted frame to be processed according to predicted frames except the reference frames in the predicted frames of the video to be processed; wherein the reference frame is a frame referred to when the predicted frame is encoded.

Illustratively, the reference frame is an image frame for which inter-frame compression encoding requires reference dependency. For example, for a P frame, the encoding needs to refer to the previous I frame or P frame, and the previous I frame or P frame is the reference frame. If the video to be processed is encoded based on an encoding mode supporting at least two reference frames and a weighted prediction technology, for example, based on the encoding standards h.264/AVC and h.265/HEVC after h.263, etc., the reference frames may be removed from the predicted frames, the total number of predicted frames remaining after the reference frames are removed, or the predicted frames obtained by sampling from the predicted frames remaining after the reference frames are removed, as the predicted frames to be processed. For example, if the predicted frames P1, P3, P6 are reference frames of the predicted frame P7, and P7 is not used as a reference frame of other predicted frames, the predicted frames P1, P3, P6 are removed, the predicted frame P7 is reserved, and P7 is used as a predicted frame to be processed. Because the predicted frames are obtained by carrying out weighted prediction according to a plurality of reference frames, the predicted frames contain the information of the reference frames, only the predicted frames except the reference frames can be reserved, and the predicted frames to be processed are selected, so that the number of the predicted frames to be processed is effectively reduced, and the calculated amount is further reduced.

S120, comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bi-directional predicted frames.

Wherein the processed video may refer to a video that has been processed. The target predicted frame may refer to a predicted image frame in the processed video. The target predicted frame may also include a forward predicted frame (P-frame) and/or a bi-directional predicted frame (B-frame). It should be noted that the types of the predicted frame to be processed and the target predicted frame remain identical. That is, if the target predicted frame includes only P frames, the predicted frame to be processed also includes only P frames; if the target predicted frame only comprises the B frame, the predicted frame to be processed also only comprises the B frame; if the target predicted frame includes both a P frame and a B frame, the predicted frame to be processed also includes both a P frame and a B frame. Optionally, the identification information of the target predicted frame includes a hash value of the target predicted frame and a number of the target predicted frame in the processed video.

S130, determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result, so as to determine the processing mode of the video to be processed according to the judgment result.

The comparison result may include comparison success and comparison failure. A duplicate video segment may refer to a duplicate video segment. The processing means may include video processing of the video to be processed and video processing of the video not to be processed.

In this embodiment, if the comparison is successful, it indicates that the video to be processed and the processed video include duplicate video segments, and at this time, video processing is not required for the video to be processed, and only the existing processing result of the intelligent server is required for the duplicate video segments; if the comparison fails, the video to be processed and the processed video do not comprise repeated video segments, and video processing is needed to be carried out on the video to be processed at the moment, and video processing results of the video to be processed are stored.

In this embodiment, optionally, determining identification information of the predicted frame to be processed includes: sampling a predicted frame in a video to be processed, determining the predicted frame to be processed, and determining the number of the predicted frame to be processed in the video to be processed; and generating a hash value corresponding to the predicted frame to be processed according to the predicted frame to be processed.

In this embodiment, the video to be processed may be sampled by the intelligent server, so as to determine a predicted frame to be processed. It should be noted that, in this embodiment, the sampling manner of the video to be processed is not limited, and may be flexibly set according to actual requirements. By way of example, the video to be processed may be sampled once every time interval or every few frames of images; the 1 st P frame in each I frame group may be fixedly selected as the P frame obtained by sampling (the specific selection of the P frames and the number of the P frames are not limited); the first n P frames (and/or B frames) and the last m P frames (and/or B frames) may also be selected from the video to be processed as predicted frames to be processed. Assuming a total number of P frames (and/or B frames) s, where m and n are both less than s/2.

In this embodiment, the number of the predicted frame to be processed in the video to be processed is determined. Wherein, the number is determined according to a preset coding mode. In this embodiment, the coding method of the number is not limited, and may be set according to actual requirements. For example, the numbering of the image frames in the video may be coded in a numbering incremental manner in time sequence starting with the first frame of the video; it is also possible to first distinguish the type of each image frame in the video (I-frame, P-frame or B-frame) and then encode the numbers of each type of image frame in a number increment manner in time sequence for each type of image frame, wherein different types of image frames need to be distinguished in number categories.

For example, assuming that each image frame in a certain video is represented as I-B-P-B-I-B-P in time sequence, the numbers of each image frame may be sequentially noted as1, 2,3 …, 13 in time sequence. Or distinguishing I frames, P frames and B frames, and then marking each I frame as a1 and a2 in sequence according to time sequence; each P frame is marked as b1, b2 and b3 in sequence; each B frame is denoted as c1, c2 … c7, c8 in turn.

After determining the number of the predicted frame to be processed in the video to be processed, a hash value corresponding to the predicted frame to be processed is also required to be generated according to the predicted frame to be processed. Note that, the method of calculating the hash value is not limited in this embodiment, and the hash value may be set according to actual requirements. By way of example, the following hash value algorithm may be employed to determine the hash value: (1) MD4 (message digest algorithm); (2) MD5 (modified message digest algorithm); (3) SHA-1 (secure Hash Algorithm 1); (4) a consistent hashing algorithm; (5) hash collision and collision avoidance strategy. The hash value algorithm can convert an input with any length into an output with a fixed length, and the output is a hash value (hash value).

Through such arrangement, the identification information of the predicted frame to be processed can be rapidly, conveniently and accurately determined by determining the number and the hash value, so that whether the video to be processed and the processed video comprise repeated video segments can be judged according to the identification information.

In this embodiment, optionally, the method further includes: determining, for the processed video, a hash value of a target predicted frame in the processed video; wherein the target predicted frame comprises all forward predicted frames and/or all bi-directional predicted frames of the processed video; storing the identification information of the processed video and the identification information of the target prediction frame; the identification information of the target predicted frame comprises the number of the target predicted frame in the processed video and the hash value of the target predicted frame.

In this embodiment, for a processed video, a hash value of a target predicted frame in the processed video is first determined. Wherein the target predicted frame comprises all forward predicted frames and/or all bi-directional predicted frames of the processed video. The method for determining the hash value of the target predicted frame can be referred to the method for determining the hash value of the predicted frame to be processed. And then determining the number of the target predicted frame in the processed video, and taking the number of the target predicted frame in the processed video and the hash value of the target predicted frame as the identification information of the target predicted frame. The numbering mode of the target predicted frame in the processed video is required to be consistent with the numbering mode of the predicted frame to be processed in the video to be processed. And then the identification information of the processed video and the identification information of the target predicted frame are stored. Wherein the identification information of the processed video may be used to uniquely identify the processed video. For example, the identification information of the processed video may be a preset video code.

Through the arrangement, the identification information of the processed video and the identification information of the target prediction frame are stored, so that the processed video can be quickly and accurately called in subsequent use, the video processing time is shortened, and the video processing efficiency is improved.

In this embodiment, optionally, storing the identification information of the processed video and the identification information of the target prediction frame includes: and processing the hash value of the target predicted frame based on the hash function, and storing the identification information of the processed video and the identification information of the target predicted frame in a chained hash table according to the processing result.

Where the hash function may refer to a function that maps key values of elements to storage locations of the elements. A certain corresponding relation can be established between the keywords of the element and the storage position through the hash function, so that each keyword corresponds to a unique storage position.

It should be noted that, in this embodiment, the determination manner of the hash function is not limited, and may be set according to actual requirements. For example, the hash function may be determined by a remainder method or a multiplication method. Taking the remainder as an example, the hash function may be expressed as h (k) =k mod m. Where k represents a hash value, m represents a divisor, mod represents a remainder of the remainder of dividing k by m, and h (k) represents a remainder of dividing k by m. Assuming that m=5, processing the hash value of the target predicted frame by the hash function may result in 5 remainder cases, i.e., the remainder is 0,1,2,3, 4, respectively.

And respectively storing the identification information of the processed video corresponding to different remainder conditions and the identification information of the target predicted frame in different linked lists of the chain hash table. Here, the identification information of the processed video and the identification information of the target predicted frame corresponding to the remainders of 0, 1,2,3, and 4 are stored in different linked lists, respectively, that is, the identification information of the processed video and the identification information of the target predicted frame are stored in 5 different linked lists separately. Illustratively, the format of the storage may be: identification information of the processed video-number of target predicted frames in the processed video-hash value of the target predicted frames.

Through the arrangement, the chain hash table is used for storing the identification information of the processed video and the target predicted frame, so that the search range can be effectively reduced in the subsequent search, and the search efficiency of the identification information of the processed video and the target predicted frame is improved.

In this embodiment, optionally, before comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video, the method further includes: processing hash values in the identification information of the predicted frames to be processed based on the hash function, and determining a target linked list in the chain hash table according to the processing result; and comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame stored in the target linked list.

The target linked list may be a linked list in which identification information matching with a hash value in identification information of a predicted frame to be processed may exist in the chain hash table. In this embodiment, before comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video, the same hash function is used to process the hash value in the identification information of the predicted frame to be processed, and the target linked list is determined from the linked lists of the chain hash table. And comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame stored in the target linked list. For example, the hash value of the identification information of the predicted frame to be processed is divided by 10 to obtain a value of 3, (if the hash value of the identification information of the target predicted frame is stored, the hash value is divided by 10 to obtain a remainder and then classified and stored in each chain table of the chain hash table according to the remainder), the chain table is searched for a chain table corresponding to the hash value of the identification information divided by 10 to obtain a remainder of 3. Traversing the identification information in the target linked list, and searching whether the identification information is matched with the identification information of the predicted frame to be processed. According to the scheme provided by the embodiment of the application, the identification information of the predicted frame to be processed is compared with the identification information of the target predicted frame stored in the target linked list, so that the comparison range of the identification information is greatly shortened, the identification information comparison time is shortened, and the identification information comparison efficiency is improved.

Example two

Fig. 2 is a flowchart of a video processing method according to a second embodiment of the present application, which is optimized based on the foregoing embodiment. The concrete optimization is as follows: comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video, wherein the method comprises the following steps: comparing the hash value of the predicted frame to be processed with the hash value of the target predicted frame; if the hash value of the target predicted frame is the same as the hash value of the predicted frame to be processed, determining first interval characteristic information of the predicted frame number to be processed corresponding to the same hash value and second interval characteristic information of the target predicted frame number; and comparing the first interval characteristic information with the second interval characteristic information.

As shown in fig. 2, the method of this embodiment specifically includes the following steps:

s210, determining a predicted frame to be processed in the video to be processed, and determining identification information of the predicted frame to be processed.

S220, comparing the hash value of the predicted frame to be processed with the hash value of the target predicted frame; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bi-directional predicted frames.

In this embodiment, after the identification information of the predicted frame to be processed is determined, the hash value may be selected from the identification information of the predicted frame to be processed, and compared with the hash value of the target predicted frame. Wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bi-directional predicted frames.

S230, if the hash value of the target predicted frame is the same as the hash value of the predicted frame to be processed, determining first interval characteristic information of the predicted frame number to be processed corresponding to the same hash value and second interval characteristic information of the target predicted frame number.

The first interval characteristic information may refer to difference characteristic information of numbers of each predicted frame to be processed corresponding to the same hash value. For example, assuming that the numbers of the predicted frames to be processed corresponding to the same hash value are 2, 5, 8, and 11, respectively, it indicates that the adjacent difference value of the numbers of the predicted frames to be processed corresponding to the same hash value is 3, that is, the first interval feature information is 3.

S240, comparing the first interval characteristic information with the second interval characteristic information.

In this embodiment, after the first interval feature information and the second interval feature information are determined, the first interval feature information and the second interval feature information may be further compared, and a comparison result may be determined. Specifically, if the first interval characteristic information is the same as the second interval characteristic information, the comparison is successful; if the first interval characteristic information is different from the second interval characteristic information, the comparison is failed.

S250, determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result, so as to determine the processing mode of the video to be processed according to the judgment result.

According to the technical scheme, the hash value of the predicted frame to be processed is compared with the hash value of the target predicted frame; if the hash value of the target predicted frame is the same as the hash value of the predicted frame to be processed, determining first interval characteristic information of the predicted frame number to be processed corresponding to the same hash value and second interval characteristic information of the target predicted frame number; and comparing the first interval characteristic information with the second interval characteristic information. According to the technical scheme, the hash chain of the identification information of the video frame is used for efficient retrieval, so that the retrieval efficiency is improved, and the detection efficiency of the repeated video is further improved. The identification information comparison is carried out based on the hash value and the interval characteristic information of the predicted frame number, so that the accuracy of the identification information comparison is improved on the basis of effectively avoiding repeated processing of the same video, reducing the calculation force of video processing and improving the video processing efficiency, thereby improving the accuracy of judging repeated video segments and being beneficial to accurately processing the video to be processed.

In this embodiment, optionally, determining, according to the comparison result, whether the video to be processed and the processed video include the duplicate video segment includes: if the first interval characteristic information is consistent with the second interval characteristic information, determining a video segment from the predicted frame to be processed with the minimum coding to the predicted frame to be processed with the maximum coding in the video to be processed as a repeated video segment.

In this embodiment, if the first interval feature information is consistent with the second interval feature information, it indicates that the comparison of the identification information is successful, and at this time, a video segment from a predicted frame to be processed with a minimum code to a predicted frame to be processed with a maximum code in the video to be processed may be determined as a repeated video segment.

Through the arrangement, the repeated video segments can be rapidly and accurately determined according to the comparison result of the first interval characteristic information and the second interval characteristic information.

In this embodiment, optionally, determining, according to the comparison result, whether the video to be processed and the processed video include the duplicate video segment includes: if the first interval characteristic information is consistent with the second interval characteristic information, determining a target prediction frame with the minimum coding and a target prediction frame with the maximum coding, which correspond to the same hash value; extracting a key frame corresponding to a target predicted frame with the minimum coding and a key frame corresponding to a target predicted frame with the maximum coding from the analyzed video, and determining a first hash value of the key frame corresponding to the target predicted frame with the minimum coding and a second hash value of the key frame corresponding to the target predicted frame with the maximum coding; determining a third hash value of a key frame corresponding to the predicted frame to be processed with the minimum coding and a fourth hash value of the key frame corresponding to the predicted frame to be processed with the maximum coding; if the first hash value is the same as the third hash value and the second hash value is the same as the fourth hash value, determining that a video segment from a key frame corresponding to a predicted frame to be processed with minimum coding to a key frame corresponding to a predicted frame to be processed with maximum coding in the video to be processed is a repeated video segment.

The first hash value may refer to a hash value of a key frame corresponding to the target predicted frame with the minimum encoding. The second hash value may refer to a hash value of a key frame corresponding to the target predicted frame having the largest encoding. The third hash value may refer to a hash value of a key frame corresponding to the predicted frame to be processed with the minimum encoding. The fourth hash value may refer to a hash value of a key frame corresponding to the predicted frame to be processed having the largest encoding.

In this embodiment, firstly, a predicted frame to be processed (P-frame and/or B-frame) in a video to be processed is compared with a target predicted frame in a processed video, if the comparison result shows that the first interval characteristic information is consistent with the second interval characteristic information, a key frame corresponding to the predicted frame to be processed needs to be further compared with a key frame corresponding to the target predicted frame, and whether the video to be processed and the processed video include a repeated video segment or not is finally determined according to the key frame comparison result, as shown in fig. 3. Fig. 3 is a flowchart of a preferred video processing method according to a second embodiment of the present application. Specifically, if the first interval feature information is consistent with the second interval feature information, the target predicted frame with the smallest code and the target predicted frame with the largest code corresponding to the same hash value can be further determined. And then extracting a key frame corresponding to the target predicted frame with the minimum coding and a key frame corresponding to the target predicted frame with the maximum coding from the analyzed video, determining the hash value of the key frame corresponding to the target predicted frame with the minimum coding as a first hash value, determining the hash value of the key frame corresponding to the target predicted frame with the maximum coding as a second hash value, determining the hash value of the key frame corresponding to the predicted frame to be processed with the minimum coding as a third hash value, and determining the hash value of the key frame corresponding to the predicted frame to be processed with the maximum coding as a fourth hash value.

And comparing the first hash value with the third hash value, and comparing the second hash value with the fourth hash value, and determining the repeated video segment according to the hash value comparison result. Specifically, if the first hash value is the same as the third hash value and the second hash value is the same as the fourth hash value, a video segment from a key frame corresponding to a predicted frame to be processed with the minimum coding to a key frame corresponding to a predicted frame to be processed with the maximum coding in the video to be processed can be determined as a repeated video segment.

According to the scheme, through the arrangement, the hash value comparison is introduced, the repeated video segments are further determined according to the hash value comparison result on the basis of the interval characteristic information comparison result, and the accuracy of determining the repeated video segments can be further improved.

Example III

Fig. 4 is a schematic structural diagram of a video processing apparatus according to a third embodiment of the present application, where the apparatus may execute the video processing method according to any embodiment of the present application, and the apparatus has functional modules and beneficial effects corresponding to the execution method. As shown in fig. 4, the apparatus includes:

An identification information determining module 310, configured to determine a predicted frame to be processed in a video to be processed, and determine identification information of the predicted frame to be processed;

An identification information comparison module 320, configured to compare the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video; wherein the predicted frame to be processed and the target predicted frame comprise forward predicted frames and/or bi-directional predicted frames;

And the repeated video segment determining module 330 is configured to determine whether the video to be processed and the processed video include repeated video segments according to the comparison result, so as to determine a processing manner of the video to be processed according to the determination result.

Optionally, the identification information determining module 310 is specifically configured to:

Sampling a predicted frame in the video to be processed, determining the predicted frame to be processed, and determining the number of the predicted frame to be processed in the video to be processed;

and generating a hash value corresponding to the predicted frame to be processed according to the predicted frame to be processed.

Optionally, the identification information of the target predicted frame includes a hash value of the target predicted frame and a number of the target predicted frame in the processed video; the number of the predicted frames to be processed is at least two;

The identification information comparison module 320 is configured to:

comparing the hash value of the predicted frame to be processed with the hash value of the target predicted frame;

if the hash value of the target predicted frame is the same as the hash value of the predicted frame to be processed, determining first interval characteristic information of the number of the predicted frame to be processed corresponding to the same hash value and second interval characteristic information of the number of the target predicted frame;

and comparing the first interval characteristic information with the second interval characteristic information.

Optionally, the duplicate video segment determining module 330 is configured to:

and if the first interval characteristic information is consistent with the second interval characteristic information, determining a video segment from the predicted frame to be processed with the minimum coding to the predicted frame to be processed with the maximum coding in the video to be processed as a repeated video segment.

Optionally, the repeated video segment determining module 330 is further configured to:

If the first interval characteristic information is consistent with the second interval characteristic information, determining a target prediction frame with the minimum coding and a target prediction frame with the maximum coding, which correspond to the same hash value;

Extracting a key frame corresponding to the target predicted frame with the minimum coding and a key frame corresponding to the target predicted frame with the maximum coding from the analyzed video, and determining a first hash value of the key frame corresponding to the target predicted frame with the minimum coding and a second hash value of the key frame corresponding to the target predicted frame with the maximum coding;

Determining a third hash value of a key frame corresponding to the predicted frame to be processed with the minimum coding and a fourth hash value of the key frame corresponding to the predicted frame to be processed with the maximum coding;

And if the first hash value is the same as the third hash value and the second hash value is the same as the fourth hash value, determining that a video segment from a key frame corresponding to a predicted frame to be processed with minimum coding to a key frame corresponding to a predicted frame to be processed with maximum coding in the video to be processed is a repeated video segment.

Optionally, the apparatus further includes:

a hash value determining unit configured to determine, for a processed video, a hash value of a target predicted frame in the processed video; wherein the target predicted frame comprises all forward predicted frames and/or all bi-predicted frames of the processed video;

a storage unit for storing the identification information of the processed video and the identification information of the target prediction frame; wherein the identification information of the target predicted frame includes a number of the target predicted frame in the processed video and a hash value of the target predicted frame.

Optionally, the storage unit is configured to:

and processing the hash value of the target predicted frame based on a hash function, and storing the identification information of the processed video and the identification information of the target predicted frame in a chain hash table according to a processing result.

Optionally, the apparatus further includes:

The target chain table determining module is used for processing the hash value in the identification information of the predicted frame to be processed based on a hash function before comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video, and determining a target chain table in the chain hash table according to a processing result;

And the second identification information comparison module is used for comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame stored in the target linked list.

if the coding mode adopted by the video to be processed is a coding mode based on at least two reference frames and a weighted prediction technology, determining a predicted frame to be processed according to predicted frames except the reference frames in the predicted frames of the video to be processed; wherein the reference frame is a frame referred to when the predicted frame is encoded.

The video processing device provided by the embodiment of the application can execute the video processing method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as video processing methods.

In some embodiments, the video processing method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the video processing method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the video processing method in any other suitable way (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method of video processing, the method comprising:

2. The method of claim 1, wherein determining identification information of the predicted frame to be processed comprises:

3. The method of claim 2, wherein the identification information of the target predicted frame includes a hash value of the target predicted frame and a number of the target predicted frame in the processed video; the number of the predicted frames to be processed is at least two;

Comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame in the processed video, wherein the method comprises the following steps:

comparing the first interval characteristic information with the second interval characteristic information;

Determining whether the video to be processed and the processed video comprise repeated video segments according to the comparison result comprises the following steps:

4. A method according to claim 3, wherein determining whether the video to be processed and the processed video include duplicate video segments based on the comparison result comprises:

5. The method according to claim 1, wherein the method further comprises:

determining, for a processed video, a hash value of a target predicted frame in the processed video; wherein the target predicted frame comprises all forward predicted frames and/or all bi-predicted frames of the processed video;

Storing the identification information of the processed video and the identification information of the target prediction frame; wherein the identification information of the target predicted frame comprises the number of the target predicted frame in the processed video and a hash value of the target predicted frame;

Storing the identification information of the processed video and the identification information of the target prediction frame, including:

6. The method of claim 5, wherein the identifying information of the predicted frame to be processed is compared with identifying information of a target predicted frame in the processed video, the method further comprising:

processing hash values in the identification information of the predicted frames to be processed based on a hash function, and determining a target linked list in the chain hash table according to a processing result;

and comparing the identification information of the predicted frame to be processed with the identification information of the target predicted frame stored in the target linked list.

7. The method of claim 1, wherein determining a predicted frame to process in the video to process comprises:

8. A video processing apparatus, the apparatus comprising:

9. A video processing electronic device, the electronic device comprising:

at least one processor; and

A memory communicatively coupled to the at least one processor; wherein,

The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the video processing method of any one of claims 1-7.

10. A computer readable storage medium storing computer instructions for causing a processor to perform the video processing method of any one of claims 1-7.