CN114005062A

CN114005062A - Abnormal frame processing method, abnormal frame processing device, server and storage medium

Info

Publication number: CN114005062A
Application number: CN202111280047.4A
Authority: CN
Inventors: 贾碧莹; 郭鹏
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2021-10-29
Filing date: 2021-10-29
Publication date: 2022-02-01

Abstract

The present disclosure relates to an abnormal frame processing method, an abnormal frame processing device, a server and a storage medium, wherein the method comprises the following steps: identifying abnormal frames from target video frames of a video to be identified; acquiring frame information corresponding to the abnormal frame; the frame information comprises a frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified; and sending the frame information to a terminal so that the terminal determines a corresponding abnormal frame from the video to be identified according to the frame tag, and identifying the abnormal frame based on the video frames before and after the playing time in the video to be identified. By adopting the method, the position of the abnormal frame in the video to be identified can be quickly determined, and an auditor does not need to watch the complete video to determine the abnormal frame, so that the auditing efficiency of the video is improved.

Description

Abnormal frame processing method, abnormal frame processing device, server and storage medium

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to an abnormal frame processing method, an abnormal frame processing apparatus, a server, and a storage medium.

Background

With the development of internet technology, at present, more and more videos are published on a network, and in order to supervise the published videos, a video publishing platform can publish each video after auditing and labeling each video.

The current video auditing method generally depends on the combination of a video and a frame image, whether the video violates rules or not is checked manually and marked, and for a long video, the auditing is mainly carried out by checking the frame image. However, this auditing method only depends on the frame image, and after the auditor finishes watching the frame image, the auditor needs to watch the video again, which is time-consuming and inefficient.

Disclosure of Invention

The disclosure provides an abnormal frame processing method, an abnormal frame processing device, a server and a storage medium, which are used for at least solving the problems that a video auditing method in the related art consumes time and auditing efficiency is low. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided an abnormal frame processing method, including:

identifying abnormal frames from target video frames of a video to be identified;

acquiring frame information corresponding to the abnormal frame; the frame information comprises a frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified;

and sending the frame information to a terminal so that the terminal determines a corresponding abnormal frame from the video to be identified according to the frame tag, and identifying the abnormal frame based on the video frames before and after the playing time in the video to be identified.

In an exemplary embodiment, the identifying an abnormal frame from the target video frames of the video to be identified includes:

acquiring the weight and the abnormal probability of the target video frame;

and screening out the video frames with the weight larger than a weight threshold value and the abnormal probability larger than an abnormal probability threshold value from the target video frames as the abnormal frames.

In an exemplary embodiment, the obtaining the weight and the abnormal probability of the target video frame includes:

extracting image characteristic information of the target video frame;

inputting the image characteristic information into a trained self-positioning attention model to obtain the weight and the abnormal probability of the target video frame; the self-positioning attention model is obtained by training a neural network model according to image feature information, weight and abnormal probability of a sample video frame.

In an exemplary embodiment, before the abnormal frame is identified from the target video frame of the video to be identified, the method further includes:

extracting a plurality of video frames in the video to be identified; the plurality of video frames comprise a first video frame and an intermediate video frame of the video to be identified;

and carrying out similar frame filtering processing on the plurality of video frames to obtain a target video frame of the video to be identified.

In an exemplary embodiment, the performing similar frame filtering processing on the plurality of video frames to obtain a target video frame of the video to be identified includes:

acquiring an identification value of each video frame; the identification value is determined based on image feature information of the video frame;

screening out video frames with the difference values between the identification values within a preset range from the plurality of video frames as similar frames;

and performing picture duplicate checking processing on the similar frames, and reserving any frame from the similar frames with the duplication degree greater than the duplication degree threshold value to obtain the target video frame of the video to be identified.

In an exemplary embodiment, the extracting a plurality of video frames in the video to be identified includes:

acquiring the time length corresponding to the video to be identified;

determining a time interval based on the length of time; the time interval represents the time difference between two adjacent video frames extracted from the video to be identified;

and according to the time interval, extracting a first video frame and a middle video frame from the video to be identified to form a plurality of video frames in the video to be identified.

In an exemplary embodiment, after obtaining frame information corresponding to the abnormal frame, the method further includes:

acquiring a first video frame in a set time period before the playing time of the abnormal frame and a second video frame in a set time period after the playing time;

according to the playing time sequence, carrying out image synthesis processing on the first video frame, the second video frame and the abnormal frame to obtain a dynamic image corresponding to the abnormal frame;

and sending the dynamic image to the terminal so that the terminal can identify the abnormal frame based on the dynamic image.

According to a second aspect of the embodiments of the present disclosure, there is provided an abnormal frame processing apparatus including:

the identification unit is configured to identify abnormal frames from target video frames of the video to be identified;

an acquisition unit configured to perform acquisition of frame information corresponding to the abnormal frame; the frame information comprises a frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified;

and the sending unit is configured to execute sending of the frame information to a terminal, so that the terminal determines a corresponding abnormal frame from the video to be identified according to the frame tag, and identifies the abnormal frame based on video frames before and after the playing time in the video to be identified.

In an exemplary embodiment, the identifying unit is further configured to perform obtaining the weight and the abnormal probability of the target video frame; and screening out the video frames with the weight larger than a weight threshold value and the abnormal probability larger than an abnormal probability threshold value from the target video frames as the abnormal frames.

In an exemplary embodiment, the identifying unit is further configured to perform extracting image feature information of the target video frame; inputting the image characteristic information into a trained self-positioning attention model to obtain the weight and the abnormal probability of the target video frame; the self-positioning attention model is obtained by training a neural network model according to image feature information, weight and abnormal probability of a sample video frame.

In an exemplary embodiment, the apparatus further comprises:

an extraction unit configured to perform extraction of a plurality of video frames in the video to be identified; the plurality of video frames comprise a first video frame and an intermediate video frame of the video to be identified;

and the filtering unit is configured to perform similar frame filtering processing on the plurality of video frames to obtain a target video frame of the video to be identified.

In an exemplary embodiment, the filtering unit is further configured to perform obtaining an identification value of each of the video frames; the identification value is determined based on image feature information of the video frame; screening out video frames with the difference values between the identification values within a preset range from the plurality of video frames as similar frames; and performing picture duplicate checking processing on the similar frames, and reserving any frame from the similar frames with the duplication degree greater than the duplication degree threshold value to obtain the target video frame of the video to be identified.

In an exemplary embodiment, the extracting unit is further configured to perform obtaining a time length corresponding to the video to be identified; determining a time interval based on the length of time; the time interval represents the time difference between two adjacent video frames extracted from the video to be identified; and according to the time interval, extracting a first video frame and a middle video frame from the video to be identified to form a plurality of video frames in the video to be identified.

In an exemplary embodiment, the apparatus further includes a synthesizing unit configured to perform acquiring a first video frame within a set period before a playback time of the abnormal frame and a second video frame within a set period after the playback time; according to the playing time sequence, carrying out image synthesis processing on the first video frame, the second video frame and the abnormal frame to obtain a dynamic image corresponding to the abnormal frame; and sending the dynamic image to the terminal so that the terminal can identify the abnormal frame based on the dynamic image.

According to a third aspect of the embodiments of the present disclosure, there is provided a server, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the method of any one of the above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of a server, enable the server to perform the method of any one of the above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising instructions which, when executed by a processor of a server, enable the server to perform the method as defined in any one of the above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

after the abnormal frame is identified from the target video frame of the video to be identified, the frame tag of the abnormal frame and the playing time of the abnormal frame in the video to be identified are obtained, and the frame tag and the playing time are sent to the terminal, so that the terminal can quickly determine the corresponding abnormal frame from the video to be identified according to the frame tag without manually checking each video frame, and can quickly jump to the position of the abnormal frame in the video to be identified based on the playing time of the abnormal frame in the video to be identified, so that the abnormal frame can be identified only by watching the video frames before and after the playing time corresponding to the abnormal frame, and the auditing personnel do not need to watch the complete video to determine the abnormal frame, and the auditing efficiency of the video is greatly improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a diagram illustrating an application environment for a method of exception frame handling in accordance with an illustrative embodiment.

Fig. 2 is a flowchart illustrating a method of exception frame handling according to an example embodiment.

Fig. 3 is a flowchart illustrating a method of exception frame handling according to another example embodiment.

FIG. 4 is a flowchart illustrating a method of anomalous frame identification and processing in accordance with an exemplary embodiment.

Fig. 5 is a block diagram illustrating a structure of an abnormal frame processing apparatus according to an exemplary embodiment.

FIG. 6 is a block diagram illustrating a server in accordance with an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The abnormal frame processing method provided by the present disclosure may be applied to an application environment as shown in fig. 1, and includes: the terminal 110 and the server 120, wherein the terminal 110 interacts with the server 120 through a network, wherein the terminal 110 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server 120 may be implemented by an independent server or a server cluster consisting of a plurality of servers. In the application scenario of the present disclosure, the server 120 identifies an abnormal frame from a target video frame of a video to be identified, acquires a frame tag of the abnormal frame and a playing time of the abnormal frame in the video to be identified as frame information, sends the frame information to the terminal 110, and the terminal 110 determines a corresponding abnormal frame from the video to be identified according to the frame tag and identifies the abnormal frame based on the video frames before and after the playing time in the video to be identified.

Fig. 2 is a flowchart illustrating an abnormal frame processing method according to an exemplary embodiment, and as shown in fig. 2, the method is described as being applied to the server 120, and includes the following steps:

in step S210, an abnormal frame is identified from the target video frames of the video to be identified.

The target video frame represents a plurality of video frames extracted from the video to be identified.

The abnormal frame may represent a video frame containing abnormal content.

In a specific implementation, after the server 120 obtains the video to be identified, it may use the ffmpeg command to extract video frames from the video to be identified, so as to obtain a plurality of target video frames, store the obtained target video frames in the sublibrary, and then perform anomaly identification on the target video frames, so as to screen out anomalous frames. More specifically, the abnormal frame may be determined from the target video frame based on the weight and the abnormal probability of the target video frame by obtaining the weight and the abnormal probability of the target video frame.

In step S220, frame information corresponding to the abnormal frame is acquired; the frame information comprises a frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified.

In practical application, the frame tag may be a number, a symbol, a character, and the like, for example, 0 may be used as a normal frame tag, and 1 may be used as an abnormal frame tag.

In a specific implementation, after the server 120 identifies the abnormal frame and before storing the target video frame, a frame tag may be added to each video frame to represent whether the video frame is a normal frame or an abnormal frame, and the playing time of each target video frame in the video to be identified may also be obtained, and the target video frame is associated with the corresponding frame tag and the playing time and then stored in the sub-library table.

In practical applications, a frame tag may be written in a frame type field in frame information of a video frame by designing an added frame type (frameType) field, so that the type of the video frame may be identified by the frame tag in the frame type field.

In step S230, frame information is sent to the terminal, so that the terminal determines a corresponding abnormal frame from the video to be identified according to the frame tag, and identifies the abnormal frame based on the video frames before and after the playing time in the video to be identified.

In a specific implementation, after obtaining the frame tag and the playing time of the abnormal frame and forming frame information, the server 120 may send the frame information to the terminal 110, and after receiving the frame information, the terminal 110 may determine whether the target video frame is the abnormal frame or the normal frame based on the frame tag in the frame information, so as to label the abnormal frame, and may jump to the playing position of the abnormal frame in the video to be identified based on the playing time of the abnormal frame in the video to be identified, and audit is performed on the abnormal frame through the video frames in the time periods before and after the playing position. The labeling of the abnormal frame can be realized by highlighting the frame of the abnormal frame, adding colors and the like, so that the abnormal frame is different from the normal frame.

According to the abnormal frame processing method, after the abnormal frame is identified from the target video frame of the video to be identified, the frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified are obtained, and the frame label and the playing time are sent to the terminal, so that the terminal can quickly determine the corresponding abnormal frame from the video to be identified according to the frame label without manually checking each video frame, and can quickly jump to the position of the abnormal frame in the video to be identified based on the playing time of the abnormal frame in the video to be identified, so that the abnormal frame can be identified only by watching the video frames before and after the playing time corresponding to the abnormal frame, and an auditor does not need to watch the complete video to determine the abnormal frame, and the auditing efficiency of the video is greatly improved.

In an exemplary embodiment, in step S210, the abnormal frame is identified from the target video frame of the video to be identified, which may be specifically implemented by the following steps: acquiring the weight and the abnormal probability of a target video frame; and screening out the video frames with the weight larger than the weight threshold value and the abnormal probability larger than the abnormal probability threshold value from the target video frames as abnormal frames.

In specific implementation, after the weight and the abnormal probability of each target video frame are obtained, the weight and the abnormal probability of each target video frame can be compared with a preset weight threshold and an abnormal probability threshold respectively, and the video frame with the weight larger than the weight threshold and the abnormal probability larger than the abnormal probability threshold is obtained and serves as the abnormal frame.

In practical application, video frames with the abnormal probability greater than the abnormal probability threshold value can be screened from the target video frames based on the abnormal probability, and then the video frame with the maximum weight is determined from the video frames with the abnormal probability greater than the abnormal probability threshold value based on the weight to serve as the abnormal frame, wherein the abnormal probability threshold value can be set to be 0.14, and when a plurality of video frames with the maximum weight exist, each video frame is determined to be the abnormal frame.

In the embodiment, the weight of the target video frame is combined with the abnormal probability, and the abnormal frame is determined from the target video frame, so that the accuracy of the determined abnormal frame can be improved.

Further, in an exemplary embodiment, the step of obtaining the weight and the abnormal probability of the target video frame may be implemented by: extracting image characteristic information of a target video frame; inputting image characteristic information into a trained self-positioning attention model to obtain the weight and the abnormal probability of a target video frame; and the self-positioning attention model is obtained by training the neural network model according to the image characteristic information, the weight and the abnormal probability of the sample video frame.

In the specific implementation, a neural network model can be pre-constructed to serve as a Self-positioning Attention model (Self-positioning Attention) to be trained, weights and abnormal probabilities of sample video frames and the sample video frames are obtained, image characteristic information of the sample video frames is extracted to serve as input values of the Self-positioning Attention model to be trained, the Self-positioning Attention model to be trained is trained, the weights and the abnormal probabilities output by the model are differed with actual weights and the abnormal probabilities of the sample video frames to obtain precision values of model training, when the precision values are larger than a threshold value, model parameters are adjusted, the model is retrained until the precision values are smaller than the threshold value, and the trained Self-positioning Attention model is obtained. And further, after the target video frame is obtained, the image characteristic information of the target video frame can be extracted, and the trained self-positioning attention model is input to obtain the weight and the abnormal probability of the target video frame.

In the embodiment, the weight and the abnormal probability of the target video frame are obtained by training the self-positioning attention model, so that the abnormal frame can be identified based on the weight and the abnormal probability, and the accuracy of the identification result is improved.

In an exemplary embodiment, before the step S210 identifies an abnormal frame from the target video frames of the video to be identified, the method further includes: extracting a plurality of video frames in a video to be identified; the multiple video frames comprise a first video frame and a middle video frame of a video to be identified; and carrying out similar frame filtering processing on the plurality of video frames to obtain a target video frame of the video to be identified.

In the specific implementation, the problem that video frames are easy to miss when the video is played too fast or just begins to be played due to the fact that the video frames are loaded and blocked is considered, when the video frames are extracted from the video to be identified, the first video frame of the video to be identified can be extracted first to avoid the problem of missing, frame extraction can be performed on 0 second of the video to be identified specifically by using an ffmpeg command to obtain the first video frame, then, the video frames can be extracted from the video to be identified at intervals from 0 second according to a preset time interval to obtain a plurality of intermediate video frames, and the first video frame and the intermediate video frames form the plurality of video frames of the video to be identified. Because the number of the extracted video frames can be more, for some videos with single scene and less content change like the explained videos, the repetition degree between the extracted video frames is higher, and the redundancy of the video frames is caused, therefore, the similar frame filtering processing can be further carried out on the extracted video frames, and only one video frame can be reserved for the extracted video frames with higher similarity, so that the redundancy of the extracted video frames is reduced.

In practical application, a plurality of video frames extracted from a video to be identified can be all video frames in the video to be identified, so that the omission of the video frames is reduced.

In the embodiment, the problem of missed detection of the video frames can be reduced by extracting the first video frame and the plurality of intermediate video frames of the video to be identified, the redundancy of the video frames can be reduced by filtering the extracted plurality of video frames through similar frames, and the identification efficiency of the video frames is improved, so that the identification efficiency of the video frames is improved under the condition of avoiding missing abnormal frames.

In an exemplary embodiment, similar frame filtering processing is performed on a plurality of video frames to obtain a target video frame of a video to be identified, and the method can be implemented by the following steps: acquiring an identification value of each video frame; the identification value is determined based on image characteristic information of the video frame; screening out video frames with the difference values between the identification values within a preset range from the plurality of video frames as similar frames; and performing picture duplicate checking processing on the similar frames, and reserving any frame from the similar frames with the duplication degree greater than the duplication degree threshold value to obtain a target video frame of the video to be identified.

The identification value is a value obtained according to image feature information in the video frame, and may specifically be an MD5 value, where an MD5 value, i.e., an MD5 information digest algorithm, is a cryptographic hash function that can generate a 128-bit hash value.

In the specific implementation, the identification value of each video frame can be obtained through calculation by an information summarization algorithm, and then the difference value between the identification values of every two video frames is calculated. Further, picture repetition checking processing is carried out on the similar frames to obtain the repetition degree among the similar frames, and when the repetition degree is larger than a repetition degree threshold value, the high coincidence of the contents among the similar frames is indicated, so that any frame can be reserved from the similar frames with the repetition degree larger than the repetition degree threshold value, other frames are eliminated, the redundancy among the video frames is reduced, and the target video frame of the video to be identified is obtained.

In the embodiment, a plurality of video frames extracted from a video to be identified are primarily screened through the identification values of the video frames to obtain a plurality of similar frames, then the similar frames are subjected to picture duplicate checking processing, and the similar frames are further identified and subjected to duplicate removal processing, so that the accuracy of the duplicate removal result of the video frames can be improved.

In an exemplary embodiment, extracting a plurality of video frames in a video to be identified includes: acquiring the time length corresponding to a video to be identified; determining a time interval based on the length of time; the time interval represents the time difference between two adjacent video frames extracted from the video to be identified; according to the time interval, extracting a first video frame and a middle video frame from the video to be identified to form a plurality of video frames in the video to be identified.

In specific implementation, before extracting a plurality of video frames in a video to be identified, a time length corresponding to the video to be identified, that is, a time length required for playing the video to be identified, is further obtained, a time interval for extracting the video frames from the video to be identified is determined based on the time length, specifically, the time interval is determined based on the principle that the longer the time length is, the larger the time interval is, the shorter the time length is, and the smaller the time interval is, and then a first video frame and a middle video frame are extracted from the video to be identified according to the time interval to form the plurality of video frames of the video to be identified.

In this embodiment, the time interval is determined based on the principle that the longer the time interval is, the larger the time interval is, the shorter the time interval is, and the smaller the time interval is, so that the determined time interval is adaptively changed along with the time length of the video to be recognized, and the problems that the extracted video frames are too many and the calculated amount is increased due to too long time length of the video to be recognized and the time interval is small, or the extracted video frames are too few and the video frames are omitted due to too short time length of the video to be recognized and the time interval is large are avoided.

In an exemplary embodiment, after obtaining frame information corresponding to the abnormal frame, the method further includes: acquiring a first video frame in a set time period before the playing time of the abnormal frame and a second video frame in the set time period after the playing time; according to the playing time sequence, carrying out image synthesis processing on the first video frame, the second video frame and the abnormal frame to obtain a dynamic image corresponding to the abnormal frame; and sending the dynamic image to the terminal so that the terminal can identify the abnormal frame based on the dynamic image.

In a specific implementation, the dynamic image synthesized based on the first video frame, the second video frame and the abnormal frame may be a GIF (Graphics Interchange Format) moving image. For example, if the playing time corresponding to the acquired abnormal frame is 1 minute 30 seconds, the video frame within 3 seconds before 1 minute 30 seconds, that is, within 1 minute 27 seconds to 1 minute 30 seconds in the video to be identified may be acquired as the first video frame, the video frame within 3 seconds after 1 minute 30 seconds, that is, within 1 minute 30 seconds to 1 minute 33 seconds in the video to be identified may be acquired as the second video frame, the first video frame, the abnormal frame, and the second video frame may be image-synthesized according to the playing time of each video frame in the video to be identified to obtain the GIF motion picture corresponding to the abnormal frame, and the GIF motion picture may be transmitted to the terminal, so that the terminal may identify the abnormal frame based on the GIF motion picture.

In the embodiment, the video frames and the abnormal frames in two time periods before and after the playing time of the abnormal frame are subjected to image synthesis processing to obtain the dynamic image, and the dynamic image is sent to the terminal, so that the terminal can directly identify the abnormal frame according to the dynamic image without looking up the video to be identified, and the auditing efficiency of the video to be identified can be improved.

In another exemplary embodiment, as shown in fig. 3, is a flowchart of an abnormal frame processing method according to another exemplary embodiment, and in this embodiment, the method includes the following steps:

step S302, extracting a plurality of video frames in a video to be identified; the multiple video frames comprise a first video frame and a middle video frame of a video to be identified;

step S304, carrying out similar frame filtering processing on a plurality of video frames to obtain a target video frame of the video to be identified;

step S306, extracting image characteristic information of the target video frame;

step S308, inputting the image characteristic information into a trained self-positioning attention model to obtain the weight and the abnormal probability of the target video frame; the self-positioning attention model is obtained by training a neural network model according to image characteristic information, weight and abnormal probability of a sample video frame;

step S310, screening out video frames with the weight larger than a weight threshold value and the abnormal probability larger than an abnormal probability threshold value from the target video frames as abnormal frames;

step S312, acquiring frame information corresponding to the abnormal frame; the frame information comprises a frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified;

step S314, sending frame information to the terminal so that the terminal can determine corresponding abnormal frames from the video to be identified according to the frame tags and identify the abnormal frames based on the video frames before and after the playing time in the video to be identified.

The abnormal frame processing method provided by this embodiment can reduce the problem of missed detection of the video frame by extracting the first video frame and the plurality of intermediate video frames of the video to be identified, can reduce the redundancy of the video frames and improve the identification efficiency of the video frames by performing similar frame filtering processing on the plurality of extracted video frames, thereby realizing that the identification efficiency of the video frames is improved under the condition of avoiding missing the abnormal frames, after the abnormal frames are identified from the target video frames based on the weight and the abnormal probability of the target video frames, the corresponding abnormal frames can be quickly determined from the video to be identified by the terminal according to the frame labels by acquiring the frame labels of the abnormal frames and the playing time of the abnormal frames in the video to be identified and sending the frame labels and the playing time to the terminal without manually reviewing each video frame, and based on the playing time of the abnormal frames in the video to be identified, the method and the device have the advantages that the position of the abnormal frame in the video to be identified is quickly jumped to, so that the abnormal frame can be identified only by watching the video frames before and after the playing time corresponding to the abnormal frame, an auditor does not need to watch the complete video to determine the abnormal frame, and the auditing efficiency of the video is greatly improved.

In an exemplary embodiment, to facilitate understanding of the embodiments of the present application by those skilled in the art, reference will now be made to the specific examples illustrated in the accompanying drawings. Referring to fig. 4, a flow chart of a method for identifying and processing an abnormal frame in an application example is shown, which includes the following steps:

(1) after the video to be identified is uploaded to the server, a ffmpeg command is adopted to extract frames from the video to be identified, specifically, a video frame of 0 second of video resources is extracted to obtain a first video frame, and a plurality of video frames are selected from all video frames in the video by an extraction strategy of one frame in one second by using the ffmpeg command and stored in a BlobStore.

(2) And performing similar frame filtering processing on the extracted multiple video frames, specifically filtering out highly similar frames through an MD5 algorithm and a picture repetition checking algorithm, and reserving any frame image in the similar frames to obtain the target video frame.

(3) The method comprises the steps of obtaining the weight and the abnormal probability of a target video frame through a Self-positioning Attention model (Self-position Attention), and screening out a video frame with the abnormal probability being greater than 0.14 and the maximum weight from the target video frame to serve as an abnormal frame.

(4) Adding a frameType field for a video frame, setting a frameType value of an abnormal frame as an abnormal frame tag, monitoring abnormal frame information through asynchronous information through the identification capability of an MMU algorithm to obtain the playing time of the abnormal frame in a video to be identified, sending the abnormal frame, the frameType value and the playing time of the abnormal frame to an auditing platform, and storing the abnormal frame, the frameType value and the playing time into a shared table (a database-based table).

(5) On one hand, the auditing platform takes out the abnormal frame from the shared table according to the frameType value, constructs a data structure and sends the data structure to the front end, so that the front end can quickly determine the abnormal frame according to the frameType value and perform frame highlight labeling on the abnormal frame; on the other hand, according to the frameType value, the abnormal frame and the playing time corresponding to the abnormal frame are taken out from the shared table, a data structure is constructed and sent to the front end, so that the front end can be positioned to the playing position of the abnormal frame in the video to be identified according to the playing time corresponding to the abnormal frame, and the abnormal frame is checked based on the video frames before and after the playing position.

The method for identifying and processing the abnormal frame, provided by the embodiment, sends the frameType value of the video frame to the front end, so that the front end can directly and quickly determine the abnormal value according to the frameType value and mark the abnormal value, the marking efficiency and accuracy are improved, manpower is saved, the playing time of the abnormal frame in the video to be identified is sent to the front end, the playing position of the abnormal frame in the video to be identified can be quickly positioned by the front end according to the playing time, the complete video does not need to be watched, and the auditing efficiency of the abnormal frame is improved.

It should be understood that although the various steps in the flow charts of fig. 2-4 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-4 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.

It is understood that the same/similar parts between the embodiments of the method described above in this specification can be referred to each other, and each embodiment focuses on the differences from the other embodiments, and it is sufficient that the relevant points are referred to the descriptions of the other method embodiments.

Fig. 5 is a block diagram illustrating a structure of an abnormal frame processing apparatus according to an exemplary embodiment. Referring to fig. 5, the apparatus includes: an identification unit 502, an acquisition unit 504 and a sending unit 506, wherein,

an identifying unit 502 configured to perform identifying an abnormal frame from target video frames of a video to be identified;

an obtaining unit 504 configured to perform obtaining frame information corresponding to the abnormal frame; the frame information comprises a frame label of the abnormal frame and the playing time of the abnormal frame in the video to be identified;

a sending unit 506, configured to execute sending of the frame information to a terminal, so that the terminal determines a corresponding abnormal frame from the video to be identified according to the frame tag, and identifies the abnormal frame based on video frames before and after the playing time in the video to be identified.

In an exemplary embodiment, the identifying unit 502 is further configured to perform obtaining the weight and the abnormal probability of the target video frame; and screening out the video frames with the weight larger than a weight threshold value and the abnormal probability larger than an abnormal probability threshold value from the target video frames as the abnormal frames.

In an exemplary embodiment, the identifying unit 502 is further configured to perform extracting image feature information of the target video frame; inputting the image characteristic information into a trained self-positioning attention model to obtain the weight and the abnormal probability of the target video frame; the self-positioning attention model is obtained by training a neural network model according to image feature information, weight and abnormal probability of a sample video frame.

In an exemplary embodiment, the apparatus further comprises:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 6 is a block diagram illustrating a server 600 for implementing an exception frame handling method according to an example embodiment. The server 600 includes a processing component 620 that further includes one or more processors, and memory resources, represented by memory 622, for storing instructions, such as application programs, that are executable by the processing component 620. The application programs stored in memory 622 may include one or more modules that each correspond to a set of instructions. Further, the processing component 620 is configured to execute instructions to perform the above-described methods.

The server 600 may further include: the power component 624 is configured to perform power management for the server 600, the wired or wireless network interface 626 is configured to connect the server 600 to a network, and the input/output (I/O) interface 628. The server 600 may operate based on an operating system stored in memory 622, such as Window 66 over, Mac O6X, Unix, Linux, FreeB6D, or the like.

In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory 622 comprising instructions, executable by the processor of the server 600 to perform the above-described method is also provided. The storage medium may be a computer-readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product is also provided, which includes instructions executable by a processor of the server 600 to perform the above-described method.

It should be noted that the descriptions of the above-mentioned apparatus, server, computer-readable storage medium, computer program product, etc. according to the method embodiments may also include other embodiments, and specific implementations may refer to the descriptions of the related method embodiments, which are not described herein in detail.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An abnormal frame processing method, comprising:

2. The method according to claim 1, wherein the identifying the abnormal frame from the target video frame of the video to be identified comprises:

acquiring the weight and the abnormal probability of the target video frame;

3. The method of claim 2, wherein the obtaining the weight and the anomaly probability of the target video frame comprises:

extracting image characteristic information of the target video frame;

4. The method of claim 1, further comprising, before identifying an abnormal frame from the target video frames of the video to be identified:

5. The method according to claim 4, wherein the performing similar frame filtering processing on the plurality of video frames to obtain a target video frame of the video to be identified comprises:

6. The method according to claim 4, wherein the extracting a plurality of video frames in the video to be identified comprises:

acquiring the time length corresponding to the video to be identified;

7. An abnormal frame processing apparatus, comprising:

8. A server, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the exception frame handling method of any of claims 1 to 6.

9. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of a server, enable the server to perform the method of any of claims 1-6.

10. A computer program product comprising instructions which, when executed by a processor of a server, enable the server to perform the method of exception frame handling of any one of claims 1 to 6.