CN110796053B

CN110796053B - Video detection method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN110796053B
Application number: CN201911001034.1A
Authority: CN
Inventors: 陆瀛海
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2019-10-21
Filing date: 2019-10-21
Publication date: 2022-07-29
Anticipated expiration: 2039-10-21
Also published as: CN110796053A

Abstract

The embodiment of the invention provides a video detection method, a video detection device, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a video to be detected, and detecting the interval between every two adjacent I frames in the video to be detected; wherein the interval between the two adjacent I frames is used for indicating the number of frames between the two adjacent I frames; acquiring candidate frames, wherein the candidate frames at least comprise all I frames with the interval smaller than a preset threshold; and detecting illegal contents of the video to be detected based on the candidate frames to obtain and output a detection result. The embodiment of the invention can improve the condition that the illegal content is missed to be detected because the video frame containing the illegal content is not sampled when the number of the frames containing the illegal content embedded in the video is less, thereby improving the detection accuracy.

Description

Video detection method and device, electronic equipment and computer readable storage medium

Technical Field

The present invention relates to the field of video detection technologies, and in particular, to a video detection method, an apparatus, an electronic device, and a computer-readable storage medium.

Background

The situation that illegal content is maliciously embedded in an existing video often exists, and the video can influence a viewer by displaying the illegal content; for example, the advertisement placement guides the user to purchase, the illegal video placement has negative effects on the user, and the like. Therefore, auditing the content in the video to detect illegal content is a very important task for the video distribution platform.

At present, the content in the video is usually checked by sampling and frame-extracting the video frames to detect the illegal content, for example, one or several frames are extracted at regular intervals to detect the illegal content. However, if the number of frames containing illegal contents embedded in the video is small, for example, only one frame of picture containing illegal contents is embedded, in this case, the detection method of frame extraction at intervals is likely to cause that the illegal contents are missed due to the video frames containing illegal contents being not sampled, thereby resulting in low detection accuracy.

Disclosure of Invention

Embodiments of the present invention provide a video detection method, an apparatus, an electronic device, and a computer-readable storage medium, so as to achieve the purpose of improving the detection accuracy of illegal content when the number of frames embedded with the illegal content in a video is small. The specific technical scheme is as follows:

In a first aspect of the present invention, there is provided a video detection method, including:

acquiring a video to be detected;

detecting the interval between every two adjacent I frames in the video to be detected; wherein the interval between the two adjacent I frames is used for indicating the number of frames between the two adjacent I frames;

acquiring candidate frames, wherein the candidate frames at least comprise all I frames with the interval smaller than a preset threshold;

and detecting illegal contents of the video to be detected based on the candidate frames to obtain and output a detection result.

In a second aspect of the present invention, there is also provided a video detection apparatus, comprising:

the first acquisition module is used for acquiring a video to be detected;

the first detection module is used for detecting the interval between every two adjacent I frames in the video to be detected; wherein the interval between the two adjacent I frames is used for indicating the number of frames between the two adjacent I frames;

a second obtaining module, configured to obtain candidate frames, where the candidate frames at least include all I frames whose intervals are smaller than a preset threshold;

and the second detection module is used for detecting the violation content of the video to be detected based on the candidate frames to obtain and output a detection result.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which when run on a computer, cause the computer to execute any of the above-described video detection methods.

In yet another aspect of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the video detection methods described above.

According to the video detection method, the video detection device, the electronic equipment and the computer readable storage medium, the I frame with the interval between two adjacent I frames in the video to be detected smaller than the preset threshold value is selected to serve as the candidate frame for detecting the illegal content, the illegal content is detected on the basis of the candidate frame, and the detection result is obtained and output.

Because a new I frame is often required to be generated when illegal contents are embedded in a video, the interval of the I frames in the video embedded with the illegal contents is changed, and the interval between two adjacent I frames is reduced, therefore, all I frames with the interval smaller than a preset threshold value are selected as candidate frames to detect the illegal contents, the condition that the illegal contents are missed to be detected due to the fact that video frames containing the illegal contents are not sampled when the number of frames embedded with the illegal contents in the video is small can be improved, and the detection accuracy can be improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

FIG. 1 is a flowchart illustrating a video detection method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a video frame to be detected with a single frame embedded therein that contains illegal content;

FIG. 3 is a second flowchart illustrating a video detection method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram illustrating an interval between two adjacent I frames in a video to be detected;

FIG. 5 is a schematic structural diagram of a video detection apparatus according to an embodiment of the present invention;

FIG. 6 is a schematic diagram illustrating a detailed structure of a second detection module of the video detection apparatus according to the embodiment of the present invention;

FIG. 7 is a schematic diagram illustrating a detailed structure of a first detection module of the video detection apparatus according to the embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

First, a video detection method provided by an embodiment of the present invention is explained.

It should be noted that the video detection method provided by the embodiment of the present invention can be applied to electronic devices. Optionally, the electronic device may be a server of the video distribution platform, and is configured to detect the content of violation of the video before the video distribution platform distributes the video.

Referring to fig. 1, a flowchart of a video detection method according to an embodiment of the present invention is shown. As shown in fig. 1, the method may include the steps of:

step 101, acquiring a video to be detected;

step 102, detecting the interval between every two adjacent I frames in the video to be detected; wherein the interval between the two adjacent I frames is used for indicating the number of frames between the two adjacent I frames;

step 103, obtaining candidate frames, wherein the candidate frames at least comprise all I frames with the interval smaller than a preset threshold;

and 104, detecting violation contents of the video to be detected based on the candidate frames to obtain and output a detection result.

In step 101, the video to be detected may be any video, for example, the video to be detected may be a video that is issued by a video issuing platform and needs to detect illegal content. In order to ensure that the video published by the video publishing platform does not have inappropriate illegal content, the illegal content needs to be detected, so that the video publishing with the illegal content is avoided.

The video to be detected can be stored in a server of the video publishing platform, for example, in a memory of the server, and when the detection of the illegal content in the video is started, the video to be detected is directly obtained from the server. The server may also include a user interface, and may obtain the video to be detected uploaded from the user interface. Of course, the server may further include a communication interface, and the communication interface may be configured to receive the video to be detected, which is sent by the terminal of the video distribution platform.

The illegal content can be understood as content violating the normal playing order of the video, the illegal content is usually greatly different from a picture played normally by the video, and even some illegal content violates the law, such as an advertisement which is not normally on-line, carries unhealthy pictures and links, and carries extreme ideas in picture texts.

According to the number of frames after compression coding of one or more pictures embedded with illegal contents at one playing position, the method for embedding the illegal contents in the video to be detected can be divided into two ways, the first embedding way is to embed a first type of illegal frame at one playing position of the video to be detected, and the first type of illegal frame can be understood as embedding one picture or continuous several pictures with illegal contents, and the number of frames embedded with the illegal contents is small, such as less than 5 frames. The second embedding mode is to embed a second type of violation frame at a playing position of the video to be detected, where the second type of violation frame can be understood as being embedded in a small video containing violation content, and the number of frames embedded in the violation content is large, such as greater than or equal to 5 frames.

For example, when the video to be detected is normally played for 24 minutes, a small video containing illegal contents is embedded, and the small video can be an advertisement which is not normally on line.

Of course, illegal contents may be embedded in a plurality of playing positions of the video to be detected, for example, for the video a, one picture containing the illegal contents is embedded in the playing position 1, 4 continuous pictures containing the illegal contents are embedded in the playing position 2, and a small video containing the illegal contents is embedded in the playing position 3.

Correspondingly, according to different embedding modes, the video to be detected can be divided into three situations, the first situation is that illegal content is only embedded in a first embedding mode, that is, one playing position or a plurality of playing positions of the video to be detected are only embedded with a first type of illegal frame, the second situation is that illegal content is only embedded in a second embedding mode, that is, one playing position or a plurality of playing positions of the video to be detected are only embedded with a second type of illegal frame, and the third situation is that the illegal content is embedded in both the first embedding mode and the second embedding mode, that is, the video to be detected is embedded with both the first type of illegal frame and the second type of illegal frame.

In an embodiment, whether the number of frames is small or large may be determined according to actual conditions, and whether the number of frames is small or large may be defined according to a frame extraction interval in the interval frame extraction detection manner, if the frame extraction interval in the interval frame extraction detection manner is 5, the number of frames is defined as being less than 5, and the number of frames is defined as being greater than or equal to 5.

In step 102, the video to be detected is a video compressed by using a video compression technology, and according to a video compression principle, the video to be detected includes a plurality of groups of Pictures (GOPs), each GOP includes an I frame, and meanwhile, the GOP may further include a P frame and/or a B frame, and a certain interval may exist between two adjacent GOPs, that is, between two adjacent I frames.

It should be noted that, if the video to be detected is not embedded with illegal content, the picture is usually smooth, and under normal conditions, the situation of sudden change of the picture does not occur, so that each interval between two adjacent I frames obtained by encoding the video is relatively stable and relatively far. Under the condition that illegal contents are embedded in a video to be detected, the difference degree between the picture of the embedded illegal contents and the picture of the video to be detected is often larger, and according to the video compression principle, if the picture corresponding to two adjacent frames is changed too much, redundant coding cannot be carried out by reference, and a new I frame is generated. Referring to fig. 2, a schematic diagram of a video frame to be detected in which a single frame containing illegal content is embedded is shown, as shown in fig. 2, a video to be detected includes a plurality of video frames, which are respectively a frame 201, a frame 202, a frame 203, a frame 204, a frame 205, and a frame 206, the frame 203 is an I frame newly added for compression coding of a picture in which the illegal content is embedded in the video to be detected, and a normal picture behind the frame has a larger mutation with the picture containing the illegal content, so that a new I frame is regenerated again to be the frame 204, at this time, the number of frames between the two adjacent I frames is 0, and therefore, the interval between the frame 203 and the frame 204 is 0.

Based on the principle, a new I frame is inserted between two adjacent GOPs in the video to be detected, so that the interval between the two adjacent I frames is reduced. Therefore, the detection of the illegal content of the video to be detected can be carried out by utilizing the characteristics of the interval distribution of two adjacent I frames.

The position of each I frame can be determined by marking each I frame in the video to be detected, the interval between every two adjacent I frames in the video to be detected is determined based on the position of each I frame, and the interval calculation formula can be d-I _i+1 -I _i -1, wherein I is a positive integer greater than or equal to 1, I _i+1 Indicates the position of the I +1 th I frame, I _i Indicates the position of the I-th I-frame, and d indicates an interval indicating the number of frames inserted between the I + 1-th I-frame and the I-th I-frame. In practical application, the serial number of the I frame may be counted by using ffmpeg and other methods for the video to be detected, for example, the serial number of the I frame is the 1 st frame, the 10 th frame, and the like, and based on the serial number of each I frame, the sequence number of each I frame is counted by d-I _i+1 -I _i -1 calculating said to be detectedFor example, if the video a includes 4I frames with sequence numbers of 1, 10, 18, and 19, the interval between every two adjacent I frames in the video to be detected can be determined to be 8, 7, and 0, respectively.

In step 103, candidate frames include at least all I frames whose interval between two adjacent I frames is smaller than a preset threshold, which have a high possibility of containing illegal contents and need to be detected. Certainly, the candidate frames may also include other frames, for example, in order to ensure that the detection completeness of the illegal content in the video to be detected meets the requirement, the detection of the illegal content in the video to be detected may be performed by using an interval frame extraction detection method, and correspondingly, the candidate frames may also include video frames extracted at regular intervals.

If each interval between two adjacent I frames in the video to be detected is greater than or equal to a preset threshold, the candidate frame may be empty, that is, the video to be detected may not have illegal content embedded therein, or the video to be detected may not have a first type of illegal frame embedded therein.

It should be noted that the preset threshold may be set according to actual situations, and is usually set to a value smaller than the interval in the interval frame-decimation manner, for example, the preset threshold may be set to 4, and the candidate frames include all I frames whose interval between two adjacent I frames is smaller than 4.

In addition, it should be noted that after the intervals between all two adjacent I frames in the video to be detected are completely detected, each interval may be compared with a preset threshold to obtain a candidate frame, and of course, after the intervals between some two adjacent I frames are detected, the detected intervals may also be compared with the preset threshold, that is, the detected intervals are compared with the preset threshold while the intervals between two adjacent I frames are detected, until all the intervals are detected and all the detected intervals are compared with the preset threshold, so as to obtain the candidate frame, where implementation manners of the candidate frame are not limited.

In step 104, a picture corresponding to each frame in the candidate frames may be obtained, and based on the picture, the violation content of the video to be detected is detected. Specifically, firstly, detection of violation content may be performed on a picture corresponding to each frame in the candidate frames; and then determining the violation frame of the video to be detected based on the result obtained by detection, and finally obtaining and outputting the detection result based on the violation frame of the video to be detected. When the illegal content of the picture is detected, whether the picture in the picture is the illegal content or not can be judged, whether the picture carries the illegal content or not is judged, and if yes, the corresponding candidate frame is judged to be the illegal frame of the video to be detected.

The detection result may only include the violation frame of the video to be detected, that is, only the picture including the preset violation content in the video to be detected is output, and based on the detection result, the detection personnel may further determine that the picture includes the preset violation content, and if the picture is determined to include the preset violation content, the pictures may be deleted. The detection result may also only include the serial number of the violation frame of the video to be detected, and based on the detection result, the server finds the corresponding frame based on the serial number and directly deletes the frame. Of course, the detection result may also include both of the above two types, which is not limited herein.

In addition, the detection result may further include the type of the violation frame, for example, the type of the violation frame is a violent type, that is, the violation frame is embedded with violent violation content, so that the detection personnel can perform targeted detection. For example, for a video distribution platform, many lawbreakers would like to embed violent illegal content, and the video distribution platform may need to detect violent illegal content more specifically.

It should be noted that, before detection, it is not clear what type of violation frame is embedded in the video to be detected, and therefore, in order to ensure the detection completeness of the violation content in the video to be detected, in practical applications, two or more detection modes may be required. For example, the detection mode in the embodiment of the present invention may be used in cooperation with an interval frame extraction detection mode, so as to improve a situation that the illegal content is missed for detection due to the fact that the video frame of the illegal content with a small frame number in the video to be detected is not sampled, and further improve the detection accuracy of the illegal content.

According to the video detection method provided by the embodiment, the I frames with the interval between every two adjacent I frames in the video to be detected being smaller than the preset threshold value are selected as the candidate frames for detecting the illegal content, and the illegal content is detected for the video to be detected based on the candidate frames, so that the condition that the illegal content is missed to be detected due to the fact that the video frames containing the illegal content cannot be sampled when the number of frames containing the illegal content embedded in the video to be detected is small is improved, and the detection accuracy is improved.

In addition, since the candidate frame is detected in a targeted manner, the detection speed is high, the efficiency is high, and the resource consumption is low.

Further, based on the first embodiment, referring to fig. 3, a second flowchart of the video detection method according to the second embodiment of the invention is shown. As shown in fig. 3, the method may include the steps of:

step 301, acquiring a video to be detected;

step 302, detecting the interval between every two adjacent I frames in the video to be detected; wherein the interval between the two adjacent I frames is used for indicating the number of frames between the two adjacent I frames;

step 303, obtaining candidate frames, where the candidate frames at least include all I frames whose intervals are smaller than a preset threshold;

step 304, detecting violation contents of each I frame in the candidate frames, and determining a first violation frame; the first violation frame comprises preset violation content;

step 305, determining the violation frame of the video to be detected based on the first violation frame;

step 306, obtaining and outputting a detection result based on the violation frame of the video to be detected; and the detection result comprises the violation frame of the video to be detected and/or the sequence number of the violation frame of the video to be detected.

Step 301 is similar to step 101 of the first embodiment, step 302 is similar to step 102 of the first embodiment, and step 303 is similar to step 103 of the first embodiment, so that the explanation can refer to step 101, step 102, and step 103 of the first embodiment, respectively, which will not be described again.

In step 304, the first violation frame is a frame corresponding to a picture carrying preset violation content, and detection of the violation content may be performed on each I frame in the candidate frames in two ways.

In a first manner, a picture corresponding to each I frame in the candidate frame is compared with an illegal picture in an image library, specifically, a picture similarity comparison algorithm may be adopted to compare the picture corresponding to each I frame in the candidate frame with the illegal picture in the image library, for example, comparing a picture corresponding to one I frame with an illegal picture, a feature value of the picture corresponding to the I frame and a feature value of the illegal picture may be calculated, then, the feature values of the two pictures are compared, and if the feature values are closer, the similarity between the picture corresponding to the I frame and the illegal picture is larger.

If the similarity between the picture corresponding to the target I frame and the illegal picture is greater than the preset similarity, the target I frame can be determined as a first illegal frame. The illegal picture is a picture carrying illegal content.

In a second mode, illegal contents are identified for picture contents corresponding to each I frame in the candidate frames; and under the condition that the picture content corresponding to the target I frame meets a preset violation condition, determining the target I frame as a first violation frame.

The violation condition may be preset, and the violation condition may include one or more of the following conditions:

the semantic content of the picture comprises preset content; and (c) a second step of,

the links carried in the pictures comprise preset links; and the number of the first and second groups,

the picture content contains advertisements that are not normally online.

Parameters in the violation conditions, such as preset content, preset links, advertisements which are not normally on-line, and the like, may be stored in a database in advance, and when the violation content needs to be identified for the picture content corresponding to each I frame in the candidate frames, the violation conditions are obtained from the database. Parameters in the violation conditions, such as preset content, preset links, advertisements which are not normally on-line and the like, can also be stored on the network, when the violation content needs to be identified for the picture content corresponding to each I frame in the candidate frames, the violation conditions are downloaded from the cloud platform, for example, the parameters in the violation conditions are stored in the hundredth cloud, and when the violation conditions need to be used, the violation conditions can be downloaded from the hundredth cloud.

The preset content may include unhealthy content such as violent content, bloody content, yellow content and the like, and unsafe and illegal legal items such as knives, firearms, drugs and the like, and the preset link is a link for entering unsafe, unhealthy and illegal laws, such as a link for deceiving people into money. The advertisement which is not normally on-line is an advertisement which is not allowed to be implanted through the video publishing platform, such as an advertisement which is not purchased with an advertisement slot on the video publishing platform.

In the implementation process, whether the semantic content of the picture includes the preset content may be determined by a semantic similarity calculation method, for example, a distance between the semantic content of the picture and the preset content may be calculated, and the smaller the distance, the greater the semantic similarity is, therefore, if the distance between the semantic content of the picture and the preset content is smaller than a first preset distance, it may be indicated that the semantic content of the picture includes the preset content, and similarly, if the distance between the semantic content of the picture and the preset content is greater than a second preset distance, it may be indicated that the semantic content of the picture does not include the preset content, and the second preset distance is greater than the first preset distance. For example, the semantic content of the picture is "gun", the preset content includes "gun", and at this time, the distance between the semantic content of the picture and the preset content "gun" is calculated to be smaller than the first preset distance, which indicates that the semantic content of the picture includes the preset content.

In the implementation process, whether the link carried in the picture includes the preset link or not can be judged through a matching method, for example, the link carried in the picture can be extracted and matched with the preset link, if the link carried in the picture is matched, the link carried in the picture can be shown to include the preset link, and if the link carried in the picture is not matched, the link carried in the picture can be shown not to include the preset link.

In the implementation process, whether the picture content includes the advertisement which is not normally online or not can be judged through a matching method, for example, a product of the advertisement in the picture can be extracted and matched with the product of the advertisement which is not normally online, if the product of the advertisement in the picture is matched, the picture content can be indicated to include the advertisement which is not normally online, and if the product of the advertisement in the picture is not matched, the picture content can be indicated to not include the advertisement which is not normally online.

In addition, if the advertisement of the advertisement position is purchased in the video publishing platform, namely the advertisement which is normally on line, the video publishing platform usually has records, therefore, the product of the advertisement in the picture can be extracted and matched with the product of the advertisement which is normally on line and recorded by the video publishing platform, if the product of the advertisement in the picture is matched, the content of the picture can be indicated not to contain the advertisement which is not normally on line, and if the product of the advertisement in the picture is not matched, the content of the picture can be indicated to contain the advertisement which is not normally on line.

Specifically, semantic identification can be performed on the picture content corresponding to each I frame in the candidate frames to obtain the semantic content of the picture corresponding to each I frame, and if the semantic content of the picture corresponding to the target I frame includes a gun and a link entering the gun for buying and selling, since the semantic content of the picture corresponding to the target I frame includes preset content and the link carried in the picture includes the preset link, the target I frame is determined as a first violation frame. Of course, when the first violation frame is determined, the picture content corresponding to the target I frame only needs to satisfy any violation condition, that is, the target I frame can be determined to be the first violation frame.

In step 305, the first violation frame may include many target I-frames, which may include two types, the first type being: the inter-frame coding frame related to the target I frame does not exist behind the target I frame, the violation frame only comprises the target I frame, and at the moment, the target I frame can be directly determined as the violation frame of the video to be detected; the second type is: and an inter-frame coding frame related to the target I frame exists behind the target I frame, and the illegal frame is the target I frame and the inter-frame coding frame related to the target I frame.

In step 306, the detection result may only include the violation frame of the video to be detected, that is, only the picture including the preset violation content in the video to be detected is output, and based on the detection result, the detection person may further determine that the picture includes the preset violation content, and if the picture includes the preset violation content, the pictures may be deleted. The detection result may also only include the serial number of the violation frame of the video to be detected, and based on the detection result, the server finds the corresponding frame based on the serial number and directly deletes the frame. Of course, the detection result may also include both of the above two types, which is not limited herein.

In addition, the detection result may further include the type of the violation frame, for example, the type of the violation frame is a violence type, that is, the violation frame embeds violent violation content, so that a detection person can perform targeted detection. For example, for a video distribution platform, many lawbreakers like to embed violent illegal content, the video distribution platform may need to detect violent illegal content more specifically.

In the embodiment, the illegal content is identified for the picture content corresponding to each I frame in the candidate frames, a first illegal frame related to the illegal content is detected, the illegal frame of the video to be detected is determined based on the first illegal frame, and a detection result is obtained and output based on the illegal frame of the video to be detected. The method can detect that the video to be detected is a single frame containing illegal contents, and has high detection accuracy.

Further, if there is an inter-frame coding frame associated with the target I frame after the target I frame, based on the second embodiment, the step 305 specifically includes:

detecting violation content of an interframe coding frame corresponding to the first violation frame, and determining a second violation frame; wherein the second violation frame includes the preset violation content;

and determining the first violation frame and the second violation frame as violation frames of the video to be detected.

The detection of the violation content is performed on the inter-frame coding frame corresponding to the first violation frame, and the step of determining the second violation frame may be similar to step 304 in embodiment two, which is not described herein again.

Because the inter-frame coding frame corresponding to the target I frame is a frame directly or indirectly compressed and coded based on the target I frame, and the possibility that the picture corresponding to the inter-frame coding frame is the same as or similar to the picture corresponding to the target I frame is very high, in practical application, the inter-frame coding frame corresponding to the target I frame can be directly determined as the second illegal frame, and of course, the picture corresponding to the inter-frame coding frame can be manually detected.

In this embodiment, a second violation frame is determined by detecting violation content of an inter-frame coding frame corresponding to the first violation frame, and the first violation frame and the second violation frame are determined as violation frames of the video to be detected. When the single frame containing the illegal content in the video to be detected is detected, a plurality of continuous frames containing the illegal content in the video to be detected can be detected, and the detection accuracy is high.

In addition, in the above embodiments, the step of detecting the interval between every two adjacent I frames in the video to be detected includes:

determining the serial number of each I frame in the video to be detected;

and determining the interval between every two adjacent I frames in the video to be detected based on the sequence number of each I frame.

The sequence number of the I frame can be counted by using ffmpeg and other methods for the video to be detected, for example, the sequence number of the I frame is the 1 st frame, the 10 th frame and the like. And calculating the interval between every two adjacent I frames in the video to be detected based on the sequence number of each I frame.

In addition, the interval between two adjacent I frames may be determined from each GOP structure, for example, if the size of a group of GOPs is 5, the interval between an I frame in the GOP and the next adjacent I frame may be determined to be 4.

The following describes the video detection method provided by the embodiment of the present invention in detail by way of example.

Application scenarios: the video to be detected is the video before the video publishing platform publishes the video.

Firstly, counting the sequence number of the I frame by using the way of ffmpeg and the like, and calculating the interval between every two adjacent I frames in the video to be detected based on the sequence number. Referring to fig. 4, a schematic diagram of an interval between two adjacent I frames in a video to be detected is shown, as shown in fig. 4, an abscissa represents a sequence number of the I frame, an ordinate represents an interval between two adjacent I frames, each interval has a large difference, and there is an I frame smaller than 4 (a preset threshold is 4);

Then, obtaining candidate frames, wherein the candidate frames at least comprise all I frames with the interval less than 4;

then, identifying illegal contents of the picture content corresponding to each I frame in the candidate frames, and judging whether the picture content corresponding to the target I frame meets the illegal conditions or not;

determining the target I frame as a first violation frame under the condition that the picture content corresponding to the target I frame meets violation conditions;

then, judging whether a target I frame in the first violation frame has an inter-frame coding frame related to the target I frame, and if not, directly determining the target I frame as the violation frame of the video to be detected; if the frame exists, detecting the inter-frame coding frame corresponding to the target I, determining a second violation frame, and determining the target I frame and the second violation frame as the violation frames of the video to be detected;

and finally, deleting the determined violation frames, and simultaneously, releasing the video after the violation content of the video to be detected is detected.

The following describes a video detection apparatus according to an embodiment of the present invention.

Referring to fig. 5, a schematic structural diagram of a video detection apparatus according to an embodiment of the present invention is shown. As shown in fig. 5, the video detection apparatus 500 includes:

A first obtaining module 501, configured to obtain a video to be detected;

a first detection module 502, configured to detect an interval between every two adjacent I frames in the video to be detected; wherein the interval between the two adjacent I frames is used for indicating the number of frames between the two adjacent I frames;

a second obtaining module 503, configured to obtain candidate frames, where the candidate frames at least include all I frames whose intervals are smaller than a preset threshold;

and a second detection module 504, configured to perform detection on violation content of the video to be detected based on the candidate frame, and obtain and output a detection result.

Optionally, referring to fig. 6, a schematic diagram of a detailed structure of a second detection module of the video detection apparatus in the embodiment of the present invention is shown. As shown in fig. 6, the second detection module 504 includes:

a detecting unit 5041, configured to perform detection on violation content for each I frame in the candidate frames, and determine a first violation frame; the first violation frame comprises preset violation content;

a first determining unit 5042, configured to determine, based on the first violation frame, a violation frame of the video to be detected;

an output unit 5043, configured to obtain and output a detection result based on the violation frame of the video to be detected; and the detection result comprises the violation frame of the video to be detected and/or the sequence number of the violation frame of the video to be detected.

Optionally, the detecting unit 5041 is specifically configured to identify illegal content of the picture content corresponding to each I frame in the candidate frames; and under the condition that the picture content corresponding to the target I frame meets a preset violation condition, determining the target I frame as a first violation frame.

Optionally, if the first violation frame corresponds to an inter-frame coding frame; the first determining unit 5042 is specifically configured to perform violation content detection on the inter-frame coding frame corresponding to the first violation frame, and determine a second violation frame; wherein the second violation frame includes the preset violation content; and determining the first violation frame and the second violation frame as violation frames of the video to be detected.

Optionally, referring to fig. 7, a schematic diagram of a detailed structure of a first detection module of the video detection apparatus in the embodiment of the present invention is shown. As shown in fig. 7, the first detection module 502 includes:

a second determining unit 5021, configured to determine a sequence number of each I frame in the video to be detected;

a third determining unit 5022, configured to determine an interval between every two adjacent I frames in the video to be detected based on the sequence number of each I frame.

The device provided by the embodiment of the present invention can implement each process implemented in the above method embodiments, and is not described here again to avoid repetition.

The video detection device provided by this embodiment selects the I frame, of which the interval between every two adjacent I frames in the video to be detected is smaller than the preset threshold, as the candidate frame for detecting the illegal content, and performs detection of the illegal content on the video to be detected based on the candidate frame, so that the situation that the illegal content is missed to be detected due to the fact that the video frame containing the illegal content is not sampled when the number of frames containing the illegal content embedded in the video to be detected is small is improved, and the detection accuracy is improved.

The following describes an electronic device provided in an embodiment of the present invention.

An embodiment of the present invention further provides an electronic device, as shown in fig. 8, which includes a processor 801, a communication interface 802, a memory 803, and a communication bus 804, where the processor 801, the communication interface 802, and the memory 803 complete mutual communication through the communication bus 804,

a memory 803 for storing a computer program;

the processor 801 is configured to implement the following steps when executing the program stored in the memory 803:

Acquiring a video to be detected;

Optionally, the processor 801 is specifically configured to:

detecting violation contents of each I frame in the candidate frames, and determining a first violation frame; the first violation frame comprises preset violation content;

determining the violation frame of the video to be detected based on the first violation frame;

obtaining and outputting a detection result based on the violation frame of the video to be detected; and the detection result comprises the violation frame of the video to be detected and/or the sequence number of the violation frame of the video to be detected.

Optionally, the processor 801 is specifically configured to:

identifying illegal contents of picture contents corresponding to each I frame in the candidate frames;

and under the condition that the picture content corresponding to the target I frame meets a preset violation condition, determining the target I frame as a first violation frame.

Optionally, the processor 801 is specifically configured to:

if the first violation frame corresponds to an inter-frame coding frame, detecting violation content of the inter-frame coding frame corresponding to the first violation frame, and determining a second violation frame; wherein the second violation frame includes the preset violation content;

Optionally, the processor 801 is specifically configured to:

determining the sequence number of each I frame in the video to be detected;

The communication bus mentioned in the above terminal may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the terminal and other equipment.

The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, which stores instructions that, when executed on a computer, cause the computer to perform the video detection method described in any of the above embodiments.

In yet another embodiment, a computer program product containing instructions is provided, which when run on a computer, causes the computer to perform the video detection method as described in any of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It should be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for video detection, the method comprising:

acquiring a video to be detected;

detecting the interval between every two adjacent I frames in the video to be detected; the interval between two adjacent I frames is used for indicating the number of frames between two adjacent I frames, and the I frames are I frames in each GOP (group of pictures) in the video compression technology;

acquiring candidate frames, wherein the candidate frames are all I frames with the interval smaller than a preset threshold, and the value set by the preset threshold is smaller than the interval in an interval frame extraction mode;

2. The method according to claim 1, wherein the step of detecting the illegal content of the video to be detected based on the candidate frames and obtaining and outputting a detection result comprises:

Determining an illegal frame of the video to be detected based on the first illegal frame;

3. The method of claim 2, wherein the detecting of the offending content for each I-frame in the candidate frames, the step of determining the first offending frame includes:

4. The method of claim 2, wherein if the first offending frame corresponds to an inter-coded frame; the step of determining the violation frame of the video to be detected based on the first violation frame comprises:

5. The method according to claim 1, wherein the step of detecting the interval between every two adjacent I frames in the video to be detected comprises:

determining the serial number of each I frame in the video to be detected;

6. A video detection apparatus, characterized in that the apparatus comprises:

the first acquisition module is used for acquiring a video to be detected;

the first detection module is used for detecting the interval between every two adjacent I frames in the video to be detected; the interval between two adjacent I frames is used for indicating the number of frames between two adjacent I frames, and the I frames are I frames in each GOP (group of pictures) in the video compression technology;

a second obtaining module, configured to obtain candidate frames, where the candidate frames are all I frames whose intervals are smaller than a preset threshold, and a value set by the preset threshold is smaller than an interval in an interval frame extraction manner;

7. The apparatus of claim 6, wherein the second detection module comprises:

The detection unit is used for detecting violation content of each I frame in the candidate frames and determining a first violation frame; the first violation frame comprises preset violation content;

the first determining unit is used for determining the violation frame of the video to be detected based on the first violation frame;

the output unit is used for obtaining and outputting a detection result based on the violation frame of the video to be detected; and the detection result comprises the violation frame of the video to be detected and/or the sequence number of the violation frame of the video to be detected.

8. The apparatus according to claim 7, wherein the detection unit is specifically configured to:

9. The apparatus of claim 7, wherein if the first offending frame corresponds to an inter-coded frame; the first determining unit is specifically configured to:

And determining the first violation frame and the second violation frame as the violation frames of the video to be detected.

10. The apparatus of claim 6, wherein the first detection module comprises:

the second determining unit is used for determining the sequence number of each I frame in the video to be detected;

and the third determining unit is used for determining the interval between every two adjacent I frames in the video to be detected based on the sequence number of each I frame.

11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method steps of any one of claims 1 to 5 when executing a program stored in the memory.

12. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-5.