CN108229262B - Pornographic video detection method and device - Google Patents

Pornographic video detection method and device Download PDF

Info

Publication number
CN108229262B
CN108229262B CN201611200177.1A CN201611200177A CN108229262B CN 108229262 B CN108229262 B CN 108229262B CN 201611200177 A CN201611200177 A CN 201611200177A CN 108229262 B CN108229262 B CN 108229262B
Authority
CN
China
Prior art keywords
video
model
target
model parameter
preset threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611200177.1A
Other languages
Chinese (zh)
Other versions
CN108229262A (en
Inventor
侯鑫
牛志伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201611200177.1A priority Critical patent/CN108229262B/en
Publication of CN108229262A publication Critical patent/CN108229262A/en
Application granted granted Critical
Publication of CN108229262B publication Critical patent/CN108229262B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a pornographic video detection method which is used for quickly identifying a pornographic video and improving the identification accuracy. The method provided by the embodiment of the invention comprises the following steps: extracting a plurality of groups of video frame sequences from a target video; extracting first motion information characteristics and/or first picture content characteristics corresponding to each group of video frame sequences through a first deep learning model, and calculating first scores corresponding to each group of video frame sequences respectively according to the first motion information characteristics and/or the first picture content characteristics; determining the maximum value in the first fraction, and judging whether the maximum value is greater than a first preset threshold value; and if the target video is larger than the first preset threshold, determining that the target video is the pornographic video. The embodiment of the invention also discloses a pornographic video detection device which is used for rapidly identifying the pornographic video and improving the identification accuracy.

Description

Pornographic video detection method and device
Technical Field
The invention relates to the field of Internet application, in particular to a pornographic video detection method and device.
Background
Through the internet, users can share a lot of resource information, but the users can obtain a lot of useful information and also encounter some bad information, wherein the pornographic video is the most serious. The videos have the characteristics of high content complexity, strong concealment, large quantity, strong time-varying property and the like, and are large in harm to the public after analysis and propagation. Therefore, the method has great significance for detecting and filtering the pornographic video.
In the prior art, the detection of the pornographic video is mainly to detect whether the video is the pornographic video or not by algorithms based on the skin color degree of a human body of a picture.
However, the method for distinguishing the skin color of the human body usually considers some images with less clothing and more exposed skin as pornographic images, and misjudgment is generated on some non-pornographic videos, namely the miskill rate is high and the accuracy rate is low.
Disclosure of Invention
The embodiment of the invention provides a pornographic video detection method and device, which are used for quickly identifying pornographic videos and improving the identification accuracy.
In view of this, a first aspect of the embodiments of the present invention provides a pornographic video detecting method, including:
extracting a plurality of groups of video frame sequences from a target video;
extracting first motion information characteristics and/or first picture content characteristics corresponding to each group of video frame sequences through a first deep learning model, and calculating first scores corresponding to each group of video frame sequences respectively according to the first motion information characteristics and/or the first picture content characteristics;
determining the maximum value in the first fraction, and judging whether the maximum value is greater than a first preset threshold value;
and if the target video is larger than the first preset threshold, determining that the target video is the pornographic video.
A second aspect of an embodiment of the present invention provides a detection apparatus, including:
the extraction module is used for extracting a plurality of groups of video frame sequences from the target video;
the first calculation module is used for extracting a first motion information characteristic and/or a first image content characteristic corresponding to each group of video frame sequences through a first deep learning model and calculating first scores respectively corresponding to each group of video frame sequences according to the first motion information characteristic and/or the first image content characteristic;
the first judging module is used for determining the maximum value in the first fraction and judging whether the maximum value is greater than a first preset threshold value;
and the first determining module is used for determining that the target video is the pornographic video when the judging module determines that the maximum value is larger than the first preset threshold value.
According to the technical scheme, the embodiment of the invention has the following advantages:
the method comprises the steps of extracting motion information characteristics and/or picture content characteristics of a plurality of groups of video frame sequences in a target video through a deep learning model, calculating scores corresponding to each group of video frame sequences according to the motion information characteristics and/or the picture content characteristics, judging whether the maximum value of the scores is larger than a preset threshold value or not, and determining that the target video is a pornographic video if the maximum value of the scores is larger than the preset threshold value. The deep learning model is obtained through training of a large amount of video data, so that the recognition accuracy is high. Therefore, the excellent emotion video can be rapidly identified, and the identification accuracy is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present invention.
FIG. 1 is a schematic diagram of an embodiment of a pornographic video detection system in an embodiment of the invention;
FIG. 2 is a flow chart of an embodiment of a pornographic video detection method in an embodiment of the invention;
FIG. 3 is a flow chart of another embodiment of a pornographic video detection method in an embodiment of the present invention;
FIG. 4 is a schematic diagram of an embodiment of a detection device in an embodiment of the invention;
FIG. 5 is a schematic diagram of another embodiment of a detection device in an embodiment of the invention;
FIG. 6 is a schematic diagram of another embodiment of a detection device in an embodiment of the invention;
fig. 7 is a schematic structural diagram of a server in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
To facilitate understanding of the embodiment of the present invention, a scene to which the embodiment of the present invention is applied is briefly introduced below, and referring to fig. 1, a schematic diagram of a system composition structure to which the method and the apparatus for detecting pornographic video according to the embodiment of the present invention are applied is shown.
As shown in fig. 1, the system may include a service system composed of at least one server 101, and a plurality of terminals 102. Among them, the server 101 in the service system may store data for detecting the target video and transmit the detection result to the terminal. The terminal 102 may be configured to upload target video data to be detected to the server, and display a detection result returned by the server to the user. It should be understood that the terminal 102 is not limited to a Personal Computer (PC) shown in fig. 1, but may also be a mobile phone, a tablet Computer, or other devices capable of uploading video.
It should be understood that the pornography detection method and apparatus in the embodiment of the present invention can be used in other scenarios besides the system components shown in fig. 1, and are not limited herein.
To facilitate understanding of the embodiments of the present invention, some terms used in the embodiments of the present invention are described below:
and (3) pornographic video: video content includes, but is not limited to, sexual activity, exposure to a sensitive body location, and the like;
and (3) common video: a video that is not a pornographic video;
the false killing rate is as follows: if N ordinary videos are assumed, M pornographic videos are judged to be provided through an algorithm, and the false killing rate is M/N;
the accuracy is as follows: c pornographic videos are judged to be provided by an algorithm on the assumption that A pornographic videos and B common videos, wherein D pornographic videos are real pornographic videos, and the accuracy is (B-C +2D)/(A + B);
deep learning: the method is a new field in machine learning research, and an algorithm for performing high-level abstraction on data by using a plurality of processing layers comprising complex structures or consisting of multiple nonlinear transformations is motivated by establishing and simulating a neural network for analyzing and learning the human brain and simulating the mechanism of the human brain to interpret data such as images, sounds and texts. The essence of the method is that more useful features are learned by constructing a machine learning model with a plurality of hidden layers and massive training data, so that the accuracy of classification or prediction is finally improved.
The embodiment of the invention provides a pornographic video detection method and device, which are used for quickly and accurately identifying pornographic videos. Referring to fig. 2, a pornographic video detecting method according to an embodiment of the present invention is described below, where an embodiment of the pornographic video detecting method according to the present invention includes:
201. extracting a plurality of groups of video frame sequences from a target video;
after a user uploads a target video through the Internet, a detection device acquires the target video and extracts a plurality of groups of video frame sequences from the target video.
It should be noted that, in the embodiment of the present invention, the target video includes a plurality of video frames, and the group of video frame sequences includes at least one video frame.
202. Extracting first motion information characteristics and/or first picture content characteristics corresponding to each group of video frame sequences through a first deep learning model, and calculating first scores corresponding to each group of video frame sequences respectively according to the first motion information characteristics and/or the first picture content characteristics;
after acquiring a plurality of groups of video frame sequences, the detection device extracts, through the first deep learning model, motion information features and/or picture content features corresponding to each group of video frame sequences, where the motion information features refer to motion information features of the group of video frame sequences in a time dimension, and may specifically include a direction of motion of an object, or a motion pattern of the object, or other features, which is not limited herein. The picture content feature refers to a picture content feature of each image in the set of video frame sequences, and may specifically include texture information of the image, such as a key contour of an object, or a color space feature of the image, and may further include other information, which is not limited herein. For convenience of description, the embodiment of the present invention refers to the motion information feature extracted by the first deep learning model as a first motion information feature, and the picture content feature extracted by the first deep learning model as a first picture content feature.
After the first deep learning model extracts the first motion information feature and/or the first picture content feature, a score of each set of video frame sequences can be calculated by the first deep learning model according to the first motion information feature and/or the first picture content feature, and it should be understood that the score is used for evaluating the relationship between the video frame sequences and the pornographic content. For convenience of description, the embodiment of the present invention refers to the score calculated by the first deep learning model as a first score.
It should be understood that the feature information of the video frame sequence extracted by the detection device is determined by the deep learning model, and based on different deep learning models, the detection device may extract only the motion information feature of the video frame sequence, calculate the score according to the motion information feature, or extract only the picture content feature, calculate the body according to the picture content feature, or extract both the motion information feature and the picture content feature at the same time, and calculate the score according to both the motion information feature and the picture content feature. For example, the Pooling Conv model and the Conv3D model extract two information features of motion information and picture content at the same time for calculation, and for example, the ImageNet model extracts only picture content features for calculation. There are many other deep learning models that can extract video feature information, and this is not listed here.
203. Determining the maximum value in the first scores corresponding to each group of video frame sequences;
after first scores corresponding to each group of video frame sequences are determined through the first deep learning model, the maximum value of the first scores is determined.
204. Judging whether the maximum value is greater than a first preset threshold value, if so, executing step 205;
after the maximum value is determined, it is determined whether the maximum value is greater than a first preset threshold, and if so, step 205 is executed.
It should be understood that the first preset threshold is preset by a user or a system, and may be obtained through a test of a large amount of sample data, or may be obtained through other manners, which are not limited herein.
205. And determining the target video as the pornographic video.
When the maximum value is larger than a first preset threshold value, the detection device determines that the target video is the pornographic video.
The method comprises the steps of extracting motion information characteristics and/or picture content characteristics of a plurality of groups of video frame sequences in a target video through a deep learning model, calculating scores corresponding to each group of video frame sequences according to the motion information characteristics and/or the picture content characteristics, judging whether the maximum value of the scores is larger than a preset threshold value or not, and determining that the target video is a pornographic video if the maximum value of the scores is larger than the preset threshold value. The deep learning model is obtained through training of a large amount of video data, so that the recognition accuracy is high. Therefore, the excellent emotion video can be rapidly identified, and the identification accuracy is improved.
Based on the above embodiment corresponding to fig. 2, when the maximum value in the first scores corresponding to each group of video frame sequences is not greater than the first preset threshold, the detecting device may further determine the target video in various ways to identify whether the target video is a pornographic video, and as described in detail below by taking the embodiment corresponding to fig. 3 as an example, referring to fig. 3, another embodiment of the pornographic video detecting method in the embodiment of the present invention includes:
301. extracting a plurality of groups of video frame sequences from a target video;
after a user uploads a target video through the Internet, a detection device acquires the target video and extracts a plurality of groups of video frame sequences from the target video.
Specifically, the detection apparatus may set the extraction time interval of each group of video frame sequences, so as to determine the number of groups of video frame sequences to be extracted, may also set the number of groups of video frame sequences to be extracted according to the time length of the target video, and may also determine the number of groups of video frame sequences to be extracted in other manners, which is not limited herein. Further, the number of the video frames included in each group of video frame sequences may also be limited, that is, each group of video frame sequences includes video frames with a preset number of frames, and generally, the number of the frames included in each group of video frame sequences is the same, but may also be different, and is not limited herein. Or defining the time interval between each video frame in each group of video frame sequences, for example, extracting the preset number of video frames as a group of video frame sequences according to a fixed time interval, i.e. making the time interval of any two adjacent videos in each group of video frame sequences equal. Or extracting a plurality of sets of video frame sequences corresponding to the target video according to other extraction rules, which is not limited herein.
302. Extracting first motion information characteristics and/or first picture content characteristics corresponding to each group of video frame sequences through a first deep learning model, and calculating first scores corresponding to each group of video frame sequences respectively according to the first motion information characteristics and/or the first picture content characteristics;
after acquiring a plurality of groups of video frame sequences, the detection device extracts, through the first deep learning model, motion information features and/or picture content features corresponding to each group of video frame sequences, where the motion information features refer to motion information features of the group of video frame sequences in a time dimension, and may specifically include a direction of motion of an object, or a motion pattern of the object, or other features, which is not limited herein. The picture content feature refers to a picture content feature of each image in the set of video frame sequences, and may specifically include texture information of the image, such as a key contour of an object, or a color space feature of the image, and may further include other information, which is not limited herein. . For convenience of description, the embodiment of the present invention refers to the motion information feature extracted by the first deep learning model as a first motion information feature, and the picture content feature extracted by the first deep learning model as a first picture content feature.
After the first deep learning model extracts the first motion information feature and/or the first picture content feature, a score for each set of video frame sequences may be calculated according to the first deep learning model and the first motion information feature and/or the first picture content feature. For convenience of description, the embodiment of the present invention refers to the score calculated by the first deep learning model as a first score.
It should be noted that the first deep learning model may be obtained by the detection apparatus from another apparatus before performing step 301, or may be obtained by training the detection apparatus with a large amount of training data before performing step 301, and is not limited herein. Specifically, the first deep learning model can be obtained by training in the following way:
(1) determining a model to be trained and a first model parameter corresponding to the model to be trained;
the detection device determines a model to be trained and an initial model parameter corresponding to the model to be trained, it should be understood that in a subsequent training process, the initial model parameter is continuously optimized, and for convenience of description, the initial model parameter is referred to as a first model parameter in the embodiment of the present invention.
(2) Taking the first model parameter as a target model parameter, and then respectively executing the steps (3) and (4);
and after determining the model to be trained and the first model parameter, the detection device takes the first model parameter as a target model parameter. It is to be understood that the target model parameter is a variable, and the meaning of the first model parameter as the target model parameter is that the value of the first model parameter is stored as the corresponding initial value of the target model parameter for the variable at the detection means.
(3) Sending the model to be trained and the target model parameters to a first computing server, and then executing the step (5);
the detection device sends a model to be trained and target model parameters to the first calculation server after determining the target model parameters, the first calculation server obtains training samples for training the target model parameters after receiving the model to be trained and the target model parameters, calculates gradient values corresponding to the training samples according to the model to be trained and the target model parameters received at the time, and updates the model to be trained according to the gradient values obtained by the calculation to obtain new model parameters.
It should be understood that the target model parameter is a variable, so the target model parameters sent by the detection apparatus each time may be different, it should also be understood that the training samples are obtained from a large amount of manually labeled data, and include pornographic video data and normal video data, and the training samples obtained by the first computing server each time are different.
It should be further noted that the first computation server may compute the gradient value corresponding to the training sample through a forward propagation algorithm and a backward propagation algorithm, or may compute the gradient value through other manners, which is not limited herein.
(4) Sending the model to be trained and the target model parameters to a second computing server, and then executing the step (6);
after determining the target model parameters, the detection device sends the model to be trained and the target model parameters to the second calculation server, after receiving the model to be trained and the target model parameters, the second calculation server obtains training samples for training the target model parameters, calculates gradient values corresponding to the training samples according to the model to be trained and the target model parameters received at the time, and updates the model to be trained according to the gradient values obtained by the calculation to obtain new model parameters.
It should be understood that the target model parameter is a variable, so the target model parameters sent by the detection apparatus each time may be different, it should also be understood that the training sample is obtained from a large amount of manually labeled data, and includes pornographic video data and normal video data, and the training sample obtained by the second computing server each time is different.
It should be further noted that the first computation server may compute the gradient value corresponding to the training sample through a forward propagation algorithm and a backward propagation algorithm, or may compute the gradient value through other manners, which is not limited herein.
(5) Performing difference processing according to the second model parameter and the target model parameter to obtain a fourth model parameter, taking the fourth model parameter as the target model parameter, and executing the step (6) and/or the step (3) until the model to be trained is converged;
and when the detection device receives the second model parameters sent by the first calculation server, the detection device performs difference processing according to the received second model parameters and the target model parameters stored in the detection device at present to obtain new model parameters, and for convenience of description, the model parameters obtained by the detection device according to the second model parameters and the target model parameters are called as fourth model parameters.
And (3) after the detection device performs difference processing each time to obtain a fourth model parameter, determining the fourth model parameter as a target model parameter, namely storing the value of the fourth model parameter as the latest value of the variable of the target model parameter corresponding to the detection device, and then executing the step (6) and/or the step (3).
(6) Performing difference processing according to the third model parameter and the target model parameter to obtain a fifth model parameter, taking the fifth model parameter as the target model parameter, and executing the step (5) and/or the step (4) until the training model converges;
and after the second computing server updates the training model each time to obtain a third model parameter, returning the obtained third model parameter to the detection device, and when the detection device receives the third model parameter sent by the second computing server, performing difference processing according to the received third model parameter and a target model parameter stored in the detection device at the present time to obtain a new model parameter.
And (3) after the detection device performs difference processing each time to obtain a fifth model parameter, determining the fifth model parameter as a target model parameter, namely storing the value of the fifth model parameter as the latest value of the variable of the target model parameter corresponding to the detection device, and executing the step (5) and/or the step (4).
It should be understood that step (5) needs to be executed after the detection device sends the model to be trained and the target model parameters to the first computation server in step (3), step (6) needs to be executed after the detection device sends the model to be trained and the target model parameters to the second computation server in step (4), step (5) is executed when the detection device receives the training result (third model parameters) returned by the first computation server, and step (6) is executed when the detection device receives the training result (fourth model parameters) returned by the second computation server, so the execution sequence between step (5) and step (6) depends on the first computation server and the second computation server who return the training result first.
If the first computing server returns the training result, the detection device executes the step (5) first, then the detection device returns the target model parameters updated through the step (5) to the second computing server, namely returns to execute the step (3), and when the conditions are met, executes the step (5) again. Meanwhile, if the triggering condition of step (6) is met, that is, the training result returned by the second computing server is received, the detection device will simultaneously execute step (6), and at this time, the target model parameters for applying the difference processing in step (6) should be the target model parameters updated through step (5). After the step (6) is executed, the detection device returns the latest target model parameters obtained after the updating in the step (6) to the first calculation server, namely returns to the step (4), and when the conditions are met, the step (6) is executed again. And (5) executing the steps in a circulating mode until the training model converges in the step (5) or the step (6), ending the circulation, and executing the step (7).
If the second computing server returns the training result, the detection device executes the step (6) first, and then the detection device returns the target model parameters updated through the step (6) to the second computing server, namely returns to execute the step (4), and executes the step (6) again when the condition is met. Meanwhile, if the triggering condition of step (5) is met, that is, the training result returned by the second computing server is received, the detection device will simultaneously execute step (5), and at this time, the target model parameters applied by the difference processing in step (5) should be the target model parameters updated through step (6). After the step (5) is executed, the detection device returns the latest target model parameters obtained after the updating in the step (5) to the first calculation server, namely returns to the step (3), and when the conditions are met, the step (5) is executed again. And (5) executing the steps in a circulating mode until the training model converges in the step (5) or the step (6), ending the circulation, and executing the step (7).
In brief, the first computation server or the second computation server performs training and updating according to the received target model parameters after receiving the target model parameters sent by the detection device each time, returns the results obtained by respective training and updating to the detection device after the training and updating of the first computation server or the second computation server is completed, performs difference processing according to the returned results and the latest target model parameters stored in the detection device after the detection device receives the results returned by the first computation server or the second computation server each time, updates the target model parameters again, returns the target model parameters updated again to the corresponding computation server, that is, if the target model parameters are updated according to the results returned by the first computation server, returns to the first computation server, if the target model parameters are updated according to the results returned by the second computation server, then return to the second computing server. And (5) executing the step (7) until the target model parameters are updated to the convergence of the model to be trained by the detection device, and no longer sending the latest target model parameters to the first calculation server or the second calculation server.
(7) And (5) determining the model to be trained after convergence in the step (5) or the step (6) as a first deep learning model.
When the detection device determines that the model to be trained converges, the detection device takes the converged model to be trained as a first deep learning model. It should be understood that the converged model to be trained may be obtained after being updated through step (5) or may be obtained after being updated through step (6).
It should be noted that the detection apparatus may obtain the first deep learning model by the distributed server training method described in the above steps (1) to (7), and may also obtain the first deep learning model by training in other manners, which is not limited herein.
It should be noted that, it should be understood that the feature information of the video frame sequence extracted by the detection device is determined by the deep learning model, and based on different deep learning models, the detection device may extract only the motion information feature of the video frame sequence, or only the picture content feature, or may extract both the motion information feature and the picture content feature. For example, the Pooling Conv model and the Conv3D model extract two information features of motion information and picture content at the same time, and the calculation accuracy of the Conv3D is higher than that of the Pooling Conv. And for example, the ImageNet model only extracts the picture content features, and the calculation accuracy is lower than that of the two models because the extracted feature information is less. There are many other deep learning models that can extract video feature information, and this is not listed here. In the embodiment of the present invention, the first deep learning model may be any one of the examples, or may be another deep learning model, and is not limited herein. The second deep learning model mentioned later is any one of the learning models with higher calculation accuracy than the first deep learning model, for example, the first deep learning model is an ImageNet model, the second deep learning model is a Pooling Conv model, or the first deep learning model is a Pooling Conv model, the second deep learning model is a Conv3D model, or the like, which is not listed here.
303. Determining the maximum value in the first scores corresponding to each group of video frame sequences;
after first scores corresponding to each group of video frame sequences are determined through the first deep learning model, the maximum value of the first scores is determined.
304. Judging whether the maximum value is larger than a first preset threshold value, if so, executing step 309, and if not, executing step 305;
after the maximum value is determined, it is determined whether the maximum value is greater than a first preset threshold, if so, step 309 is executed, and if not, step 305 is executed. It should be understood that the first preset threshold is preset by a user or a system, and may be obtained through a test of a large amount of sample data, or may be obtained through other manners, which are not limited herein.
305. Judging whether the maximum value is smaller than a second preset threshold value, if so, executing a step 310, and if not, executing a step 306;
when the maximum value is not greater than the first preset threshold value, whether the maximum value is smaller than a second preset threshold value is judged, if yes, step 310 is executed, and if not, step 306 is executed. It should be understood that the second preset threshold is preset by a user or a system, and may be obtained by testing a large amount of sample data, or may be determined by other methods, which are not limited herein. It will also be appreciated that the second preset threshold is less than the first preset threshold.
306. Determining the video frame sequence corresponding to the maximum value as a target video frame sequence;
when the detection device determines that the maximum value is not less than the second preset threshold value, that is, the maximum value is between the first preset threshold value and the second preset threshold value, the target video is considered to be a suspected pornographic video, and therefore the target video is further judged.
307. Extracting a second motion information feature and/or a second picture content feature corresponding to the target video frame sequence through a second deep learning model, and calculating a second score corresponding to the target video frame sequence according to the second motion information feature and/or the second picture content feature;
after the detection device determines the target video sequence, the detection device extracts a motion information feature and/or a picture content feature corresponding to the target video frame sequence through a second deep learning model, and calculates a second score corresponding to the target video frame sequence according to the motion information feature and/or the picture content feature, wherein the motion information feature refers to a motion information feature of the group of video frame sequences in a time dimension, and the picture content feature refers to a picture content feature of each frame image in the group of video frame sequences.
It should be understood that the second deep learning model is two different models from the first deep learning model, so that the motion information and/or the picture content features corresponding to the target video frame sequence extracted by the second deep learning model are different from those extracted by the first deep learning model for the target video frame sequence, and the calculation manner is also different.
It should also be understood that the second deep learning model may be obtained by the detection apparatus from another apparatus before performing step 307, or may be obtained by the detection apparatus training the model through a large amount of training data before performing step 307, which is not limited herein. Specifically, the detection apparatus may train the model to be trained based on the training architecture of the distributed server to obtain the second deep learning model, and the training mode based on the training architecture of the distributed server is similar to the training mode in step 302, which is not described herein again. The second deep learning model may also be obtained by training the model to be trained in other manners, which is not limited herein.
308. Judging whether the second score is larger than a third preset threshold, if so, executing step 309, otherwise, executing step 310;
after the detection device obtains a second score corresponding to the target video frame sequence through calculation of the second deep learning model, it is determined whether the second score is greater than a third preset threshold, if so, step 309 is executed, and if not, step 310 is executed. It should be understood that the third preset threshold is preset by a user or a system, and may be obtained by testing a large amount of sample data, or may be determined by other ways, which is not limited herein.
309. Determining the target video as a pornographic video;
when the detection device determines that the maximum value is larger than a first preset threshold value or determines that a second score corresponding to the target video frame sequence is larger than a third preset threshold value, the detection device determines that the target video is a pornographic video.
310. And determining that the target video is a common video.
When the detection device determines that the maximum value is smaller than a second preset threshold value or determines that a second score corresponding to the target video frame sequence is not larger than a third preset threshold value, the detection device determines that the target video is a common video.
The method comprises the steps of extracting motion information characteristics and/or picture content characteristics of a plurality of groups of video frame sequences in a target video through a deep learning model, calculating scores corresponding to each group of video frame sequences according to the motion information characteristics and/or the picture content characteristics, judging whether the maximum value of the scores is larger than a preset threshold value or not, and determining that the target video is a pornographic video if the maximum value of the scores is larger than the preset threshold value. The deep learning model is obtained through training of a large amount of video data, so that the recognition accuracy is high. Therefore, the excellent emotion video can be rapidly identified, and the identification accuracy is improved.
Secondly, in the embodiment of the invention, when the maximum value is not greater than the preset threshold value, the detection device can further identify the target video through another deep learning model with higher accuracy, so that the identification accuracy can be further improved.
It should be further noted that, in addition to the embodiment corresponding to fig. 3, the detection device may use two deep learning models to detect the pornographic video of the target video, and may also use three or more deep learning models to detect the pornographic video of the target video, so as to achieve higher identification accuracy, which is not described herein again in detail.
For convenience of understanding, the pornographic video identification detection method in the embodiment of the invention is described in an application scenario as follows:
the Tencent video website server obtains a Pooling Conv model and a Conv3D model which are trained from a parameter server in advance, and a large amount of sample data tests show that the video frame sequences which are calculated by the Pooling Conv model and are higher than 0.97 all contain pornography content, so 0.97 is set as a first preset threshold, the video frame sequences which are calculated by the Pooling Conv model and are lower than 0.3 do not contain pornography content, 0.3 is set as a second preset threshold, the video frame sequences which are calculated by the Conv3D and are higher than 0.92 all contain pornography video basically, the video frame sequences which are lower than 0.92 do not contain pornography video basically, and 0.92 is set as a third preset threshold.
A certain user uploads a video A with a period of 20 minutes on a Tencent video website, a server of the Tencent video website acquires the video A, 3 groups of video frame sequences are randomly extracted from the video A according to the time length of the video, specifically, 16 frames of images are extracted as a group of video frame sequences according to a fixed time interval of 0.2 second, and the extracted 3 groups of video frame sequences are recorded as a frame sequence 1, a frame sequence 2 and a frame sequence 3.
Then, for each group of video frame sequences, a Pooling Conv model (a first deep learning model) is used to extract motion information features and picture content features corresponding to the group of video frame sequences, scores corresponding to the group of video frame sequences are calculated according to the extracted motion information features and picture content, and finally, the Pooling Conv model calculates that the score corresponding to the frame sequence 1 is 0.3 (a first score), the score corresponding to the frame sequence 2 is 0.9 (a first score), and the score corresponding to the frame sequence 3 is 0.72 (a first score). The Tencent video website server determines that the maximum value of the scores corresponding to the 3 groups of video frame sequences is 0.9, judges whether 0.9 is larger than a first preset threshold (0.97), determines that 0.9 is not larger than the first preset threshold, then judges whether 0.9 is smaller than a second preset threshold (0.3), determines that 0.9 is not smaller than the second preset threshold, considers that the video A is a suspected pornographic video, needs to make further judgment, first determines a target video frame sequence with the first score of 0.9, namely a frame sequence 2, extracts the motion information characteristic and the picture content characteristic corresponding to the frame sequence 2 through a Conv3D model (a second deep learning model), and calculates the score corresponding to the frame sequence 2 to be 0.94 (a second score) according to the motion information characteristic and the picture content characteristic. The Tencent video website server judges whether the 0.94 is greater than a third preset threshold (0.92), and if the 0.94 is greater than the third preset threshold, the Tencent video website server determines that the video A is the pornographic video. The Tencent video website server deletes the video A from the website.
Referring to fig. 4, the pornographic video detecting method according to the embodiment of the present invention is described above, and the detecting apparatus according to the embodiment of the present invention is described below, where another embodiment of the detecting apparatus according to the embodiment of the present invention includes:
an extraction module 401, configured to extract a plurality of sets of video frame sequences from a target video;
a first calculating module 402, configured to extract, through a first deep learning model, first motion information features and/or first image content features corresponding to each group of video frame sequences, and calculate, according to the first motion information features and/or the first image content features, first scores corresponding to each group of video frame sequences;
a first judging module 403, configured to determine a maximum value in the first score, and judge whether the maximum value is greater than a first preset threshold;
a first determining module 404, configured to determine that the target video is the pornographic video when the determining module 403 determines that the maximum value is greater than the first preset threshold.
In the embodiment of the present invention, a first calculating module 402 extracts motion information features and/or picture content features of a plurality of groups of video frame sequences in a target video through a deep learning model, and calculates scores corresponding to each group of video frame sequences according to the motion information features and/or the picture content features, a first determining module 403 determines whether a maximum value of the scores is greater than a preset threshold, and if so, a first determining module 404 determines that the target video is a pornographic video. The deep learning model is obtained through training of a large amount of video data, so that the recognition accuracy is high. Therefore, the excellent emotion video can be rapidly identified, and the identification accuracy is improved.
Based on the embodiment corresponding to fig. 4, please refer to fig. 5, in another embodiment of the detection apparatus provided in the embodiment of the present invention, the detection apparatus may further include:
a second judging module 405, configured to, when the first judging module 404 determines that the maximum value is not greater than the first preset threshold, judge whether the maximum value is smaller than a second preset threshold, where the second preset threshold is smaller than the first preset threshold;
a second determining module 406, configured to determine, when the second determining module 405 determines that the maximum value is not smaller than the second preset threshold, that the video frame sequence corresponding to the maximum value is the target video frame sequence;
the second calculating module 407 is configured to extract, by using a second deep learning model, a second motion information feature and/or a second picture content feature corresponding to the target video frame sequence, and calculate a second score corresponding to the target video frame sequence according to the second motion information feature and/or the second picture content feature, where the calculation accuracy of the second deep learning model is greater than that of the first deep learning model;
a third determining module 408, configured to determine whether the second score is greater than a third preset threshold;
a third determining module 409, configured to determine that the target video is the pornographic video when the third determining module 408 determines that the second score is greater than the third preset threshold.
Optionally, in this embodiment of the present invention, the detecting device may further include:
a fourth determining module 410, configured to determine that the target video is the normal video when the second determining module 405 determines that the maximum value is smaller than the second preset threshold, or the third determining module 408 determines that the second score is not larger than the third preset threshold.
In the embodiment of the present invention, when it is determined that the maximum value of the scores corresponding to the extracted groups of video frame sequences is not greater than the first preset threshold, the second calculation module 407 may further identify the target video through the second deep learning model with higher calculation accuracy, so as to further improve the accuracy of the scheme.
Based on the embodiment corresponding to fig. 4 or fig. 5, referring to fig. 6, in another embodiment of the detection apparatus provided in the embodiment of the present invention, the detection apparatus may further include:
a fifth determining module 411, configured to determine the model to be trained and the first model parameter corresponding to the model to be trained;
a sixth determining module 412, configured to use the first model parameter as a target model parameter, and then trigger the first sending module 413 and the second sending module 414 to:
the first sending module 413 is configured to send the model to be trained and the target model parameter to the first computing server, so that the first computing server obtains a first training sample corresponding to the target model parameter, calculates a first gradient value corresponding to the first training sample according to the model to be trained and the target model parameter, and updates the model to be trained according to the first gradient value to output a second model parameter, where the first training sample includes pornographic video data and normal video data;
a second sending module 414, configured to send the model to be trained and the target model parameter to a second computing server, so that the second computing server obtains a second training sample corresponding to the target model parameter, calculates a second gradient value corresponding to the second training sample according to the model to be trained and the target model parameter, and updates the model to be trained according to the second gradient value to output a third model parameter, where the second training sample includes pornographic video data and normal video data;
the first processing module 415 is configured to, when receiving the second model parameter sent by the first computing server, perform difference processing according to the second model parameter and the target model parameter to obtain a fourth model parameter, use the fourth model parameter as the target model parameter, and trigger the first sending module 413 until the model to be trained converges;
a second processing module 416, configured to, when receiving a third model parameter sent by the second computing server, perform difference processing according to the third model parameter and the target model parameter to obtain a fifth model parameter, use the fifth model parameter as the target model parameter, and execute triggering of the second sending module 414 until the model to be trained converges;
a seventh determining module 417, configured to determine that the converged model to be trained obtained after the processing by the first processing module or the second processing module is the first deep learning model.
The embodiment of the invention provides a mode for acquiring the first deep learning model by the detection device, and the realizability of the scheme is improved.
Based on any one of the embodiments corresponding to fig. 4 to fig. 6, in another embodiment of the detection apparatus provided in the embodiment of the present invention, the extraction module may include:
the extraction unit is used for extracting a plurality of groups of video frame sequences according to the time length of the target video, each group of video frame sequences comprises video frames with preset frame numbers, and the time intervals of any two adjacent video frames in each group of video frame sequences are equal.
The detecting apparatus in the embodiment of the present invention is described above from the perspective of a functional module, and the detecting apparatus in the embodiment of the present invention is described below from the perspective of a hardware entity. The detection device in the embodiment of the present invention is applicable to any computer device, for example, the computer device may be a server for implementing video sharing, or other devices with data processing capability. Taking a server as an example for description, please refer to fig. 7, where fig. 7 is a schematic structural diagram of a server according to an embodiment of the present invention, the server 700 may generate a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 722 (e.g., one or more processors) and a memory 732, and one or more storage media 730 (e.g., one or more mass storage devices) storing an application program 742 or data 744. Memory 732 and storage medium 730 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 730 may include one or more modules (not shown), each of which may include a series of instruction operations for the server. Further, the central processor 722 may be configured to communicate with the storage medium 730, and execute a series of instruction operations in the storage medium 730 on the server 700.
The server 700 may also include one or more power supplies 726, one or more wired or wireless network interfaces 750, one or more input-output interfaces 758, and/or one or more operating systems 741, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
The steps performed by the detection means in the above-described embodiment may be based on the server structure shown in fig. 7.
The embodiment of the invention provides a specific mode for extracting a video frame sequence by an extraction module, and the realizability of the scheme is improved.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A pornographic video detection method is characterized by comprising the following steps:
extracting a plurality of groups of video frame sequences from a target video;
extracting first motion information characteristics and/or first picture content characteristics corresponding to each group of video frame sequences through a first deep learning model, and calculating first scores corresponding to each group of video frame sequences respectively according to the first motion information characteristics and/or the first picture content characteristics;
determining the maximum value in the first fraction, and judging whether the maximum value is greater than a first preset threshold value;
if the target video is larger than the first preset threshold, determining that the target video is a pornographic video;
the extracting of the plurality of sets of video frame sequences from the target video may comprise:
determining a model to be trained and a first model parameter corresponding to the model to be trained;
taking the first model parameter as a target model parameter, and then performing the following steps 1) and 2) respectively:
1) sending the model to be trained and the target model parameter to a first computing server, so that the first computing server obtains a first training sample corresponding to the target model parameter, calculates a first gradient value corresponding to the first training sample according to the model to be trained and the target model parameter, updates the model to be trained according to the first gradient value, outputs a second model parameter, and then executes step 3), wherein the first training sample comprises pornographic video data and normal video data;
2) sending the model to be trained and the target model parameter to a second computing server, so that the second computing server obtains a second training sample corresponding to the target model parameter, calculates a second gradient value corresponding to the second training sample according to the model to be trained and the target model parameter, updates the model to be trained according to the second gradient value, outputs a third model parameter, and then executes step 4), wherein the second training sample comprises pornographic video data and normal video data;
3) when receiving a second model parameter sent by the first computing server, performing difference processing according to the second model parameter and the target model parameter to obtain a fourth model parameter, taking the fourth model parameter as a target model parameter, and executing the step 4) and/or the step 1) until the model to be trained is converged;
4) when a third model parameter sent by the second computing server is received, performing difference processing according to the third model parameter and the target model parameter to obtain a fifth model parameter, taking the fifth model parameter as a target model parameter, and executing the step 3) and/or the step 2) until the model to be trained is converged;
and determining the model to be trained after convergence in the step 3) or the step 4) as a first deep learning model.
2. The method of claim 1, further comprising:
when the maximum value is determined to be not larger than the first preset threshold value, judging whether the maximum value is smaller than a second preset threshold value, wherein the second preset threshold value is smaller than the first preset threshold value;
if the maximum value is not less than the second preset threshold value, determining that the video frame sequence corresponding to the maximum value is a target video frame sequence;
extracting a second motion information feature and/or a second picture content feature corresponding to the target video frame sequence through a second deep learning model, and calculating a second score corresponding to the target video frame sequence according to the second motion information feature and/or the second picture content feature, wherein the calculation precision of the second deep learning model is greater than that of the first deep learning model;
judging whether the second score is larger than a third preset threshold value or not;
and if the target video is larger than the third preset threshold, determining that the target video is the pornographic video.
3. The method of claim 2, further comprising:
and when the maximum value is determined to be smaller than a second preset threshold value or the second score is determined to be not larger than a third preset threshold value, determining that the target video is a common video.
4. The method of any of claims 1-3, wherein the extracting sets of sequences of video frames from the target video comprises:
and extracting a plurality of groups of video frame sequences according to the time length of the target video, wherein each group of video frame sequences comprises video frames with preset frame numbers, and the time intervals of any two adjacent video frames in each group of video frame sequences are equal.
5. A detection device, comprising:
the extraction module is used for extracting a plurality of groups of video frame sequences from the target video;
the first calculation module is used for extracting a first motion information characteristic and/or a first image content characteristic corresponding to each group of video frame sequences through a first deep learning model and calculating first scores respectively corresponding to each group of video frame sequences according to the first motion information characteristic and/or the first image content characteristic;
the first judging module is used for determining the maximum value in the first fraction and judging whether the maximum value is greater than a first preset threshold value;
the first determining module is used for determining that the target video is the pornographic video when the judging module determines that the maximum value is larger than the first preset threshold value;
the device further comprises:
the fifth determining module is used for determining a model to be trained and a first model parameter corresponding to the model to be trained;
a sixth determining module, configured to use the first model parameter as a target model parameter, and then trigger the first sending module and the second sending module respectively:
the first sending module is used for sending the model to be trained and the target model parameter to a first computing server so that the first computing server obtains a first training sample corresponding to the target model parameter, calculates a first gradient value corresponding to the first training sample according to the model to be trained and the target model parameter, updates the model to be trained according to the first gradient value and outputs a second model parameter, and the first training sample comprises pornographic video data and normal video data;
the second sending module is used for sending the model to be trained and the target model parameter to a second computing server so that the second computing server obtains a second training sample corresponding to the target model parameter, calculates a second gradient value corresponding to the second training sample according to the model to be trained and the target model parameter, updates the model to be trained according to the second gradient value and outputs a third model parameter, wherein the second training sample comprises pornographic video data and normal video data;
the first processing module is used for carrying out differential processing according to the second model parameter and the target model parameter to obtain a fourth model parameter when receiving the second model parameter sent by the first computing server, taking the fourth model parameter as the target model parameter, and triggering the first sending module until the model to be trained is converged;
the second processing module is used for carrying out differential processing according to the third model parameter and the target model parameter to obtain a fifth model parameter when receiving the third model parameter sent by the second computing server, taking the fifth model parameter as the target model parameter, and executing triggering of the second sending module until the model to be trained is converged;
and the seventh determining module is used for determining the converged model to be trained, which is obtained after the processing of the first processing module or the second processing module, as the first deep learning model.
6. The apparatus of claim 5, further comprising:
the second judging module is used for judging whether the maximum value is smaller than a second preset threshold value when the first judging module determines that the maximum value is not larger than the first preset threshold value, and the second preset threshold value is smaller than the first preset threshold value;
a second determining module, configured to determine, when the second determining module determines that the maximum value is not smaller than the second preset threshold, that the video frame sequence corresponding to the maximum value is a target video frame sequence;
the second calculation module is used for extracting a second motion information feature and/or a second picture content feature corresponding to the target video frame sequence through a second deep learning model, and calculating a second score corresponding to the target video frame sequence according to the second motion information feature and/or the second picture content feature, wherein the calculation precision of the second deep learning model is greater than that of the first deep learning model;
the third judging module is used for judging whether the second score is larger than a third preset threshold value;
and the third determining module is used for determining that the target video is the pornographic video when the third judging module determines that the second score is greater than the third preset threshold value.
7. The apparatus of claim 6, further comprising:
a fourth determining module, configured to determine that the target video is a normal video when the second determining module determines that the maximum value is smaller than a second preset threshold, or the third determining module determines that the second score is not greater than a third preset threshold.
8. The apparatus of any one of claims 5 to 7, wherein the extraction module comprises:
and the extraction unit is used for extracting a plurality of groups of video frame sequences according to the time length of the target video, each group of video frame sequences comprises video frames with preset frame numbers, and the time intervals of any two adjacent video frames in each group of video frame sequences are equal.
9. A server, comprising: a memory and a processor;
the memory is used for storing a computer program;
the processor is configured to execute a computer program stored in the memory;
the computer program is for performing the pornographic video detection method of any one of claims 1-4.
10. A computer-readable storage medium, having stored thereon a computer-executable program which, when loaded and executed by a processor, implements the pornographic video detection method according to any one of claims 1-4.
CN201611200177.1A 2016-12-22 2016-12-22 Pornographic video detection method and device Active CN108229262B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611200177.1A CN108229262B (en) 2016-12-22 2016-12-22 Pornographic video detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611200177.1A CN108229262B (en) 2016-12-22 2016-12-22 Pornographic video detection method and device

Publications (2)

Publication Number Publication Date
CN108229262A CN108229262A (en) 2018-06-29
CN108229262B true CN108229262B (en) 2021-10-15

Family

ID=62656384

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611200177.1A Active CN108229262B (en) 2016-12-22 2016-12-22 Pornographic video detection method and device

Country Status (1)

Country Link
CN (1) CN108229262B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108810620B (en) 2018-07-18 2021-08-17 腾讯科技(深圳)有限公司 Method, device, equipment and storage medium for identifying key time points in video
CN109308490B (en) * 2018-09-07 2020-03-17 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN111126108B (en) * 2018-10-31 2024-05-21 北京市商汤科技开发有限公司 Training and image detection method and device for image detection model
CN109451349A (en) * 2018-10-31 2019-03-08 维沃移动通信有限公司 A kind of video broadcasting method, device and mobile terminal
CN111382605B (en) * 2018-12-28 2023-08-18 广州市百果园信息技术有限公司 Video content auditing method, device, storage medium and computer equipment
CN111385602B (en) * 2018-12-29 2022-08-09 广州市百果园信息技术有限公司 Video auditing method, medium and computer equipment based on multi-level and multi-model
US11961300B2 (en) 2019-04-29 2024-04-16 Ecole Polytechnique Federale De Lausanne (Epfl) Dynamic media content categorization method
CN110414471B (en) * 2019-08-06 2022-02-01 福建省趋普物联科技有限公司 Video identification method and system based on double models
CN111291610B (en) * 2019-12-12 2024-05-28 深信服科技股份有限公司 Video detection method, device, equipment and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930553A (en) * 2011-08-10 2013-02-13 中国移动通信集团上海有限公司 Method and device for identifying objectionable video content
CN103763515A (en) * 2013-12-24 2014-04-30 浙江工业大学 Video anomaly detection method based on machine learning
CN106503610A (en) * 2015-09-08 2017-03-15 阿里巴巴集团控股有限公司 Video frequency identifying method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5045371B2 (en) * 2007-10-30 2012-10-10 Kddi株式会社 Foreground / background classification apparatus, method, and program for each pixel of moving image
US8358837B2 (en) * 2008-05-01 2013-01-22 Yahoo! Inc. Apparatus and methods for detecting adult videos
US9355406B2 (en) * 2013-07-18 2016-05-31 GumGum, Inc. Systems and methods for determining image safety

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930553A (en) * 2011-08-10 2013-02-13 中国移动通信集团上海有限公司 Method and device for identifying objectionable video content
CN103763515A (en) * 2013-12-24 2014-04-30 浙江工业大学 Video anomaly detection method based on machine learning
CN106503610A (en) * 2015-09-08 2017-03-15 阿里巴巴集团控股有限公司 Video frequency identifying method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
双阈值级联分类器的加速人脸检测算法;王燕 等;《计算机应用》;20110731;第31卷(第7期);第1822-1824页 *

Also Published As

Publication number Publication date
CN108229262A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
CN108229262B (en) Pornographic video detection method and device
CN110070029B (en) Gait recognition method and device
CN110339569B (en) Method and device for controlling virtual role in game scene
CN110633745A (en) Image classification training method and device based on artificial intelligence and storage medium
CN111611873A (en) Face replacement detection method and device, electronic equipment and computer storage medium
CN110298230A (en) Silent biopsy method, device, computer equipment and storage medium
CN109847366B (en) Data processing method and device for game
CN114331829A (en) Countermeasure sample generation method, device, equipment and readable storage medium
CN113822254B (en) Model training method and related device
CN111949702B (en) Abnormal transaction data identification method, device and equipment
CN110096938A (en) A kind for the treatment of method and apparatus of action behavior in video
CN111539290A (en) Video motion recognition method and device, electronic equipment and storage medium
CN110287848A (en) The generation method and device of video
CN112052816A (en) Human behavior prediction method and system based on adaptive graph convolution countermeasure network
CN110688319B (en) Application keep-alive capability test method and related device
JP2019105871A (en) Abnormality candidate extraction program, abnormality candidate extraction method and abnormality candidate extraction apparatus
CN110688878B (en) Living body identification detection method, living body identification detection device, living body identification detection medium, and electronic device
CN112800923A (en) Human body image quality detection method and device, electronic equipment and storage medium
CN112269937A (en) Method, system and device for calculating user similarity
CN111354013A (en) Target detection method and device, equipment and storage medium
CN111104952A (en) Method, system and device for identifying food types and refrigerator
CN109587248A (en) User identification method, device, server and storage medium
CN110210215A (en) A kind of method and relevant apparatus of viral diagnosis
CN113919488A (en) Method and device for generating countermeasure sample and server
CN114360002A (en) Face recognition model training method and device based on federal learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant