KR20160104826A

KR20160104826A - Method and apparatus for detecting obscene video

Info

Publication number: KR20160104826A
Application number: KR1020150027427A
Authority: KR
Inventors: 유정재; 정치윤; 최수길; 최진우; 한승완
Original assignee: 한국전자통신연구원
Priority date: 2015-02-26
Filing date: 2015-02-26
Publication date: 2016-09-06

Abstract

The present invention relates to a method and an apparatus for judging a harmful video, and a method for judging a malicious video according to an embodiment of the present invention includes: detecting a change in shot of a video; Dividing a processing period on a time axis for analyzing a harmfulness of the moving image based on the detected shot change; Extracting the spatiotemporal axis characteristic in each processing section and calculating a harmful possibility of the processing section; And judging the harmfulness of the moving picture on the basis of the harmfulness calculated for each processing section.

Description

METHOD AND APPARATUS FOR DETECTING OBSCENE VIDEO [0002]

[0001] The present invention relates to a method and an apparatus for judging a harmful moving picture, and more particularly, to a method and an apparatus for judging a harmful moving picture which improve classification accuracy using shot length information

Recently, due to the development of communication network technology and popularization of PCs and mobile devices, downloading and viewing video contents without restriction of time and place has become a daily life. However, with the increasing convenience of entertainment culture, the risk of exposure to harmful content such as obscene video is also increasing along with the growing number of children and adolescents. Accordingly, there is an increasing demand for a technology for automatically analyzing the contents of image contents to determine whether the contents are harmful and for blocking harmful contents.

Recent trends in how to identify and block harmful content are categorized into several types: In the following explanations, the term 'harmful video' means video content that shows women's breasts, sexually explicit sexual acts such as sex, sex, and caress.

The first type is a method for identifying harmfulness by using feature information that can be perceived intuitively for harmful images. As an example, a method of extracting a specific color distribution region such as a skin region from a frame image extracted from a moving image and calculating a feature vector representing a distribution pattern of a center of gravity and a region from a set of pixels included in the skin region . A recognizer such as Multi-Layer Perceptron (MLP) or Support Vector Machine (SVM), which inputs the feature vector thus calculated, is used to determine whether the input image is harmful or not, Learning. This approach is a method that utilizes the perceived intuitive characteristics of the person using the prior knowledge that the harmful content image basically expresses sexual behavior in a state of high exposure.

The second type is a method for determining whether a frame image extracted from a moving image is harmful or not by using a statistical characteristic automatically extracted by a machine. We applied the BOVW (Bag Of Visual Word) model, which has been studied for the automatic classification of document contents, to the image recognition problem, constructed a visual vocabulary from the natural feature automatically detected from the image, It is a method to automatically identify the category of the image from the input image as it automatically recognizes the classification of the content from the document. This method has been reported in academia to classify pornographic images that are exposed to male and female genitalia in the input image.

The third type is a method for discriminating the harmfulness of the input moving image by using the image change information on the time axis instead of the frame unit image characteristic. At this time, the image change information on the time axis can be used as a difference image between frames, a tendency of a feature point trajectory detected in a moving image, and a distribution characteristic of a motion vector. This approach can be used to reduce the errors that can occur when determining the harmfulness of the entire input video and to improve the accuracy of the discrimination results when using only the features extracted from the frame unit still images as in the first and second types above It can be used for the purpose.

SUMMARY OF THE INVENTION It is an object of the present invention to provide a method and an apparatus for judging a harmful video which improves classification accuracy using shot length information.

According to another aspect of the present invention, there is provided a method for determining a harmful moving picture, the method comprising: detecting a shot change of a moving picture; Dividing a processing period on a time axis for analyzing a harmfulness of the moving image based on the detected shot change; Extracting the spatiotemporal axis characteristic in each processing section and calculating a harmful possibility of the processing section; And judging the harmfulness of the moving picture on the basis of the harmfulness calculated for each processing section.

According to the present invention, classification accuracy can be improved when a harmful moving picture is discriminated.

1 is a flowchart illustrating a method for determining a harmful moving picture according to an embodiment of the present invention.
2 is a flowchart showing a harmfulness calculation method according to an embodiment of the present invention.
3 is a diagram illustrating shot length information according to an embodiment of the present invention.
4 is a diagram showing the possibility of harmfulness in the harmful moving image group and the harmless moving image group acquired in the learning step.
5 is a block diagram showing the internal structure of a harmful moving picture determining apparatus according to an embodiment of the present invention.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Note that, in the drawings, the same components are denoted by the same reference symbols as possible. Further, the detailed description of well-known functions and constructions that may obscure the gist of the present invention will be omitted.

Throughout the specification, when a part is referred to as being "connected" to another part, it includes not only "directly connected" but also "indirectly connected" between the devices in the middle. Throughout the specification, when an element is referred to as "comprising ", it means that it can include other elements as well, without excluding other elements unless specifically stated otherwise.

1 is a flowchart illustrating a method for determining a harmful moving picture according to an embodiment of the present invention.

Referring to FIG. 1, first, in step 110, a shot change of a moving image is detected. Such shot change detection is performed for the whole section of the video to be judged as a hazard. According to an embodiment of the present invention, frames in which a new shot starts are calculated by calculating a change in color distribution between frames from the start frame to the end frame of the video to be harmful, and such frames can be selected and stored as shot- have.

Then, based on the shot change detected in step 120, the processing section on the time axis for analyzing the hazardousness of the video to be subjected to the hazard determination is divided. That is, considering the shot divided frame stored in step 110, the processing section for analyzing the harmfulness of the moving picture to be harmful is divided. In this case, for the ease of hazard analysis, different shots are not included in the same section, and the segment lengths can be either a method using a fixed constant frame length or a method using a variable length.

Next, in step 130, spatiotemporal axis characteristics are extracted in each processing section, and the possibility of harmfulness of the processing section is calculated. In this case, the spatio-temporal axis characteristic is a concept including both the image space axis characteristics such as the color and texture of the still images extracted in frame unit, and the temporal axis image change characteristics such as difference image, motion distribution, and feature trajectory tendency. According to one embodiment of the present invention, a large amount of harmful and harmless learning data is learned in advance, and a feature vector calculated from the spatiotemporal axis characteristics is subjected to a machine learning method such as SVM (Support Vector Machine) Model information can be obtained. Then, the feature vector is calculated in the same manner as the learning step in each divided section of the input moving picture when the hazard is determined, and [0.0, 1.0] is calculated using the model information of the harmful and harmless motion picture group acquired in the learning step, It is possible to calculate the hazard probability having the value of the interval.

FIG. 2 shows a more detailed process for calculating the hazard probability.

2 is a flowchart showing a harmfulness calculation method according to an embodiment of the present invention.

Referring to FIG. 2, in calculating the hazard probability, first, in step 210, shot length information on a current processing section is estimated. The shot length information will be described with reference to FIG.

3 is a diagram illustrating shot length information according to an embodiment of the present invention.

FIG. 3 shows shot length information defined for the current processing period. In FIG. 3, a rectangle 340 filled with black means a shot division frame selected at the time of shot change detection, and a dotted line box 310 indicates a processing interval on the time axis currently being analyzed. When the shot length information is estimated, the shot length is defined as the length from the specific reference frame to the previous last shot divided frame within the processing interval as in the case of L1 320 in FIG. 3, Lt; RTI ID = 0.0 > of < / RTI >

In step 220, based on the spatiotemporal axis characteristic and the shot length information, the possibility of harm to the processing section is determined. We calculate P (Obj | m, t) and P (Obj | m, t) defined by equations (1) and ), The current processing section can be determined as a harmful video section. Here, m has a harmfulness value of 0.0 to 1.0, and t has an integer value indicating the number of frames as shot length information. P (Obj) is a probability that the current processing section is a harmful video, and P (Non) is a probability of a harmless video.

Equation (1)

Equation (2)

(Obj | m, t) and P (Non | m, t), assuming that the prior probabilities for the hazard of the processing interval in equations (1) and (2) t) can be interpreted as a value including the same proportional constant k as in the following equations (3) and (4), and as a result, P (m | Obj) t | Non), the current processing section can be determined as a harmful video section.

Equation (3)

Equation (4)

P (t | Obj), P (t | Obj), P (m | Non) and P (t | Non) are likelihood values for m and t of harmful video and harmless video, We use the density function for the distribution of hazards and shot length values to calculate moving images and harmless videos. The density function can be estimated as a continuous function, but it can be replaced with an approximated density function value in the form of a histogram in the learning step as shown in FIG.

4 is a diagram showing the possibility of harmfulness in the harmful moving image group and the harmless moving image group acquired in the learning step.

FIG. 4 is a histogram of the hazard probability, and the horizontal axes 0 to 10 are normalized such that the frequency is recorded at an interval of 0.1 with respect to the probability of harmfulness from 0.0 to 1.0, and the total sum is 1. The approximate values of P (m | Obj) and P (m | Non) can be obtained from the histogram information and the histogram of the shot length distribution is calculated in the harmful moving image group and the harmless moving image group in the same manner. Obj) and P (t | Non).

Referring back to FIG. 1, in step 140, the harmfulness of the moving image is determined based on the hazard probability calculated for each processing section. According to an embodiment of the present invention, if the harmful processing section of the video to be harmful judged is a certain ratio or more, the moving picture is classified as a harmful video.

According to the present invention, due to the characteristics of the harmful video, in which a specific region is displayed for a long time or a scene in which a specific action is continuously displayed in order to give a sexual satisfaction to a viewer, long take shots are frequently It is possible to improve the accuracy of the hazard discrimination by using the shot length information of the moving image as the feature information.

The embodiments of the invention thus far described may be embodied in instructions carried out by a processor and stored in a computer-readable storage medium. When these instructions are executed by a processor, they may generate means for implementing the functions / operations specified in the flowcharts and / or block diagrams described above. Each block in the flowcharts / block diagrams may represent hardware and / or software modules or logic that implement embodiments of the present invention. Further, the functions referred to in the block diagrams may occur outside the order mentioned in the drawings, or may occur simultaneously.

The computer-readable medium can include, for example, but is not limited to, a nonvolatile memory such as a floppy disk, ROM, flash memory, disk drive memory, CD-ROM, and other persistent storage, Available.

5 is a block diagram showing the internal structure of a harmful moving picture determining apparatus according to an embodiment of the present invention.

Referring to FIG. 5, the apparatus for determining a harmful motion picture 500 according to an embodiment of the present invention may include a moving picture input unit 510, a storage unit 520, and a controller 530.

The video input unit 510 refers to an interface for inputting a video to be determined for a hazard to the video determination unit 500. The moving picture input unit 510 according to an embodiment of the present invention may be a removable storage medium such as a USB, or may be a wired / wireless communication unit when a moving picture is inputted through a network.

The storage unit 520 stores programs and data necessary for the operation of the harmful moving picture determining apparatus 500. The storage unit 520 may be a volatile storage medium or a nonvolatile storage medium, or may be a combination of both storage media. The volatile storage medium may include a semiconductor memory such as a RAM, a DRAM, and a SRAM. The nonvolatile storage medium may include a hard disk and a flash NAND memory. According to an embodiment of the present invention, the storage unit 520 may store shot division frame information, learning data, and the like.

The control unit 530 is a component for controlling the overall operation of the harmful moving image determining apparatus 500. [ In the present invention, the controller 530 controls the overall operation of the process of determining the harmfulness of the moving picture inputted by the harmful moving picture determining apparatus 500. According to an embodiment of the present invention, the control unit 530 may include a shot change detecting unit 531, a moving picture segment dividing unit 533, an interval hazard assessment unit 535, and the like.

The shot change detection unit 531 performs shot division on the entire section of the moving picture when the first moving picture is inputted. The shot change detection unit 531 detects a frame in which a new shot starts, calculates a change in color distribution between frames from a start frame to an end frame of a video to be harmful, and selects and stores such frames as a shot division frame .

The moving picture division section 533 divides the processing sections on the time axis for analyzing the harmfulness of the input moving picture in consideration of the shot divided frame information calculated by the shot change detection section 531. [ In this case, the divided segments are divided so that different shots are not included in the same segment for ease of hazard analysis, and the segment length can be either a method using a fixed constant frame length or a method using a variable length.

The hazard probability estimating unit 535 extracts the spatiotemporal characteristic from each divided section and calculates the hazard probability of the corresponding section from the extracted spatiotemporal characteristic. In this case, the spatio-temporal axis characteristic is a concept including both the image space axis characteristics such as the color and texture of the still images extracted in frame unit, and the temporal axis image change characteristics such as difference image, motion distribution, and feature trajectory tendency. In this learning step, for a large amount of hazardous and harmless learning data, a characteristic vector calculated from the space-time axis characteristics is compared with the SVM (Support Vector Machine) Model information for harmful and harmless video clusters can be obtained by a machine learning method. Then, the feature vector is calculated in the same manner as the learning step in each divided section of the input moving picture when the hazard is determined, and [0.0, 1.0] is calculated using the model information of the harmful and harmless motion picture group acquired in the learning step, It is possible to calculate the hazard probability having the value of the interval.

The harmfulness possibility estimating unit 535 may further include an interval shot length calculating unit 536 and an interval hazard discriminating unit 537. [

Section shot length calculation unit 536) can estimate the shot length information defined in the same manner as FIG. 3 for the current processing period. In FIG. 3, a rectangle 340 filled with black means a shot division frame selected at the time of shot change detection, and a dotted line box 310 indicates a processing interval on the time axis currently being analyzed. When the shot length information is estimated, the shot length is defined as the length from the specific reference frame to the previous last shot divided frame within the processing interval as in the case of L1 320 in FIG. 3, Lt; RTI ID = 0.0 > of < / RTI >

The harmfulness discrimination unit 537 for each zone uses the spatiotemporal characteristic-based harmfulness calculated by the hazard-based likelihood probability estimation unit 535 and the shot length information calculated by the shot-length estimating unit 536 for each zone, Identify the harmfulness of the product. M, t) and P (Obj | m, t) defined by the equations (1) and (2) If P (Non | m, t) is estimated, the current processing section can be determined as a harmful video section. Here, m has a harmfulness value of 0.0 to 1.0, and t has an integer value indicating the number of frames as shot length information. P (Obj) is a probability that the current processing section is a harmful video, and P (Non) is a probability of a harmless video.

Equation (1)

Equation (2)

Equation (3)

Equation (4)

P (t | Obj), P (t | Obj), P (m | Non) and P (t | Non) are likelihood values for m and t of harmful video and harmless video, The hazard probability computed by the harmfulness possibility estimating unit 535 for the moving images and the harmless moving images and the density function for the distribution of the shot length values calculated by the segment length calculating unit 536 are used. The density function can be estimated as a continuous function, but it can be replaced with an approximated density function value in the form of a histogram in the learning step as shown in FIG.

In the above description, the shot change detecting unit 531, the moving picture interval dividing unit 533, the hazard probability estimating unit 535 and the control unit 530 are configured as separate blocks, and each block performs a different function However, this is merely for the sake of technical convenience, but the functions are not necessarily distinguished in this manner. For example, the control unit 530 may perform the functions performed by the shot change detection unit 531, the moving picture segmentation unit 533, and the interval-based harmfulness estimation unit 535.

The embodiments of the present invention disclosed in the present specification and drawings are merely illustrative examples of the present invention and are not intended to limit the scope of the present invention in order to facilitate understanding of the present invention. It will be apparent to those skilled in the art that other modifications based on the technical idea of the present invention are possible in addition to the embodiments disclosed herein.

510: Video input unit
520:
530:

Claims

Detecting a shot change of the moving picture;
Dividing a processing period on a time axis for analyzing a harmfulness of the moving image based on the detected shot change;
Extracting the spatiotemporal axis characteristic in each processing section and calculating a harmful possibility of the processing section; And
And judging the harmfulness of the moving picture on the basis of the harmfulness calculated for each of the processing sections.