KR20110056999A

KR20110056999A - Method and apparatus for malicious photo filtering using semantic features of digital photo

Info

Publication number: KR20110056999A
Application number: KR1020090113523A
Authority: KR
Inventors: 노용만; 한승완; 민현석; 김세민; 정병호
Original assignee: 한국전자통신연구원; 한국과학기술원
Priority date: 2009-11-23
Filing date: 2009-11-23
Publication date: 2011-05-31

Abstract

PURPOSE: A harmful image discriminating and blocking apparatus is provided to efficiently model big harmful meaning through a meaning based feature and to efficiently block a harmful image. CONSTITUTION: A pre-processing unit(2000) divides an input image into one or more areas. A visual feature generator(3000) generates visual feature about the divided areas. A meaning feature generator(4000) generates the meaning feature from visual features by area. A harmful image blocking unit(5000) decides the final harmfulness of the image. A meaning feature modeling unit(6000) models meaning feature models which are stored in a meaning feature model storage(4300).

Description

Method and apparatus for identifying and blocking harmful images using semantic-based features {Method and apparatus for malicious photo filtering using semantic features of digital photo}

본 발명은 유해 영상을 판별하여 차단하는 방법 및 장치에 관한 것이며, 더 상세하게는 의미 기반 특징을 이용하여 유해 영상을 판별하고 차단하는 방법 및 장치에 관한 것이다. The present invention relates to a method and apparatus for determining and blocking harmful images, and more particularly, to a method and apparatus for identifying and blocking harmful images using a semantic based feature.

본 발명은 지식경제부 사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: 2009-F-054-01, 과제명: 유해 멀티미디어 콘텐츠 분석/차단 기술개발].The present invention is derived from a study conducted as part of the Ministry of Knowledge Economy [Task Management No .: 2009-F-054-01, Task name: Development of harmful multimedia content analysis / blocking technology].

최근 인터넷, IPTV, 사회적 네트워크 (social network)와 같은 정보 공간의 확산으로 멀티미디어 콘텐츠의 유통 및 소비가 급격히 증가하고 있다. 특히 인터넷의 발달과 개인 사용자의 콘텐츠 제작 기기의 발달로 인하여 기존 전문가에 의해 생성된 콘텐츠뿐 아니라 전문가가 아닌 일반 사용자에 의해 생성된 콘텐츠의 유통도 급격히 증가하고 있다. 이러한 사용자에 의해 생성된 콘텐츠(user created contents: UCC)의 경우, 기존 전문가에 의해 생성된 콘텐츠와는 달리 콘텐츠를 설 명하는 신뢰성 있는 메타데이터(데이터를 설명하는 데이터, e.g., EPG, PPG)가 제공되지 않는다. 인터넷과 콘텐츠 제작 기술의 발달로 사용자 생성 콘텐츠의 양과 접근성은 높아졌지만, 콘텐츠에 대한 정보의 부족으로 인하여 음란물, 폭력물과 같은 유해 콘텐츠들이 무차별적으로 유통되는 문제와 이를 효과적으로 차단하지 못하는 문제점이 대두되고 있다. Recently, with the proliferation of information spaces such as the Internet, IPTV, and social networks, the distribution and consumption of multimedia contents are rapidly increasing. In particular, due to the development of the Internet and the development of content production devices for individual users, the distribution of contents generated by non-specialists as well as contents generated by existing experts is rapidly increasing. In the case of user created contents (UCC), unlike the contents created by existing experts, reliable metadata (data describing the data, eg, EPG, PPG) that describes the content Not provided. With the development of the Internet and content production technology, the amount and accessibility of user-generated content has increased, but the lack of information on the content raises the problem of indiscriminate distribution of harmful contents such as pornography and violent materials and the inability to block them effectively. have.

이러한 문제점을 해결하기 위하여 유해 콘텐츠를 차단하기 위한 연구가 많이 진행되어 왔다. 유해 콘텐츠를 차단하기 위한 기존의 방법은 IP 기반 차단 방법, 문자 기반 차단 방법, 영상의 내용 기반 차단 방법으로 분류될 수 있다. 첫번째, IP 기반 차단 방법은 유해 콘텐츠가 존재하는 IP 주소를 바탕으로 유해 콘텐츠를 차단하는 방법이다. 그러나 인터넷 공간에서 생성되는 콘텐츠의 증가 속도가 빠를 뿐 아니라, 콘텐츠의 IP 주소는 매우 유동적이고 변할 수 있다. 그러므로 제한적인 IP 주소를 바탕으로 유해 콘텐츠를 차단하기에는 부족함이 많다. 두번째, 문자 기반 차단 방법은 콘텐츠의 제목, 설명, 주변 단어 등의 문자 정보를 이용하여 유해 콘텐츠를 차단하는 방법이다. 이러한 유해 콘텐츠와 관련된 단어를 가지고 있는 콘텐츠를 차단하는 방법 또한 여러 문제점을 안고 있다. 첫째, 단어는 여러 뜻을 포함할 수 있다. 유해 콘텐츠와 관련 있는 단어라도 다른 의미로 일반 콘텐츠와 함께 등장할 수 있다. 이에 유해 콘텐츠가 아닌 콘텐츠가 차단될 수 있다. 둘째, 단어에 의한 차단 방법은 차단을 피하기 위해 고의적으로 잘못 표기한 단어는 검출하지 못 한다. 마지막으로, 유해 콘텐츠를 차단하기 위하여 유해 의미와 관련된 충분한 단어 집합이 존재하지 않아 유해 콘텐츠를 차단하기에는 문제점을 안고 있다. In order to solve this problem, many studies have been conducted to block harmful contents. Existing methods for blocking harmful content can be classified into IP based blocking method, text based blocking method and content based blocking method of video. First, the IP-based blocking method is to block harmful content based on the IP address where the harmful content exists. However, not only is the growth rate of content generated in the Internet space fast, but the content's IP address can be very flexible and changeable. Therefore, it is not enough to block harmful content based on restrictive IP address. Second, the character-based blocking method is a method of blocking harmful content by using character information such as title, description, and surrounding words of the content. There is also a problem in the way of blocking content that has words related to such harmful content. First, words can contain multiple meanings. Even words related to the harmful content may appear with the general content in a different meaning. Accordingly, non-hazardous content may be blocked. Second, the word blocking method does not detect words that are intentionally mislabeled to avoid blocking. Finally, there is not enough word set related to the harmful meaning to block the harmful content, there is a problem to block the harmful content.

이러한 문제점을 해결하기 위하여 많은 연구자들은 내용 기반 유해 콘텐츠 차단 기술을 연구해 왔다. 내용 기반 유해 콘텐츠 차단 기술은 콘텐츠의 내용을 분석하여 노출 등의 유해한 내용이 존재하는지를 판단하여 유해 콘텐츠를 차단하는 기술이다. 그러나 기존의 내용 기반 유해 콘텐츠 차단 기술은 대부분 사람의 피부의 색과 질감과 같은 시각적 요소에 바탕을 둔 저수준 특징(low-level feature)을 바탕으로 이루어져 왔다. 그러나 유해 의미(폭력, 노출, ...)는 저수준 특징으로 모델링하기에는 모호하고 복잡한 특징을 가지고 있다. 이러한 저수준 특징을 바탕으로 한 내용 기반 유해 콘텐츠 차단 기술의 한계를 극복하는 것이 숙제이다. In order to solve this problem, many researchers have studied content-based parental content blocking technology. Content-based harmful content blocking technology is a technology that blocks harmful content by analyzing the content of the content to determine whether harmful content such as exposure exists. However, the existing content-based harmful content blocking technology has been based on low-level features based on visual factors such as color and texture of human skin. However, the harmful meanings (violence, exposure, ...) are vague and complex to model as low-level features. The challenge is to overcome the limitations of content-based parental content blocking technology based on these low-level features.

본 발명은 상기 기존 유해 영상 판별 및 차단 시스템의 문제점을 해결하기 위한 것으로, 영상에서 유해 의미 판단을 위한 의미 특징들을 얻고, 이를 바탕으로 유해 콘텐츠를 판별 및 차단하는 방법과 장치를 제공하는 데 그 목적이 있다.An object of the present invention is to solve the problems of the existing harmful image determination and blocking system, and to obtain semantic features for determining harmful meaning in an image, and to provide a method and apparatus for identifying and blocking harmful content based on the same. There is this.

이러한 목적을 달성하기 위한 본 발명의 유해 영상 차단 방법은, 입력된 영상을 다수의 영역으로 분할하는 전처리 단계와, 분할된 각 영역에 대하여 의미의 시각적 특징을 생성하는 시각적 특징 생성 단계와, 생성된 상기 시각적 특징을 이용하여 의미 특징을 모델링하는 의미 특징 모델화 단계와, 생성된 상기 시각적 특징을 바탕으로 상기 분할된 각 영역의 의미 특징을 생성하는 의미 특징 생성 단계와, 생성된 상기 의미 특징을 병합하는 의미 특징 병합 단계와, 의미 특징 기반의 유해성을 모델링하는 유해성 모델화 단계와, 병합된 상기 의미 특징을 바탕으로 상기 입력된 영상의 유해성 여부를 판단하는 유해성 판단 단계와, 상기 유해성 판단 단계에서 유해한 것으로 판단된 상기 입력된 영상을 차단하는 유해 영상 차단 단계를 포함하여 이루어진다. In order to achieve the above object, the harmful image blocking method of the present invention includes a preprocessing step of dividing an input image into a plurality of areas, a visual feature generation step of generating visual features of meaning for each of the divided areas, and A semantic feature modeling step of modeling a semantic feature using the visual feature, a semantic feature generation step of generating semantic features of each divided region based on the generated visual feature, and merging the generated semantic feature A harmfulness determination step of determining a harmfulness of the input image based on a merged meaning feature, a hazard modeling step of modeling harmfulness based on the semantic feature, and the merged semantic feature, and the harmfulness determination step And a harmful image blocking step of blocking the inputted image. .

여기에서, 상기 시각적 특징 생성 단계는, 상기 분할된 각 영역에 대하여 의미의 시각적 특징을 추출하는 시각적 특징 추출 단계와, 추출된 상기 시각적 특징을 정규화하는 특징 정규화 단계를 포함할 수 있으며, 상기 시각적 특징 추출 단계에서, 상기 입력된 영상의 색상, 질감, 모양 정보나 MEPG-7 기술자(descriptors)를 사용하여 의미의 시각적 특징을 추출할 수 있다. The visual feature generating step may include a visual feature extraction step of extracting a visual feature of meaning with respect to each of the divided regions, and a feature normalization step of normalizing the extracted visual feature. In the extracting step, visual features of meaning may be extracted using color, texture, shape information or MEPG-7 descriptors of the input image.

상기 의미 특징 모델화 단계와 상기 유해성 모델화 단계에서는, SVM(support vector machine)을 이용하여 각각 상기 분할된 각 영역의 의미와 유해 의미를 모델화할 수 있으며, 상기 의미 특징 병합 단계에서는, 상기 분할된 각 영역의 의미 특징 중 가장 큰 신뢰값을 갖는 의미 특징으로 의미 특징을 병합하는 것이 바람직하다. 또한, 상기 유해성 판단 단계에서는, 유해 개념에 대한 신뢰도가 미리 정한 경계치 이상일 경우 유해 영상으로 판단할 수 있다. In the semantic feature modeling step and the hazard modeling step, each of the divided regions may be modeled using a support vector machine (SVM), and in the merging of the semantic features, the divided regions may be modeled. It is preferable to merge the semantic features with the semantic features having the largest confidence value among the semantic features. In addition, in the harmfulness determination step, when the reliability of the harmful concept is more than a predetermined threshold value can be determined as a harmful image.

본 발명의 목적은 또한, 영상을 입력받는 영상 입력부와, 입력된 영상을 다수의 영역으로 분할하는 전처리부와, 분할된 각 영역에 대하여 의미의 시각적 특징을 생성하는 시각적 특징 생성부와, 생성된 상기 시각적 특징을 이용하여 의미 특징을 모델링하는 의미 특징 모델화부와, 생성된 상기 시각적 특징을 바탕으로 상기 분할된 각 영역의 의미 특징을 생성하는 의미 특징 생성부와, 생성된 상기 의미 특징을 병합하는 의미 특징 병합부와, 의미 특징 기반의 유해성을 모델링하는 유해성 모델화부와, 병합된 상기 의미 특징을 바탕으로 상기 입력된 영상의 유해성 여부를 판단하는 유해성 판단부와, 상기 유해성 판단부에서 유해한 것으로 판단된 상기 입력된 영상을 차단하는 유해 영상 차단부를 포함하는 유해 영상 차단 장치에 의해서도 달성 가능하다. The present invention also provides an image input unit for receiving an image, a preprocessor for dividing the input image into a plurality of regions, a visual feature generator for generating visual features of meaning for each divided region, and A semantic feature modeling unit for modeling a semantic feature using the visual feature, a semantic feature generation unit for generating semantic features of each divided region based on the generated visual feature, and merging the generated semantic feature Meaning feature merging unit, a hazard modeling unit for modeling the harmfulness based on the semantic feature, a harmfulness determination unit for determining whether the input image is harmful based on the merged semantic features, and the harmfulness determination unit is determined to be harmful It can also be achieved by a harmful image blocking device including a harmful image blocking unit for blocking the input image received .

여기에서, 상기 전처리부는, 영역 템플릿 모델을 저장하는 영역 템플릿 모델 저장부와, 상기 영역 템플릿 모델을 이용하여 상기 입력된 영상을 다수의 영역으로 분할하는 영역 분할부를 포함할 수 있으며, 상기 시각적 특징 생성부는, 상기 분할 된 각 영역의 시각적 특징을 추출하는 시각적 특징 추출부와, 추출된 상기 시각적 특징을 정규화하는 특징 정규화부를 포함할 수 있다. The preprocessing unit may include an area template model storage unit for storing an area template model, and an area divider for dividing the input image into a plurality of areas using the area template model. The generation unit may include a visual feature extraction unit for extracting the visual features of each of the divided regions, and a feature normalization unit for normalizing the extracted visual features.

본 발명에 의하면, 색상, 질감, 모양 등의 내용 기반 시각적 특징값으로 모델화하기 힘든 노출과 같은 모호한 유해 의미를, 가슴, 엉덩이 등 큰 유해 의미를 구성하는 작은 의미들의 의미 기반 특징을 이용하여 효과적으로 모델링화으로써, 유해 영상을 보다 효과적으로 차단할 수 있다.According to the present invention, ambiguous harmful meanings such as exposure that are difficult to model with content-based visual feature values such as color, texture, and shape can be effectively modeled using semantic based features of small meanings that constitute a large harmful meaning such as chest and hips. By doing so, harmful images can be blocked more effectively.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 한편, 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소, 단계, 동작 및/또는 소자는 하나 이상의 다른 구성요소, 단계, 동작 및/또는 소자의 존재 또는 추가를 배제하지 않는다. Advantages and features of the present invention and methods for achieving them will be apparent with reference to the embodiments described below in detail with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in various forms, and only the present embodiments are intended to complete the disclosure of the present invention, and the general knowledge in the art to which the present invention pertains. It is provided to fully convey the scope of the invention to those skilled in the art, and the present invention is defined only by the scope of the claims. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification. As used herein, “comprises” and / or “comprising” refers to the presence of one or more other components, steps, operations and / or elements. Or does not exclude additions.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세히 설명한다. 먼저, 도 1은 본 발명의 실시예에 따른 유해 영상 차단 장치의 구성을 도시한 블록도이다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. First, Figure 1 is a block diagram showing the configuration of a harmful image blocking device according to an embodiment of the present invention.

도 1에 나타난 바와 같이, 본 발명의 실시예에 따른 유해 영상 차단 장치는 영상 입력부(Image input part, 1000), 전처리부(Pre-processing part, 2000), 시각적 특징 생성부(Visual feature generation, 3000), 의미 특징 생성부(Semantic feature generation part, 4000), 의미 특징 모델화부(Semantic feature modeling, 6000), 유해 영상 차단부(Malicious image filtering part, 5000), 유해 모델화부(Malice modeling, 7000)를 포함하여 구성된다. As shown in FIG. 1, the apparatus for blocking harmful images according to an exemplary embodiment of the present invention may include an image input part (1000), a pre-processing part (2000), and a visual feature generation unit (3000). ), A semantic feature generation part (4000), a semantic feature modeling part (Semantic feature modeling) 6000, a malicious image filtering part (5000), and a harmful modeling part (Malice modeling, 7000) It is configured to include.

영상 입력부(1000)는 유해 판단이 필요한 영상을 입력받는다. 본 발명의 실시예에서는 영상 입력부(1000)가 사용자에 의해 생성된 콘텐츠를 인터넷을 통하여 입력받거나, 임의의 원격(remote) 또는 지역(local) 저장 장치에 저장된 콘텐츠를 입력받는 것을 포함한다. 본 발명의 실시예에서는 미디어 콘텐츠의 파일 타입으로 현재 널리 쓰이고 있는 JPEG, BMP, GIF 등의 콘텐츠 파일 타입을 가정하고 있으나, 상기 예의 콘텐츠 파일 타입에 한정되지는 않는다. 그리고 여기에서 유해 영상이란 폭력성, 음란성 영상을 포함한다. The image input unit 1000 receives an image requiring harmful determination. According to an exemplary embodiment of the present invention, the image input unit 1000 may receive content generated by a user through the Internet or receive content stored in an arbitrary remote or local storage device. The embodiment of the present invention assumes a content file type such as JPEG, BMP, GIF, etc., which is widely used as a file type of media content, but is not limited to the content file type of the above example. In this case, the harmful image includes violent and pornographic images.

도 2는 본 발명의 실시예에 따른 유해 영상 차단 장치의 전처리부(2000)의 구성과 기능을 나타낸 도면이다. 2 is a view showing the configuration and function of the preprocessor 2000 of the harmful image blocking device according to an embodiment of the present invention.

도 2에 나타난 바와 같이, 전처리부(2000)에서는 영상 입력부(1000)를 통하여 입력된 영상이 있을 경우에, 영상 내의 의미들의 공간적 분포 특징들을 추출하기 위하여 영역 템플릿 모델을 기반으로 하여 입력된 영상을 1개 이상의 영역으로 분할한다. 이를 위하여 전처리부(2000)는 영역 템플릿 모델(Region template model, 2200)과 영역 분할부(Region sementation, 2100)을 포함한다. 영역 분할부(2100)에 의해 분할된 각 영역(Segmented region, 2300)들은 시각적 특징 생성부(3000)의 입력으로 들어가게 된다. As shown in FIG. 2, when there is an image input through the image input unit 1000, the preprocessor 2000 extracts an input image based on an area template model to extract spatial distribution features of meanings in the image. Split into one or more areas. To this end, the preprocessor 2000 includes a region template model 2200 and a region segmentation 2100. Each of the segmented regions 2300 divided by the region divider 2100 enters an input of the visual feature generator 3000.

도 3은 본 발명의 실시예에 따른 영상의 지역 분할 템플릿을 도시한 것이다. 본 발명의 실시예에 따르면, 도 3에서 도시한 것과 같이 6개의 템플릿으로 영상을 분할할 수 있다. 6개의 지역 분할 기반 템플릿은 다음과 같은 [수학식 1]에 의해 표현된다. 3 illustrates a region segmentation template of an image according to an embodiment of the present invention. According to an embodiment of the present invention, as shown in FIG. 3, the image may be divided into six templates. The six region segmentation based templates are represented by Equation 1 below.

여기에서, R(t)는 t번째 지역 분할 템플릿을 나타낸다. Here, R (t) represents the t-th regional partition template.

만일, 입력된 영상 I가 가로 w, 세로 h의 크기를 갖는다면, 각각의 지역 분할 템플릿의 좌표는 다음과 같은 [수학식 2]로 표현된다. If the input image I has a size of horizontal w and vertical h, the coordinates of each region segmentation template are expressed by Equation 2 below.

여기에서, left(t)는 t번째 템플릿의 좌측의 x축 좌표를 나타내며, top(t)는 t번째 템플릿의 좌측 상단의 y축 좌표를 나타내며, right(t)는 t번째 템플릿의 우측의 x축 좌표를 나타내며, bottom(t)는 t번째 템플릿의 우측 하단의 y축 좌표를 나타낸다.Here, left (t) represents the x-axis coordinate of the left side of the t-th template, top (t) represents the y-axis coordinate of the upper left of the t-th template, and right (t) represents x of the right side of the t-th template Bottom (t) represents the y-axis coordinate of the bottom right of the t-th template.

[수학식 2]에 나타난 각각의 템플릿의 좌표는 아래의 [수학식 3]과 같으며, 도 3은 이를 도면상으로 도시한 것이다. The coordinates of each template shown in [Equation 2] are as shown in [Equation 3] below, and FIG.

상기 지역 분할 템플릿에 따라 분할된 입력 사진 I는 다음과 같은 [수학식 4]에 의해 표현될 수 있다.The input picture I divided according to the region division template may be represented by Equation 4 as follows.

도 4는 본 발명의 실시예에 따른 유해 영상 차단 장치의 시각적 특징 생성부(3000)의 구성과 기능을 나타낸 도면이다. 4 is a view showing the configuration and function of the visual feature generating unit 3000 of the harmful image blocking device according to an embodiment of the present invention.

도 4에 나타난 바와 같이, 시각적 특징 생성부(3000)는 전처리부(2000)로부터 얻어진 분할된 각 영역(2300)들에 대하여 시각적 특징을 생성하는 역할을 수행하며, 이를 위하여 시각적 특징 생성부(3000)는 시각적 특징 추출부(Visual feature extraction, 3100)와 특징 정규화부(Feature normalization, 3200)를 포함한다. 즉, 분할된 영역(2300)은 시각적 특징 추출부(3100)를 통하여 시각적 특징이 추출되며, 추출된 각 시각적 특징들은 특징 정규화부(3200)에서 정규화된다. As shown in FIG. 4, the visual feature generator 3000 generates a visual feature for each of the divided regions 2300 obtained from the preprocessor 2000, and for this purpose, the visual feature generator 3000 is generated. ) Includes visual feature extraction (3100) and feature normalization (3200). That is, the divided region 2300 is extracted with the visual feature through the visual feature extractor 3100, and each extracted visual feature is normalized by the feature normalizer 3200.

사용된 시각적 특징값은 다음과 같은 [수학식 5]로 표현될 수 있다. The visual feature value used may be expressed by Equation 5 as follows.

여기서,

는 사용된 특징값의 개수를 나타낸다. 본 발명의 실시예에서는 시각적 특징값을 추출하는 방법에 있어서 영상의 색상, 질감, 모양 정보 등을 기본으로 이용하는 것을 포함하며, 기본적으로 MPEG-7 기술자(descriptors)를 사용하여 특징값을 추출하는 방법을 사용한다. 그러나, 시각적 특징값을 추출하는 방법이 MPEG-7 기술자에 한정되지 않고 BovW(bag of visual words) 및 일반적인 시각적 기술자도 사용할 수 있음은 물론이다. 또한 추출된 특징값을 정규화하여 여러 시각 특징을 사용함에 있어서, 특징값들이 다양한 범위와 차원을 가짐에도 일부 지배적인 특징에만 의존하지 않도록 한다. here,

Denotes the number of feature values used. In an embodiment of the present invention, a method of extracting a visual feature value includes basically using color, texture, and shape information of an image, and basically extracting the feature value using MPEG-7 descriptors. Use However, the method of extracting the visual feature values is not limited to MPEG-7 descriptors, but of course, the bag of visual words (BovW) and general visual descriptors can be used. Also, in using the various visual features by normalizing the extracted feature values, the feature values do not depend only on some dominant features even though they have various ranges and dimensions.

템플릿 R에 의해 분할된 영상의 지역으로부터 추출된 시각적 특징값은 다음과 같은 [수학식 6]에 의해 표현된다.The visual feature value extracted from the region of the image segmented by the template R is expressed by Equation 6 below.

본 발명의 실시예에 따른 유해 영상 차단 장치에서는 주어진 지역 기반 특징값을 기반으로 영상의 각 분할 지역이 포함하고 있는 지역의 의미 특징을 모델링하 는 것을 포함하는데, 이러한 기능은 의미 특징 모델화부(6000)에 의해 수행된다. According to an embodiment of the present invention, the harmful image blocking device includes modeling a semantic feature of a region included in each segmented region of an image based on a given region-based feature value, and the function includes a semantic feature modeling unit 6000. Is performed by

도 5는 본 발명의 실시예에 따른 유해 영상 차단 장치에서 의미 특징을 모델링하기 위한 구성과 그 과정을 나타낸다. 5 illustrates a configuration and a process for modeling semantic features in a harmful image blocking device according to an embodiment of the present invention.

의미 특징 모델화부(6000)는 의미 특징 모델 저장부(4300)에 저장될 의미 특징 모델들을 모델화하는 역할을 한다. 의미 특징 모델화부(6000)는, 도 5에 나타나듯이, 유해 의미 특징 모델화부(6100) 및 비 유해 의미 특징 모델화부(6200)를 통하여 각 영상들로 사전 훈련을 통하여 의미 특징 모델(4300: 4310, 4320)들을 생성해 둔다. 의미 특징 모델(4300: 4310, 4320)들은 유해 콘텐츠 판단을 위해 미리 훈련된 의미 모델로써, 의미 특징 벡터 생성부(4000)에서 사용된다.The semantic feature modeling unit 6000 functions to model semantic feature models to be stored in the semantic feature model storage unit 4300. The semantic feature modeling unit 6000, as shown in FIG. 5, uses the semantic feature modeling unit 6100 and the non-hazardous semantic feature modeling unit 6200 through the preliminary training of the respective images through the semantic feature model 4300: 4310. 4320). The semantic feature models 4300 and 4320 are used as the semantic feature vector generator 4000 as a semantic model trained in advance for harmful content determination.

이와 같은 의미 특징 모델(4300)을 생성하기 위하여, 먼저 유해 및 무해 의미 판별의 목적에 포함하는 의미 특징들을 정의한다. 의미 특징을 L이라 하면 유/무해의 의미 특징은 다음과 같은 [수학식 7]로 표현된다. In order to generate the semantic feature model 4300, first, semantic features included in the purpose of harmful and harmless semantic determination are defined. If the meaning characteristic is L Meaningful and harmless features are expressed by Equation 7 as follows.

여기서 L(l)은 l번째 의미 특징을 표시하며, N_semantic은 의미 특징의 총 개수이다. Where L (l) denotes the l- th semantic feature and N _semantic is the total number of semantic features.

유해 의미를 판단하기 위한 의미 특징들을 정함에 있어서는 유해 의미를 지원하는 유해 의미 특징과 유해 의미를 지원하지 않는 비유해 의미 특징을 포함하도록 의미 특징을 정한다. 예를 들어, 유해 의미를 판단하는 의미 특징들을 정함에 있어서는, 유해 의미를 지원하는 가슴, 복부, 엉덩이, 성기와 같은 의미 특징과, 유해 의미를 지원하지 않는 옷, 수영복, 속옷, 빌딩, 도로, 나무, 하늘, 바다 등과 같은 의미 특징을 사용할 수 있다. In defining semantic features to determine harmful meanings, semantic features are defined to include harmful meaning features that support harmful meanings and non-hazardous meaning features that do not support harmful meanings. For example, in determining meaning features that determine harmful meanings, meaning features such as breasts, abdomen, hips, and genitals supporting harmful meanings, clothes, swimwear, underwear, buildings, roads, You can use semantic features such as trees, sky, sea, etc.

다음으로, 상기 기술된 의미 특징을 모델링하기 위하여 각각의 의미에 대한 학습 표본 영상들을 수집한다. 수집된 영상으로부터 상기 시각적 특징값을 추출하고, 추출된 특징값을 구별자를 이용하여 학습하는 과정을 수행한다. 이때 사용되는 구별자의 예는 SVM(support vector machine)을 들 수 있으나, 본 발명의 구현이 SVM에 국한되지 않음은 물론이다. 각각의 학습된 SVM_semantic은 다음과 같은 [수학식 8]에서와 같이 표현된다. Next, in order to model the above-described semantic features, learning sample images for each meaning are collected. The visual feature value is extracted from the collected image, and the extracted feature value is learned using a discriminator. An example of the discriminator used here may be a support vector machine (SVM), but the implementation of the present invention is not limited to the SVM. Each learned SVM _semantic is expressed as shown in [Equation 8].

여기서, SVM_L은 의미 특징 L을 위해 학습된 SVM을 나타낸다. SVM _semantic의 입력으로는 앞서 기술한 시각적 특징값 벡터 X가 사용된다.Here, SVM _L represents the SVM learned for semantic feature L. The input of the SVM _semantic uses the visual feature vector X described above.

의미 특징 생성부(4000)는 시각적 특징 생성부(3000)로부터 얻어진 영역별 시각적 특징(3300)들로부터 의미 특징을 생성하는 역할을 하며, 앞서 설명한 의미 특징 모델화부(6000)에 의해 생성된 의미 특징 모델을 저장하고 있는 의미 특징 모델 저장부(4300)와 영역별 의미 특징 생성부(4100)와 의미 특징 결합부(4200)로 이루어진다. The semantic feature generator 4000 generates a semantic feature from the visual features 3300 for each region obtained from the visual feature generator 3000, and the semantic feature generated by the semantic feature modeler 6000 described above. It consists of a semantic feature model storage unit 4300, a region-specific semantic feature generation unit 4100, and a semantic feature combiner 4200 storing a model.

영역별 의미 특징 생성부(4100)는 시각적 특징 생성부(3000)로부터 얻어진 영역별 시각적 특징(3300)들로부터 영역별 의미 특징을 생성한다. 영역별 의미 특징이 생성되면, 이를 바탕으로 의미 특징 결합부(4200)에서 영상을 대표하는 의미 특징을 생성한다. The area-specific semantic feature generator 4100 generates area-specific semantic features from the area-specific visual features 3300 obtained from the visual feature generator 3000. When a semantic feature for each region is generated, the semantic feature combiner 4200 generates a semantic feature that represents an image based on the semantic feature.

도 6은 영역별 유해/무해 의미 특징 벡터 추출 과정을 도식적으로 나타낸 것이다. 각 분할 영역 R에 대해서 의미 특징은 다음과 같은 [수학식 9]에 의해 얻을 수 있다. 6 is a diagram schematically illustrating a process of extracting a harmful / harmful meaning feature vector for each region. For each divided region R, the semantic characteristic can be obtained by the following [Equation 9].

여기서 X _R는 분할 지역 R의 시각적 특징값 벡터를 나타내고, F_L(R)는 분할 지역 R의 의미 특징 L에 대한 신뢰도를 나타낸다. 상기 의미 특징은 분할 영역 영상이 해당 의미 특징의 의미를 얼마나 많이 포함하고 있는지에 대한 신뢰도를 나타낸다.Where X _R represents the visual feature vector of the divided region R, and F _L (R) represents the reliability of the semantic feature L of the divided region R. The semantic feature indicates how much the split region image includes the meaning of the semantic feature.

이러한 과정으로 유해/무해로 선정된 모든 의미 특징에 대해 SVM을 수행한다 (4310, 4320). 이를 통하여 얻은 신뢰도들을 바탕으로 분할 지역 R에 대한 하나의 의미 특징 벡터(4120)를 만든다. 이렇게 생성된 의미 특징 벡터는 다음과 같은 [수학식 10]과 같이 표현될 수 있다.This process performs SVM on all semantic features selected as harmful / harmless (4310, 4320). Based on the reliability results obtained, a semantic feature vector 4120 for the divided region R is generated. The semantic feature vector generated as described above may be expressed by Equation 10 as follows.

여기서, F _semantic(R)는 분할 영역 R에 대해 추출된 모든 의미 특징에 대한 값을 나타낸다.Here, F _semantic (R) represents the values for all the semantic features extracted for the partition region R.

상술한 바와 같이 영상의 각 분할 영역에 대해 측정된 의미 특징을 이용하여 영상 전체영역의 의미 특징을 얻기 위하여 의미 특징 병합부(4200)를 통해 의미 특징을 병합한다. 본 발명의 실시예에 따르면 분할 영역들의 의미 특징을 병합하기 위한 방법으로, 각 영역들의 의미 특징을 전체 영상에 대한 의미 특징으로 병합하는 과정을 포함한다. 각 영역의 의미 특징 중 제일 큰 신뢰값을 선택하는 특징 병합 과정은 아래의 [수학식 11]에서와 같이 표현된다.As described above, the semantic feature is merged through the semantic feature merging unit 4200 to obtain the semantic feature of the entire image using the semantic feature measured for each divided region of the image. According to an embodiment of the present invention, a method for merging semantic features of divided regions, including merging semantic features of respective regions into semantic features for the entire image. The feature merging process for selecting the largest confidence value among the semantic features of each region is expressed as in Equation 11 below.

여기서

는 입력 영상을 나타내고,

는 입력 영상의 의미 L에 대한 병합된 의미 특징 값을 나타낸다. 위와 같은 병합 과정은 본 발명의 하나의 실시예이다. 한편, 영상이 여러 개의 의미 특징값을 가질 수도 있으며, 이 때에는 신뢰값의 경계치를 두어 영역의 의미 특징의 신뢰값이 경계치 이상의 값을 가지는 경우 전체 영상의 의미로 병합할수 있다. here

Represents the input image,

Denotes a merged semantic feature value for the semantic L of the input image. The merging process as described above is one embodiment of the present invention. On the other hand, the image may have a plurality of semantic feature values. In this case, if the confidence value of the semantic feature of the region has a value greater than or equal to the boundary value of the confidence value, the image may be merged into the meaning of the entire image.

유해 영상 차단부(5000)는 영상의 최종 유해 여부를 판단한다. The harmful image blocking unit 5000 determines whether the image is final harmful.

도 7은 본 발명의 실시예에 따른 유해 영상 차단 장치에서 유해 영상 차단부에 의해 영상의 최종 유해 여부를 판단하는 과정에서 사용되는 유해 의미 모델화 과정을 도식적으로 나타낸 것이다. FIG. 7 schematically illustrates a harmful meaning modeling process used in a process of determining a final harmfulness of an image by a harmful image blocking unit in a harmful image blocking apparatus according to an exemplary embodiment of the present invention.

유해 모델화부(7000)는 유해 모델 저장부(5200)에 저장될 유해 개념을 모델화하는 역할을 한다. 유해 모델화부(7000)는 사전에 훈련되어서 유해 개념들을 생성해 둔다. 먼저, 유해 개념은 아래와 같은 [수학식 12]와 같이 표현된다. The hazard modeling unit 7000 serves to model a hazard concept to be stored in the hazardous model storage unit 5200. The hazard modeling unit 7000 is trained in advance to generate hazard concepts. First, the harmful concept is expressed as Equation 12 below.

여기서, M(g)는 g번째 유해 개념을 의미하며, N _g 는 정의된 유해 개념의 개수를 나타낸다. Here, M (g) means the g-th harmful concept, N _g represents the number of defined harmful concepts.

각각의 유해 개념에 대해 학습된 SVM_malicious은 다음의 [수학식 13]과 같이 표현된다. The learned SVM _malicious for each harmful concept is expressed as Equation 13 below.

여기서, SVM_M는 유해 개념 M를 위해 학습된 SVM(5200)을 나타내며, SVM _malicious의 입력으로는 상기 분할 영역들로부터 추출되어 병합된 의미 특징 집합인 F' _semantic이 들어간다.Here, SVM _M represents the SVM 5200 learned for the harmful concept M, and the input of SVM _malicious includes F ' _{semantic, which} is a _semantic feature set extracted from the partitions and merged.

유해 영상 판단을 위한 유해 의미를 정함에 있어서는, 예를 들어 상위 노출, 전신 노출, 뒷모습 노출, 성기 노출 등과 같은 유해 의미를 사용할 수 있다. In determining the harmful meaning for the harmful image determination, for example, harmful meanings such as upper exposure, whole body exposure, backside exposure, and genital exposure may be used.

도 8은 본 발명의 실시예에 따른 유해 영상 차단 장치의 유해 영상 차단 부(5000)의 구성과 기능을 도시한 것이다. 8 illustrates a configuration and a function of the harmful image blocking unit 5000 of the harmful image blocking apparatus according to the exemplary embodiment of the present invention.

유해 영상 차단부(5000)의 입력은 의미 특징 생성부(4000)로부터 얻어진 의미 특징 백터(4300)이다. 도 1에 나타난 바와 같이, 유해 영상 차단부(5000)는 유해 모델 저장부(5200)와 유해 영상 차단부(5100)로 이루어진다. 유해 모델 저장부(5200)에 저장된 유해 개념 모델을 이용하여, 유해 영상 차단부(5100)에서 입력된 의미 특징 벡터를 이용하여 유해 콘텐츠를 분류한다. 유해 모델 저장부에 저장된 학습된 유해 개념 모델인 SVM _malicious(5200)을 이용하여 입력된 영상 I가 유해 의미를 갖는지에 대한 신뢰도 값을 다음과 같은 [수학식 14]를 통해 구한다. The input of the harmful image blocking unit 5000 is a semantic feature vector 4300 obtained from the semantic feature generating unit 4000. As shown in FIG. 1, the harmful image blocking unit 5000 includes a harmful model storage unit 5200 and a harmful image blocking unit 5100. Using the harmful concept model stored in the harmful model storage unit 5200, the harmful content is classified using the semantic feature vector input from the harmful image blocking unit 5100. Using SVM _malicious (5200), the learned harmful concept model stored in the harmful model storage unit, the reliability value of whether the input image I has a harmful meaning is obtained through Equation 14 as follows.

여기서 V_M는 유해 개념 M에 대한 신뢰도를 나타낸다. 유해 영상 차단부(5100)에서는 위에서 구한 신뢰도가 미리 정해진 경계치 이상일 경우, 유해 개념을 지닌 영상(5300)으로 판별하여 차단하고, 그렇지 않을 경우 유해하지 않은 영상(5400)으로 판별한다. Where V _M represents the confidence in the hazard concept M. The harmful image blocking unit 5100 discriminates and blocks the image 5300 having the harmful concept if the reliability obtained above is greater than or equal to a predetermined threshold, and otherwise determines the non-hazardous image 5400.

한편, 본 발명에 따른 의미 특징을 이용한 콘텐츠의 유해성 여부를 판단하는 기술은 이미지 콘텐츠에 국한되지 않으며 다양한 종류의 디지털 콘텐츠에 적용될 수 있다. 또한 본 발명의 의미 특징을 이용한 유해성 판단 기술은 IPTV, DMB 등의 어플리케이션으로 확장될 수도 있다. On the other hand, the technology for determining the harmfulness of the content using the semantic features according to the present invention is not limited to the image content can be applied to various types of digital content. In addition, the hazard determination technology using the semantic features of the present invention may be extended to applications such as IPTV and DMB.

본 발명의 실시예에 따른 유해 영상 차단 방법 및 장치는 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현되는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 컴퓨터가 읽을 수 있는 기록매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 하드 디스크, 플로피 디스크, 플래쉬 메모리, 광 데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들면 인터넷을 통한 전송)의 형태로 구현되는 것도 포함된다. 또한 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 통신망으로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 읽을 수 있는 코드로서 저장되고 실행될 수 있다. The harmful image blocking method and apparatus according to the embodiment of the present invention may be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, hard disk, floppy disk, flash memory, optical data storage device, and also carrier waves (for example, transmission over the Internet). It is also included to be implemented in the form of. The computer readable recording medium can also be distributed over computer systems connected over a computer network so that the computer readable code is stored and executed in a distributed fashion.

이상에서 바람직한 실시예를 기준으로 본 발명을 설명하였지만, 본 발명의 유해 영상 차단 방법 및 장치는 반드시 상술된 실시예에 제한되는 것은 아니며 발명의 요지와 범위로부터 벗어남이 없이 다양한 수정이나 변형을 하는 것이 가능하다. 첨부된 특허청구의 범위는 본 발명의 요지에 속하는 한 이러한 수정이나 변형을 포함할 것이다. Although the present invention has been described above with reference to preferred embodiments, the harmful image blocking method and apparatus of the present invention are not necessarily limited to the above-described embodiments, and various modifications or variations may be made without departing from the spirit and scope of the present invention. It is possible. The appended claims will cover such modifications and variations as long as they fall within the spirit of the invention.

도 1은 본 발명의 실시예에 따른 유해 영상 차단 장치의 구성을 도시한 블록도이다.1 is a block diagram showing the configuration of a harmful image blocking device according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 유해 영상 차단 장치의 전처리부의 구성과 기능을 나타낸 도면이다. 2 is a view showing the configuration and function of the preprocessor of the harmful image blocking device according to an embodiment of the present invention.

도 3은 본 발명의 실시예에 따른 영상의 지역 분할 템플릿을 도시한 것이다.3 illustrates a region segmentation template of an image according to an embodiment of the present invention.

도 4는 본 발명의 실시예에 따른 유해 영상 차단 장치의 시각적 특징 생성부의 구성과 기능을 나타낸 도면이다. 4 is a view showing the configuration and function of the visual feature generating unit of the harmful image blocking device according to an embodiment of the present invention.

도 5는 본 발명의 실시예에 따른 유해 영상 차단 장치에서 의미 특징을 모델화하기 위한 구성과 그 과정을 나타낸다. 5 illustrates a configuration and a process for modeling semantic features in a harmful image blocking device according to an embodiment of the present invention.

도 6은 영역별 유해/무해 의미 특징 벡터 추출 과정을 나타낸 것이다.6 shows a process of extracting a harmful / harmful meaning feature vector for each region.

도 7은 본 발명의 실시예에 따른 유해 영상 차단 장치에서 유해 영상 차단부에 의해 영상의 최종 유해 여부를 판단하는 과정에서 사용되는 유해 의미 모델화 과정을 나타낸 것이다. FIG. 7 illustrates a harmful meaning modeling process used in a process of determining whether a harmful image is finally harmful by an harmful image blocking unit in the harmful image blocking apparatus according to an exemplary embodiment of the present invention.

도 8은 본 발명의 실시예에 따른 유해 영상 차단 장치의 유해 영상 차단부의 구성과 기능을 도시한 것이다. 8 illustrates a configuration and a function of a harmful image blocking unit of a harmful image blocking apparatus according to an exemplary embodiment of the present invention.

Claims

A preprocessing step of dividing the input image into a plurality of regions;

A visual feature generation step of generating visual features of meaning for each divided region;

A semantic feature modeling step of modeling a semantic feature using the generated visual feature;

Generating a semantic feature of each of the divided regions based on the generated visual feature;

A semantic feature merging step of merging the generated semantic features;

A hazard modeling step of modeling hazards based on semantic features;

A hazard determination step of determining whether the input image is harmful based on the semantic features merged;

Harmful image blocking method comprising the harmful image blocking step of blocking the input image determined to be harmful in the harmfulness determination step.

The method of claim 1, wherein in the pretreatment step,

Receives a region template and divides the input image into a plurality of regions R ₁ , R ₂ , R ₃ , R ₄ , R ₅ , and R ₆ .

The coordinates of each template are as shown in [Equation 15].

Hazardous image blocking method determined as follows.

The method of claim 1, wherein the visual feature generating step,

A visual feature extraction step of extracting visual features of meaning for each of the divided regions;

And a feature normalization step of normalizing the extracted visual features.

The method of claim 3, wherein in the visual feature extraction step,

The harmful image blocking method of extracting a visual feature of meaning by using the color, texture, shape information of the input image.

The method of claim 4, wherein in the visual feature extraction step,

Hazardous image blocking methods, using MEPG-7 descriptors to further extract visual features of meaning.

The method of claim 1, wherein in the semantic feature modeling step,

A harmful image blocking method for modeling the meaning of each divided region using a support vector machine (SVM).

The method of claim 6, wherein in the semantic feature modeling step,

The meaning of each divided region is represented by Equation 16 below.

(R is each of the divided regions, X _R is the visual feature vector of the region R, and F _L (R) is the reliability of the semantic feature L of the region R)

The harmful video blocking method determined by the.

The method of claim 1, wherein in the merging of semantic features,

The harmful image blocking method of merging the semantic features into the semantic features having the largest confidence value among the semantic features of the divided regions.

The method of claim 1, wherein in the hazard modeling step,

Harmful image blocking method that models harmful meanings using SVM (support vector machine).

According to claim 1, In the hazard determination step,

Equation 17

(SVM _M is SVM learned for harmful concept M, F ' _semantic is a set of semantic features extracted from each of the divided regions and merged)

Harmful image blocking method to determine the hazard according to.

According to claim 1, In the hazard determination step,

Equation 18

(V _M is the reliability of harmful concept M)

The harmful image blocking method determines that the image is harmful when the reliability is higher than or equal to a predetermined threshold.

An image input unit for receiving an image,

A preprocessor dividing the input image into a plurality of areas;

A visual feature generator for generating a visual feature of meaning for each divided region;

A semantic feature modeling unit for modeling semantic features using the generated visual features;

A semantic feature generator for generating semantic features of each of the divided regions based on the generated visual features;

A semantic feature merging unit for merging the generated semantic features;

A hazard modeling unit for modeling hazards based on semantic features,

A hazard determination unit that determines whether the input image is harmful based on the merged semantic features;

Harmful image blocking device including a harmful image blocker for blocking the input image determined to be harmful in the hazard determination unit.

The method of claim 12, wherein the preprocessing unit,

An area template model storage unit for storing the area template model;

And a region dividing unit which divides the input image into a plurality of regions by using the region template model.

The method of claim 13, wherein the region template model comprises a plurality of region templates R ₁ , R ₂ , R ₃ , R ₄ , R ₅ , R ₆ ,

The coordinate of each template is represented by Equation 19 below.

Hazardous image blocking device determined as follows.

The method of claim 12, wherein the visual feature generating unit,

A visual feature extraction unit for extracting visual features of each of the divided regions;

A harmful image blocking device comprising a feature normalization unit for normalizing the extracted visual feature.

The method of claim 12, wherein the semantic feature modeling unit,

A harmful image blocking device modeling the meaning of each divided region by using a support vector machine (SVM).

The method of claim 16, wherein the semantic feature modeling unit,

The meaning of each divided region is represented by Equation 20 below.

Hazardous video blocker determined by.

The method of claim 12, wherein the semantic feature merger,

And a harmful image blocking device incorporating the semantic feature into the semantic feature having the largest confidence value among the semantic features of the divided regions.

The method of claim 12, wherein the hazard determination unit,

Equation 21

Harmful image blocking method to determine the hazard according to.

The method of claim 12, wherein the hazard determination unit,

Equation 22

(V _M is the reliability of harmful concept M)

Harmful image blocking device to determine the harmful image when the reliability according to the predetermined threshold value or more.