KR102167730B1

KR102167730B1 - Apparatus and method for masking a video

Info

Publication number: KR102167730B1
Application number: KR1020190046553A
Authority: KR
Inventors: 강지홍; 임비
Original assignee: 주식회사 로민
Priority date: 2019-04-22
Filing date: 2019-04-22
Publication date: 2020-10-20
Also published as: KR20200077370A

Abstract

본 출원은 영상 마스킹 장치 및 영상 마스킹 방법에 관한 것으로서, 본 발명의 일 실시예에 의한 영상 마스킹 방법은, 입력받은 대상 동영상을 디코딩(decoding)하여 프레임 단위의 프레임 이미지를 생성하는 단계; 상기 프레임 이미지로부터 각각의 프레임 이미지에 대응하는 프레임 특징벡터를 추출하고, 상기 프레임 특징벡터로부터 후보 객체를 검출하는 단계; 상기 검출된 후보 객체들의 위치정보를 이용하여, 상기 프레임 특징벡터로부터 각각의 후보 객체에 대응하는 객체 특징벡터를 추출하는 단계; 상기 객체 특징벡터를 이용하여 상기 후보객체들에 대한 식별정보를 설정하는 단계; 동일한 식별정보를 가지는 후보 객체들을 연속된 프레임 이미지 내에서 추적하여, 각각의 식별정보 별로 상기 후보 객체들에 대한 추적정보를 생성하는 단계; 및 마스킹 입력을 수신하면, 상기 추적정보를 이용하여 상기 마스킹 입력에 포함된 식별정보와 동일한 후보 객체들을 추출하고, 상기 추출한 후보 객체들에 대한 마스킹을 수행하는 단계를 포함할 수 있다. The present application relates to an image masking apparatus and an image masking method. An image masking method according to an embodiment of the present invention includes: generating a frame image in units of frames by decoding an input target video; Extracting a frame feature vector corresponding to each frame image from the frame image, and detecting a candidate object from the frame feature vector; Extracting an object feature vector corresponding to each candidate object from the frame feature vector using the position information of the detected candidate objects; Setting identification information for the candidate objects by using the object feature vector; Tracking candidate objects having the same identification information within a continuous frame image, and generating tracking information for the candidate objects for each identification information; And when receiving the masking input, extracting candidate objects identical to the identification information included in the masking input by using the tracking information, and performing masking on the extracted candidate objects.

Description

Image masking device and image masking method {Apparatus and method for masking a video}

본 출원은 영상 마스킹 장치 및 영상 마스킹 방법에 관한 것으로서, 개인의 사생활 보호를 위해 동영상을 마스킹할 수 있는 영상 마스킹 장치 및 영상 마스킹 방법에 관한 것이다. The present application relates to an image masking device and an image masking method, and to an image masking device and an image masking method capable of masking a video for personal privacy protection.

최근 인터넷 등 네트워크의 보급과, 일반 사용자들의 사용하는 각종 기기(퍼스널 컴퓨터, 스마트폰, 카메라 등)의 고기능화에 의하여, 개인이 촬영한 동영상 등을 온라인 상에서 공개하거나, 타인과 공유하는 것을 용이하게 수행할 수 있게 되었다. 다만, 네트워크 상에 공유되는 동영상에 많아짐에 따라, 임의의 인물의 얼굴이 촬영된 동영상 등이 무단으로 네트워크 상에서 공개되는 등 개인정보 노출이나 사생활 침해의 문제가 심각해지고 있다. 이러한 문제점을 해결하기 위하여, 종래에는 동영상 내 포함된 인물의 얼굴을 모자이크화 처리하여 개인정보 노출을 방지하는 비식별화 기술 등이 제안된 바 있다. Due to the recent spread of networks such as the Internet and high-functionalization of various devices (personal computers, smartphones, cameras, etc.) used by general users, it is easy to publish videos taken by individuals online or share them with others. I could do it. However, as the number of videos shared on the network increases, a problem of personal information exposure or privacy invasion is becoming serious, such as a video, etc. in which a person's face is photographed is disclosed on the network without permission. In order to solve this problem, conventionally, a de-identification technique for preventing the exposure of personal information by mosaicizing the face of a person included in a video has been proposed.

그러나, 종래의 비식별화 기술은 사용자가 직접 해당 동영상에 대한 비식별화 처리를 수행해야하는 점에서 어려움이 존재하였으며, 특히, 동영상에 포함된 다양한 인물들 중에서 특정 인물만을 비식별화하는 등의 경우에는, 수동으로 전체 동영상에 포함된 모든 인물들에 대하여 확인해야하는 등 어려움이 존재하였다.However, the conventional de-identification technology was difficult in that the user had to directly perform de-identification processing on the corresponding video. In particular, in the case of de-identifying only a specific person among various persons included in the video. There were difficulties such as having to manually check all the people included in the entire video.

한편, 특정 인물에 대한 마스킹을 예외적으로 처리하기 위한 방법으로, 등록번호 10-1215948호의 "신체정보 및 얼굴인식에 기반한 감시 시스템의 영상정보 마스킹 방법"이 제시된 바 있다. 즉, 영상정보로부터 인물의 얼굴 인식이 가능한 경우 얼굴의 미간 및 인중 간의 거리를 검출하고, 이에 기반하여 데이터베이스에 저장된 특정 인물의 얼굴을 식별하여, 해당 인물의 마스킹 처리를 별도로 처리하는 등의 내용이 제시되고 있다. On the other hand, as a method for exceptionally handling masking for a specific person, registration number 10-1215948, "A method of masking image information of a surveillance system based on body information and face recognition" has been suggested. That is, if a person's face can be recognized from the image information, the distance between the eyebrows and the person's face is detected, and based on this, the face of a specific person stored in the database is identified, and the masking process of the person is separately processed. Is being presented.

하지만, 해당 기술의 경우 영상에서 사람을 검출하고, 얼굴을 인식하며, 추적하는데 사용되는 각개의 기술이 모두 독립적으로 이루어지므로, 각 과정을 수행하는데 매우 많은 연산량이 필요하게 된다. 또한, 전체 시스템의 성능이 특정 기술의 성능 하한에 의해 결정되므로, 인물의 검출률 및 인식률이 낮아지는 문제점이 발생하게 된다. 이외에도, 얼굴인식을 위해 사용되는 특징점으로 양미간과 인중 간의 거리를 사용하므로, 실제 시스템에서 얼굴 인식률이 현저하게 떨어지는 등 문제점이 존재한다.However, in the case of the technology, since each technology used to detect a person, recognize a face, and track a person in an image is all independently performed, a very large amount of computation is required to perform each process. In addition, since the performance of the entire system is determined by the lower limit of the performance of a specific technology, a problem of lowering the detection rate and recognition rate of a person occurs. In addition, since the distance between the two eyes and the person's middle is used as a feature point used for face recognition, there are problems such as a remarkable decrease in the face recognition rate in an actual system.

본 출원은, 개인의 사생활 보호 등을 위하여 동영상을 마스킹할 수 있는 영상 마스킹 장치 및 영상 마스킹 방법을 제공하고자 한다. The present application is intended to provide an image masking apparatus and an image masking method capable of masking a video to protect personal privacy.

본 출원은, 동영상에 포함된 객체들을 구별하여 선택적으로 마스킹할 수 있는 할 수 있는 영상 마스킹 장치 및 영상 마스킹 방법을 제공하고자 한다.The present application is to provide an image masking apparatus and an image masking method capable of selectively masking objects included in a video.

본 출원은, 동영상에 포함된 객체들을 추적하여 효율적으로 마스킹할 수 있는영상 마스킹 장치 및 영상 마스킹 방법을 제공하고자 한다.The present application is to provide an image masking apparatus and an image masking method capable of efficiently masking by tracking objects included in a video.

본 출원은, 동영상에 포함된 객체를 검출, 추적, 인식하는데 있어 검출률 및 인식률이 기존 기술 대비 뛰어난 영상 마스킹 장치 및 영상 마스킹 방법을 제공하고자 한다. The present application is intended to provide an image masking device and an image masking method that are superior to conventional technologies in detecting, tracking, and recognizing an object included in a moving picture.

본 발명의 일 실시예에 의한 영상 마스킹 방법은, 입력받은 대상 동영상을 디코딩(decoding)하여 프레임 단위의 프레임 이미지를 생성하는 단계; 상기 프레임 이미지로부터 각각의 프레임 이미지에 대응하는 프레임 특징벡터를 추출하고, 상기 프레임 특징벡터로부터 후보 객체를 검출하는 단계; 상기 검출된 후보 객체들의 위치정보를 이용하여, 상기 프레임 특징벡터로부터 각각의 후보 객체에 대응하는 객체 특징벡터를 추출하는 단계; 상기 객체 특징벡터를 이용하여 상기 후보객체들에 대한 식별정보를 설정하는 단계; 동일한 식별정보를 가지는 후보 객체들을 연속된 프레임 이미지 내에서 추적하여, 각각의 식별정보 별로 상기 후보 객체들에 대한 추적정보를 생성하는 단계; 및 마스킹 입력을 수신하면, 상기 추적정보를 이용하여 상기 마스킹 입력에 포함된 식별정보와 동일한 후보 객체들을 추출하고, 상기 추출한 후보 객체들에 대한 마스킹을 수행하는 단계를 포함할 수 있다. An image masking method according to an embodiment of the present invention includes: generating a frame image in units of frames by decoding an input target video; Extracting a frame feature vector corresponding to each frame image from the frame image, and detecting a candidate object from the frame feature vector; Extracting an object feature vector corresponding to each candidate object from the frame feature vector using the position information of the detected candidate objects; Setting identification information for the candidate objects by using the object feature vector; Tracking candidate objects having the same identification information within a continuous frame image, and generating tracking information for the candidate objects for each identification information; And when receiving the masking input, extracting candidate objects identical to the identification information included in the masking input by using the tracking information, and performing masking on the extracted candidate objects.

여기서 상기 식별정보를 설정하는 단계는, 상기 객체 특징벡터에 대응하는 등록특징벡터를 식별정보 데이터베이스에서 검색하는 단계; 상기 객체 특징벡터에 대응하는 등록특징벡터가 검색되면, 상기 식별정보 데이터베이스에서 상기 등록특징벡터와 매칭된 식별정보를 추출하여, 상기 후보 객체의 식별정보로 설정하는 단계; 및 상기 객체 특징벡터에 대응하는 등록특징벡터가 검색되지 않으면, 상기 후보 객체의 식별정보를 신규생성하고, 상기 객체 특징벡터와 상기 식별정보를 상기 식별정보 데이터베이스에 신규 등록하는 단계를 포함할 수 있다. The step of setting the identification information includes: searching for a registered feature vector corresponding to the object feature vector from an identification information database; When a registration feature vector corresponding to the object feature vector is searched, extracting identification information matched with the registration feature vector from the identification information database and setting it as the identification information of the candidate object; And if the registration feature vector corresponding to the object feature vector is not searched, newly generating identification information of the candidate object, and newly registering the object feature vector and the identification information in the identification information database. .

여기서 상기 식별정보 데이터베이스에서 검색하는 단계는, 상기 객체 특징벡터가 상기 등록특징벡터와 기 설정된 오차범위 내에서 일치하면, 상기 객체 특징벡터가 상기 등록특징벡터에 대응하는 것으로 판별할 수 있다. In the step of searching the identification information database, if the object feature vector matches the registered feature vector within a preset error range, it may be determined that the object feature vector corresponds to the registered feature vector.

여기서 상기 추적정보를 생성하는 단계는, 상기 연속된 프레임 이미지 내에 포함된 각각의 후보 객체들의 객체 특징벡터의 차이값과, 상기 후보객체들의 위치 및 크기 변화를 이용하여, 동일한 후보 객체를 추적할 수 있다. In the generating of the tracking information, the same candidate object may be tracked using a difference value of object feature vectors of each of the candidate objects included in the continuous frame image and a change in position and size of the candidate objects. have.

여기서 상기 추적정보를 생성하는 단계는, Error = (V1 - V2) + a × (d1-d2) + b × (s1-s2) 여기서, Error는 추적오차값, V1은 제1 프레임 이미지에 포함된 제1 후보객체의 객체 특징 백터, V2는 제2 프레임 이미지에 포함된 제2 후보객체의 객체 특징 백터, d1은 기준점으로부터 제1 후보객체의 중심점까지의 거리, d2는 기준점으로부터 제2 후보객체의 중심점까지의 거리, s1은 제1 후보객체의 면적, s2는 제2 후보객체의 면적, a, b는 가중치이고, 상기 추적오차값이 최소값인 제2 후보객체를 상기 제1 후보객체와 동일한 식별정보를 가지는 후보 객체로 판별할 수 있다. Here, the step of generating the tracking information includes Error = (V1-V2) + a × (d1-d2) + b × (s1-s2) where Error is a tracking error value, and V1 is included in the first frame image. The object feature vector of the first candidate object, V2 is the object feature vector of the second candidate object included in the second frame image, d1 is the distance from the reference point to the center point of the first candidate object, and d2 is the second candidate object from the reference point. The distance to the center point, s1 is the area of the first candidate object, s2 is the area of the second candidate object, a and b are weights, and the second candidate object whose tracking error value is the minimum value is identified as the first candidate object. Can be identified as a candidate object having information.

여기서, 본 발명의 일 실시예에 의한 영상 마스킹 방법은, 동일한 식별정보를 가지는 후보 객체가, 연속하는 프레임 이미지 내에서 일부 누락되면, 상기 후보 객체에 대한 식별정보 설정에 오류가 발생한 것으로 판별하는 오류감지단계를 더 포함할 수 있다. Here, in the image masking method according to an embodiment of the present invention, when a candidate object having the same identification information is partially omitted in a continuous frame image, an error of determining that an error has occurred in setting identification information for the candidate object It may further include a sensing step.

여기서 상기 추적정보는, 상기 후보 객체의 식별정보, 상기 후보 객체가 등장하는 프레임 정보, 상기 프레임 이미지 내에 포함된 상기 후보 객체의 위치 정보 및 크기 정보 중 적어도 어느 하나를 포함할 수 있다. Here, the tracking information may include at least one of identification information of the candidate object, frame information in which the candidate object appears, location information and size information of the candidate object included in the frame image.

여기서, 본 발명의 일 실시예에 의한 영상 마스킹 방법은, 상기 추적정보를 이용하여, 상기 후보 객체와 상기 후보 객체별 식별정보를 상기 대상 동영상에 오버레이(overlay)한 객체 추적 영상을 생성하는 단계를 더 포함할 수 있다. Here, the image masking method according to an embodiment of the present invention comprises the step of generating an object tracking image by overlaying the candidate object and the identification information for each candidate object on the target video using the tracking information. It may contain more.

여기서, 본 발명의 일 실시예에 의한 영상 마스킹 방법은, 상기 대상 동영상 내에 포함된 후보 객체들에 대응하는 식별정보를 정렬한 식별정보 리스트, 동일한 식별정보를 가지는 후보 객체들이 상기 대상 동영상 내에 등장하는 등장구간정보 및 상기 후보 객체가 나타난 프레임 이미지를 포함하는 마스킹 선택 인터페이스를 표시하는 단계를 더 포함할 수 있다. Here, the image masking method according to an embodiment of the present invention includes an identification information list in which identification information corresponding to candidate objects included in the target video is arranged, and candidate objects having the same identification information appear in the target video. The method may further include displaying a masking selection interface including appearance section information and a frame image in which the candidate object appears.

여기서 상기 마스킹 입력은, 사용자로부터 입력받은 식별정보를 이용하여 생성하거나, 기 설정된 선택 알고리즘에 따라 상기 식별정보를 추출하여 생성하는 것 일 수 있다. Here, the masking input may be generated using identification information input from a user, or generated by extracting the identification information according to a preset selection algorithm.

여기서 상기 마스킹을 수행하는 단계는, 상기 추적정보를 이용하여 상기 식별정보에 대응하는 후보객체가 등장하는 프레임 이미지들을 추출하고, 상기 추출된 프레임 이미지에서 상기 후보객체의 위치에 대응하는 마스킹 영역을 설정할 수 있다. Here, the performing of the masking includes extracting frame images in which the candidate object corresponding to the identification information appears using the tracking information, and setting a masking area corresponding to the position of the candidate object in the extracted frame image. I can.

여기서 상기 마스킹을 수행하는 단계는, 블러링(blurring), 모자이크 처리 또는 이미지 치환을 이용하여, 상기 마스킹 영역을 마스킹할 수 있다. In the performing of the masking, the masking area may be masked by using blurring, mosaic processing, or image replacement.

여기서 상기 마스킹을 수행하는 단계는, 상기 대상 동영상에 설정된 마스킹 영역에 대한 마스킹 정보와, 상기 대상 동영상을 각각 저장하고, 상기 대상 동영상의 재생시 상기 마스킹 정보를 이용하여 상기 대상 동영상을 마스킹하여 재생할 수 있다. In the performing of the masking, masking information for a masking area set in the target video and the target video are stored, respectively, and the target video is masked and played using the masking information when the target video is played. have.

여기서 상기 마스킹을 수행하는 단계는, 상기 대상 동영상의 재생시 접근권한을 요구하고, 상기 접근권한이 없는 경우에는 상기 대상 동영상을 마스킹하여 재생할 수 있다. Here, in the performing of the masking, an access right is requested when the target video is played, and if the access right is not provided, the target video may be masked and played.

본 발명의 일 실시예에 의하면, 하드웨어와 결합되어 상술한 영상 마스킹 방법을 실행하기 위하여 매체에 저장된 컴퓨터 프로그램이 존재할 수 있다. According to an embodiment of the present invention, a computer program stored in a medium may exist in order to execute the above-described image masking method in combination with hardware.

본 발명의 일 실시예에 의한 영상 마스킹 장치는, 입력받은 대상 동영상을 디코딩(decoding)하여 복수의 프레임 단위의 프레임 이미지를 생성하는 프레임 입력부; 상기 프레임 이미지로부터 각각의 프레임 이미지에 대응하는 프레임 특징벡터를 추출하고, 상기 프레임 특징벡터로부터 후보 객체를 검출하며, 상기 검출된 후보 객체들의 위치정보를 이용하여, 상기 프레임 특징벡터로부터 각각의 후보 객체에 대응하는 객체 특징벡터를 추출하는 특징벡터 추출부; 상기 객체 특징벡터를 이용하여 상기 후보 객체들에 대한 식별정보를 설정하는 식별정보 설정부; 동일한 식별정보를 가지는 후보 객체들을 연속된 프레임 이미지 내에서 추적하여, 각각의 식별정보 별로 상기 후보 객체들에 대한 추적정보를 생성하는 추적정보 생성부; 및 마스킹 입력을 수신하면, 상기 추적정보를 이용하여 상기 마스킹 입력에 포함된 식별정보와 동일한 후보 객체들을 추출하고, 상기 추출한 후보 객체들에 대한 마스킹을 수행하는 마스킹부를 포함할 수 있다. An image masking apparatus according to an embodiment of the present invention includes: a frame input unit configured to generate a frame image in units of a plurality of frames by decoding an input target video; Extracting a frame feature vector corresponding to each frame image from the frame image, detecting a candidate object from the frame feature vector, and using the position information of the detected candidate objects, each candidate object from the frame feature vector A feature vector extracting unit for extracting an object feature vector corresponding to; An identification information setting unit for setting identification information for the candidate objects by using the object feature vector; A tracking information generator configured to track candidate objects having the same identification information within a continuous frame image, and generate tracking information for the candidate objects for each identification information; And a masking unit for extracting candidate objects identical to the identification information included in the masking input by using the tracking information and performing masking on the extracted candidate objects.

본 발명의 일 실시예에 의한 영상 마스킹 장치는, 프로세서; 및 상기 프로세서에 커플링된 메모리를 포함하는 것으로서, 상기 메모리는 상기 프로세서에 의하여 실행되도록 구성되는 하나 이상의 모듈을 포함하고, 상기 하나 이상의 모듈은, 입력받은 대상 동영상을 디코딩(decoding)하여 프레임 단위의 프레임 이미지를 생성하고, 상기 프레임 이미지로부터 각각의 프레임 이미지에 대응하는 프레임 특징벡터를 추출하고, 상기 프레임 특징벡터로부터 후보 객체를 검출하며, 상기 검출된 후보 객체들의 위치정보를 이용하여, 상기 프레임 특징벡터로부터 각각의 후보 객체에 대응하는 객체 특징벡터를 추출한 후, 상기 객체 특징벡터를 이용하여 상기 후보객체들에 대한 식별정보를 설정하고, 동일한 식별정보를 가지는 후보 객체들을 연속된 프레임 이미지 내에서 추적하여, 각각의 식별정보 별로 상기 후보 객체들에 대한 추적정보를 생성하며, 마스킹 입력을 수신하면, 상기 추적정보를 이용하여 상기 마스킹 입력에 포함된 식별정보와 동일한 후보 객체들을 추출하고, 상기 추출한 후보 객체들에 대한 마스킹을 수행하는, 명령어를 포함할 수 있다.An image masking apparatus according to an embodiment of the present invention includes: a processor; And a memory coupled to the processor, wherein the memory includes at least one module configured to be executed by the processor, and the at least one module decodes an input target video to Create a frame image, extract a frame feature vector corresponding to each frame image from the frame image, detect a candidate object from the frame feature vector, and use the position information of the detected candidate objects to feature the frame After extracting an object feature vector corresponding to each candidate object from the vector, setting identification information for the candidate objects using the object feature vector, and tracking candidate objects having the same identification information within a continuous frame image Thus, tracking information for the candidate objects is generated for each identification information, and when a masking input is received, candidate objects identical to the identification information included in the masking input are extracted using the tracking information, and the extracted candidate objects It may include a command that performs masking on objects.

덧붙여 상기한 과제의 해결수단은, 본 발명의 특징을 모두 열거한 것이 아니다. 본 발명의 다양한 특징과 그에 따른 장점과 효과는 아래의 구체적인 실시형태를 참조하여 보다 상세하게 이해될 수 있을 것이다.In addition, the solution to the above-described problem does not enumerate all features of the present invention. Various features of the present invention and advantages and effects thereof may be understood in more detail with reference to the following specific embodiments.

본 발명의 일 실시예에 의한 영상 마스킹 장치 및 영상 마스킹 방법에 의하면, 동영상에 포함된 각각의 객체들을 구별할 수 있으므로, 사용자가 선택한 객체를 선택적으로 마스킹하여 비식별화할 수 있다. 즉, 동영상에 포함된 객체들을 일괄적으로 비식별화하는 것이 아니라, 선택된 특정 객체에 대하여 선별적으로 비식별화할 수 있으므로, 사용자 편의성을 높일 수 있다. According to the image masking apparatus and the image masking method according to an embodiment of the present invention, since each object included in a video can be distinguished, an object selected by the user can be selectively masked to de-identify. That is, objects included in the video are not collectively de-identified, but a selected specific object can be selectively de-identified, thereby enhancing user convenience.

본 발명의 일 실시예에 의한 영상 마스킹 장치 및 영상 마스킹 방법에 의하면, 동영상 내 포함된 각각의 객체들의 위치를 추적할 수 있으므로, 사용자에 의해 선택된 특정 객체를 동영상 전체에 용이하게 추출하여 마스킹할 수 있다. According to the image masking apparatus and the image masking method according to an embodiment of the present invention, since the positions of each object included in the video can be tracked, a specific object selected by the user can be easily extracted from the entire video and masked. have.

본 발명의 일 실시예에 의한 영상 마스킹 장치 및 영상 마스킹 방법에 의하면, 하나의 프레임 특징벡터로부터 후보 객체의 검출, 추적 및 인식을 수행할 수 있으므로 효율적이며, 각각의 검출, 추적 및 인식을 기계학습 알고리즘 등을 이용하여 동시에 수행하도록 학습시킬 수 있으므로 성능 및 연산속도를 현저히 향상시킬 수 있다. According to the image masking apparatus and the image masking method according to an embodiment of the present invention, it is efficient because it is possible to detect, track, and recognize a candidate object from one frame feature vector, and machine learning each detection, tracking, and recognition Performance and computation speed can be remarkably improved because it can be learned to perform simultaneously using an algorithm or the like.

본 발명의 일 실시예에 의한 영상 마스킹 장치 및 영상 마스킹 방법에 의하면, 접근권한이 없는 불특정 인물이 동영상을 열람할 때에는 마스킹된 동영상을 제공함으로써, 개인정보 노출 및 사생활 침해 등을 방지할 수 있다. According to the image masking apparatus and the image masking method according to an embodiment of the present invention, when an unspecified person without access rights reads a video, a masked video is provided, thereby preventing exposure of personal information and invasion of privacy.

다만, 본 발명의 실시예들에 따른 영상 마스킹 장치 및 영상 마스킹 방법이 달성할 수 있는 효과는 이상에서 언급한 것들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확하게 이해될 수 있을 것이다.However, the effects that can be achieved by the image masking apparatus and the image masking method according to the embodiments of the present invention are not limited to those mentioned above, and other effects not mentioned are the techniques to which the present invention belongs from the following description. It will be clearly understood by those of ordinary skill in the field.

도1은 본 발명의 일 실시예에 의한 영상 마스킹 시스템을 나타내는 개략도이다.
도2 및 도3은 본 발명의 일 실시예에 의한 영상 마스킹 장치를 나타내는 블록도이다.
도4는 본 발명의 일 실시예에 의한 후보 객체 추출 및 식별정보 설정을 나타내는 개략도이다.
도5는 본 발명의 일 실시예에 의한 마스킹 선택 인터페이스를 나타내는 개략도이다.
도6은 본 발명의 일 실시예에 의한 영상 마스킹 방법을 나타내는 순서도이다. 1 is a schematic diagram showing an image masking system according to an embodiment of the present invention.
2 and 3 are block diagrams showing an image masking apparatus according to an embodiment of the present invention.
4 is a schematic diagram illustrating extraction of a candidate object and setting of identification information according to an embodiment of the present invention.
5 is a schematic diagram showing a masking selection interface according to an embodiment of the present invention.
6 is a flowchart illustrating an image masking method according to an embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 즉, 본 발명에서 사용되는 '부'라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, '부'는 어떤 역할들을 수행한다. 그렇지만 '부'는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. '부'는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 '부'는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 '부'들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 '부'들로 결합되거나 추가적인 구성요소들과 '부'들로 더 분리될 수 있다.Hereinafter, exemplary embodiments disclosed in the present specification will be described in detail with reference to the accompanying drawings, but identical or similar elements are denoted by the same reference numerals regardless of the reference numerals, and redundant descriptions thereof will be omitted. The suffixes "module" and "unit" for components used in the following description are given or used interchangeably in consideration of only the ease of preparation of the specification, and do not have meanings or roles that are distinguished from each other by themselves. That is, the term'unit' used in the present invention means a hardware component such as software, FPGA or ASIC, and the'unit' performs certain roles. However,'part' is not limited to software or hardware. The'unit' may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors. Thus, as an example,'unit' refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, Includes subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, database, data structures, tables, arrays and variables. The functions provided in the components and'units' may be combined into a smaller number of components and'units', or may be further divided into additional components and'units'.

또한, 본 명세서에 개시된 실시 예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.In addition, in describing the embodiments disclosed in the present specification, when it is determined that a detailed description of related known technologies may obscure the subject matter of the embodiments disclosed in the present specification, the detailed description thereof will be omitted. In addition, the accompanying drawings are for easy understanding of the embodiments disclosed in the present specification, and the technical idea disclosed in the present specification is not limited by the accompanying drawings, and all changes included in the spirit and scope of the present invention It should be understood to include equivalents or substitutes.

도1은 본 발명의 일 실시예에 의한 영상 마스킹 시스템을 나타내는 개략도이다. 1 is a schematic diagram showing an image masking system according to an embodiment of the present invention.

도1을 참조하면, 본 발명의 일 실시예에 의한 영상 마스킹 시스템은 영상촬영장치(1) 및 영상마스킹장치(100)를 포함할 수 있다. Referring to FIG. 1, an image masking system according to an embodiment of the present invention may include an image capturing apparatus 1 and an image masking apparatus 100.

이하 도1을 참조하여, 본 발명의 일 실시예에 의한 영상 마스킹 시스템을 설명한다. Hereinafter, an image masking system according to an embodiment of the present invention will be described with reference to FIG. 1.

영상촬영장치(1)는 주변 환경을 촬영하여 동영상을 생성할 수 있다. 여기서, 영상촬영장치(1)는 비디오 카메라, 캠코더 등 동영상을 촬영할 수 있는 장치이면 어떠한 것도 해당할 수 있다. 영상촬영장치(1)는 촬영한 동영상을 실시간으로 스트리밍하거나 파일로 저장할 수 있으며, 영상촬영장치(1)는 촬영한 동영상을 전송하기 위하여 유선 또는 무선 통신을 제공할 수 있다. The image capturing apparatus 1 may generate a video by photographing the surrounding environment. Here, the image capturing device 1 may correspond to any device such as a video camera or a camcorder capable of photographing a moving picture. The image capturing device 1 may stream or store the captured video in real time as a file, and the image capturing device 1 may provide wired or wireless communication to transmit the captured video.

영상마스킹장치(100)는 마스킹(masking)을 수행할 대상 동영상을 수신할 수 있으며, 수신한 대상 동영상 내에 포함된 객체들에 대한 마스킹을 수행할 수 있다. 여기서, 영상마스킹장치(100)는 영상촬영장치(1)가 촬영한 대상 동영상을 파일이나 데이터 형식으로 입력받을 수 있으며, 실시예에 따라서는 영상촬영장치(1)로부터 실시간으로 대상 동영상을 스트리밍받는 것도 가능하다. The image masking apparatus 100 may receive a target video to be masked, and may perform masking on objects included in the received target video. Here, the image masking device 100 may receive a target video captured by the image capturing device 1 in a file or data format, and according to an embodiment, the target video is streamed from the image capturing device 1 in real time. It is also possible.

도1에서는 영상 마스킹 장치(100)가 영상 촬영 장치(1)와 별도로 구비된 것으로 도시하였으나, 실시예에 따라서는, 영상 마스킹 장치(100)가 CCTV나 차량용 블랙박스, 카메라 등 영상처리장치(1) 내에 내장되거나, 별도의 컴퓨터나 스마트폰 등에 구비되는 것도 가능하다. In FIG. 1, the image masking device 100 is shown to be provided separately from the image photographing device 1, but according to the embodiment, the image masking device 100 is an image processing device 1 such as a CCTV, a black box for a vehicle, a camera, etc. ), or may be provided in a separate computer or smartphone.

한편, 영상마스킹 장치(100)는 대상 동영상 내에 포함된 객체들을 구별할 수 있으며, 구별된 객체들에 대한 마스킹(masking)을 수행할 수 있다. 즉, 개인정보 노출이나 사생활 침해 등의 방지를 위하여, 동영상 내 포함된 일부 객체들을 마스킹하여 비식별화처리할 수 있다. 예를들어, CCTV나 차량용 블랙박스 등의 영상촬영장치(1)의 경우, 설정된 촬영영역을 무작위로 녹화하므로, 타인의 얼굴이나 신체부위, 차량번호판 등이 촬영될 수 있다. 여기서, 동영상이 인터넷 등을 통하여 공개되는 경우, 타인의 얼굴 등 개인정보가 노출되어 사생활 침해 등의 문제가 발생할 수 있다. 따라서, 이러한 문제점 등을 방지하기 위하여, 대상 동영상 내 포함된 타인의 얼굴이나 차량 번호판 등을 마스킹하여 식별할 수 없도록 처리할 필요가 있다. Meanwhile, the image masking apparatus 100 may distinguish objects included in the target video, and may perform masking on the distinguished objects. That is, in order to prevent exposure of personal information or invasion of privacy, some objects included in the video may be masked and de-identified. For example, in the case of an image photographing apparatus 1 such as a CCTV or a vehicle black box, a set photographing area is recorded at random, so that a face or a body part of another person, a license plate, etc. may be photographed. Here, when the video is disclosed through the Internet or the like, personal information such as the face of another person may be exposed, resulting in a problem such as invasion of privacy. Therefore, in order to prevent such problems, it is necessary to mask the face of another person included in the target video or the vehicle license plate to be processed so that it cannot be identified.

종래에는 대상 동영상에 대해 사용자가 직접 마스킹을 수행해야하는 점에서 어려움이 존재하였으며, 자동 마스킹을 제공하는 경우에도, 동영상에 포함된 전체 객체들에 대하여 일괄적으로 마스킹을 수행하는 것이 일반적이었다. 따라서, 동영상에 포함된 다양한 인물들 중에서 특정 인물만을 마스킹하거나, 특정인물만을 제외하고 마스킹해야하는 등의 경우에는, 수동으로 전체 동영상에 포함된 모든 인물들에 대하여 특정 인물에 해당하는지를 확인하고, 마스킹을 수행해야하는 등 어려움이 존재하였다.Conventionally, there has been a difficulty in that a user has to perform masking directly on a target video, and even when automatic masking is provided, it is common to collectively perform masking on all objects included in the video. Therefore, in the case of masking only a specific person among various people included in the video, or when masking is required except for a specific person, manually check whether all the people included in the video correspond to a specific person, and perform masking. Difficulties existed, such as having to be performed.

반면에, 본 발명의 일 실시예에 의한 영상 마스킹 장치(100)에 의하면, 동영상에 나타난 각각의 객체들을 식별할 수 있으며, 동영상 내 포함된 객체들 중에서 선택된 객체들만을 추적하여 자동으로 마스킹을 수행하는 것이 가능하다.On the other hand, according to the image masking apparatus 100 according to an embodiment of the present invention, each object shown in a video can be identified, and only selected objects among objects included in the video are tracked to automatically perform masking. It is possible to do.

이하 본 발명의 일 실시예에 의한 영상 마스킹 장치에 대하여 설명한다. Hereinafter, an image masking apparatus according to an embodiment of the present invention will be described.

도2는 본 발명의 일 실시예에 의한 영상 마스킹 장치를 나타내는 블록도이다. 2 is a block diagram showing an image masking apparatus according to an embodiment of the present invention.

도2를 참조하면, 본 발명의 일 실시예에 의한 영상 마스킹 장치(100)는 프레임 입력부(110), 특징벡터 추출부(120), 식별정보 설정부(130), 추적정보 생성부(140), 오류감지부(150), 마스킹부(160), 객체추적 영상생성부(170) 및 마스킹 선택 인터페이스 표시부(180)를 포함할 수 있다. Referring to FIG. 2, an image masking apparatus 100 according to an embodiment of the present invention includes a frame input unit 110, a feature vector extraction unit 120, an identification information setting unit 130, and a tracking information generation unit 140. , An error detection unit 150, a masking unit 160, an object tracking image generation unit 170, and a masking selection interface display unit 180.

프레임 입력부(110)는 입력받은 대상 동영상을 디코딩(decoding)하여 프레임 단위의 이미지를 생성할 수 있다. 구체적으로, 프레임 입력부(110)는 영상촬영장치(1)로부터 MPEC(Moving Picture Experts Group) 방식으로 인코딩된 대상 동영상을 입력받을 수 있으며, 이후 압축된 MPEC 파일 형태의 대상 동영상을 디코딩하고, 대상 동영상의 프레임을 각각 캡쳐하여 프레임 이미지를 생성할 수 있다. 여기서, 대상 동영상은 N개의 프레임 이미지를 포함할 수 있으며, 각각의 프레임 이미지에 대하여 1부터 N까지 프레임 번호를 부여할 수 있다. 여기서, N은 자연수에 해당한다. The frame input unit 110 may generate an image in units of frames by decoding the input target video. Specifically, the frame input unit 110 may receive a target video encoded by the Moving Picture Experts Group (MPEC) method from the image capturing device 1, and then decode the target video in the form of a compressed MPEC file, and By capturing each of the frames of, you can create a frame image. Here, the target video may include N frame images, and frame numbers from 1 to N may be assigned to each frame image. Here, N corresponds to a natural number.

특징벡터 추출부(120)는 프레임 이미지로부터 각각의 프레임 이미지에 대응하는 프레임 특징벡터를 추출할 수 있다. 특징벡터 추출부(120)에서 생성한 프레임 특징벡터는, 이후 프레임 이미지 내에 포함된 후보 객체들을 검출하거나, 각각의 후보 객체를 구별하여 인식하는데 활용될 수 있으며, 연속되는 프레임 이미지 내에서 포함된 후보 객체를 추적하는 데에도 활용될 수 있다. The feature vector extraction unit 120 may extract a frame feature vector corresponding to each frame image from the frame image. The frame feature vector generated by the feature vector extractor 120 can be used to detect candidate objects included in the frame image afterwards, or to distinguish and recognize each candidate object, and can be used to recognize candidates included in successive frame images. It can also be used to track objects.

먼저, 특징벡터 추출부(120)는 프레임 이미지에 포함된 각각의 픽셀(pixel)들의 위치정보, 픽셀값 정보 등 프레임 이미지의 픽셀정보를 이용하여, 프레임 특징벡터를 생성할 수 있다. 여기서, 각각의 프레임 이미지에 대응하는 픽셀정보를 이용하여 프레임 특징벡터를 추출하기 위하여, CNN(Convolutional Neural Network), RNN(Recurrent Neural Network) 등 기계학습 알고리즘을 이용하거나, HOG(Histogram of Oriented Gradient), LBP(Local Binary Pattern) 등의 영상의 통계적 특성을 추출하는 방법 등을 활용할 수 있다. 이외에도 다양한 방식으로 프레임 특징벡터를 생성할 수 있으며, 본 발명의 내용이 상술한 방법 등에 의하여 제한되는 것은 아니다. First, the feature vector extractor 120 may generate a frame feature vector using pixel information of a frame image, such as location information and pixel value information of each pixel included in the frame image. Here, in order to extract frame feature vectors using pixel information corresponding to each frame image, machine learning algorithms such as CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) are used, or Histogram of Oriented Gradient (HOG) , LBP (Local Binary Pattern), etc. can be used to extract statistical characteristics of the image. In addition, frame feature vectors can be generated in various ways, and the content of the present invention is not limited by the above-described method.

이후, 특징벡터 추출부(120)는 생성한 프레임 특징벡터를 이용하여, 각각의 프레임 이미지에 포함된 후보 객체들을 검출할 수 있다. 여기서, 후보 객체는 마스킹의 대상이 될 수 있는 객체로, 후보객체는 실시예에 따라 상이하게 설정할 수 있다. 예를들어, CCTV(Closed Circuit Television)는 촬영영역을 지나가는 사람이나 동물, 차량을 비롯하여 공원벤치, 운동기구 등 다양한 종류의 객체를 촬영할 수 있다. 그러나, 개인정보보호 등과 관련하여, 마스킹이 필요한 객체는 사람이나 차량 번호판 등일 수 있으므로, 특징벡터 추출부(120)는 CCTV로 촬영한 대상 영상 중에서, 사람이나 차량 번호판 등을 추출하여 후보 객체로 설정할 수 있다. Thereafter, the feature vector extraction unit 120 may detect candidate objects included in each frame image by using the generated frame feature vector. Here, the candidate object is an object that can be a target of masking, and the candidate object may be set differently according to embodiments. For example, CCTV (Closed Circuit Television) can photograph various types of objects, such as people, animals, vehicles, park benches, and sports equipment passing through a photographing area. However, in relation to personal information protection, since the object requiring masking may be a person or a vehicle license plate, the feature vector extraction unit 120 extracts a person or vehicle license plate from the target image captured by CCTV and sets it as a candidate object. I can.

여기서, 추출하고자 하는 후보 객체들의 형상이나 휘도 등의 특징은, 기계학습 알고리즘을 이용하여 미리 학습해 둘 수 있으며, 특징벡터 추출부(120)는 이를 활용하여 각각의 프레임 이미지로부터 후보 객체에 해당하는 영역을 검출할 수 있다. 예를들어, 다양한 사람들의 형상을 반복하여 학습하여, 프레임 이미지 내에 포함된 사람의 형상을 후보 객체로 추출하도록 할 수 있으며, 이때, 기계학습 알고리즘으로 CNN, RNN, PCA(Principal Component Analysis), Logistic Regression, Decision Tree 등을 활용할 수 있다.Here, features such as shape and luminance of candidate objects to be extracted can be learned in advance using a machine learning algorithm, and the feature vector extractor 120 utilizes this to Area can be detected. For example, it is possible to repeatedly learn the shapes of various people to extract the shape of a person included in the frame image as a candidate object. At this time, the machine learning algorithm is CNN, RNN, Principal Component Analysis (PCA), and Logistic. Regression, Decision Tree, etc. can be used.

한편, 특징벡터 추출부(120)는 추출한 후보 객체를 프레임 이미지 상에 경계 박스(bounding box) 또는 분할 마스크(segmentation mask) 등으로 특정하여 표시할 수 있다. 여기서, 경계 박스는 도4(a)에 도시한 바와 같이, 후보 객체인 보행자의 주위에 직사각형으로 표시되는 것일 수 있으며, 사용자는 경계 박스를 통하여 후보 객체들을 용이하게 구별할 수 있다. 실시예에 따라서는, 경계박스의 좌측 상단 꼭지점의 위치좌표(x,y)와, 경계박스의 너비와 높이(w,h)를 나타내는 4개의 숫자(x,y,w,h)를 이용하여, 각각의 경계박스들을 특정할 수 있다. Meanwhile, the feature vector extractor 120 may specify and display the extracted candidate object as a bounding box or a segmentation mask on the frame image. Here, the bounding box may be a rectangle displayed around a pedestrian that is a candidate object, as shown in FIG. 4(a), and a user can easily distinguish candidate objects through the bounding box. Depending on the embodiment, using the position coordinates (x,y) of the upper left vertex of the bounding box and four numbers (x,y,w,h) representing the width and height (w,h) of the bounding box , You can specify each bounding box.

또한, 분할 마스크를 이용하는 경우에는, 프레임 이미지 중에서 후보객체에 해당하는 포어그라운드(foreground)를, 나머지 배경이 되는 백그라운드(background)와 픽셀단위로 분리하여 표시할 수 있다. 즉, 도4(b)에 도시한 바와 같이, 백그라운드에 해당하는 픽셀들의 픽셀값은 0으로 설정하고, 포어그라운드에 해당하는 후보 객체의 픽셀값은 1로 설정하는 방식으로 나타낼 수 있다.In addition, when a segmentation mask is used, a foreground corresponding to a candidate object among a frame image may be displayed separately from a background serving as the remaining background in pixel units. That is, as shown in FIG. 4B, the pixel value of the pixels corresponding to the background is set to 0, and the pixel value of the candidate object corresponding to the foreground is set to 1. As shown in FIG.

후보객체를 검출한 이후에는, 특징벡터 추출부(120)가 후보객체에 대응하는 객체 특징벡터를 생성할 수 있다. 특징벡터 추출부(120)는 프레임 특징벡터 중에서 후보객체에 대응하는 영역을 특정할 수 있으며, 상기 특정된 영역에 해당하는 특징벡터 값을 추출하여 객체 특징벡터로 설정할 수 있다. After detecting the candidate object, the feature vector extractor 120 may generate an object feature vector corresponding to the candidate object. The feature vector extractor 120 may specify a region corresponding to a candidate object from among frame feature vectors, and extract a feature vector value corresponding to the specified region and set it as an object feature vector.

여기서, 객체 특징벡터는 각각의 후보 객체마다 상이하게 설정되므로, 객체 특징벡터를 이용하여 후보 객체들을 구별할 수 있다. 예를들어, 동일한 후보 객체 A가 복수의 프레임 이미지 상에 연속적으로 나타나는 경우, 동일한 후보 객체 A의 객체 특징벡터는 각각의 프레임 이미지 상에서 동일하거나 매우 유사하게 설정될 수 있다. 반면에, 상이한 후보 객체들의 경우, 객체 특징벡터가 적어도 설정값 이상의 오차값을 가지게 된다. 따라서, 객체 특징벡터가 동일한 후보 객체는 서로 동일한 객체에 해당하는 것으로 판단할 수 있으며, 객체 특징벡터가 상이한 후보 객체들은 서로 상이한 객체에 해당하는 것으로 판단할 수 있다. Here, since the object feature vector is set differently for each candidate object, candidate objects can be distinguished using the object feature vector. For example, when the same candidate object A appears consecutively on a plurality of frame images, the object feature vectors of the same candidate object A may be set identically or very similarly on each frame image. On the other hand, in the case of different candidate objects, the object feature vector has an error value of at least a set value or more. Accordingly, candidate objects having the same object feature vector may be determined to correspond to the same object, and candidate objects having different object feature vectors may be determined to correspond to different objects.

한편, 후보객체의 크기는, 후보객체의 위치나 이동방향 등에 따라, 매 프레임 이미지마다 변화할 수 있으며, 그에 따라 객체 특징벡터의 크기도 각각의 프레임 이미지마다 상이하게 설정될 수 있다. 따라서, 특징벡터추출부(120)는 보간법(interpolation)을 이용하여 각각의 객체 특징벡터의 크기를 미리 정해진 크기로 일정하게 변형시킬 수 있다. Meanwhile, the size of the candidate object may change for each frame image according to the position or movement direction of the candidate object, and accordingly, the size of the object feature vector may be set differently for each frame image. Accordingly, the feature vector extraction unit 120 may uniformly transform the size of each object feature vector to a predetermined size using an interpolation method.

상술한 바와 같이, 본 발명의 일 실시예에 의한 특징벡터 추출부(120)는, 프레임 특징벡터를 이용하여, 후보 객체 검출 및 후보 객체의 객체 특징벡터 생성 등을 수행할 수 있다. 즉, 한번 추출한 프레임 특징벡터를 활용하여 후보 객체 검출과 객체 특징벡터 생성을 수행할 수 있으므로, 효율적인 연산이 가능하며, 연산속도를 향상시킬 수 있다. As described above, the feature vector extractor 120 according to an embodiment of the present invention may perform detection of a candidate object and generation of an object feature vector of the candidate object, using the frame feature vector. That is, since candidate object detection and object feature vector generation can be performed by using the frame feature vector extracted once, efficient operation can be performed and the operation speed can be improved.

식별정보 설정부(130)는 객체 특징벡터를 이용하여 후보 객체들에 대한 식별정보를 설정할 수 있다. 즉, 식별정보 설정부(130)는 후보 객체들의 객체 특징벡터를 이용하여 각각의 후보 객체들을 구별할 수 있으며, 구별된 각각의 후보 객체들에 대하여 식별정보를 부여하여 표시할 수 있다. 예를들어, 도4(a)에는 복수의 후보 객체들이 포함되어 있으며, 각각의 후보 객체들은 서로 상이한 객체에 해당한다. 따라서, 식별정보 설정부(130)는 각각의 후보 객체들을 구별하고, 구별된 후보객체에 대하여 식별정보로 "ID"를 각각 "138", "147", "128", "153"으로 설정할 수 있다. 이때, 식별정보 설정부(130)는 식별정보 데이터베이스(database)를 이용하여, 각각의 후보 객체들에 대한 식별정보를 설정할 수 있다. The identification information setting unit 130 may set identification information for candidate objects using an object feature vector. That is, the identification information setting unit 130 may distinguish each candidate object by using the object feature vector of the candidate objects, and may assign and display identification information to each of the identified candidate objects. For example, a plurality of candidate objects are included in FIG. 4(a), and each of the candidate objects corresponds to a different object. Accordingly, the identification information setting unit 130 can distinguish each candidate object and set "ID" as identification information for the identified candidate object as "138", "147", "128", and "153", respectively. have. In this case, the identification information setting unit 130 may set identification information for each candidate object using an identification information database.

구체적으로, 식별정보 설정부(130)는 각 프레임 이미지에서 추출한 후보 객체들의 객체 특징벡터에 대응하는 등록특징벡터를 식별정보 데이터베이스(d)에서 검색할 수 있다. 여기서, 식별정보 데이터베이스(d)에는 등록특징벡터와, 각각의 등록특징벡터에 대응하는 식별정보가 매칭되어 저장되어 있을 수 있다. 따라서, 식별정보 설정부(130)는 식별정보 데이터베이스(d)에서 해당 후보 객체의 객체 특징벡터에 대응하는 식별정보들을 검색할 수 있다. Specifically, the identification information setting unit 130 may search the identification information database d for a registration feature vector corresponding to the object feature vectors of candidate objects extracted from each frame image. Here, in the identification information database d, a registration feature vector and identification information corresponding to each registration feature vector may be matched and stored. Accordingly, the identification information setting unit 130 may search the identification information corresponding to the object feature vector of the candidate object in the identification information database d.

여기서, 객체 특징벡터에 대응하는 등록특징벡터가 검색되는 경우에는, 식별정보 데이터베이스(d)에서 등록특징벡터와 매칭된 식별정보를 추출하여, 해당 후보 객체의 식별정보로 설정할 수 있다. 반면에, 객체 특징벡터에 대응하는 등록특징벡터가 검색되지 않는 경우가 있을 수 있으며, 이 경우는 해당 객체 특징벡터가 대상 동영상 내에서 최초로 등장한 경우에 해당한다. 따라서, 식별정보 설정부(130)는 객체 특징벡터에 대응하는 식별정보를 신규생성할 수 있으며, 식별정보 데이터베이스(d)에 신규생성한 식별정보와 객체 특징벡터를 등록하여, 식별정보 데이터베이스(d)를 업데이트할 수 있다.Here, when the registration feature vector corresponding to the object feature vector is searched, the identification information matched with the registration feature vector may be extracted from the identification information database d and set as the identification information of the corresponding candidate object. On the other hand, there may be a case where the registered feature vector corresponding to the object feature vector is not searched, and in this case, the corresponding object feature vector first appears in the target video. Accordingly, the identification information setting unit 130 can newly generate identification information corresponding to the object feature vector, and register the newly created identification information and the object feature vector in the identification information database d, and the identification information database d ) Can be updated.

여기서, 식별정보 설정부(130)는 객체 특징벡터가 등록특징벡터와 기 설정된 오차범위 내에서 일치하면, 객체 특징벡터가 등록특징벡터에 대응하는 것으로 판별할 수 있다. 즉, 동일한 후보 객체의 경우에도, 각각의 프레임 이미지마다 객체 특징벡터가 일부 오차를 포함할 수 있으므로, 오차범위를 고려하여 동일성을 판단하도록 할 수 있다. Here, if the object feature vector matches the registered feature vector within a preset error range, the identification information setting unit 130 may determine that the object feature vector corresponds to the registered feature vector. That is, even in the case of the same candidate object, the object feature vector for each frame image may contain some errors, and thus the identity may be determined in consideration of an error range.

식별정보 설정부(130)는 식별정보 데이터베이스(d)를 참조하여 각각의 후보 객체들의 식별정보를 설정하므로, 대상 동영상 내에 포함된 후보 객체 중에서 객체 특징벡터가 동일한 후보객체들은 모두 동일한 식별정보를 가지도록 설정할 수 있다. 예를들어, 객체 특징벡터 A를 가지는 후보 객체가 3-10번 프레임 이미지에 등장한 후, 다시 20-26번 프레임 이미지에 등장하는 경우, 해당 후보 객체는 3-10번 프레임 이미지와 20-26번 프레임 이미지에서 동일하게 식별정보 b을 가지도록 설정될 수 있다. Since the identification information setting unit 130 sets identification information of each candidate object by referring to the identification information database (d), all candidate objects having the same object feature vector among candidate objects included in the target video have the same identification information. Can be set to For example, if a candidate object with object feature vector A appears in frame image 3-10 and then appears in frame image 20-26 again, the candidate object is frame image 3-10 and frame image 20-26. It may be set to have the same identification information b in the frame image.

추가적으로, 식별정보 데이터베이스(d)는 각각의 대상 동영상별로 구비될 수 있으나, 실시예에 따라서는 영상 마스킹 장치(100)가 수신하는 전체 대상 동영상들에 대하여 하나의 식별정보 데이터베이스(d)를 구비하는 것도 가능하다. 이 경우, 새롭게 제공받은 대상 동영상에 포함된 객체 특징벡터가 이전의 대상 동영상에 포함된 객체 특징벡터와 동일하면, 이전의 동영상에서 설정한 식별정보와 동일한 식별정보로 설정할 수 있다. 즉, 서로 다른 대상 동영상에 대하여도 동일한 후보객체에 대하여는 동일한 식별정보를 설정할 수 있다. Additionally, the identification information database (d) may be provided for each target video, but according to an embodiment, a single identification information database (d) for all target videos received by the image masking device 100 is provided. It is also possible. In this case, if the object feature vector included in the newly provided target video is the same as the object feature vector included in the previous target video, the same identification information as the identification information set in the previous video may be set. That is, the same identification information can be set for the same candidate object even for different target videos.

추적정보 생성부(140)는 동일한 식별정보를 가지는 후보 객체들을 연속된 프레임 이미지 내에서 추적하여, 각각의 식별정보 별로 후보 객체들에 대한 추적정보를 생성할 수 있다. 즉, 각각의 프레임 이미지에서 추출된 후보 객체의 위치정보와 식별정보들을 결합하여 추적정보를 생성할 수 있으며, 추적정보를 이용하여 연속된 프레임 이미지에서 자연스럽게 이어지는 후보 객체의 위치변화 등을 추적할 수 있다. 여기서, 추적정보에는 후보 객체의 식별정보, 후보 객체가 등장하는 프레임 번호 등 프레임 정보, 프레임 이미지 내에 포함된 후보 객체의 위치정보 및 크기 정보 등을 포함할 수 있다. 또한, 실시예에 따라서는, 후보 객체의 포즈(pose)나, 얼굴 특징점, 의상 특징점 등에 대한 정보 등을 추적정보에 더 포함하는 것도 가능하다. 하나의 대상 동영상에는 복수의 추적정보들이 포함될 수 있으며, 각각의 추적정보는 후보객체의 식별정보별로 생성될 수 있다. The tracking information generating unit 140 may track candidate objects having the same identification information within a continuous frame image, and generate tracking information for candidate objects for each identification information. That is, tracking information can be generated by combining the location information and identification information of the candidate object extracted from each frame image, and the location change of the candidate object naturally continuing in the continuous frame image can be tracked using the tracking information. have. Here, the tracking information may include identification information of the candidate object, frame information such as a frame number in which the candidate object appears, location information and size information of the candidate object included in the frame image, and the like. In addition, depending on the embodiment, it is possible to further include information on a pose of a candidate object, facial feature points, clothing feature points, and the like in the tracking information. One target video may include a plurality of tracking information, and each tracking information may be generated for each identification information of a candidate object.

실시예에 따라서는, 추적정보 생성부(140)가 연속된 프레임 이미지 내에 포함된 각각의 후보 객체들의 객체 특징벡터의 차이값과, 후보객체들의 위치 및 크기 변화를 이용하여, 동일한 후보 객체를 추적하도록 하는 것도 가능하다. 예를들어, 제1 프레임 이미지에 포함된 제1 후보 객체를 추적하고자 하는 경우, 제1 프레임 이미지와 연속하는 제2 프레임 이미지에 포함된 복수의 후보 객체들에 대해, 제1 후보 객체와의 추적오차값을 연산할 수 있다. 이후, 추적오차값이 최소인 후보 객체를 제1 후보 객체와 동일한 후보 객체로 판별할 수 있다.Depending on the embodiment, the tracking information generation unit 140 tracks the same candidate object by using the difference value of the object feature vector of each candidate object included in the continuous frame image and the change in position and size of the candidate objects. It is also possible to do it. For example, when trying to track a first candidate object included in a first frame image, tracking with a first candidate object for a plurality of candidate objects included in a second frame image consecutive to the first frame image You can calculate the error value. Thereafter, the candidate object having the minimum tracking error value may be determined as the same candidate object as the first candidate object.

구체적으로, 추적오차값은 Error = (V1 - V2) + a × (d1-d2) + b × (s1-s2)를 이용하여 연산할 수 있다. 여기서, Error는 추적오차값, V1은 제1 프레임 이미지에 포함된 제1 후보객체의 객체 특징 백터, V2는 제2 프레임 이미지에 포함된 제2 후보객체의 객체 특징 백터, d1은 기준점으로부터 제1 후보객체의 중심점까지의 거리, d2는 기준점으로부터 제2 후보객체의 중심점까지의 거리, s1은 제1 후보객체의 면적, s2는 제2 후보객체의 면적, a, b는 가중치로 임의의 상수에 해당한다. Specifically, the tracking error value can be calculated using Error = (V1-V2) + a × (d1-d2) + b × (s1-s2). Here, Error is the tracking error value, V1 is the object feature vector of the first candidate object included in the first frame image, V2 is the object feature vector of the second candidate object included in the second frame image, and d1 is the first from the reference point. The distance to the center point of the candidate object, d2 is the distance from the reference point to the center point of the second candidate object, s1 is the area of the first candidate object, s2 is the area of the second candidate object, and a, b are weights and It corresponds.

일반적으로, 연속하는 프레임 이미지 사이의 시간차는 매우 짧으므로, 동일한 후보 객체가 연속하는 프레임 이미지 사이에서 많은 거리를 이동하거나 면적이 급격히 증감하기는 어렵다. 따라서, 객체 특징 벡터 사이의 차이값이 작고, 거리와 면적의 변화가 작을수록 동일한 후보 객체에 해당할 가능성이 높다. 따라서, 상술한 추적오차값을 이용하여, 연속된 프레임 이미지 내에서 후보객체를 추적할 수 있다. 또한, 실시예에 따라서는, V1-V2가 설정된 한계오차 이상이면, 추적오차값을 계산하지 않고 해당 제2 후보 객체를 상이한 후보 객체로 판별하도록 할 수 있다. 즉, 객체 특징벡터가 설정된 오차 범위를 벗어나는 경우에 해당하므로, 해당 제2 후보 객체를 제1 후보 객체와 상이한 것으로 판별할 수 있다. In general, since the time difference between consecutive frame images is very short, it is difficult for the same candidate object to move a large distance between consecutive frame images or to rapidly increase or decrease the area. Accordingly, the smaller the difference value between the object feature vectors and the smaller the change in distance and area is, the higher the probability of corresponding to the same candidate object. Therefore, using the above-described tracking error value, it is possible to track a candidate object within a continuous frame image. In addition, according to an embodiment, if V1-V2 is equal to or greater than a set limit error, the second candidate object may be determined as a different candidate object without calculating a tracking error value. That is, since it corresponds to the case where the object feature vector is out of the set error range, the second candidate object may be determined to be different from the first candidate object.

오류감지부(150)는 후보 객체에 대한 식별정보 설정오류를 감지할 수 있다. 예를들어, 동일한 식별정보를 가지는 후보 객체가, 연속하는 프레임 이미지 내에서 일부 누락되는 경우, 오류감지부(150)는 후보 객체에 대한 식별정보 설정에 오류가 발생한 것으로 판별할 수 있다. The error detection unit 150 may detect an error in setting identification information for a candidate object. For example, when a candidate object having the same identification information is partially omitted in a continuous frame image, the error detection unit 150 may determine that an error has occurred in setting identification information for the candidate object.

일반적으로 대상 동영상의 프레임 이미지는 매우 짧은 시간 간격으로 촬영되므로, 인접하는 프레임 이미지에 동시에 존재하고, 각 프레임 이미지에서의 위치가 서로 근접한 후보객체들은 서로 동일한 식별정보를 가질 것으로 예상할 수 있다. 따라서, 기존에 존재했던 후보 객체가 인접하는 프레임 이미지 내에서 갑자기 사라지는 등의 경우에는, 실제로 후보 객체가 이동한 것이 아니라 후보 객체에 대한 식별정보 설정 등에 오류가 발생한 것으로 볼 수 있다.In general, since frame images of a target video are photographed at very short time intervals, candidate objects that exist simultaneously in adjacent frame images and whose positions in each frame image are close to each other can be expected to have the same identification information. Therefore, in the case where the existing candidate object suddenly disappears within the adjacent frame image, it can be considered that the candidate object did not actually move, but that an error occurred in setting identification information for the candidate object.

예를들어, t-1 시점에서의 제1 프레임 이미지, t 시점에서의 제2 프레임 이미지, t+1 시점에서의 제3 프레임 이미지가 존재하고, 제1 프레임 이미지와 제3 프레임 이미지의 (x1, y1) 위치에 {id=0}인 후보 객체가 존재하는 경우가 있을 수 있다. 이때, 만약 제2 프레임 이미지의 (x1, y1)에 {id=0}인 후보 객체가 존재하지 않거나, 갑자기 {id=1}인 객체가 (x1, y1)에 위치하는 등의 경우에는, 식별정보 설정 등에 오류가 발생한 것으로 판별할 수 있다. 따라서, 오류감지부(150)는 오류발생을 표시하여 사용자 등에게 알릴 수 있다. For example, a first frame image at a time point t-1, a second frame image at a time point t, and a third frame image at a time point t+1 exist, and the (x1) of the first frame image and the third frame image , y1) There may be a case where a candidate object with {id=0} exists. At this time, if the candidate object with {id=0} does not exist in (x1, y1) of the second frame image, or if the object with {id=1} suddenly is located at (x1, y1), the identification It can be determined that an error has occurred in information setting, etc. Accordingly, the error detection unit 150 may display the occurrence of an error and notify the user or the like.

마스킹부(160)는 마스킹 입력을 수신하면, 추적정보를 이용하여 마스킹 입력에 포함된 식별정보와 동일한 후보 객체들을 추출하고, 추출한 후보 객체들에 대한 마스킹을 수행할 수 있다. 즉, 마스킹부(160)는 선택된 후보 객체들에 대하여, 선별적으로 마스킹을 수행할 수 있으며, 이때 마스킹을 수행할 후보 객체들은 식별정보를 이용하여 특정할 수 있다. When receiving the masking input, the masking unit 160 may extract candidate objects identical to the identification information included in the masking input using the tracking information, and mask the extracted candidate objects. That is, the masking unit 160 may selectively perform masking on selected candidate objects, and at this time, candidate objects to be masked may be specified using identification information.

여기서, 마스킹 입력은 사용자로부터 입력받은 식별정보를 이용하여 생성하거나, 미리 설정된 선택 알고리즘 등에 따라 추출한 식별정보를 이용하여 생성할 수 있다. 예를들어, 영상마스킹장치(100)는 사용자에게 별도의 마스킹 선택 인터페이스 등을 제공할 수 있으며, 사용자는 마스킹 선택 인터페이스를 이용하여 마스킹할 후보 객체에 대응하는 식별정보를 선택할 수 있다. 이 경우, 마스킹부(160)는 마스킹 선택 인터페이스를 통하여 마스킹 입력을 수신할 수 있으며, 수신한 마스킹 입력에 대응하는 후보객체들을 마스킹할 수 있다. Here, the masking input may be generated using identification information input from a user, or may be generated using identification information extracted according to a preset selection algorithm. For example, the image masking apparatus 100 may provide a separate masking selection interface to the user, and the user may select identification information corresponding to a candidate object to be masked using the masking selection interface. In this case, the masking unit 160 may receive a masking input through the masking selection interface, and may mask candidate objects corresponding to the received masking input.

또한, 영상마스킹장치(100)에 설정된 별도의 선택 알고리즘 등을 이용하여 마스킹을 수행할 후보 객체들을 자동으로 추출하는 실시예의 경우, 대상 동영상 내의 특정 구간에 포함되는 전체 후보 객체에 대해 모두 마스킹하도록 식별정보들을 추출하거나, 대상 동영상 내에 포함된 특정의 후보 객체에 해당하는 식별정보를 추출하여 마스킹하도록 설정할 수 있다. 예를들어, 선택 알고리즘을 이용하여, 대상 동영상 내에 포함된 후보 객체들 중에서 특정 성별이나 연령대에 해당하는 식별정보를 추출할 수 있다. In addition, in the case of an embodiment in which candidate objects to be masked are automatically extracted using a separate selection algorithm set in the image masking apparatus 100, all candidate objects included in a specific section in the target video are identified to be masked. The information may be extracted, or identification information corresponding to a specific candidate object included in the target video may be extracted and set to mask. For example, by using a selection algorithm, identification information corresponding to a specific gender or age group may be extracted from among candidate objects included in the target video.

한편, 마스킹부(160)는 추적정보를 이용하여, 선택된 식별정보에 대응하는 후보객체가 등장하는 프레임 이미지들을 추출할 수 있으며, 추출한 프레임 이미지에서 나타난 후보객체의 위치에 대응하여 마스킹 영역을 설정할 수 있다. 이후, 설정된 마스킹 영역을 마스킹하여 사용자들이 식별할 수 없도록 비식별화할 수 있다. 여기서, 마스킹 영역은 후보객체 중에서 얼굴에 해당하는 영역 등으로 한정하여 설정할 수 있다. Meanwhile, the masking unit 160 may extract frame images in which a candidate object corresponding to the selected identification information appears, using the tracking information, and set a masking area corresponding to the position of the candidate object appearing in the extracted frame image. have. Thereafter, the set masking area may be masked and de-identified so that users cannot identify it. Here, the masking area may be limited to an area corresponding to a face among candidate objects and set.

이때, 마스킹부(160)는 마스킹 영역을 블러링(blurring)하거나 모자이크(Mosaic) 처리하는 등의 방식으로 마스킹할 수 있다. 여기서, 블러링은 저역통과필터를 이용하여 구현할 수 있으며, 실시예에 따라서는 마스킹 영역을 단색이나 특정 패턴, 별도의 이미지나 애니메이션, 캐릭터 등으로 덮는 이미지 치환을 이용하여, 마스킹하는 것도 가능하다. In this case, the masking unit 160 may mask the masked area in a manner such as blurring or mosaic processing. Here, the blurring may be implemented using a low-pass filter, and according to an embodiment, masking may be performed using image substitution covering the masking area with a single color or a specific pattern, a separate image, animation, character, or the like.

추가적으로, 마스킹부(160)는 마스킹이 합성된 영상을 인코딩할 수 있으며, 영상출력부(미도시) 등을 통하여 마스킹된 영상이 출력되도록 할 수 있다. 한편, 실시예에 따라서는, 마스킹부(160)가 대상 동영상에 설정된 마스킹 영역에 대한 마스킹 정보와, 대상 동영상을 각각 별도로 저장하는 것도 가능하다. 즉, 대상 동영상의 원본파일을 별도로 저장한 후, 대상 동영상의 재생시 마스킹 정보를 이용하여 대상 동영상을 마스킹하도록 할 수 있다. Additionally, the masking unit 160 may encode an image obtained by synthesizing the masking, and may output the masked image through an image output unit (not shown). Meanwhile, according to an embodiment, the masking unit 160 may separately store masking information for a masking area set in the target video and the target video. That is, after separately storing the original file of the target video, the target video may be masked using masking information when the target video is played.

또한, 실시예에 따라서는, 대상 동영상의 원본파일을 재생하고자 하는 경우 원본파일에 대한 접근권한을 요구할 수 있으며, 접근권한이 없는 경우에는 대상 동영상에 대한 마스킹을 적용하여 제공하도록 할 수 있다. 여기서, 접근권한은 비밀번호나 지문인식, 홍채 인식 등 다양한 종류의 인증을 통하여 접근권한을 확인하도록 할 수 있다. In addition, according to an exemplary embodiment, when the original file of the target video is to be played, access rights to the original file may be requested, and when there is no access right, masking for the target video may be applied and provided. Here, the access right may be checked through various types of authentication such as password, fingerprint recognition, and iris recognition.

추가적으로, 본 발명의 일 실시예에 의한 영상마스킹장치(100)는, 사용자가 마스킹할 후보 객체들을 선택하도록 할 수 있으며, 이 경우 사용자의 선택의 편의성을 높이기 위한 구성들을 더 포함할 수 있다. Additionally, the image masking apparatus 100 according to an embodiment of the present invention may allow a user to select candidate objects to be masked, and in this case, may further include components for enhancing user selection convenience.

구체적으로, 객체 추적 영상 생성부(170)는, 추적정보를 이용하여 후보 객체와 후보 객체별 식별정보를 대상 동영상에 오버레이(overlay)하여, 객체 추적영상을 생성할 수 있다. 즉, 도4(a)에 도시한 바와 같이, 후보 객체를 나타내는 경계 박스와 식별정보를 대상 동영상과 함께 표시하는 방식으로, 객체 추적영상을 생성할 수 있다. Specifically, the object tracking image generator 170 may generate an object tracking image by overlaying the candidate object and identification information for each candidate object on the target video using the tracking information. That is, as shown in Fig. 4(a), an object tracking image can be generated by displaying a bounding box representing a candidate object and identification information together with a target video.

이 경우, 사용자는 객체 추적영상을 확인하여, 추적 중인 후보 객체와 각각의 후보 객체들의 식별정보를 확인할 수 있다. 따라서, 사용자는 객체 추적영상을 참조하여, 복수의 후보 객체들 중에서 마스킹 처리를 수행할 대상 등을 선정할 수 있다. In this case, the user may check the object tracking image and check the candidate object being tracked and identification information of each candidate object. Accordingly, the user may select an object to perform masking processing among a plurality of candidate objects by referring to the object tracking image.

또한, 마스킹 선택 인터페이스 표시부(160)는 사용자가 마스킹할 후보 객체들을 선택할 수 있도록 마스킹 선택 인터페이스를 제공할 수 있다. 구체적으로, 마스킹 선택 인터페이스는, 도5에 도시한 바와 같이 구현할 수 있다. 즉, 마스킹 선택 인터페이스에는 대상 동영상 내에 포함된 후보 객체들에 대응하는 식별정보를 정렬한 식별정보 리스트가 표시될 수 있으며, 동일한 식별정보를 가지는 후보 객체들이 대상 동영상 내에 등장하는 등장구간정보와 등장한 프레임 이미지의 개수 등이 표시될 수 있다. 또한, 사용자가 각각의 식별정보에 대응하는 후보 객체들을 확인할 수 있도록, 후보 객체가 나타난 프레임 이미지 등을 더 포함할 수 있다. In addition, the masking selection interface display unit 160 may provide a masking selection interface so that a user may select candidate objects to be masked. Specifically, the masking selection interface can be implemented as shown in FIG. 5. That is, the masking selection interface may display an identification information list in which identification information corresponding to candidate objects included in the target video is arranged, and candidate objects having the same identification information appear in the target video, appearing section information and the frame The number of images, etc. may be displayed. In addition, a frame image in which the candidate object is displayed may be further included so that the user can check candidate objects corresponding to each identification information.

따라서, 사용자는 마스킹 선택 인터페이스를 통하여 대상 동영상 내에서 마스킹하고자 하는 후보 객체들을 선택할 수 있으며, 식별정보 리스트 옆에 표시된 체크박스에 체크하는 등의 방식으로, 마스킹하고자 하는 후보 객체들을 선택할 수 있다. 즉, 대상 동영상 내에 포함된 전체 객체들에 대하여 일괄적으로 마스킹을 수행하는 것이 아니라, 마스킹을 수행할 후보 객체들을 사용자가 선택적으로 설정하는 것이 가능하다. Accordingly, the user may select candidate objects to be masked in the target video through the masking selection interface, and may select candidate objects to be masked by checking a checkbox displayed next to the identification information list. That is, instead of collectively performing masking on all objects included in the target video, it is possible for the user to selectively set candidate objects to be masked.

한편, 도3에 도시한 바와 같이, 본 발명의 일 실시예에 의한 영상 마스킹 장치(100)는, 프로세서(10), 메모리(40) 등의 물리적인 구성을 포함할 수 있으며, 메모리(40) 내에는 프로세서(10)에 의하여 실행되도록 구성되는 하나 이상의 모듈이 포함될 수 있다. 구체적으로, 하나 이상의 모듈에는, 프레임 입력모듈, 특징벡터추출모듈, 식별정보 설정모듈, 추적정보 생성모듈, 오류감지 모듈, 마스킹 모듈, 객체추적영상 생성모듈 및 마스킹 선택 인터페이스 표시 모듈 등이 포함될 수 있다. Meanwhile, as shown in FIG. 3, the image masking apparatus 100 according to an embodiment of the present invention may include a physical configuration such as a processor 10 and a memory 40, and the memory 40 One or more modules configured to be executed by the processor 10 may be included therein. Specifically, one or more modules may include a frame input module, a feature vector extraction module, an identification information setting module, a tracking information generation module, an error detection module, a masking module, an object tracking image generation module, and a masking selection interface display module. .

프로세서(10)는, 다양한 소프트웨어 프로그램과, 메모리(40)에 저장되어 있는 명령어 집합을 실행하여 여러 기능을 수행하고 데이터를 처리하는 기능을 수행할 수 있다. 주변인터페이스부(30)는, 영상 마스킹 장치(100)의 입출력 주변 장치를 프로세서(10), 메모리(40)에 연결할 수 있으며, 메모리 제어기(20)는 프로세서(10)나 영상 마스킹 장치(100)의 구성요소가 메모리(40)에 접근하는 경우에, 메모리 액세스를 제어하는 기능을 수행할 수 있다. 실시예에 따라서는, 프로세서(10), 메모리 제어기(20) 및 주변인터페이스부(30)를 단일 칩 상에 구현하거나, 별개의 칩으로 구현할 수 있다. The processor 10 may execute various software programs and an instruction set stored in the memory 40 to perform various functions and perform a function of processing data. The peripheral interface unit 30 may connect input/output peripheral devices of the image masking device 100 to the processor 10 and the memory 40, and the memory controller 20 may be the processor 10 or the image masking device 100. When an element of accesses the memory 40, a function of controlling memory access may be performed. Depending on the embodiment, the processor 10, the memory controller 20, and the peripheral interface unit 30 may be implemented on a single chip, or may be implemented as separate chips.

메모리(40)는 고속 랜덤 액세스 메모리, 하나 이상의 자기 디스크 저장 장치, 플래시 메모리 장치와 같은 불휘발성 메모리 등을 포함할 수 있다. 또한, 메모리(40)는 프로세서(10)로부터 떨어져 위치하는 저장장치나, 인터넷 등의 통신 네트워크를 통하여 엑세스되는 네트워크 부착형 저장장치 등을 더 포함할 수 있다. The memory 40 may include a high-speed random access memory, one or more magnetic disk storage devices, a nonvolatile memory such as a flash memory device, and the like. In addition, the memory 40 may further include a storage device located away from the processor 10 or a network attached storage device that is accessed through a communication network such as the Internet.

한편, 도3에 도시한 바와 같이, 본 발명의 일 실시예에 의한 영상 마스킹 장치(100)는, 메모리(40)에 운영체제를 비롯하여, 응용프로그램에 해당하는 프레임 입력모듈, 특징벡터추출모듈, 식별정보 설정모듈, 추적정보 생성모듈, 오류감지 모듈, 마스킹 모듈, 객체추적영상 생성모듈 및 마스킹 선택 인터페이스 표시 모듈 등을 포함할 수 있다. 여기서, 각각의 모듈들은 상술한 기능을 수행하기 위한 명령어의 집합으로, 메모리(40)에 저장될 수 있다. Meanwhile, as shown in FIG. 3, the image masking apparatus 100 according to an embodiment of the present invention includes an operating system in the memory 40, a frame input module corresponding to an application program, a feature vector extraction module, and identification. It may include an information setting module, a tracking information generation module, an error detection module, a masking module, an object tracking image generation module, and a masking selection interface display module. Here, each of the modules is a set of instructions for performing the above-described functions, and may be stored in the memory 40.

따라서, 본 발명의 일 실시예에 의한 영상 마스킹 장치(100)는, 프로세서(10)가 메모리(40)에 액세스하여 각각의 모듈에 대응하는 명령어를 실행할 수 있다. 다만, 프레임 입력모듈, 특징벡터추출모듈, 식별정보 설정모듈, 추적정보 생성모듈, 오류감지 모듈, 마스킹 모듈, 객체추적영상 생성모듈 및 마스킹 선택 인터페이스 표시 모듈은 상술한 프레임 입력부, 특징벡터추출부, 식별정보 설정부, 추적정보 생성부, 오류감지부, 마스킹부, 객체추적영상생성부 및 마스킹 선택 인터페이스 표시부에 각각 대응하므로 여기서는 자세한 설명을 생략한다.Accordingly, in the image masking apparatus 100 according to an embodiment of the present invention, the processor 10 may access the memory 40 and execute a command corresponding to each module. However, the frame input module, feature vector extraction module, identification information setting module, tracking information generation module, error detection module, masking module, object tracking image generation module and masking selection interface display module include the above-described frame input unit, feature vector extraction unit, Since the identification information setting unit, the tracking information generation unit, the error detection unit, the masking unit, the object tracking image generation unit, and the masking selection interface display unit respectively correspond to each other, detailed descriptions are omitted here.

도6은 본 발명의 일 실시예에 의한 영상 마스킹 방법을 나타내는 순서도이다. 6 is a flowchart illustrating an image masking method according to an embodiment of the present invention.

도6을 참조하면 본 발명의 일 실시예에 의한 영상 마스킹 방법은, 프레임 입력단계(S10), 후보객체검출단계(S20), 객체 특징벡터 추출단계(S30), 식별정보 설정단계(S40), 추적정보 생성단계(S50) 및 마스킹단계(S60)를 포함할 수 있다. 여기서, 각각의 단계들은 본 발명의 일 실시예에 의한 영상 마스킹 장치에 의하여 수행될 수 있다.6, the image masking method according to an embodiment of the present invention includes a frame input step (S10), a candidate object detection step (S20), an object feature vector extraction step (S30), an identification information setting step (S40), It may include a tracking information generation step (S50) and a masking step (S60). Here, each of the steps may be performed by the image masking apparatus according to an embodiment of the present invention.

이하 도6을 참조하여 본 발명의 일 실시예에 의한 영상 마스킹 방법을 설명한다. Hereinafter, an image masking method according to an embodiment of the present invention will be described with reference to FIG. 6.

프레임 입력단계(S10)에서는, 입력받은 대상 동영상을 디코딩(decoding)하여 프레임 단위의 프레임 이미지를 생성할 수 있다. 구체적으로, 영상 마스킹 장치는 영상촬영장치로부터 MPEC 인코딩된 대상 동영상을 입력받을 수 있으며, 이 경우 압축된 MPEC 파일 형태의 대상 동영상을 디코딩한 후, 대상 동영상의 프레임을 각각 캡쳐하여 프레임 이미지를 생성할 수 있다. In the frame input step S10, a frame image in a frame unit may be generated by decoding the input target video. Specifically, the image masking device may receive an MPEC-encoded target video from an image capturing device. In this case, after decoding the target video in the form of a compressed MPEC file, each frame of the target video is captured to generate a frame image. I can.

후보객체검출단계(S20)에서는, 프레임 이미지로부터 각각의 프레임 이미지에 대응하는 프레임 특징벡터를 추출하고, 프레임 특징벡터로부터 후보 객체를 검출할 수 있다. 여기서, 영상마스킹 장치는 프레임 이미지에 포함된 각각의 픽셀들의 위치정보, 픽셀값 정보 등 프레임 이미지의 픽셀정보를 이용하여, 프레임 특징벡터를 생성할 수 있으며, 실시예에 따라서는, CNN(Convolutional Neural Network), RNN(Recurrent Neural Network) 등 기계학습 알고리즘이나, HOG(Histogram of Oriented Gradient), LBP(Local Binary Pattern) 등의 영상의 통계적 특성을 추출하는 방법 등을 활용할 수 있다. In the candidate object detection step S20, a frame feature vector corresponding to each frame image may be extracted from the frame image, and a candidate object may be detected from the frame feature vector. Here, the image masking apparatus may generate a frame feature vector by using pixel information of a frame image, such as location information and pixel value information of each pixel included in the frame image. According to an embodiment, a convolutional neural (CNN) Network), a machine learning algorithm such as RNN (Recurrent Neural Network), or a method of extracting statistical characteristics of an image such as Histogram of Oriented Gradient (HOG) and Local Binary Pattern (LBP).

이후, 영상 마스킹 장치는 생성한 프레임 특징벡터를 이용하여 각각의 프레임 이미지에 포함된 후보 객체들을 검출할 수 있다. 여기서, 후보 객체는 마스킹의 대상이 될 수 있는 객체로, 후보 객체들의 형상이나 휘도 등의 특징은, 미리 기계학습 알고리즘을 이용하여 학습해 둘 수 있다. 즉, CNN, RNN, PCA(Principal Component Analysis), Logistic Regression, Decision Tree 등의 기계학습 알고리즘을 이용하여 학습할 수 있으며, 이를 활용하여 각각의 프레임 이미지로부터 후보 객체에 해당하는 영역을 검출할 수 있다. 이때, 추출한 후보 객체들은 프레임 이미지 상에 경계 박스(bounding box) 또는 분할 마스크(segmentation mask) 등으로 특정하여 표시할 수 있다. Thereafter, the image masking apparatus may detect candidate objects included in each frame image by using the generated frame feature vector. Here, the candidate object is an object that can be a target of masking, and features of the candidate objects, such as shape or luminance, may be learned in advance using a machine learning algorithm. That is, it can be learned using machine learning algorithms such as CNN, RNN, Principal Component Analysis (PCA), Logistic Regression, and Decision Tree, and by using this, a region corresponding to a candidate object can be detected from each frame image. . In this case, the extracted candidate objects may be specified and displayed on the frame image as a bounding box or a segmentation mask.

객체 특징벡터 추출단계(S30)에서는, 검출된 후보 객체들의 위치정보를 이용하여, 프레임 특징벡터로부터 각각의 후보 객체에 대응하는 객체 특징벡터를 추출할 수 있다. 즉, 프레임 특징벡터 중에서 후보객체에 대응하는 영역을 특정할 수 있으며, 상기 특정된 영역에 해당하는 특징벡터 값을 추출하여 객체 특징벡터로 설정할 수 있다. 여기서, 객체 특징벡터는 각각의 후보 객체마다 상이하게 설정되므로, 객체 특징벡터를 이용하여 후보 객체들을 구별할 수 있다. In the object feature vector extraction step (S30), an object feature vector corresponding to each candidate object may be extracted from the frame feature vector using the position information of the detected candidate objects. That is, a region corresponding to a candidate object may be specified from among frame feature vectors, and a feature vector value corresponding to the specified region may be extracted and set as an object feature vector. Here, since the object feature vector is set differently for each candidate object, candidate objects can be distinguished using the object feature vector.

한편, 후보객체의 크기는, 후보객체의 위치나 이동방향 등에 따라, 매 프레임 이미지마다 변화할 수 있으며, 그에 따라 객체 특징벡터의 크기도 각각의 프레임 이미지마다 상이하게 설정될 수 있다. 따라서, 보간법(interpolation)을 이용하여 각각의 객체 특징벡터의 크기를 미리 정해진 크기로 일정하게 변형시킬 수 있다. Meanwhile, the size of the candidate object may change for each frame image according to the position or movement direction of the candidate object, and accordingly, the size of the object feature vector may be set differently for each frame image. Accordingly, the size of each object feature vector can be uniformly transformed into a predetermined size using interpolation.

식별정보 설정단계(S40)에서는, 객체 특징벡터를 이용하여 후보 객체들에 대한 식별정보를 설정할 수 있다. 즉, 후보 객체들의 객체 특징벡터를 이용하여 각각의 후보 객체들을 구별할 수 있으며, 구별된 각각의 후보 객체들에 대하여 식별정보를 부여하여 표시할 수 있다. 이때, 식별정보 데이터베이스(database)를 이용하여, 각각의 후보 객체들에 대한 식별정보를 설정할 수 있다. In the identification information setting step S40, identification information for candidate objects may be set using an object feature vector. That is, each candidate object may be distinguished by using the object feature vectors of the candidate objects, and identification information may be assigned and displayed for each of the distinguished candidate objects. In this case, identification information for each candidate object may be set using an identification information database.

구체적으로, 각 프레임 이미지에서 추출한 후보 객체들의 객체 특징벡터에 대응하는 등록특징벡터를 식별정보 데이터베이스에서 검색할 수 있다. 여기서, 식별정보 데이터베이스에는 등록특징벡터와, 각각의 등록특징벡터에 대응하는 식별정보가 매칭되어 저장되어 있을 수 있다. 따라서, 식별정보 데이터베이스에서 해당 후보 객체의 객체 특징벡터에 대응하는 식별정보들을 검색할 수 있다. Specifically, a registration feature vector corresponding to the object feature vectors of candidate objects extracted from each frame image may be searched in the identification information database. Here, in the identification information database, the registration feature vector and the identification information corresponding to each registration feature vector may be matched and stored. Accordingly, identification information corresponding to the object feature vector of the candidate object can be searched in the identification information database.

여기서, 객체 특징벡터에 대응하는 등록특징벡터가 검색되는 경우에는, 식별정보 데이터베이스에서 등록특징벡터와 매칭된 식별정보를 추출하여, 해당 후보 객체의 식별정보로 설정할 수 있다. 반면에, 객체 특징벡터에 대응하는 등록특징벡터가 검색되지 않는 경우가 있을 수 있으며, 이 경우는 해당 객체 특징벡터가 대상 동영상 내에서 최초로 등장한 경우에 해당한다. 따라서, 객체 특징벡터에 대응하는 식별정보를 신규생성할 수 있으며, 식별정보 데이터베이스에 신규생성한 식별정보와 객체 특징벡터를 등록하여, 식별정보 데이터베이스를 업데이트할 수 있다.Here, when the registration feature vector corresponding to the object feature vector is searched, the identification information matched with the registration feature vector may be extracted from the identification information database and set as the identification information of the candidate object. On the other hand, there may be a case where the registered feature vector corresponding to the object feature vector is not searched, and in this case, the corresponding object feature vector first appears in the target video. Accordingly, identification information corresponding to the object feature vector can be newly created, and the identification information database can be updated by registering the newly created identification information and the object feature vector in the identification information database.

여기서, 객체 특징벡터가 등록특징벡터와 기 설정된 오차범위 내에서 일치하면, 객체 특징벡터가 등록특징벡터에 대응하는 것으로 판별할 수 있다. 즉, 동일한 후보 객체의 경우에도, 각각의 프레임 이미지마다 객체 특징벡터가 일부 오차를 포함할 수 있으므로, 오차범위를 고려하여 동일성을 판단하도록 할 수 있다. Here, if the object feature vector matches the registered feature vector within a preset error range, it can be determined that the object feature vector corresponds to the registered feature vector. That is, even in the case of the same candidate object, the object feature vector for each frame image may contain some errors, and thus the identity may be determined in consideration of an error range.

추적정보 생성단계(S50)에서는, 동일한 식별정보를 가지는 후보 객체들을 연속된 프레임 이미지 내에서 추적하여, 각각의 식별정보 별로 후보 객체들에 대한 추적정보를 생성할 수 있다. 즉, 각각의 프레임 이미지에서 추출된 후보 객체의 위치정보와 식별정보들을 결합하여 추적정보를 생성할 수 있으며, 추적정보를 이용하여 연속된 프레임 이미지에서 자연스럽게 이어지는 후보 객체의 위치변화 등을 추적할 수 있다. 여기서, 추적정보에는 후보 객체의 식별정보, 후보 객체가 등장하는 프레임 번호 등 프레임 정보, 프레임 이미지 내에 포함된 후보 객체의 위치정보 및 크기 정보 등을 포함할 수 있다. 또한, 실시예에 따라서는, 후보 객체의 포즈(pose)나, 얼굴 특징점, 의상 특징점 등에 대한 정보 등을 추적정보에 더 포함하는 것도 가능하다. 하나의 대상 동영상에는 복수의 추적정보들이 포함될 수 있으며, 각각의 추적정보는 후보객체의 식별정보별로 생성될 수 있다. In the tracking information generation step S50, candidate objects having the same identification information are tracked within a continuous frame image, and tracking information for candidate objects may be generated for each identification information. That is, tracking information can be generated by combining the location information and identification information of the candidate object extracted from each frame image, and the location change of the candidate object naturally continuing in the continuous frame image can be tracked using the tracking information. have. Here, the tracking information may include identification information of the candidate object, frame information such as a frame number in which the candidate object appears, location information and size information of the candidate object included in the frame image, and the like. In addition, depending on the embodiment, it is possible to further include information on a pose of a candidate object, facial feature points, clothing feature points, and the like in the tracking information. One target video may include a plurality of tracking information, and each tracking information may be generated for each identification information of a candidate object.

실시예에 따라서는, 연속된 프레임 이미지 내에 포함된 각각의 후보 객체들의 객체 특징벡터의 차이값과, 후보객체들의 위치 및 크기 변화를 이용하여, 동일한 후보 객체를 추적하도록 하는 것도 가능하다. 구체적으로, 추적오차값을 Error = (V1 - V2) + a × (d1-d2) + b × (s1-s2)를 이용하여 연산할 수 있다. 여기서, Error는 추적오차값, V1은 제1 프레임 이미지에 포함된 제1 후보객체의 객체 특징 백터, V2는 제2 프레임 이미지에 포함된 제2 후보객체의 객체 특징 백터, d1은 기준점으로부터 제1 후보객체의 중심점까지의 거리, d2는 기준점으로부터 제2 후보객체의 중심점까지의 거리, s1은 제1 후보객체의 면적, s2는 제2 후보객체의 면적, a, b는 가중치로 임의의 상수에 해당한다. Depending on the embodiment, it is also possible to track the same candidate object by using the difference value of the object feature vector of each candidate object included in the continuous frame image and the position and size change of the candidate objects. Specifically, the tracking error value can be calculated using Error = (V1-V2) + a × (d1-d2) + b × (s1-s2). Here, Error is the tracking error value, V1 is the object feature vector of the first candidate object included in the first frame image, V2 is the object feature vector of the second candidate object included in the second frame image, and d1 is the first from the reference point. The distance to the center point of the candidate object, d2 is the distance from the reference point to the center point of the second candidate object, s1 is the area of the first candidate object, s2 is the area of the second candidate object, and a, b are weights and It corresponds.

한편, 도시하지 않았으나, 실시예에 따라서는, 오류감지단계를 더 포함할 수 있다. 즉, 오류감지단계에서는 후보 객체에 대한 식별정보 설정오류를 감지할 수 있으며, 식별정보 설정오류가 감지되면 사용자 등에게 오류발생을 알릴 수 있다. 예를들어 동일한 식별정보를 가지는 후보 객체가 연속하는 프레임 이미지 내에서 일부 누락되는 경우가 발생할 수 있으며, 이 경우 각각의 프레임 이미지 사이의 간격은 매우 짧기 때문에, 후보 객체에 대한 식별정보 설정에 오류가 발생한 것으로 판별할 수 있다. On the other hand, although not shown, according to an embodiment, an error detection step may be further included. That is, in the error detection step, an error in setting identification information for a candidate object can be detected, and when an error in setting identification information is detected, the occurrence of an error can be notified to a user. For example, it may happen that some candidate objects having the same identification information are missing in a continuous frame image. In this case, since the interval between each frame image is very short, there is an error in setting identification information for the candidate object. It can be determined that it has occurred.

마스킹 단계(S60)에서는, 마스킹 입력을 수신하면, 추적정보를 이용하여 마스킹 입력에 포함된 식별정보와 동일한 후보 객체들을 추출하고, 추출한 후보 객체들에 대한 마스킹을 수행할 수 있다. 즉, 선택된 후보 객체들에 대하여, 선별적으로 마스킹을 수행할 수 있으며, 마스킹을 수행할 후보 객체들은 식별정보를 이용하여 특정할 수 있다. 여기서, 마스킹 입력은 사용자로부터 입력받은 식별정보를 이용하여 생성하거나, 미리 설정된 선택 알고리즘 등에 따라 추출한 식별정보를 이용하여 생성할 수 있다. In the masking step (S60), upon receiving the masking input, candidate objects identical to identification information included in the masking input may be extracted using tracking information, and masking of the extracted candidate objects may be performed. That is, masking may be selectively performed on selected candidate objects, and candidate objects to be masked may be specified using identification information. Here, the masking input may be generated using identification information input from a user, or may be generated using identification information extracted according to a preset selection algorithm.

한편, 마스킹 단계(S60)에서는 추적정보를 이용하여, 선택된 식별정보에 대응하는 후보객체가 등장하는 프레임 이미지들을 추출할 수 있다. 이후, 추출한 프레임 이미지에서 나타난 후보객체의 위치에 대응하여 마스킹 영역을 설정할 수 있으며, 설정된 마스킹 영역을 마스킹하여 사용자들이 식별할 수 없도록 비식별화할 수 있다. 여기서, 마스킹 영역은 후보객체 중에서 얼굴에 해당하는 영역 등으로 한정하여 설정할 수 있다. Meanwhile, in the masking step (S60), frame images in which a candidate object corresponding to the selected identification information appears may be extracted using the tracking information. Thereafter, a masking area may be set corresponding to a position of a candidate object appearing in the extracted frame image, and the set masking area may be masked to de-identify so that users cannot identify it. Here, the masking area may be limited to an area corresponding to a face among candidate objects and set.

이때, 영상 마스킹 장치는 마스킹 영역을 블러링(blurring)하거나 모자이크(Mosaic) 처리하는 등의 방식으로 마스킹할 수 있다. 여기서, 블러링은 저역통과필터를 이용하여 구현할 수 있다. 또한, 실시예에 따라서는 마스킹 영역을 단색이나 특정 패턴, 별도의 이미지나 애니메이션, 캐릭터 등으로 덮는 이미지 치환을 이용하여, 마스킹하는 것도 가능하다. In this case, the image masking apparatus may mask the masked area in a manner such as blurring or mosaic processing. Here, blurring can be implemented using a low-pass filter. In addition, depending on the embodiment, it is possible to mask the masking area using a single color, a specific pattern, a separate image, an animation, or an image substitution covering a character.

추가적으로, 영상마스킹장치는 마스킹이 합성된 영상을 인코딩할 수 있으며, 영상출력부 등을 이용하여 마스킹된 영상이 출력되도록 할 수 있다. 한편, 실시예에 따라서는, 대상 동영상에 설정된 마스킹 영역에 대한 마스킹 정보와, 대상 동영상을 각각 별도로 저장하는 것도 가능하다. 즉, 대상 동영상의 원본파일을 별도로 저장한 후, 대상 동영상의 재생시 마스킹 정보를 이용하여 대상 동영상을 마스킹하도록 할 수 있다. Additionally, the image masking apparatus may encode an image obtained by synthesizing the masking, and may output the masked image using an image output unit or the like. Meanwhile, depending on the embodiment, it is possible to separately store masking information for a masking area set in the target video and the target video. That is, after separately storing the original file of the target video, the target video may be masked using masking information when the target video is played.

추가적으로, 본 발명의 일 실시예에 의한 영상마스킹방법은, 사용자가 마스킹할 후보 객체들을 선택하도록 할 수 있으며, 이 경우 사용자의 선택의 편의성을 높이기 위한 구성들을 더 포함할 수 있다.Additionally, the image masking method according to an embodiment of the present invention may allow a user to select candidate objects to be masked, and in this case, may further include configurations for enhancing user selection convenience.

구체적으로, 객체 추적 영상 생성단계(미도시)를 더 포함하여, 추적정보를 이용하여 후보 객체와 후보 객체별 식별정보를 대상 동영상에 오버레이(overlay)하여, 객체 추적영상을 생성할 수 있다. 즉, 후보 객체를 나타내는 경계 박스와 식별정보를 대상 동영상과 함께 표시하는 방식으로, 객체 추적영상을 생성할 수 있다. Specifically, an object tracking image may be generated by further including an object tracking image generation step (not shown), and overlaying the candidate object and identification information for each candidate object on the target video using the tracking information. That is, an object tracking image may be generated by displaying a bounding box representing a candidate object and identification information together with a target video.

또한, 마스킹 선택 인터페이스 표시단계(미도시)를 더 포함하여, 사용자가 마스킹할 후보 객체들을 선택할 수 있도록 마스킹 선택 인터페이스를 제공할 수 있다. 구체적으로, 마스킹 선택 인터페이스는, 대상 동영상 내에 포함된 후보 객체들에 대응하는 식별정보를 정렬한 식별정보 리스트가 표시될 수 있으며, 동일한 식별정보를 가지는 후보 객체들이 대상 동영상 내에 등장하는 등장구간정보와 등장한 프레임 이미지의 개수 등이 표시될 수 있다. 또한, 사용자가 각각의 식별정보에 대응하는 후보 객체들을 확인할 수 있도록, 후보 객체가 나타난 프레임 이미지의 이미지 등을 더 포함할 수 있다. In addition, a masking selection interface may be provided so that the user may select candidate objects to be masked by further including a displaying step (not shown) of the masking selection interface. Specifically, the masking selection interface may display an identification information list in which identification information corresponding to candidate objects included in the target video is arranged, and candidate objects having the same identification information appear in the target video. The number of frame images that have appeared may be displayed. In addition, it may further include an image of a frame image in which the candidate object appears so that the user can check candidate objects corresponding to each identification information.

전술한 본 발명은, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다. 따라서, 상기의 상세한 설명은 모든 면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.The present invention described above can be implemented as a computer-readable code in a medium on which a program is recorded. The computer-readable medium may be one that continuously stores a program executable by a computer, or temporarily stores a program for execution or download. In addition, the medium may be a variety of recording means or storage means in a form in which a single piece of hardware or several pieces of hardware are combined, but is not limited to a medium directly connected to a computer system, and may be distributed on a network. Examples of media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical recording media such as CD-ROMs and DVDs, magnetic-optical media such as floptical disks, and And ROM, RAM, flash memory, and the like, and may be configured to store program instructions. In addition, examples of other media include an app store that distributes applications, a site that supplies or distributes various software, and a recording medium or storage medium managed by a server. Therefore, the detailed description above should not be construed as restrictive in all respects and should be considered as illustrative. The scope of the present invention should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of the present invention are included in the scope of the present invention.

본 발명은 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 있어, 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 본 발명에 따른 구성요소를 치환, 변형 및 변경할 수 있다는 것이 명백할 것이다.The present invention is not limited by the above-described embodiments and the accompanying drawings. It will be apparent to those of ordinary skill in the art to which the present invention pertains, that components according to the present invention can be substituted, modified, and changed within the scope of the technical spirit of the present invention.

1: 영상촬영장치 100: 영상마스킹장치
110: 프레임 입력부 120: 특징벡터 추출부
130: 식별정보 설정부 140: 추적정보 생성부
150: 오류감지부 160: 마스킹부
170: 객체추적영상생성부 180: 마스킹 선택 인터페이스표시부
200: 식별정보 데이터베이스
S10: 프레임 입력단계 S20: 후보객체 검출단계
S30: 객체 특징벡터 추출단계 S40: 식별정보 설정단계
S50: 추적정보 생성단계 S60: 마스킹 단계 1: image recording device 100: image masking device
110: frame input unit 120: feature vector extraction unit
130: identification information setting unit 140: tracking information generation unit
150: error detection unit 160: masking unit
170: object tracking image generation unit 180: masking selection interface display unit
200: identification information database
S10: frame input step S20: candidate object detection step
S30: object feature vector extraction step S40: identification information setting step
S50: tracking information generation step S60: masking step

Claims

Generating a frame image in units of frames by decoding the input target video;
Extracting a frame feature vector corresponding to each frame image from the frame image, and detecting a candidate object from the frame feature vector;
Extracting an object feature vector corresponding to each candidate object from the frame feature vector using the position information of the detected candidate objects;
Setting identification information for the candidate objects by using the object feature vector;
Tracking candidate objects having the same identification information within a continuous frame image, and generating tracking information for the candidate objects for each identification information; And
Upon receiving the masking input, extracting candidate objects identical to the identification information included in the masking input using the tracking information, and performing masking on the extracted candidate objects,
The step of setting the identification information
Searching an identification information database for a registered feature vector corresponding to the object feature vector;
When a registration feature vector corresponding to the object feature vector is searched, extracting identification information matched with the registration feature vector from the identification information database and setting it as the identification information of the candidate object; And
If the registration feature vector corresponding to the object feature vector is not searched, newly generating identification information of the candidate object, and newly registering the object feature vector and the identification information in the identification information database,
The step of performing the masking
And extracting frame images in which a candidate object corresponding to the identification information appears using the tracking information, and setting a position of the candidate object included in the extracted frame image as a masking area.

delete

The method of claim 1, wherein the step of searching in the identification information database
And if the object feature vector matches the registered feature vector within a preset error range, it is determined that the object feature vector corresponds to the registered feature vector.

The method of claim 1,
The image masking method further comprising an error detection step of determining that an error has occurred in setting the identification information for the candidate object when a part of the candidate object having the same identification information is omitted in the continuous frame image.

The method of claim 1, wherein the tracking information
And at least one of identification information of the candidate object, frame information in which the candidate object appears, location information and size information of the candidate object included in the frame image.

The method of claim 1,
And generating an object tracking image in which the candidate object and the identification information for each candidate object are overlaid on the target video by using the tracking information.

The method of claim 6,
Masking including an identification information list in which identification information corresponding to candidate objects included in the target video is arranged, information about an appearance section in which candidate objects having the same identification information appear in the target video, and a frame image in which the candidate object appears The image masking method further comprising the step of displaying a selection interface.

The method of claim 1, wherein the masking input
An image masking method comprising generating by using identification information input from a user or by extracting the identification information according to a preset selection algorithm.

delete

The method of claim 1, wherein performing the masking
An image masking method comprising masking the masked area by using blurring, mosaic processing, or image replacement.

The method of claim 1, wherein performing the masking
And storing masking information for a masking area set in the target video and the target video, respectively, and masking and playing the target video using the masking information when the target video is played.

The method of claim 11, wherein performing the masking
A video masking method comprising: requesting access rights when playing the target video, and masking and playing the target video when there is no access right.

A computer program stored in a medium to execute the image masking method of any one of claims 1, 3 to 8, and 10 to 12 in combination with hardware.

A frame input unit that decodes the received target video and generates a frame image in units of frames;
Extracting a frame feature vector corresponding to each frame image from the frame image, detecting a candidate object from the frame feature vector, and using the position information of the detected candidate objects, each candidate object from the frame feature vector A feature vector extracting unit for extracting an object feature vector corresponding to;
An identification information setting unit for setting identification information for the candidate objects by using the object feature vector;
A tracking information generator configured to track candidate objects having the same identification information within a continuous frame image, and generate tracking information for the candidate objects for each identification information; And
Upon receiving a masking input, comprising a masking unit for extracting candidate objects identical to the identification information included in the masking input by using the tracking information, and performing masking on the extracted candidate objects,
The identification information setting unit
When a registration feature vector corresponding to the object feature vector is searched in an identification information database, and when a registration feature vector corresponding to the object feature vector is searched, identification information matched with the registered feature vector is extracted from the identification information database, Set as the identification information of the candidate object, and if the registration feature vector corresponding to the object feature vector is not searched, the identification information of the candidate object is newly generated, and the object feature vector and the identification information are stored in the identification information database Register for a new one,
The masking part
And extracting frame images in which a candidate object corresponding to the identification information appears using the tracking information, and setting a position of the candidate object included in the extracted frame image as a masking area.

delete