KR20010067778A

KR20010067778A - Image/video processing method and image/video object segmentation method

Info

Publication number: KR20010067778A
Application number: KR1020010015242A
Authority: KR
Inventors: 원치선
Original assignee: 원치선
Priority date: 2001-03-23
Filing date: 2001-03-23
Publication date: 2001-07-13

Abstract

PURPOSE: A method for treating an image/video and a method for segmenting an image/video object are provided to automatically segment the image/video without intervening by a user. CONSTITUTION: A semantic object is defined in an image. A pixel value of pixels in the defined semantic object is combined with a predetermined label based on a water marking, thereby labelling the semantic object. When the semantic object is defined in the image, an original image is divided into small blocks without overlapping small blocks with each other(202). In addition, the small blocks are grouped by a user in such a manner that the small blocks belonged to the same object are grouped(204). Then, image data including the labeled semantic object is compressed by using a predetermined image compression method(210). The compressed image data is stored in a predetermined data base(212).

Description

Image / video processing method and image / video object segmentation method

본 발명은 영상/비디오 데이터 처리 방법에 관한 것으로, 더 상세하게는 사용자의 개입없이 자동적으로 객체를 분할하기 위한 영상/비디오 데이터 처리 방법에 관한 것이다.The present invention relates to a video / video data processing method, and more particularly, to a video / video data processing method for automatically segmenting an object without user intervention.

정지 영상 또는 비디오로부터의 의미적 객체의 추출은 콘텐츠 기반의 영상 처리를 위하여 필수적이다. 예를들어, 객체 기반의 MPEG-4 인코더내에서의 인코딩 과정과 형상 기술자와 같은 다양한 MPEG-7 기술자를 추출하는 과정에서 비디오 객체 분할이 전처리 과정으로써 필요하다.Extraction of semantic objects from still images or video is essential for content-based image processing. For example, video object segmentation is required as a preprocessing process in the encoding process in object-based MPEG-4 encoders and the extraction of various MPEG-7 descriptors such as shape descriptors.

하지만, 현재까지는 컴퓨터를 사용하여 완전 자동으로 객체 분할을 하는 방법은 알려져 있지 않다. 즉, 의미적 객체를 분할하는 과정에서 사용자의 도움이 요구된다. 사람의 개입이 필요하기 때문에 콘텐츠 기반의 영상 및 비디오 처리는 완전 자동으로 실행되지 않는 것이며, 이로써 온라인 시스템을 구현하는데 어려움이 있다.However, until now, it is not known how to use a computer to automatically segment objects. In other words, the user's help is required in the process of segmenting semantic objects. Because of the human intervention, content-based video and video processing is not fully automatic, which makes it difficult to implement an online system.

이러한 문제를 해결하기 위하여, 영상 또는 비디오의 분할에서 사람의 개입을 최소화 또는 단순화한 방법이 C. Gu 및 M. C. Lee에 의하여 1998년 9월에 발표된 "Semiautomatic segmentation and tracking of semantic video objects(의미적비디오의 반자동적 분할 및 트랙킹)", IEEE Transaction On Circuit and System for Video Technology, vol. 8, no. 5, pp. 572-584과, D. K. Park, H. S. Yoon, 및 C. S. Won에 의하여 2000년 8월에 발표된 "Fast object tracking in digital video(디지털 비디오내에서의 고속 트랙킹)", IEEE Transaction On Consumer Electronics, vol. 46, no. 3, pp. 785-790, 및 "System and method for semantic video object segmentation(의미적 비디오 객체의 분할 시스템 및 방법)"이라는 제목으로 1999년 9월 24일자로 The Trustees of Columbia University in the city of New York에 의하여 출원된 PCT 출원 번호 PCT/US99/22264호에 개시되어 있다.To solve this problem, a method of minimizing or simplifying human involvement in the segmentation of video or video is described by C. Gu and MC Lee in September 1998, "Semiautomatic segmentation and tracking of semantic video objects. Semi-automatic Segmentation and Tracking of Video ", IEEE Transaction On Circuit and System for Video Technology, vol. 8, no. 5, pp. 572-584, "Fast object tracking in digital video" published in August 2000 by D. K. Park, H. S. Yoon, and C. S. Won, IEEE Transaction On Consumer Electronics, vol. 46, no. 3, pp. 785-790, and filed by The Trustees of Columbia University in the city of New York, dated September 24, 1999, entitled "System and method for semantic video object segmentation". PCT Application No. PCT / US99 / 22264.

상기와 같은 자료들 또는 특허자료에 개시된 객체 분할 방법들은 오프라인으로 사용자의 지원이 필요로 하고 있다. 더욱이, 상기 객체 분할 방법들에서 사용된 기법들은 저장성이 없다. 따라서, 종래의 객체 분할 방법에 따르면, 이는 사람의 지원이 보장되어야 하고, 반복적인 분할 과정이 오프라인을 기반으로 여전히 수행될 뿐만 아니라 의미적 객체 추출 작업이 필요로 할 때마다 동일한 반자동적 객체 분할 과정이 반복되어야 한다는 문제점이 있다.The object partitioning methods disclosed in the above-mentioned materials or patent documents require the support of the user offline. Moreover, the techniques used in the object partitioning methods are not storage. Therefore, according to the conventional object segmentation method, this requires not only human support, but also the repetitive segmentation process is still performed on an offline basis, and the same semi-automatic object segmentation process whenever a semantic object extraction task is required. There is a problem that this must be repeated.

본 발명이 이루고자 하는 기술적 과제는 영상내에서 사용자의 개입없이 자동적으로 객체를 분할할 수 있도록 영상을 처리하는 영상 처리 방법을 제공하는 것이다.An object of the present invention is to provide an image processing method for processing an image so that an object can be automatically segmented without user intervention in the image.

본 발명이 이루고자 하는 다른 기술적 과제는 상기 영상 처리 방법을 수행하는 영상 처리 장치를 제공하는 것이다.Another object of the present invention is to provide an image processing apparatus for performing the image processing method.

본 발명이 이루고자 하는 또 다른 기술적 과제는 상기와 같은 영상 데이터 처리 방법에 의하여 처리된 영상으로부터 의미적 객체를 분할하는 방법을 제공하는 것이다.Another object of the present invention is to provide a method of segmenting semantic objects from an image processed by the image data processing method as described above.

본 발명이 이루고자 하는 또 다른 기술적 과제는 상기 영상 분할 방법을 수행하는 영상 분할 장치를 제공하는 것이다.Another object of the present invention is to provide an image segmentation apparatus which performs the image segmentation method.

본 발명이 이루고자 하는 또 다른 기술적 과제는 사용자의 개입없이 비디오내에서 자동적으로 객체를 분할할 수 있도록 비디오를 처리하는 비디오 처리 방법을 제공하는 것이다.Another technical problem to be solved by the present invention is to provide a video processing method for processing a video so that an object can be automatically segmented in the video without user intervention.

본 발명이 이루고자 하는 또 다른 기술적 과제는 상기와 같은 비디오 처리 방법에 의하여 처리된 비디오로부터 의미적 객체를 분할하는 방법을 제공하는 것이다.Another object of the present invention is to provide a method for segmenting semantic objects from video processed by the video processing method as described above.

도 1은 본 발명의 실시예에 따른 영상/비디오 처리 장치를 도시한 블록도이다.1 is a block diagram illustrating an image / video processing apparatus according to an exemplary embodiment of the present invention.

도 2는 도 1의 영상/비디오 처리 장치내에서 수행되는 본 발명의 실시예에 따른 영상/비디오 처리 방법의 주요 단계들을 나타낸 흐름도이다.2 is a flowchart illustrating main steps of an image / video processing method according to an embodiment of the present invention performed in the image / video processing apparatus of FIG. 1.

도 3a는 입력된 원 영상의 일예를 나타낸 도면이다.3A is a diagram illustrating an example of an input original image.

도 3b에는 서로 겹치지 않는 작은 블록들로 나누어진 영상의 일예를 나타낸 도면이다.3B is a diagram illustrating an example of an image divided into small blocks that do not overlap each other.

도 3c는 라벨링 결과의 일예를 나타낸 도면이다.3C is a diagram illustrating an example of a labeling result.

도 3d는 워터마킹된 영상의 일예를 나타낸 도면이다.3D is a diagram illustrating an example of a watermarked image.

도 4는 본 발명의 실시예에 따른 영상/비디오 분할 장치의 구조를 도시한 블록도이다.4 is a block diagram illustrating a structure of an image / video segmentation apparatus according to an embodiment of the present invention.

도 5는 도 4의 영상/비디오 분할 장치내에서 수행되는 본 발명의 실시예에 따른 영상/비디오 분할 방법의 주요 단계들을 나타낸 흐름도이다.5 is a flowchart illustrating main steps of a video / video segmentation method according to an embodiment of the present invention performed in the video / video segmentation apparatus of FIG. 4.

도 6a에는 워터마킹된 영상의 일예를 나타낸 도면이다.6A illustrates an example of a watermarked image.

도 6b는 추출된 객체 라벨의 일예를 나타낸 도면이다.6B is a diagram illustrating an example of an extracted object label.

도 6c는 추출된 거친 객체의 일예를 나타낸 도면이다.6C is a diagram illustrating an example of an extracted rough object.

도 6d는 정밀한 의미적 객체의 일예를 나타낸 도면이다.6D is a diagram illustrating an example of a precise semantic object.

상기 기술적 과제를 이루기 위하여 본 발명에 따른 영상 처리 방법은 영상내에서 사용자의 개입없이 자동적으로 객체를 분할할 수 있도록 영상을 처리하는 방법에 있어서, (a) 영상내에서 의미적 객체를 정의하는 단계; 및 (b) 정의된 의미적 객체 내의 픽셀들의 픽셀값과 소정의 라벨을 워터마킹을 기반으로 조합함으로써 의미적 객체를 라벨링하는 단계;를 포함하는 것을 특징으로 한다.In order to achieve the above technical problem, an image processing method according to the present invention is a method of processing an image so that an object can be automatically segmented without user intervention in the image, the method comprising the steps of: (a) defining a semantic object in the image; ; And (b) labeling the semantic object by combining the pixel value of the pixels in the defined semantic object with a predetermined label based on watermarking.

또한, 상기 (a) 단계는, (a-1) 원 영상을 서로 겹치지 않는 작은 블록들로 나누는 단계; 및 (a-2) 사용자의 개입으로 작은 블록들을 동일한 객체에 속하는 그룹들로 그루핑하는 단계;를 포함하는 것이 바람직하다.In addition, the step (a), (a-1) dividing the original image into small blocks that do not overlap each other; And (a-2) grouping the small blocks into groups belonging to the same object with the user's intervention.

또한, 상기 (b) 단계 이후에, (c) 라벨링된 의미적 객체를 포함하는 영상 데이터를 소정의 영상 압축 방법을 사용하여 압축하는 단계; 및 (d) 압축된 영상 데이터를 소정의 데이터베이스내에 저장하는 단계;를 더 포함하는 것이 바람직하다.In addition, after the step (b), (c) compressing the image data including the labeled semantic object using a predetermined image compression method; And (d) storing the compressed image data in a predetermined database.

또한, 상기 (a) 단계는, (a') 인간의 도움을 받아 컴퓨터의 사용자 인터페이스를 사용하여 영상으로부터 의미적 객체를 정의하는 단계;를 포함하는 것이 바람직하다.In addition, the step (a), (a ') with the help of a human using a computer user interface to define a semantic object from the image; preferably includes.

또한, 상기 (b) 단계는, (b-1) 정의된 의미적 객체의 각 픽셀에 라벨을 할당하는 단계; 및 (b-2) 픽셀에 할당된 라벨을 해당 픽셀들의 그레이 레벨값에 감춤으로써 워터마킹을 수행하는 단계;를 포함하는 것이 바람직하다.In addition, step (b) may comprise: (b-1) assigning a label to each pixel of the defined semantic object; And (b-2) performing watermarking by hiding the label assigned to the pixel to the gray level value of the corresponding pixel.

또한, 상기 (b-2) 단계는, (b-2-1) 픽셀의 그레이 레벨값을, L은 워터마크를 LSB(Least Significant Bit)의 변화에 강인하게(robust) 만들기 위한 정수, K는 영상내의 분할된 객체의 총 수,는 절두된(truncated) 정수값,는 절대값, "mod"는 모듈러 연산을 나타낸다고 할 때, 워터마킹된 그레이 레벨값을에 의하여 결정하는 단계; 및 (b-2-2) 결정된 워터마킹된 그레이 레벨값으로 각 픽셀값을 바꾸는 단계;를 포함하는 것이 바람직하다.In addition, in the step (b-2), the (b-2-1) pixel The gray level of Where L is an integer to make the watermark robust to changes in the Least Significant Bit (LSB), K is the total number of segmented objects in the image, Is a truncated integer value, Is an absolute value, and "mod" represents a modular operation, the watermarked gray level value of Determining by; And (b-2-2) replacing each pixel value with the determined watermarked gray level value.

상기 다른 기술적 과제를 이루기 위하여 본 발명에 따른 영상 처리 장치는 사용자의 개입으로 입력된 영상으로부터 객체를 분할하는 분할부; 그루핑된 객체를 소정의 라벨로 라벨링하는 라벨링부; 및 워터마킹을 기반으로 원 영상과 라벨을 조합함으로써 워터마킹된 영상을 구하는 워터마킹부;를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an image processing apparatus including: a divider for dividing an object from an image input through user intervention; A labeling unit to label the grouped object with a predetermined label; And a watermarking unit obtaining a watermarked image by combining the original image and the label based on the watermarking.

상기 또 다른 기술적 과제를 이루기 위하여 본 발명에 따른 영상 분할 방법은 (a) 워터마킹된 영상으로부터 객체 라벨을 추출하는 단계; (b) 워터마킹된 영상내의 모든 픽셀들로부터 워터마킹된 그레이 레벨을 구하는 단계; 및 (c) 워터마킹된 그레이 레벨값으로부터 객체 라벨을 추출하는 단계;를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a method of segmenting an image, the method comprising: (a) extracting an object label from a watermarked image; (b) obtaining a watermarked gray level from all the pixels in the watermarked image; And (c) extracting the object label from the watermarked gray level value.

상기 또 다른 기술적 과제를 이루기 위하여 본 발명에 따른 영상 분할 장치는 워터마킹된 영상으로부터 객체 라벨을 추출하는 라벨 추출부; 추출된 객체 라벨들을 사용하여 거친 객체를 추출하는 거친 객체 추출부; 및 거친 객체를 다듬음으로써 정밀한 의미적 객체를 얻는 정밀 객체 추출부;를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an image segmentation apparatus, including: a label extractor configured to extract an object label from a watermarked image; A coarse object extracting unit which extracts a coarse object using the extracted object labels; And a precision object extractor which obtains a precise semantic object by trimming a rough object.

상기 또 다른 기술적 과제를 이루기 위하여 본 발명에 따른 비디오 처리 방법은 비디오 샷 내에서 사용자의 개입없이 자동적으로 객체를 분할할 수 있도록 비디오를 처리하는 방법에 있어서, (a) 입력된 비디오 샷의 첫 프레임을 서로 겹치지 않는 작은 블록들로 나누는 단계; (b) 사용자의 개입으로 작은 블록들을 동일한 객체에 속하는 그룹들로 그루핑함으로써 의미적 객체를 정의하는 단계; 및 (c) 비디오 샷의 시간적 도메인을 따라 트랙킹하면서 수동적으로 분할된 의미적 객체에 해당하는 객체들에 동일한 객체 라벨을 부여하는 단계;를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, a video processing method according to the present invention is a method of processing a video so that an object can be automatically segmented without user intervention in a video shot, the method comprising: (a) a first frame of an input video shot; Dividing the into small blocks that do not overlap each other; (b) defining semantic objects by grouping small blocks into groups belonging to the same object with the user's intervention; And (c) assigning the same object labels to the objects corresponding to the manually segmented semantic objects while tracking along the temporal domain of the video shot.

상기 또 다른 기술적 과제를 이루기 위하여 본 발명에 따른 비디오 분할 방법은 (a) 워터마킹된 비디오 샷의 첫 프레임으로부터 객체 라벨을 추출하는 단계; (b) 비디오 샷의 시간적 도메인을 따라 트랙킹하면서 첫 프레임으로부터 추출된 객체 라벨을 사용하여 입력된 워터마킹된 비디오로부터 거친 객체를 추출하는 단계; 및 (c) 거친 객체를 다듬음으로써 정밀한 의미적 객체를 얻는 단계;를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a video segmentation method comprising: (a) extracting an object label from a first frame of a watermarked video shot; (b) extracting coarse objects from the input watermarked video using the object labels extracted from the first frame while tracking along the temporal domain of the video shot; And (c) obtaining a precise semantic object by trimming a coarse object.

이하 첨부된 도면들을 참조하여 본 발명의 바람직한 실시예들을 상세히 설명하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1에는 본 발명의 실시예에 따른 영상/비디오 처리 장치를 블록도로써 나타내었다. 도 1을 참조하면 본 발명의 실시예에 따른 영상/비디오 처리 장치는 분할부(102), 라벨링부(104), 워터마킹부(106), 압축부(108), 및 데이터베이스(110)을 구비한다. 도 2에는 도 1의 영상/비디오 처리 장치내에서 수행되는 본 발명의 실시예에 따른 영상/비디오 처리 방법의 주요 단계들을 흐름도로써 나타내었다. 도 2는 이하에서 수시로 참조된다.1 is a block diagram of an image / video processing apparatus according to an exemplary embodiment of the present invention. Referring to FIG. 1, an image / video processing apparatus according to an exemplary embodiment of the present invention includes a division unit 102, a labeling unit 104, a watermarking unit 106, a compression unit 108, and a database 110. do. 2 is a flowchart illustrating main steps of an image / video processing method according to an embodiment of the present invention performed in the image / video processing apparatus of FIG. 1. 2 is often referenced below.

상기 영상/비디오 처리 장치의 동작을 설명한다. 분할부(102)는 사용자의 개입으로 입력된 영상으로부터 반자동 또는 수동적으로 객체를 분할한다. 본 실시예에서는, 상기 분할에 블록 기반의 반자동적 객체 분할 방법을 적용한다. 예를들어, 원 영상을 서로 겹치지 않는 작은 블록들로 나눈다(단계 202). 도 3a에는 입력된 원 영상의 일예를 나타내었으며, 도 3b에는 서로 겹치지 않는 작은 블록들로 나누어진 영상의 일예를 나타내었다. 다음으로, 사용자의 개입으로 작은 블록들을 동일한 객체에 속하는 그룹들로 그루핑한다(단계 204). 이러한, 그루핑 과정에서 사용자는 컴퓨터내의 소정의 프로그램에 의하여 제공되는 사용자 인터페이스를 사용하여 마우스 드래깅(mouse dragging) 또는 클릭킹(clicking)과 같은 조작을 함으로써 이루어질 수 있다. 설명의 편의를 위하여 본 실시예에서는 영상내에 오직 하나의 객체와 배경만이 존재하는 경우를 설명한다.The operation of the video / video processing apparatus will be described. The dividing unit 102 divides the object semi-automatically or manually from the image input by the user's intervention. In this embodiment, the block-based semi-automatic object partitioning method is applied to the partitioning. For example, the original image is divided into small blocks that do not overlap each other (step 202). 3A illustrates an example of an input original image, and FIG. 3B illustrates an example of an image divided into small blocks that do not overlap each other. Next, the user's intervention groups the small blocks into groups belonging to the same object (step 204). In the grouping process, the user may make a manipulation such as mouse dragging or clicking by using a user interface provided by a predetermined program in the computer. For convenience of description, the present embodiment describes a case in which only one object and a background exist in the image.

다음으로, 라벨링부(104)는 그루핑된 객체를 소정의 라벨로 라벨링한다(단계 206). 도 3c에는 라벨링 결과의 일예를 나타내었다. 도 3c를 참조하면, 하나의 그룹으로 그루핑된 객체에 속하는 블록들이 이진수 "1"로 라벨링되고, 배경에 속하는 블록들이 이진수 "0"으로 라벨링되어 있다. 이로써, 모든 블록들에 대하여 일비트 정보가 할당될 수 있다. 상기 블록들내의 각 블록은 다수의 픽셀들로 이루어지므로 결국 블록내의 모든 픽셀들에 동일한 일비트 정보가 할당된다.Next, the labeling unit 104 labels the grouped object with a predetermined label (step 206). Figure 3c shows an example of the labeling results. Referring to FIG. 3C, blocks belonging to a grouped object as a group are labeled with binary "1", and blocks belonging to the background are labeled with binary "0". In this way, one bit information may be allocated to all blocks. Each block in the blocks consists of a number of pixels, so that all the pixels in the block are assigned the same one bit of information.

다음으로, 워터마킹부(106)는 워터마킹을 기반으로 원 영상과 라벨을 조합함으로써 워터마킹된 영상을 구한다(단계 208). 본 실시예에서, 상기 워터마킹부(106)는 라벨값들을 해당 픽셀들의 그레이 레벨값에 감춤으로써 워터마킹을 수행한다. 예를들어,픽셀의 그레이 레벨값을라 할 때, 워터마킹된 그레이 레벨값는,Next, the watermarking unit 106 obtains the watermarked image by combining the original image and the label based on the watermarking (step 208). In the present exemplary embodiment, the watermarking unit 106 performs watermarking by hiding label values to gray level values of corresponding pixels. For example, pixel The gray level of Watermarked gray level Is,

에 의하여 결정된다. 결정된 워터마킹된 그레이 레벨값으로 각 픽셀값을 바꾼다. 이로써, 객체 라벨을 픽셀의 그레이 레벨값에 삽입한다. 여기서, L은 워터마크를 LSB(Least Significant Bit)의 변화에 강인하게(robust) 만들기 위한 정수, K는 영상내의 분할된 객체의 총 수를 나타낸다. 또한,는 절두된(truncated) 정수값을 나타내고,는 절대값을 나타내며, "mod"는 모듈러 연산을 나타낸다. 상기와 같은 워터마킹 처리는 객체 라벨을 그레이 레벨의 중간에 숨기는 것으로 이해될 수 있다. 따라서, 외부로부터의 섭동에 의한 LSB(Least Significant Bit)에서의 변화로부터 견딜 수 있다.Determined by Replace each pixel value with the determined watermarked gray level value. This makes the object label pixel Gray level value Insert in Here, L is an integer for making the watermark robust to changes in the Least Significant Bit (LSB), and K is the total number of divided objects in the image. Also, Represents a truncated integer value, Denotes an absolute value and "mod" denotes a modular operation. Such a watermarking process can be understood to hide the object label in the middle of the gray level. Thus, it can withstand changes in LSB (Least Significant Bit) due to perturbation from the outside.

상기와 같은 라벨링 과정은 영상내의 모든 픽셀들에 대하여 수행된다. 즉, 해당 픽셀이 영상내의 마지막 픽셀인지를 식별하여 마지막 픽셀이 아니면 다음 픽셀을 선택하여 라벨링 과정을 수행하고 마지막 픽셀이면 객체 라벨 삽입이 종료된다. 도 3d에는 워터마킹된 영상의 일예를 나타내었다. 도 3d를 참조하면, 워터마킹된 영상은 워터마킹의 속성으로 인하여 원 영상과 구별되지 않는다. 객체 라벨 삽입이 종료되면, 객체 라벨이 삽입된 영상을 압축(단계 210)하고, 압축된 영상을 데이터베이스(110)내에 저장(단계 212)된다.The labeling process is performed on all pixels in the image. That is, by identifying whether the corresponding pixel is the last pixel in the image, if not the last pixel, the next pixel is selected and the labeling process is performed. If the last pixel, the object label insertion is terminated. 3D shows an example of a watermarked image. Referring to FIG. 3D, the watermarked image is not distinguished from the original image due to the property of watermarking. When the insertion of the object label is finished, the image in which the object label is inserted is compressed (step 210), and the compressed image is stored in the database 110 (step 212).

상기와 같은 영상/비디오 처리 방법에 따르면 사용자 인터페이스의 도움으로 원 영상내에서에서 의미적 객체들이 정의되고, 원 영상에 객체 라벨들이 삽입된다. 객체 라벨이 삽입된 영상은 압축되어 데이터베이스내에 저장된다. 상기와 같은 영상/비디오 처리 방법에 따라 처리되어 데이터베이스내에 저장된 영상은 이후에 설명되어지는 바와 같이 자동적인 분할이 가능하다.According to the image / video processing method as described above, semantic objects are defined in the original image with the aid of a user interface, and object labels are inserted in the original image. The image with the object label inserted is compressed and stored in the database. Images processed according to the image / video processing method as described above and stored in the database can be automatically divided as described later.

한편, 다양한 콘텐츠 기반의 영상 및 비디오 응용에서 의미적 객체를 기반으로 한 영상의 검색(searching) 및 불러들임(retrieving)을 사용한다. 이를 위해서는 영상 및 비디오의 분할이 필수적이다.Meanwhile, search and retrieval of images based on semantic objects is used in various content-based image and video applications. To this end, segmentation of video and video is essential.

도 4에는 본 발명의 실시예에 따른 영상/비디오 분할 장치의 구조를 블록도로써 나타내었다. 도 4를 참조하면, 본 발명의 실시예에 따른 영상/비디오 분할 장치는 신장부(402), 라벨 추출부(404), 거친 객체 추출부(406), 및 정밀 객체 추출부(408)를 구비한다. 도 5에는 도 4의 영상/비디오 분할 장치내에서 수행되는 본 발명의 실시예에 따른 영상/비디오 분할 방법의 주요 단계들을 흐름도로써 나타내었다. 도 5는 이하에서 수시로 참조된다.4 is a block diagram illustrating a structure of an image / video segmentation apparatus according to an embodiment of the present invention. Referring to FIG. 4, an image / video segmentation apparatus according to an exemplary embodiment of the present invention includes a stretcher 402, a label extractor 404, a coarse object extractor 406, and a precision object extractor 408. do. 5 is a flowchart illustrating main steps of a video / video segmentation method according to an embodiment of the present invention performed in the video / video segmentation apparatus of FIG. 4. 5 is often referenced below.

상기 영상/비디오 분할 장치의 동작을 설명한다. 신장부(402)는 데이터베이스(110)로부터 불러들여진 압축된 영상을 신장함으로써 워터마킹된 영상을 얻는다(단계 502). 도 6a에는 워터마킹된 영상의 일예를 나타내었다. 다음으로, 라벨 추출부(404)는 워터마킹된 영상으로부터 객체 라벨을 추출한다(단계 504). 도 6b에는 추출된 객체 라벨의 일예를 나타내었다. 단계(504)는 워터마킹된 영상내의 모든 픽셀들로부터 워터마킹된 그레이 레벨을 구하고, 워터마킹된 그레이 레벨값으로부터 객체 라벨을 추출함으로써 이루어진다. 즉, 본 실시예에서는, 라벨로 워터마킹된 그레이 레벨값으로부터 객체 라벨값을,The operation of the video / video segmentation apparatus will be described. The decompression unit 402 obtains the watermarked image by decompressing the compressed image loaded from the database 110 (step 502). 6A shows an example of a watermarked image. Next, the label extractor 404 extracts the object label from the watermarked image (step 504). 6B illustrates an example of the extracted object label. Step 504 is accomplished by obtaining a watermarked gray level from all the pixels in the watermarked image and extracting the object label from the watermarked gray level value. That is, in this embodiment, the gray level value watermarked with the label The object label value from of,

에 따라 추출한다. 예를들어,상의 픽셀의 그레이 레벨값이 30이고, 상기 픽셀은 x(i,j)=2, 즉, k=2로 라벨링된 객체에 속한다고 가정하면, K=3이고 L=4인 경우,를 얻는다. 다음으로, 워터마킹된 그레이 레벨은 수학식 1에 따라=30 + 41 = 34가 된다. 이제,= 34로부터, 숨겨진 객체 라벨은가 된다. 여기서, 워터마킹된 영상이 그레이 레벨에 각종 교란(perturbation)을 유발할 수 있는 영상 처리 과정을 겪는다 하더라도 정확한 객체 라벨을 추출할 수 있다는 것에 주목하여야 한다. 32 내지 35로 변화되는 그레이 레벨 교란은 받아들질 수 있으며 올바른 객체 라벨을 얻을 수 있다. 상기와 같은 방법에 따르면 그레이 레벨의 중간 부분 정도에 객체 라벨을 숨김으로써 LSB의 변화에 강인하다. 또한, 정수값 L이 증가하면 압축 및 스케일링과 같은 비의도적인 교란에 의하여 유발되는 변화에 강인해진다. 하지만, L이 증가하면 객체 라벨의 삽입이 보다 상위 비트 레벨에서 이루어지기 때문에 특정 픽셀에서 시각적으로 인식될 수 있을 정도의 열화를 초래할 수 있다는 문제점이 발생하므로 L은 적절히 결정되어야 한다.Extract according to. E.g, The gray level of the pixels on the image Is 30 and the pixel belongs to an object labeled x (i, j) = 2, that is, k = 2, where K = 3 and L = 4, Get Next, the watermarked gray level is calculated according to equation (1). = 30 + 4 1 = 34. now, Hidden Object Labels silver Becomes Here, it should be noted that even if the watermarked image undergoes an image processing process that may cause various perturbations on the gray level, an accurate object label can be extracted. Gray level disturbances varying from 32 to 35 are acceptable and the correct object labels can be obtained. According to the above method, the object label is hidden in the middle part of the gray level, thereby making it robust to the LSB change. In addition, increasing the integer value L is more robust to changes caused by unintended disturbances such as compression and scaling. However, since L increases, the insertion of an object label is performed at a higher bit level, which may cause a deterioration that can be visually recognized at a specific pixel. Therefore, L must be properly determined.

다음으로, 거친 객체 추출부는 추출된 객체 라벨들을 사용하여 거친 객체를 추출한다(단계 506). 도 6c에는 추출된 거친 객체의 일예를 나타내었다. 다음으로, 정밀 객체 추출부는 거친 객체를 다듬음으로써 정밀한 의미적 객체를 얻는다(단계 508). 상기 단계에서는 영역 성장 기법(region growing technique)을 포함한 일반적인 픽셀 기반의 영상 분할 방법들이 사용될 수 있다. 영역 성장 기법을 사용하면윤곽선 근처에서의 픽셀단위의 정밀한 분할이 가능하다. 또한, 압축등에 의한 비의도적인 영상 변형에 따른 영상의 잡음 또는 왜곡을 저역필터나 모폴로지컬 필터(mophological filter)등을 사용하여 바로잡을 수 있다. 도 6d에는 정밀한 의미적 객체의 일예를 나타내었다.Next, the coarse object extractor extracts the coarse object using the extracted object labels (step 506). 6C shows an example of the extracted rough object. Next, the precision object extractor obtains a precise semantic object by trimming the coarse object (step 508). In this step, general pixel-based image segmentation methods including a region growing technique may be used. The region growth technique allows for precise division of pixels around the contour. In addition, the noise or distortion of the image due to unintentional image deformation due to compression or the like can be corrected by using a low pass filter or a morphological filter. 6D shows an example of a precise semantic object.

이로써, 영상내의 의미적 객체들을 자동적으로 추출할 수 있다. 즉, 상기와 같은 영상/비디오 객체 분할 방법에 따르면, 인간의 개입이 필요하지 않다. 더욱이, 상기와 같은 영상/비디오 분할 방법에 따르면, 의미적 객체 추출 작업이 필요로 할 때마다 인간의 개입이 없이 완전 자동적으로 객체 분할이 이루어진다.Thus, semantic objects in the image can be automatically extracted. That is, according to the video / video object segmentation method as described above, no human intervention is required. Furthermore, according to the video / video segmentation method as described above, whenever a semantic object extraction task is required, object segmentation is fully automatic without human intervention.

상기와 같은 영상/비디오 객체 분할 방법에 따르면 MPEG-7을 위한 콘텐츠 기반의 인덱싱 및 MPEG-4와 같은 객체 기반의 압축을 사용하는 콘텐츠 기반의 영상 처리 응용이나 기타 다른 컴퓨터 비전 문제(computer vision problem)에서 데이터베이스로부터 불러들인 영상으로부터 인간의 개입이 없이 완전 자동으로 의미적 객체를 분할할 수 있다.According to the video / video object segmentation method described above, content-based image processing applications or other computer vision problems that use content-based indexing for MPEG-7 and object-based compression such as MPEG-4. We can segment semantic objects completely automatically without human intervention from images imported from databases.

이상의 실시예에서는 설명의 편의상 오직 영상만을 분할의 대상으로 설명하였다. 하지만, 상기와 같은 본 발명에 따른 방법들은 당업자에 의하여 이해되어지는 바와 같이 동영상을 포함한 다양한 비디오에 대해서도 적용할 수 있다.In the above embodiment, for convenience of description, only the image is described as the object of division. However, the method according to the present invention as described above can be applied to a variety of video including a video as will be understood by those skilled in the art.

이해를 돕기 위하여 보다 상세히 설명하면, 먼저, 사용자의 개입으로 입력된 비디오 샷의 첫 프레임내에서 반자동 또는 수동적으로 객체를 분할한다. 도 2를 참조하여 설명한 바와 같이 상기 분할에 블록 기반의 반자동적 객체 분할 방법을 적용할 수 있다. 즉, 입력된 비디오 샷의 첫 프레임을 서로 겹치지 않는 작은 블록들로 나눈 다음, 사용자의 개입으로 작은 블록들을 동일한 객체에 속하는 그룹들로 그루핑한다. 다음으로, 그루핑된 객체를 소정의 라벨로 라벨링한다. 이제, 비디오 샷의 시간적 도메인을 따라 트랙킹하면서 수동적으로 분할된 의미적 객체에 해당하는 객체들에 동일한 객체 라벨을 부여한다. 다음으로, 워터마킹을 기반으로 원 비디오와 라벨을 조합함으로써 워터마킹된 비디오를 구하고, 구한 워터마킹된 비디오를 압축하여 저장한다.To explain in more detail, first, the object is segmented semi-automatically or manually within the first frame of the video shot input by the user's intervention. As described with reference to FIG. 2, a block-based semi-automatic object partitioning method may be applied to the partitioning. That is, the first frame of the input video shot is divided into small blocks that do not overlap each other, and then, by user intervention, the small blocks are grouped into groups belonging to the same object. Next, the grouped object is labeled with a predetermined label. Now, while tracking along the temporal domain of the video shot, objects corresponding to the manually segmented semantic objects are given the same object label. Next, the watermarked video is obtained by combining the original video and the label based on the watermarking, and the obtained watermarked video is compressed and stored.

분할 과정에서는, 먼저, 데이터베이스로부터 압축된 비디오를 불러들이고 불러들여진 압축된 비디오를 신장함으로써 워터마킹된 영상을 얻는다. 다음으로, 워터마킹된 비디오 샷의 첫 프레임으로부터 객체 라벨을 추출한다. 다음으로, 비디오 샷의 시간적 도메인을 따라 트랙킹하면서 비디오 샷의 첫 프레임으로부터 추출된 객체 라벨을 사용하여 수동적으로 분할된 의미적 객체에 해당하는 객체들에 동일한 객체 라벨을 부여한다. 다음으로, 추출된 객체 라벨들을 사용하여 거친 객체를 추출하고, 거친 객체를 다듬음으로써 정밀한 의미적 객체를 얻는다.In the segmentation process, first, a watermarked image is obtained by importing compressed video from a database and stretching the imported compressed video. Next, an object label is extracted from the first frame of the watermarked video shot. Next, using the object label extracted from the first frame of the video shot while tracking along the temporal domain of the video shot, the same object label is assigned to the objects corresponding to the manually segmented semantic object. Next, extract the rough object using the extracted object labels, and refine the rough object to obtain a precise semantic object.

본 발명의 범위는 상기 실시예들에 한정되지 않으며, 첨부된 청구항들에 의하여 정의되는 본 발명의 범위내에서 당업자에 의하여 변형 또는 수정될 수 있다. 또한, 상기와 같은 본 발명에 따른 영상/비디오 처리 방법 및 영상/비디오 분할 방법은 개인용 또는 서버급의 컴퓨터내에서 실행되는 프로그램으로 작성 가능하다. 상기 프로그램을 구성하는 프로그램 코드들 및 코드 세그멘트들은 당해 분야의 컴퓨터 프로그래머들에 의하여 용이하게 추론될 수 있다. 또한, 상기 프로그램은 컴퓨터 독취 가능 기록 매체에 저장될 수 있다. 상기 기록 매체는 자기기록매체, 광기록 매체, 및 전파 매체를 포함한다.The scope of the present invention is not limited to the above embodiments, but may be modified or modified by those skilled in the art within the scope of the present invention as defined by the appended claims. In addition, the image / video processing method and the image / video segmentation method according to the present invention as described above can be written as a program executed in a personal or server-class computer. Program codes and code segments constituting the program can be easily inferred by computer programmers in the art. The program may also be stored in a computer readable recording medium. The recording medium includes a magnetic recording medium, an optical recording medium, and a propagation medium.

상술한 바와 같이 본 발명에 따르면 MPEG-7을 위한 콘텐츠 기반의 인덱싱 및 MPEG-4와 같은 객체 기반의 압축을 사용하는 콘텐츠 기반의 영상 처리 응용이나 기타 다른 컴퓨터 비전 문제(computer vision problem)에서 데이터베이스로부터 불러들인 영상으로부터 인간의 개입이 없이 완전 자동으로 의미적 객체를 분할할 수 있다.As described above, the present invention provides a method for content-based image processing applications that use content-based indexing for MPEG-7 and object-based compression, such as MPEG-4, from databases in other computer vision problems. From imported images, semantic objects can be segmented completely automatically without human intervention.

Claims

In a method of processing an image so that the object can be automatically segmented without user intervention in the image,

(a) defining a semantic object in the image; And

(b) labeling the semantic object by combining the pixel value of the pixels in the defined semantic object with a predetermined label based on watermarking.

According to claim 1, wherein the step (a),

(a-1) dividing the original image into small blocks that do not overlap each other; And

(a-2) grouping small blocks into groups belonging to the same object with the user's intervention.

The method of claim 1, wherein after step (b),

(c) compressing the image data including the labeled semantic object using a predetermined image compression method; And

(d) storing the compressed image data in a predetermined database.

According to claim 1, wherein the step (a),

(a ') defining a semantic object from an image using a user interface of a computer with the help of a human;

According to claim 1, wherein step (b),

(b-1) assigning a label to each pixel of the defined semantic object; And

(b-2) performing watermarking by hiding a label assigned to a pixel to gray level values of the corresponding pixels.

The method of claim 5, wherein step (b-2)

(b-2-1) pixels The gray level of Where L is an integer to make the watermark robust to changes in the Least Significant Bit (LSB), K is the total number of divided objects in the image, Is a truncated integer value, Is an absolute value, and "mod" represents a modular operation, the watermarked gray level value of Determining by; And

(b-2-2) replacing each pixel value with the determined watermarked gray level value.

An apparatus for processing an image to automatically segment an object in the image without user intervention,

A divider dividing an object from an image input by a user's intervention;

A labeling unit to label the grouped object with a predetermined label; And

And a watermarking unit configured to obtain a watermarked image by combining the original image and the label based on the watermarking.

The method of claim 7, wherein the divider,

And dividing the original image into small blocks that do not overlap each other, and grouping the small blocks into groups belonging to the same object by user intervention.

The method of claim 7, wherein the watermarking unit,

pixel The gray level of Where L is an integer to make the watermark robust to changes in the Least Significant Bit (LSB), K is the total number of segmented objects in the image, Is a truncated integer value, Is an absolute value, and "mod" represents a modular operation, The gray level of Watermarked gray level of Determining each pixel value and replacing each pixel value with the determined watermarked gray level value.

The method of claim 7, wherein

A compression unit compressing an image in which an object label is inserted; And

And a database for storing the compressed image. The image processing apparatus further comprises;

(a) extracting an object label from the watermarked image;

(b) obtaining a watermarked gray level from all the pixels in the watermarked image; And

and (c) extracting the object label from the watermarked gray level value.

The method of claim 11, wherein before step (a),

Obtaining a watermarked image by decompressing a compressed image loaded from a database.

The method of claim 11, wherein after step (c),

(d) extracting the coarse object using the extracted object labels; And

(e) obtaining a precise semantic object by trimming a coarse object.

The method of claim 13, wherein step (e)

An image segmentation method based on a region growing technique.

The method of claim 11, wherein step (c) comprises:

Gray level value watermarked with label The object label value from of, And extracting according to the image segmentation method.

A label extractor which extracts an object label from a watermarked image;

A coarse object extracting unit which extracts a coarse object using the extracted object labels; And

And a precision object extracting unit obtaining a precise semantic object by trimming a coarse object.

The method of claim 16,

And a decompressing unit which obtains a watermarked image by decompressing a compressed image loaded from a database.

In a method of processing video so that the object can be automatically segmented without user intervention within the video shot,

(a) dividing the first frame of the input video shot into smaller blocks that do not overlap each other;

(b) defining semantic objects by grouping small blocks into groups belonging to the same object with the user's intervention; And

and (c) assigning the same object labels to the objects corresponding to the manually segmented semantic objects while tracking along the temporal domain of the video shot.

The method of claim 18, wherein after step (c),

(d) obtaining the watermarked video by combining the original video and the label based on the watermarking.

The method of claim 19, wherein after step (d),

(e) compressing the watermarked video using a predetermined image compression method; And

(d) storing the compressed video in a predetermined database.

(a) extracting an object label from the first frame of the watermarked video shot;

(b) extracting coarse objects from the input watermarked video using the object labels extracted from the first frame while tracking along the temporal domain of the video shot; And

(c) obtaining a precise semantic object by trimming a coarse object.

The method of claim 21, wherein before step (a),

Obtaining a watermarked image by importing the compressed video from the database and decompressing the loaded compressed video.