KR20180065441A

KR20180065441A - Object extraction method and apparatus

Info

Publication number: KR20180065441A
Application number: KR1020160166299A
Authority: KR
Inventors: 정승원; 이존하
Original assignee: 동국대학교 산학협력단
Priority date: 2016-12-08
Filing date: 2016-12-08
Publication date: 2018-06-18
Also published as: KR101893142B1

Abstract

An object extraction method and an apparatus thereof are disclosed. The object region extraction method includes a step of receiving a color image; a step of receiving joint information on an object; a step of generating skeleton information on the object using the joint information; and a step of extracting an object region using color information and the skeleton information on the object in the color image. It is possible to extract the object region with high accuracy.

Description

Field of the Invention [0001] The present invention relates to an object extraction method and apparatus,

본 발명은 칼라 영상에서 객체 영역을 분리하여 추출하는 방법 및 그 장치에 관한 것이다.The present invention relates to a method and apparatus for separating and extracting object regions from a color image.

방송이나 영화에서 사람과 배경화면 합성 등의 특수 효과를 실제와 같이 만들기 위해 영상으로부터 배경과 사람을 분리하는 사람 영역 분할(Human Body Segmentation)기술은 매우 중요한 요소이다.Human Body Segmentation technology, which separates background and person from images in order to make realistic effects such as human and background composition in broadcasting or film, is a very important factor.

영상에서 사람 영역을 분할하는 종래의 기술은 사용하는 정보에 따라 크게 몇 가지로 나눌 수 있다.Conventional techniques for dividing a human region in an image can be roughly classified into several types according to information to be used.

첫 번째 방식은, 영상 내 색상 정보를 이용하는 방식으로, 해당 색상 정보를 토대로 사람의 색상 범위 안에 있는 위치를 찾거나 혹은 특징점 (feature vector)들을 추출하여 특징점들의 조합 및 매칭 (feature matching)으로 사람을 찾는 방식을 예로 들 수 있다. The first method is to use the color information in the image to search for a position in a human color range based on the corresponding color information or to extract feature vectors from the combination of feature points and search for features An example of how to find it.

두 번째 방식은, 색상 정보뿐만 아니라 깊이 정보 또한 이용하는 방식이 있다. 색상 정보는 빛과 조명의 상태에 따라 노이즈가 생기기 쉽기 때문에 빛과 조명에 큰 영향을 받지 않는 깊이 정보를 이용하여 정확도를 높이고자 한 방식이다.In the second method, not only color information but also depth information is used. Because color information is prone to noise depending on the state of light and lighting, it is a method of increasing accuracy by using depth information that is not greatly affected by light and illumination.

첫번째 방식의 경우 상술한 바와 같이 빛과 조명에 영향을 많이 받을 뿐만 아니라 사람 주변에 복잡하거나 색상이 유사한 배경이 있을 경우(예를 들어 사람의 색상과 비슷한 가구 등) 사람 영역 추출의 정확도가 현저히 떨어지는 현상이 나타난다. 이러한 문제를 해결하기 위해 방송이나 영화 스튜디오에서는 크로마 키를 이용한다. 크로마 키는 흔히 그린 스크린, 블루 스크린으로 불리기도 하며 영상 촬영 시 단색 배경 앞에서 촬영 후 편집 과정에서 단색을 지워 다른 영상과 합성하기 위해 사용되는 기술을 말한다.In the case of the first method, as described above, not only is it greatly affected by light and lighting, but also when there is a complicated or similar background color around a person (for example, furniture similar to a person's color) A phenomenon appears. Broadcast and film studios use chroma key to solve this problem. Chroma key is often referred to as a green screen or a blue screen. It is a technique used for composing other images by removing a single color in the editing process after shooting in front of a monochromatic background.

하지만, 크로마 키의 활용 분야는 미리 제작된 스튜디오나 영화 세트장 같은 경우에 한정되는 문제점이 있다. 즉, 사람이 실제 환경을 바탕으로 찍은 영상에서 사람 인식 및 분할을 하고자 할 때 이용할 수 없는 문제점이 있다.However, the field of application of the chroma key is limited to a case of a pre-fabricated studio or movie set. That is, there is a problem that a person can not be used when he / she wants to recognize and divide a person in an image taken based on a real environment.

두번째 방식의 경우 깊이 정보를 추가로 이용하기 때문에 복잡하거나 색상이 유사한 배경이 있더라도 카메라부터의 거리를 통해 이러한 배경을 어느 정도 걸러낼 수 있다. 하지만 깊이 정보를 획득하는 센서의 특성상 공간 해상도 (spatial resolution)이 다소 낮으며 취득된 깊이 값이 부정확한 경우가 많기 때문에 분할한 사람 영역의 경계가 부정확하거나 분할 결과에서 사람의 윤곽선이 깜빡거리는 "Flickering" 현상이 일어나는 문제점이 있다.In the second method, additional depth information is used, so that even if there are complicated or similar backgrounds, this background can be filtered to some extent through the distance from the camera. However, due to the nature of the sensor that acquires the depth information, the spatial resolution is rather low and the obtained depth value is often inaccurate. Therefore, the boundary of the divided person area is inaccurate, or the "Flickering "There is a problem that the phenomenon occurs.

상술한 바와 같이 종래의 사람 인식 및 분할에 사용되는 기초적인 정보인 색상 정보와 깊이 정보는 배경과의 혼동이나 센서 특성으로 인해 생기는 노이즈 때문에 신뢰도 측면에서 문제가 있다.As described above, color information and depth information, which are basic information used in conventional human recognition and segmentation, are problematic in terms of reliability due to confusion with the background and noise caused by sensor characteristics.

본 발명은 칼라 영상에서 객체 영역을 분리하여 추출하는 방법 및 그 장치를 제공하기 위한 것이다.The present invention provides a method and apparatus for separating and extracting object regions from a color image.

또한, 본 발명은 객체의 색상 정보 및 스켈레톤 정보를 이용하여 높은 정확도로 객체 영역을 추출할 수 있는 객체 영역 추출 방법 및 그 장치를 제공하기 위한 것이다. The present invention also provides an object region extracting method and apparatus for extracting an object region with high accuracy using color information and skeleton information of an object.

본 발명의 일 측면에 따르면, 칼라 영상에서 객체 영역을 추출하는 방법이 제공된다.According to an aspect of the present invention, a method of extracting an object region from a color image is provided.

본 발명의 일 실시예에 따르면, 칼라 영상을 입력받는 단계; 객체에 대한 조인트(joint) 정보를 입력받는 단계; 상기 조인트 정보를 이용하여 상기 객체에 대한 스켈레톤(skeleton) 정보를 생성하는 단계; 및 상기 칼라 영상에서 객체에 대한 색상 정보 및 상기 스켈레톤 정보를 이용하여 객체 영역을 추출하는 단계를 포함하는 객체 영역 추출 방법이 제공될 수 있다. According to an embodiment of the present invention, there is provided an image processing method comprising: receiving a color image; Receiving joint information on an object; Generating skeleton information for the object using the joint information; And extracting an object region using the color information of the object and the skeleton information in the color image.

본 발명의 다른 실시예에 따르면, 칼라 영상 및 깊이 정보를 입력받는 단계; 상기 깊이 정보를 이용하여 조인트(joint) 정보를 생성하는 단계; 상기 조인트 정보를 이용하여 상기 객체에 대한 스켈레톤 정보를 생성하는 단계; 및 상기 칼라 영상에서 객체에 대한 색상 정보 및 상기 스켈레톤 정보를 이용하여 객체 영역을 추출하는 단계를 포함하는 객체 영역 추출 방법이 제공될 수 있다. According to another embodiment of the present invention, there is provided an image processing method comprising the steps of: receiving a color image and depth information; Generating joint information using the depth information; Generating skeleton information for the object using the joint information; And extracting an object region using the color information of the object and the skeleton information in the color image.

상기 객체 영역을 추출하는 단계 이전에, 상기 스켈레톤 정보를 이용하여 상기 칼라 영상의 각 픽셀에 대한 스켈레톤 가중치 맵을 생성하는 단계를 더 포함할 수 있다. The method may further include generating a skeleton weight map for each pixel of the color image using the skeleton information before extracting the object region.

상기 스켈레톤 가중치 맵은 상기 칼라 영상의 각 픽셀이 상기 스켈레톤 정보에 포함된 스켈레톤 위치로부터 멀어질수록 상기 각 픽셀의 스켈레톤 가중치가 감쇠도록 생성될 수 있다.The skeleton weight map may be generated such that as each pixel of the color image is moved away from the skeleton position included in the skeleton information, the skeleton weight of each pixel is attenuated.

상기 스켈레톤 가중치 맵은 상기 스켈레톤 정보에 포함된 스켈레톤의 각 조인트에 대응하는 신뢰도에 따라 상기 각 픽셀의 스켈레톤 가중치 감소 정도를 달리하여 생성될 수 있다.The skeleton weight map may be generated by varying the degree of skeleton weight reduction of each pixel according to the reliability corresponding to each joint of the skeleton included in the skeleton information.

상기 스켈레톤 가중치 맵은 상기 스켈레톤 정보에 포함된 스켈레톤의 깊이 값에 따라 상기 각 픽셀의 스켈레톤 가중치 감소 정도를 달리하여 생성될 수 있다. The skeleton weight map may be generated by varying the degree of skeleton weight reduction of each pixel according to the depth value of the skeleton included in the skeleton information.

상기 객체 영역을 추출하는 단계는, 상기 칼라 영상의 각 픽셀을 각 노드로 하는 그래프 구조를 생성하고, 상기 각 노드를 전경 노드와 배경 노드로 각각 연결하는 단계; 상기 그래프 구조에서 상기 각 노드간의 에지, 상기 각 노드와 상기 전경 노드간의 에지 또는 상기 각 노드와 상기 배경 노드간의 에지에 대한 가중치를 할당하는 단계; 및 그래프컷(GraphCut) 알고리즘을 이용하여 상기 그래프 구조에서 에지를 절단함으로써 상기 객체 영역을 추출하는 단계를 할 수 있다.The extracting of the object region may include generating a graph structure having each pixel of the color image as each node, and connecting each of the nodes to a foreground node and a background node, respectively; Assigning weights for edges between each of the nodes, edges between the nodes and the foreground nodes, or edges between each of the nodes and the background node in the graph structure; And extracting the object region by cutting an edge in the graph structure using a GraphCut algorithm.

상기 에지의 가중치를 할당하는 단계는, 상기 각 픽셀의 색상이 기설정된 객체 영역의 색상 모델과의 유사 여부에 따라 상기 전경 노드로 연결되는 에지의 가중치가 증가되도록 할당하고, 상기 각 픽셀의 색상이 기설정된 배경 영역의 색상 모델과의 유사 여부에 따라 상기 배경 노드로 연결되는 에지의 가중치가 증가되도록 할당할 수 있다.Wherein the assigning of the weight of the edge assigns the weight of the edge connected to the foreground node to increase according to whether the color of each pixel is similar to the color model of the predetermined object region, The weight of the edge connected to the background node may be increased in accordance with the similarity with the color model of the background region.

상기 가중치를 할당하는 단계는, 상기 각 픽셀에 대응하는 스켈레톤 가중치 맵에 포함된 스켈레톤 가중치가 높을수록 상기 각 노드와 상기 전경 노드를 연결하는 에지의 가중치가 증가하도록 할당할 수 있다.The assigning of the weights may be such that as the skeleton weight included in the skeleton weight map corresponding to each pixel increases, the weights of the edges connecting each node and the foreground node increase.

본 발명의 다른 측면에 따르면 칼라 영상에서 객체 영역을 추출하기 위한 영상 처리 장치가 제공된다.According to another aspect of the present invention, there is provided an image processing apparatus for extracting an object region from a color image.

본 발명의 일 실시예에 따르면, 복수의 센서로부터 칼라 영상 및 객체에 대한 조인트 정보를 각각 획득하는 입력부; 상기 조인트 정보를 이용하여 상기 객체에 대한 스켈레톤(skeleton) 정보를 생성하는 스켈레톤 생성부; 및 상기 칼라 영상에서 객체에 대한 색상 정보 및 상기 스켈레톤 정보를 이용하여 객체 영역을 추출하는 추출부를 포함하는 영상 처리 장치가 제공될 수 있다.According to an embodiment of the present invention, an input unit for acquiring joint information on a color image and an object from a plurality of sensors, respectively; A skeleton generating unit for generating skeleton information for the object using the joint information; And an extracting unit for extracting an object region using color information of the object and the skeleton information in the color image.

본 발명의 다른 실시예에 따르면, 적어도 하나의 센서로부터 칼라 영상 및 깊이 정보를 획득하는 입력부; 상기 깊이 정보를 이용하여 조인트(joint) 정보를 생성하는 조인트 정보 생성부; 상기 조인트 정보를 이용하여 상기 객체에 대한 스켈레톤 정보를 생성하는 스켈레톤 생성부; 및 상기 칼라 영상에서 객체에 대한 색상 정보 및 상기 스켈레톤 정보를 이용하여 객체 영역을 추출하는 추출부를 포함하는 영상 처리 장치가 제공될 수 있다. According to another embodiment of the present invention, there is provided an image processing apparatus including an input unit for obtaining color image and depth information from at least one sensor; A joint information generating unit for generating joint information using the depth information; A skeleton generating unit for generating skeleton information on the object using the joint information; And an extracting unit for extracting an object region using color information of the object and the skeleton information in the color image.

상기 스켈레톤 정보를 이용하여 상기 칼라 영상의 각 픽셀에 대한 스켈레톤 가중치 맵을 생성하는 가중치 맵 생성부를 더 포함하되, And a weight map generator for generating a skeleton weight map for each pixel of the color image using the skeleton information,

상기 추출부는 상기 스켈레톤 가중치 맵을 이용하여 상기 객체 영역을 추출할 수 있다. The extractor may extract the object region using the skeleton weight map.

상기 추출부는, 상기 칼라 영상의 각 픽셀을 각 노드로 하는 그래프 구조를 생성한 후 상기 객체에 대한 색상 모델 및 상기 스켈레톤 가중치 맵을 이용하여 상기 각 노드와 상기 객체 영역 분할 대상이 되는 전경 노드로의 에지의 가중치를 할당하며, 그래프컷(GraphCut) 알고리즘을 이용하여 상기 그래프 구조에서 에지를 절단함으로써 상기 객체 영역을 추출할 수 있다.Wherein the extracting unit is configured to generate a graph structure having each pixel of the color image as each node, and generate a graph structure using the color model for the object and the skeleton weight map to the foreground node And the object region can be extracted by cutting an edge in the graph structure using a GraphCut algorithm.

본 발명의 일 실시예에 따른 객체 영역 추출 방법 및 그 장치를 제공함으로써, 칼라 영상에서 객체 영역을 추출함에 있어 색상 정보와 스켈레톤 정보를 이용함으로써 높은 정확도로 객체 영역을 정확하게 분리할 수 있는 이점이 있다.The method and apparatus for extracting an object region according to an embodiment of the present invention have an advantage of accurately separating an object region with high accuracy by using color information and skeleton information in extracting an object region from a color image .

도 1은 본 발명의 일 실시예에 따른 객체 영역 추출 방법을 나타낸 순서도,
도 2는 본 발명의 일 실시예에 따른 칼만 필터의 동작 과정을 도시한 도면.
도 3은 본 발명의 일 실시예에 따른 객체에 대한 조인트 정보를 시각화한 도면.
도 4는 본 발명의 일 실시예에 따른 스켈레톤 가중치 맵을 시각화하여 나타낸 도면.
도 5는 본 발명의 일 실시예에 따른 그래프컷을 설명하기 위해 도시한 도면.
도 6은 본 발명의 일 실시예에 따른 칼라 영상을 도시한 도면.
도 7은 본 발명의 일 실시예에 따른 깊이 영상에서 추출한 스켈레톤 정보를 칼라 영상에 투영한 영상을 나타낸 도면.
도 8은 본 발명의 일 실시예에 따른 깊이 영상에 스켈레톤 정보를 투영한 영상을 나타낸 도면.
도 9는 종래의 색상 정보만을 이용하여 객체 영역을 추출한 결과를 나타낸 도면.
도 10 및 도 11은 종래의 깊이 정보만을 이용하여 객체 영역을 추출한 결과를 나타낸 도면.
도 12는 본 발명의 일 실시예에 따른 칼라 영상에서 색상 정보와 스켈레톤 정보를 이용하여 객체 영역을 추출한 결과를 나타낸 도면.
도 13은 본 발명의 일 실시예에 따른 영상 처리 장치의 내부 구성을 개략적으로 도시한 블록도.1 is a flowchart illustrating an object region extracting method according to an embodiment of the present invention;
2 is a diagram illustrating an operation process of a Kalman filter according to an embodiment of the present invention.
Figure 3 is a visualization of joint information for an object in accordance with an embodiment of the present invention.
4 is a visualization of a skeleton weight map according to an embodiment of the present invention.
5 is a diagram illustrating a graph cut according to an embodiment of the present invention;
6 illustrates a color image according to an embodiment of the present invention.
FIG. 7 is a diagram illustrating an image of a skeleton information extracted from a depth image according to an exemplary embodiment of the present invention, on a color image. FIG.
FIG. 8 is a view illustrating an image in which skeleton information is projected on a depth image according to an embodiment of the present invention; FIG.
9 is a view showing a result of extracting an object region using only conventional color information.
FIGS. 10 and 11 are diagrams illustrating a result of extracting object regions using only conventional depth information. FIG.
12 is a view illustrating a result of extracting an object region using color information and skeleton information in a color image according to an exemplary embodiment of the present invention.
13 is a block diagram schematically showing an internal configuration of an image processing apparatus according to an embodiment of the present invention;

본 명세서에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "구성된다" 또는 "포함한다" 등의 용어는 명세서상에 기재된 여러 구성 요소들, 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다. 또한, 명세서에 기재된 "부", "모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.As used herein, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. In this specification, the terms "comprising ", or" comprising "and the like should not be construed as necessarily including the various elements or steps described in the specification, Or may be further comprised of additional components or steps. Also, the terms "part," "module, " and the like in the specification mean units for processing at least one function or operation, which may be implemented by hardware or software or by a combination of hardware and software.

이하, 첨부된 도면들을 참조하여 본 발명의 실시예를 상세히 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 객체 영역 추출 방법을 나타낸 순서도이고, 도 2는 본 발명의 일 실시예에 따른 칼만 필터의 동작 과정을 도시한 도면이고, 도 3은 본 발명의 일 실시예에 따른 객체에 대한 조인트 정보를 시각화한 도면이며, 도 4는 본 발명의 일 실시예에 따른 스켈레톤 가중치 맵을 시각화하여 나타낸 도면이고, 도 5는 본 발명의 일 실시예에 따른 그래프컷을 설명하기 위해 도시한 도면이며, 도 6은 본 발명의 일 실시예에 따른 칼라 영상을 도시한 도면이고, 도 7은 본 발명의 일 실시예에 따른 깊이 영상에서 추출한 스켈레톤 정보를 칼라 영상에 투영한 영상을 나타낸 도면이며, 도 8은 본 발명의 일 실시예에 따른 깊이 영상에 스켈레톤 정보를 투영한 영상을 나타낸 도면이고, 도 9는 종래의 색상 정보만을 이용하여 객체 영역을 추출한 결과를 나타낸 도면이며, 도 10 및 도 11은 종래의 깊이 정보만을 이용하여 객체 영역을 추출한 결과를 나타낸 도면이고, 도 12는 본 발명의 일 실시예에 따른 칼라 영상에서 색상 정보와 스켈레톤 정보를 이용하여 객체 영역을 추출한 결과를 나타낸 도면이다. FIG. 1 is a flowchart illustrating an object region extracting method according to an embodiment of the present invention. FIG. 2 is a flowchart illustrating an operation of a Kalman filter according to an embodiment of the present invention. FIG. FIG. 4 is a diagram illustrating a visualization of a skeleton weighting map according to an embodiment of the present invention. FIG. 5 is a view illustrating a graph cut according to an embodiment of the present invention. FIG. 6 is a diagram illustrating a color image according to an embodiment of the present invention. FIG. 7 is a diagram illustrating a color image of a skeleton information extracted from a depth image according to an exemplary embodiment of the present invention, FIG. 8 is a diagram illustrating an image in which skeleton information is projected on a depth image according to an exemplary embodiment of the present invention. FIG. 9 is a view illustrating an example of extracting an object region using only conventional color information. FIGS. 10 and 11 are views showing the result of extracting an object region using only conventional depth information. FIG. 12 is a view showing a result of extracting color information and skeleton information from a color image according to an embodiment of the present invention. And extracting the object region using the extracted region.

단계 110에서 영상 처리 장치(100)는 칼라 영상 및 깊이 정보를 각각 입력받는다.In step 110, the image processing apparatus 100 receives the color image and the depth information, respectively.

영상 처리 장치(100)는 후술하는 센서를 통해 칼라 영상 및 깊이 정보를 각각 입력받을 수 있다. The image processing apparatus 100 can receive the color image and the depth information through a sensor described later.

본 명세서에서는 이해와 설명의 편의를 도모하기 위해 센서가 칼라 영상과 깊이 정보를 동시에 생성하는 것을 가정하여 이를 중심으로 설명하나 복수의 센서를 통해 칼라 영상과 깊이 정보를 각각 입력받을 수도 있음은 당연하다.In this specification, it is assumed that a sensor generates color images and depth information at the same time in order to facilitate understanding and explanation, but it is natural that a color image and depth information can be inputted through a plurality of sensors, respectively .

다른 예를 들어, 영상 처리 장치(100)는 카메라와 같은 이미지 센서 통해 복수의 칼라 영상을 연속하여 입력받을 수 있으며, 복수의 칼라 영상을 통해 깊이 정보를 생성하여 이용할 수도 있다.For example, the image processing apparatus 100 may receive a plurality of color images continuously through an image sensor such as a camera, and may generate depth information through a plurality of color images.

단계 115에서 영상 처리 장치(100)는 깊이 정보를 이용하여 조인트(joint) 정보를 생성한다. In step 115, the image processing apparatus 100 generates joint information using the depth information.

깊이 정보를 이용하여 조인트 정보를 생성하는 방법 자체는 당업자에게는 자명한 사항이므로 이에 대한 별도의 설명은 생략하기로 한다. The method of generating joint information using depth information itself is obvious to those skilled in the art, and a separate description thereof will be omitted.

다른 예를 들어, 영상 처리 장치(100)는 키넥트와 연동되며, 키넥트를 통해 조인트 정보를 제공받을 수도 있다. 키넥트 이외에도 동작 인식 장치들은 대부분 조인트 정보를 제공하고 있으며, 영상 처리 장치(100)는 이들 동작 인식 장치를 통해 객체 대한 조인트 정보를 입력받을 수도 있다.As another example, the image processing apparatus 100 may be coupled to a key knot and may receive joint information through a key knot. In addition to the Kinect, most motion recognition devices provide joint information, and the image processing apparatus 100 may receive joint information about the object through these motion recognition devices.

단계 120에서 영상 처리 장치(100)는 칼만 필터를 이용하여 생성된 조인트 정보를 보정한다. In operation 120, the image processing apparatus 100 corrects joint information generated using the Kalman filter.

객체가 다른 객체나 배경에 의해 가려지거나 깊이 영상의 노이즈에 의해 조인트 정보는 제대로 측정되지 않을 수 있다. 따라서, 영상 처리 장치(100)는 칼만 필터를 이용하여 조인트 정보를 보정할 수 있다. Joint information may not be properly measured due to objects being masked by other objects or backgrounds or due to noise in the depth image. Accordingly, the image processing apparatus 100 can correct the joint information using the Kalman filter.

예를 들어, 키넥트 등을 이용하는 경우, 객체에 대해 25개의 조인트 정보와 각 조인트 정보에 대한 신뢰도(예를 들어, 3단계)가 제공된다. 이에 따라, 칼만 필터를 이용하여 신뢰도가 낮은 조인트는 예측 값을 통해 현재 조인트의 위치를 추정하고 신뢰도가 높은 조인트는 키넥트를 통해 제공된 현재 조인트 위치를 이용한다. 이를 통해 조인트 정보를 보정할 수 있다. 칼만 필터의 동작은 도 2에 도시된 바와 같다. 칼만 필터 자체는 이미 당업자에게는 자명한 사항이므로 이에 대한 추가적인 설명은 생략하기로 한다. For example, when using a Kinect or the like, there are provided 25 joint information for the object and reliability (for example, three steps) for each joint information. Therefore, a joint with low reliability using Kalman filter estimates the position of the current joint through the predicted value, and the joint with high reliability uses the current joint position provided through the keynote. This allows the joint information to be corrected. The operation of the Kalman filter is as shown in Fig. Since the Kalman filter itself is already known to those skilled in the art, a further description thereof will be omitted.

본 명세서에서는 이해와 설명의 편의를 도모하기 위해 조인트 정보 보정에 칼만 필터가 이용되는 것을 가정하여 설명하고 있으나, 칼만 필터 이외에도 다른 기술이 적용될 수도 있음은 당연하다.In this specification, it is assumed that a Kalman filter is used for joint information correction in order to facilitate understanding and explanation. However, it is a matter of course that other techniques may be applied in addition to the Kalman filter.

도 3에는 객체가 사람인 경우, 키넥트를 통해 획득 가능한 25개의 조인트들의 명칭과 그에 대응하는 신체 부위를 시각화한 도면이다. 본 명세서에서는 이해와 설명의 편의를 도모하기 위해 객체가 사람인 것을 가정하나 객체는 사람 이외에도 동물 등 FIG. 3 is a diagram visualizing the names of the 25 joints obtainable through the Kinect and corresponding body parts when the object is a person. In the present specification, in order to facilitate understanding and explanation, it is assumed that an object is a person, but an object

단계 125에서 영상 처리 장치(100)는 조인트 정보를 이용하여 스켈레톤(뼈대) 정보를 생성한다.In operation 125, the image processing apparatus 100 generates skeleton information using the joint information.

예를 들어, 영상 처리 장치(100)는 키넥트 센서 등을 통해 획득된 25개의 조인트 정보를 이용하여 24개의 스켈레톤을 생성할 수 잇다. 즉, 영상 처리 장치(100)는 각 조인트를 지나는 직선을 구한 후 해당 직선 위에 있는 모든 픽셀을 스켈레톤으로 지정할 수 있다. 이와 같이 각 조인트를 연결하여 객체에 대한 스켈레톤 정보를 생성할 수 있다. For example, the image processing apparatus 100 can generate 24 skeletons using 25 joint information obtained through a Kinect sensor or the like. That is, the image processing apparatus 100 can obtain a straight line passing through each joint, and designate all the pixels on the straight line as a skeleton. Thus, each joint can be connected to generate skeleton information for an object.

또한, 키넥트와 같은 깊이 영상을 기반으로 스켈레톤 정보가 생성되는 경우, 스켈레톤 정보와 칼라 영상간의 좌표계는 각각 상이할 수 있다. 따라서, 스켈레톤 정보의 좌표계를 칼라 영상의 좌표에 맞도록 좌표계 변환을 수행해야 한다. Also, when skeleton information is generated based on a depth image such as a keynote, the coordinate system between the skeleton information and the color image may be different from each other. Therefore, the coordinate system of the skeleton information should be converted to the coordinate system of the color image.

또한, 구현 방법에 따라 칼라 영상과 깊이 영상간의 해상도 및 시점 차이를 보정하기 위한 전처리 과정이 추가적으로 수행될 수도 있다.In addition, a preprocessing process for correcting the resolution and the viewpoint difference between the color image and the depth image may be additionally performed according to the implementation method.

단계 130에서 영상 처리 장치(100)는 생성된 스켈레톤 정보를 이용하여 스켈레톤 가중치 맵을 생성한다. In operation 130, the image processing apparatus 100 generates a skeleton weight map using the generated skeleton information.

스켈레톤 가중치 맵은 스켈레톤 정보를 토대로 칼라 영상에서 객체 영역일 가능성이 높을수록 높은 가중치 값을 가지도록 생성될 수 있다. 도 4에 본 발명의 일 실시예에 따른 스켈레톤 가중치 맵을 시각화하여 가시화한 일 예가 도시되어 있다. The skeleton weight map can be generated to have a higher weight value as the likelihood of the object region in the color image is higher based on the skeleton information. FIG. 4 illustrates an example of visualizing and visualizing a skeleton weight map according to an embodiment of the present invention.

스켈레톤 정보는 객체에 대한 뼈대 정보이므로, 칼라 영상에서 스켈레톤 위치에 근접한 픽셀일수록 객체 영역에 포함될 가능성이 높고, 스켈레톤 위치로부터 먼 픽셀일수록 객체 영역에 포함될 가능성이 낮다. 따라서, 영상 처리 장치(100)는 스켈레톤 정보를 이용하여 스켈레톤 가중치 맵을 생성하기 위해, 스켈레톤 정보를 칼라 영상에 투영시킬 수 있다. 그리고, 영상 처리 장치(100)는 칼라 영상의 각 픽셀이 스켈레톤 정보에 포함된 스켈레톤 위치로부터 멀어질수록 각 픽셀에 대한 가중치 값이 낮아지게 스켈레톤 가중치 맵을 생성할 수 있다. Since skeleton information is skeleton information for an object, a pixel closer to the skeleton position in the color image is more likely to be included in the object region, and a pixel far from the skeleton position is less likely to be included in the object region. Accordingly, the image processing apparatus 100 can project the skeleton information onto the color image in order to generate the skeleton weight map using the skeleton information. The image processing apparatus 100 can generate a skeleton weight map so that the weight value for each pixel decreases as each pixel of the color image moves away from the skeleton position included in the skeleton information.

즉, 영상 처리 장치(100)는 도 4에서 보여지는 바와 같이, 조인트 위치 및 스켈레톤 위치에 근접할수록 높은 가중치 값을 가지도록 스켈레톤 가중치 맵을 생성할 수 있다.That is, as shown in FIG. 4, the image processing apparatus 100 can generate a skeleton weight map so as to have a higher weight value as the joint position and the position of the skeleton become closer to each other.

본 발명의 일 실시예에서는 스켈레톤 정보를 이용하여 스켈레톤 가중치 맵을 생성함에 있어 가우시안 함수를 활용할 수 있다. In one embodiment of the present invention, a Gaussian function may be utilized in generating a skeleton weight map using skeleton information.

이를 수학식으로 나타내면, 수학식 1과 같다.This can be expressed by the following equation (1).

여기서, (x,y)는 영상의 좌표를 나타내고, (x_s, y_s)는 (x,y)와 가까운 위치의 스켈레톤 좌표를 나타낸다. 또한,

는 가우시안 함수의 표준 편차를 나타낸다. Here, (x, y) represents the coordinates of the image, and (x _s , y _s ) represents the skeleton coordinates close to (x, y). Also,

Represents the standard deviation of the Gaussian function.

스켈레톤 가중치 맵을 생성함에 있어, 각 픽셀에 대한 가중치에 영향을 주는

의 값은 스켈레톤 신뢰도, 스켈레톤 정보에 포함되는 각 조인트의 신뢰도, 스켈레톤의 깊이값 중 적어도 하나에 따라 각각 상이하게 결정될 수 있다.In generating a skeleton weighting map, the weighting factors for each pixel

May be determined differently depending on at least one of the skeleton reliability, the reliability of each joint included in the skeleton information, and the depth value of the skeleton.

예를 들어,

값은 스켈레톤 신뢰도에 따라 상이하게 결정될 수 있다. 영상 처리 장치(100)는 스켈레톤 신뢰도가 높을스록

의 값을 낮은 값으로 설정하여 스켈레톤 정보가 객체 영역 분할에 많은 영향을 미치도록 할 수 있다.E.g,

The value may be determined differently depending on the skeleton reliability. The image processing apparatus 100 is a system in which the reliability of the skeleton is high

Can be set to a low value so that the skeleton information has a great influence on the object area division.

반면, 스켈레톤 정보의 신뢰도가 낮은 경우,

의 값을 높은 값으로 설정하여 스켈레톤 정보가 객체 영역 분할에 보조적인 역할만 수행하도록 할 수도 있다. On the other hand, when the reliability of the skeleton information is low,

Can be set to a high value so that the skeleton information plays only an auxiliary role in object region division.

다른 예를 들어, 스켈레톤 정보에서 각 조인트마다 신뢰도가 다를 수 있다. 따라서, 영상 처리 장치(100)는 조인트의 신뢰도에 따라

의 값을 다르게 설정할 수도 있다. In another example, the reliability may be different for each joint in the skeleton information. Accordingly, the image processing apparatus 100 can perform the image processing

May be set differently.

또 다른 예를 들어, 객체 영역의 크기는 카메라로부터 멀어질수록 작아지므로, 영상 처리 장치는 스켈레톤 위치의 깊이 값에 따라 폭을 정하는

값을 다르게 설정할 수도 있다. 즉 영상 처리 장치(100)는 스켈레톤의 깊이 값이 클수록

가 작은 값을 가지도록 설정하고, 스켈레톤의 깊이 값이 작을수록

가 큰 값을 가지도록 설정할 수도 있다.As another example, since the size of the object area becomes smaller as the distance from the camera increases, the image processing apparatus sets the width according to the depth value of the skeleton position

Values can be set differently. That is, in the image processing apparatus 100, as the depth value of the skeleton becomes larger

Is set to have a small value, and the smaller the depth value of the skeleton

Can be set to have a large value.

단계 135에서 영상 처리 장치(100)는 칼라 영상 및 스켈레톤 가중치 맵을 이용하여 칼라 영상에서 객체 영역을 분리(추출)한다.In operation 135, the image processing apparatus 100 separates (extracts) the object region from the color image using the color image and the skeleton weight map.

이를 보다 상세히 설명하면, 영상 처리 장치(100)는 칼라 영상을 이용하여 객체 영역과 배경 영역에 대한 색상 모델을 각각 생성할 수 있다. 이 경우, 가우시안 혼합 모델(Gaussian mixture model)이 이용될 수도 있다.In more detail, the image processing apparatus 100 may generate a color model for an object region and a background region using a color image, respectively. In this case, a Gaussian mixture model may be used.

물론, 영상 처리 장치(100)는 객체 영역의 색상 모델을 이용함에 있어, 이전 프레임에서 추출된 객체 영역의 픽셀 정보를 이용하여 객체 영역의 색상 모델을 생성할 수도 있다.Of course, in using the color model of the object region, the image processing apparatus 100 may generate a color model of the object region using the pixel information of the object region extracted from the previous frame.

다른 예를 들어, 영상 처리 장치(100)는 이전 프레임에서 추출한 객체 영역의 픽셀과 현재 프레임에서 생성된 스켈레톤 주변의 픽셀을 이용하여 객체 영역의 색상 모델을 생성할 수도 있다.For example, the image processing apparatus 100 may generate a color model of the object region using pixels of the object region extracted in the previous frame and pixels of the skeleton generated in the current frame.

배경 영역의 색상 모델은 추출하고자 하는 객체 영역을 제외한 나머지 모든 영역에 속하는 픽셀들을 이용하여 생성될 수 있다.The color model of the background region can be generated using pixels belonging to all the remaining regions except the object region to be extracted.

다른 예를 들어, 영상 처리 장치(100)는 객체 영역을 포함하는 바운딩 박스(bounding box) 내부에서 객체 영역에 속하는 픽셀을 제외한 나머지 픽셀들을 이용하여 배경 영역에 대한 색상 모델을 생성할 수도 있다. In another example, the image processing apparatus 100 may generate a color model for a background region using pixels other than pixels belonging to an object region within a bounding box including an object region.

영상 처리 장치(100)는 객체 영역 및 배경 영역의 색상 모델과 스켈레톤 가중치 맵을 이용하여 객체 영역을 분할(추출)할 수 있다.The image processing apparatus 100 may divide (extract) the object region using the color model of the object region and the background region and the skeleton weight map.

예를 들어, 영상 처리 장치(100)는 그래프 기반 영역 분할 기술을 이용하여 객체 영역을 분리(추출)할 수 있다. For example, the image processing apparatus 100 may separate (extract) an object region using a graph-based region segmentation technique.

보다 상세하게, 영상 처리 장치(100)는 그래프컷(GraphCut) 기술을 이용하여 객체 영역을 분리(추출)할 수 있다.In more detail, the image processing apparatus 100 can separate (extract) object regions using a GraphCut technique.

도 5를 참조하여 GraphCut 기술을 이용하여 영상 처리 장치(100)에서 객체 영역을 분리(추출)하는 방법에 대해 설명하기로 한다.A method of separating (extracting) an object region in the image processing apparatus 100 using GraphCut technology will be described with reference to FIG.

각 픽셀이 그래프의 노드(node)에 해당하며, 이웃한 픽셀들은 그래프에서 에지(edge)로 각각 연결된다. 이때, 그래프는 픽셀 단위가 아닌 영역(segment)나 수퍼픽셀(superpixel) 단위로 생성될 수 있다. 연결된 노드 사이의 차이값을 이용하여 에지의 가중치가 설정될 수 있다. 보다 상세하게 연결된 노드 사이의 칼라 값 차이 등이 에지의 가중치 설정 방법으로 이용될 수 있다.Each pixel corresponds to a node in the graph, and neighboring pixels are connected to edges in the graph. At this time, the graph may be generated in a unit of a pixel rather than a pixel or a super pixel. The weight of the edge can be set using the difference value between connected nodes. A color value difference between nodes connected in more detail can be used as a weight setting method of an edge.

각 노드는 전경 노드(s)와 배경 노드(t)와 별도로 연결될 수 있다(도 5의 63). 각 노드가 전경에 가까울수록 전경 노드와 연결된 에지에 높은 가중치가 할당된다. 반면 각 노드가 배경에 가까울수록 배경 노드와 연결된 에지에 높은 가중치가 할당된다.Each node may be separately connected to the foreground node s and the background node t (63 of FIG. 5). As each node is closer to the foreground, a higher weight is assigned to the edge connected to the foreground node. On the other hand, as each node is closer to the background, a higher weight is assigned to the edge connected to the background node.

이미 전술한 바와 같이, 본 발명의 일 실시예에 따르면, 영상 처리 장치(100)는 객체 영역과 배경 영역에 대한 색상 모델을 가지고 있으므로, 영상 처리 장치(100)는 각 노드의 칼라 값이 객체 영역의 색상 모델에 비슷할수록 전경 노드와 연결된 에지의 가중치를 높게 할당할 수 있다. 반면, 영상 처리 장치(100)는 각 노드의 칼라 값이 배경 영역의 색상 모델에 비슷할수록 배경 노드와 연결된 에지의 가중치를 높게 할당할 수 있다.As described above, according to an embodiment of the present invention, the image processing apparatus 100 has a color model for an object region and a background region, The higher the weight of the edge connected to the foreground node, the higher the color model. On the other hand, as the color value of each node is similar to the color model of the background area, the image processing apparatus 100 can assign a higher weight to the edge connected to the background node.

또한, 영상 처리 장치(100)는 스켈레톤 가중치 맵을 이용하여 각 노드의 스켈레톤 가중치가 높을수록 전경 노드와 연결된 에지의 가중치를 높게 할당할 수 있다. In addition, the image processing apparatus 100 can allocate a weight of an edge connected to the foreground node to a higher value as the skeleton weight of each node is higher, using the skeleton weight map.

영상 처리 장치(100)는 이와 같이 모든 에지에 대한 가중치 할당을 완료한 후 GraphCut 방식에 따라 지정된 방식에 따라(예를 들어, max-flow 또는 min-cut) 객체 영역을 분리(추출)할 수 있다.The image processing apparatus 100 can separate (extract) the object region in accordance with the method specified by the GraphCut method (e.g., max-flow or min-cut) after completing the weight assignment for all the edges .

GraphCut 방식은 그래프 구조에서 에지를 절단하는 방식으로 동작하며 절단한 에지의 가중치 합이 최소가 되도록 동작한다(도 5의 64). The GraphCut method operates by cutting the edges in the graph structure and operates so that the sum of the weights of the cut edges is minimized (64 in FIG. 5).

본 발명의 일 실시예에 따르면, 영상 처리 장치(100)는 스켈레톤 가중치 맵을 이용하여 에지 가중치를 설정하기 때문에 스켈레톤 주변에서만 객체 영역이 설정되도록 할 수 있다.According to an embodiment of the present invention, since the image processing apparatus 100 sets the edge weight using the skeleton weight map, the object region can be set only in the vicinity of the skeleton.

도 6은 본 발명의 일 실시예에 따른 칼라 영상을 도시한 도면이고, 도 7은 본 발명의 일 실시예에 따른 깊이 영상에서 추출한 스켈레톤 정보를 칼라 영상에 투영한 영상을 나타낸 도면이며, 도 8은 본 발명의 일 실시예에 따른 깊이 영상에 스켈레톤 정보를 투영한 영상을 나타낸 도면이다. FIG. 6 is a view illustrating a color image according to an embodiment of the present invention. FIG. 7 is a view illustrating an image of a skeleton information extracted from a depth image according to an exemplary embodiment of the present invention, Is a diagram illustrating an image in which skeleton information is projected on a depth image according to an embodiment of the present invention.

도 6과 같이 복잡한 칼라 영상에서 단지 색상 정보만을 이용하여 객체를 추출하는 경우, 도 9에 도시된 바와 같이, 객체 영역(예를 들어, 사람 영역)의 색상과 비슷한 주변 배경들도 동일한 객체로 인식되어 분할되는 문제가 발생한다.In the case of extracting an object using only color information in a complex color image as shown in FIG. 6, surrounding backgrounds similar to the color of an object region (for example, a human region) are recognized as the same object So that a problem of division occurs.

도 10 및 도 11은 깊이 정보만을 이용하여 객체 영역(즉, 사람 영역)을 추출할 때 단계적으로 획득되는 영역을 도시한 도면이다. 도 11은 깊이 영상에서 특정 임계치 이상인 픽셀들을 검은색으로 나타낸 이진 영상으로 대략적으로 객체 영역을 추출하지만 깊이 영상의 부정확한 정보로 인해 추출된 객체 영역의 경계가 상당히 부정확한 것을 알 수 있다.FIGS. 10 and 11 are diagrams showing regions obtained stepwise when object regions (i.e., human regions) are extracted using only depth information. FIG. 11 shows that although the object region is extracted roughly from a binary image in which pixels having a specific threshold value or more in a depth image are displayed in black, the boundary of the extracted object region is considerably inaccurate due to inaccurate information of the depth image.

도 12는 본 발명의 일 실시예에 따른 칼라 영상에서 색상 정보와 스켈레톤 정보를 이용하여 객체 영역을 추출한 결과를 나타낸 것으로, 색상 정보뿐만 아니라 스켈레톤 정보를 이용하여 객체 영역을 추출함으로써, 객체 영역의 색상과 유사하거나 복잡한 배경이 존재하거나 전경과 배경의 깊이 차이가 크지 않은 영상에서도 객체 영역을 정확하게 분리(추출)할 수 있음을 알 수 있다.FIG. 12 shows a result of extracting an object region using color information and skeleton information in a color image according to an exemplary embodiment of the present invention. By extracting an object region using skeleton information as well as color information, It can be seen that the object region can be accurately separated (extracted) even in an image having a similar or complicated background or a difference in depth between the foreground and the background.

도 13은 본 발명의 일 실시예에 따른 영상 처리 장치의 내부 구성을 개략적으로 도시한 블록도이다.FIG. 13 is a block diagram schematically illustrating an internal configuration of an image processing apparatus according to an embodiment of the present invention.

도 13을 참조하면, 본 발명의 일 실시예에 따른 영상 처리 장치(100)는 입력부(1310), 조인트 정보 생성부(1315), 보정부(1320), 스켈레톤 생성부(1325), 가중치 맵 생성부(1330), 설정부(1335), 추출부(1340), 메모리(1345) 및 프로세서(1350)를 포함하여 구성된다.13, an image processing apparatus 100 according to an exemplary embodiment of the present invention includes an input unit 1310, a joint information generation unit 1315, a correction unit 1320, a skeleton generation unit 1325, A setting unit 1335, an extracting unit 1340, a memory 1345, and a processor 1350, as shown in FIG.

입력부(1310)는 적어도 하나의 센서로부터 칼라 영상 및 깊이 정보를 획득하기 위한 수단이다. The input unit 1310 is means for acquiring color image and depth information from at least one sensor.

다른 예를 들어, 입력부(1310)는 복수의 센서로부터 칼라 영상과 객체에 대한 조인트 정보를 획득할 수도 있다. 이미 전술한 바와 같이, 영상 처리 장치(100)는 키넥트와 같은 센서를 통해 객체에 대한 조인트 정보를 획득할 수도 있다. 영상 처리 장치(100)가 조인트 정보를 획득하는 경우, 하기에서 설명되는 조인트 정보 생성부(1315)는 구성에서 포함되지 않을수도 있음은 당연하다.For example, the input unit 1310 may acquire joint information on a color image and an object from a plurality of sensors. As already described above, the image processing apparatus 100 may acquire joint information about an object through a sensor such as a Kinect. When the image processing apparatus 100 acquires the joint information, it is needless to say that the joint information generating unit 1315 described below may not be included in the configuration.

조인트 정보 생성부(1315)는 획득된 깊이 정보를 이용하여 객체에 대한 조인트 정보를 생성하기 위한 수단이다. The joint information generation unit 1315 is means for generating joint information on the object using the obtained depth information.

보정부(1320)는 조인트 정보 생성부(1315)에서 생성되거나 입력부(1310)를 통해 획득된 조인트 정보를 보정하기 위한 수단이다. 이미 전술한 바와 같이, 보정부(1320)는 칼만 필터를 이용하여 조인트 정보를 보정할 수 있다. 칼만 필터의 상세 동작은 당업자에게는 자명한 사항이므로 이에 대한 별도의 설명은 생략하기로 한다.The correction unit 1320 is means for correcting the joint information generated by the joint information generation unit 1315 or acquired through the input unit 1310. [ As described above, the corrector 1320 can correct the joint information using the Kalman filter. The detailed operation of the Kalman filter is obvious to those skilled in the art, so a detailed description thereof will be omitted.

스켈레톤 생성부(1325)는 조인트 정보를 이용하여 스켈레톤 정보를 생성하기 위한 수단이다. 객체에 대해 획득된 각 조인트를 연결하여 스켈레톤 정보를 생성할 수 있다. The skeleton generation unit 1325 is means for generating skeleton information using joint information. You can create skeleton information by connecting each acquired joint to an object.

가중치 맵 생성부(1330)는 스켈레톤 정보를 이용하여 칼라 영상의 각 픽셀에 대한 스켈레톤 가중치를 설정한 스켈레톤 가중치 맵을 생성하기 위한 수단이다. The weight map generating unit 1330 is a unit for generating a skeleton weight map in which a skeleton weight for each pixel of the color image is set using skeleton information.

각 픽셀에 대한 스켈레톤 가중치를 설정하는 방법은, 칼라 영상의 각 픽셀이 상기 스켈레톤 정보에 포함된 스켈레톤 위치로부터 멀어질수록 가중치가 감소되도록 스켈레톤 가중치를 부여할 수 있다.The method of setting the skeleton weight for each pixel can be given a skeleton weight so that the weight of each pixel of the color image decreases as the distance from the skeleton position included in the skeleton information decreases.

다른 예를 들어, 스켈레톤 정보에 포함된 스켈레톤의 각 조인트에 대응하는 신뢰도에 따라 가중치의 감소 정도를 달리하여 각 픽셀에 대한 스켈레톤 가중치를 부여할 수도 있다.For another example, a skeleton weight for each pixel may be given by varying the degree of weight reduction depending on the reliability corresponding to each joint of the skeleton included in the skeleton information.

또 다른 예를 들어, 스켈레톤 정보에 포함된 스켈레톤의 깊이 값에 따라 가중치 감소 정도를 달리하여 각 픽셀에 대한 스켈레톤 가중치를 부여할 수도 있다. 이는 이미 전술한 바와 동일하므로 중복되는 설명은 생략하기로 한다. As another example, the skeleton weight value may be given to each pixel by varying the weight reduction degree according to the depth value of the skeleton included in the skeleton information. Since this is the same as described above, redundant description will be omitted.

설정부(1335)는 칼라 영상의 각 픽셀의 색상 정보를 이용하여 객체 영역에 대한 색상 모델과 배경 영역에 대한 색상 모델을 설정하기 위한 수단이다.The setting unit 1335 is a means for setting a color model for the object region and a color model for the background region using the color information of each pixel of the color image.

이는 이미 전술한 바와 동일하므로 중복되는 설명은 생략하기로 한다.Since this is the same as described above, redundant description will be omitted.

추출부(1340)는 칼라 영상에서 객체에 대한 색상 정보 및 스켈레톤 정보를 이용하여 객체 영역을 추출하기 위한 수단이다.The extraction unit 1340 extracts the object region using the color information and the skeleton information about the object in the color image.

이미 전술한 바와 같이, 추출부(1340)는 칼라 영상의 각 픽셀을 노드로 하는 그래프 구조를 생성한 후 각 노드간의 에지, 각 노드와 전경 노드(객체에 대한 노드)간의 에지, 각 노드와 배경 노드간의 에지의 가중치를 각각 할당한 후 그래프컷 알고리즘을 이용하여 에지를 절단함으로써 객체 영역을 추출할 수 있다. As described above, the extracting unit 1340 generates a graph structure in which each pixel of the color image is a node, and then generates an edge between each node, an edge between each node and a foreground node (an object node) After assigning the weights of the edges between nodes, we can extract the object region by cutting the edges using the graph cut algorithm.

이는 도 1에서 전술한 바와 동일하므로 중복되는 설명은 생략하기로 한다. This is the same as that described above with reference to FIG. 1, so duplicate descriptions will be omitted.

메모리(1345)는 본 발명의 일 실시예에 따른 칼라 영상의 색상 정보와 스켈레톤 정보를 이용하여 객체 영역을 추출하는 방법을 수행하기 위해 필요한 다양한 알고리즘, 이 과정에서 파생되는 다양한 데이터 등을 저장하기 위한 수단이다.The memory 1345 stores various algorithms necessary for performing a method of extracting an object region using color information and skeleton information of a color image according to an exemplary embodiment of the present invention, various data derived from the algorithm, It is means.

프로세서(1350)는 본 발명의 일 실시예에 따른 영상 처리 장치()의 내부 구성 요소들(예를 들어, 입력부(1310), 조인트 정보 생성부(1315), 보정부(1320), 스켈레톤 생성부(1325), 가중치 맵 생성부(1330), 설정부(1335), 추출부(1340), 메모리(1345) 등)을 제어하기 위한 수단이다. The processor 1350 includes internal components (e.g., an input unit 1310, a joint information generation unit 1315, a correction unit 1320, and a skeleton generation unit) of the image processing apparatus according to an exemplary embodiment of the present invention. A weight map generating unit 1330, a setting unit 1335, an extracting unit 1340, a memory 1345, and the like).

한편, 전술된 실시예의 구성 요소는 프로세스적인 관점에서 용이하게 파악될 수 있다. 즉, 각각의 구성 요소는 각각의 프로세스로 파악될 수 있다. 또한 전술된 실시예의 프로세스는 장치의 구성 요소 관점에서 용이하게 파악될 수 있다.On the other hand, the components of the above-described embodiment can be easily grasped from a process viewpoint. That is, each component can be identified as a respective process. Further, the process of the above-described embodiment can be easily grasped from the viewpoint of the components of the apparatus.

또한 앞서 설명한 기술적 내용들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예들을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 하드웨어 장치는 실시예들의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.In addition, the above-described technical features may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

100: 영상 처리 장치
1310: 입력부
1315: 조인트 정보 생성부
1320: 보정부
1325: 스켈레톤 생성부
1330: 가중치 맵 생성부
1335: 설정부
1340: 추출부
1345: 메모리
1350: 프로세서100: image processing device
1310:
1315: joint information generating unit
1320:
1325: Skeleton generation unit
1330: Weight map generating unit
1335: Setting section
1340:
1345: Memory
1350: Processor

Claims

Receiving a color image;
Receiving joint information on an object;
Generating skeleton information for the object using the joint information; And
And extracting an object region using the color information of the object and the skeleton information in the color image.

Receiving color image and depth information;
Generating joint information using the depth information;
Generating skeleton information for the object using the joint information; And
And extracting an object region using the color information of the object and the skeleton information in the color image.

3. The method according to claim 1 or 2,
Before the step of extracting the object region,
And generating a skeleton weight map for each pixel of the color image using the skeleton information.

The method of claim 3,
Wherein the skeleton weight map is generated by reducing the weight as each pixel of the color image is away from the skeleton position included in the skeleton information.

The method of claim 3,
Wherein the skeleton weight map is generated by reducing the weights differently according to the reliability corresponding to each joint of the skeleton included in the skeleton information.

The method of claim 3,
Wherein the skeleton weight map is generated by reducing the weight according to the depth value of the skeleton included in the skeleton information.

The method of claim 3,
Wherein the extracting of the object region comprises:
Generating a graph structure having each pixel of the color image as each node, and connecting each node to a foreground node and a background node;
Assigning weights for edges between each of the nodes, edges between the nodes and the foreground nodes, or edges between each of the nodes and the background node in the graph structure; And
And extracting the object region by cutting an edge in the graph structure using a GraphCut algorithm.

8. The method of claim 7,
Wherein assigning the weights of the edges comprises:
And assigning a weight of an edge connected to the foreground node to increase according to whether the color of each pixel is similar to a color model of a predetermined object area, and if the color of each pixel is similar to a color model of a predetermined background area And allocating the weight of the edge connected to the background node to increase the weight of the edge connected to the background node.

8. The method of claim 7,
Wherein the assigning of the weights comprises:
Wherein a weight of an edge connecting each of the nodes and the foreground node increases as the skeleton weight included in the skeleton weight map corresponding to each pixel increases.

10. A computer-readable recording medium having recorded thereon a program code for performing the method according to any one of claims 1 to 9.

An input unit for acquiring joint information on a color image and an object from a plurality of sensors, respectively;
A skeleton generating unit for generating skeleton information for the object using the joint information; And
And an extracting unit for extracting an object region using color information of the object and the skeleton information in the color image.

An input unit for obtaining a color image and depth information from at least one sensor;
A joint information generating unit for generating joint information using the depth information;
A skeleton generating unit for generating skeleton information on the object using the joint information; And
And an extracting unit for extracting an object region using color information of the object and the skeleton information in the color image.

13. The method according to claim 11 or 12,
And a weight map generator for generating a skeleton weight map for each pixel of the color image using the skeleton information,
Wherein the extracting unit extracts the object region using the skeleton weight map.

14. The method of claim 13,
The extracting unit extracts,
A graph structure is generated using each pixel of the color image as each node, and a weight of an edge to a foreground node, which is a target of dividing each node and the object region, is used by using a color model for the object and the skeleton weight map Quot;
And extracts the object region by cutting an edge in the graph structure using a GraphCut algorithm.