KR20190087764A

KR20190087764A - Apparatus and method for encording kinect video data

Info

Publication number: KR20190087764A
Application number: KR1020180005957A
Authority: KR
Inventors: 오중선; 김연우; 김태암; 조재형
Original assignee: 한국전력공사; (유)아홉
Priority date: 2018-01-17
Filing date: 2018-01-17
Publication date: 2019-07-25
Also published as: KR102424814B1

Abstract

The present invention relates to an apparatus for encoding Kinect video data and a method thereof. According to one embodiment of the present invention, the apparatus for encoding Kinect video data comprises: an offset correction unit for correcting an offset for a range difference of depth video data input from a Kinect; a normalization processing unit for performing a normalization process to include a depth value of the corrected depth video data in a maximum bit range which can be expressed; and a depth video encoding unit for outputting a depth bit stream by performing an encoding process on the normalized depth video data. According to the present invention, it is possible to compress data without data loss.

Description

TECHNICAL FIELD [0001] The present invention relates to an apparatus and method for encoding a Kinect image data,

본 발명은 키넥트 영상 데이터 부호화 장치 및 그 방법에 관한 것으로서, 보다 상세하게는, 키넥트로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄이기 위한, 키넥트 영상 데이터 부호화 장치 및 그 방법에 관한 것이다.The present invention relates to an apparatus and method for encoding a Kinect image data, and more particularly, to a Kinect image data encoding apparatus and method for separating Kinect image data (i.e., RGB image data and depth image data) The present invention relates to a Kinect image data encoding apparatus and a method thereof, for reducing data loss by applying a method.

3차원 비디오는 차세대 멀티미디어 컨텐츠 포맷으로 주목받고 있고, 2차원 비디오를 대체할 것으로 기대된다. 이러한 3차원 비디오는 능동 센서 기반의 키넥트(kinect)를 이용하여 사물로부터 직접 깊이 정보를 얻을 수 있다.3D video is attracting attention as a next-generation multimedia content format and is expected to replace 2D video. This 3D video can obtain depth information directly from objects using active sensor based kinect.

'키넥트'라 함은 콘트롤러 없이 이용자의 신체를 이용하여 게임과 엔터테인먼트를 경험할 수 있는 엑스박스 360과 연결해서 사용하는 주변기기를 말한다."Kinect" refers to a peripheral device that is connected to the Xbox 360, which allows users to experience games and entertainment without using a controller.

여기서, 키넥트는 적외선 카메라의 중심점을 원점으로 하여 객체를 3차원으로 표시한다. Z축은 영상영역(image plane)에 수직이고, X축은 Z축에 대하여 수직이며, 적외선 카메라에서 레이저 프로젝터로 향하는 방향이다. Y축은 Z축과 X축에 대하여 수직이다.Here, the kinetic object is displayed in three dimensions with the center point of the infrared camera as the origin. The Z axis is perpendicular to the image plane, the X axis is perpendicular to the Z axis, and the direction from the infrared camera to the laser projector. The Y axis is perpendicular to the Z axis and the X axis.

키넥트는 RGB 카메라, 적외선 센서, 적외선 프로젝터 및 4개의 마이크로폰으로 구성된다. RGB 카메라는 색상 정보를 획득하며, 적외선 센서 및 적외선 프로젝터는 전면 물체에 픽셀 단위의 적외선을 송출하고 반사되어 돌아오는 것을 받아들여 거리 정보를 획득하게 된다.The Kinect consists of an RGB camera, an infrared sensor, an infrared projector and four microphones. The RGB camera acquires color information, and the infrared sensor and the infrared projector transmit the infrared light of the pixel unit to the front object, receive the reflected light, and obtain the distance information.

센서들은 색상 뷰(color view), 영상의 깊이 정보를 나타내는 깊이 뷰(depth view), 객체의 골격을 나타내는 골격 뷰(skeleton view)를 얻을 수 있다. 이때, 센서들은 사람 신체의 47개 부위를 초당 30번씩 감지한다.Sensors can obtain a color view, a depth view representing depth information of an image, and a skeleton view representing the skeleton of an object. At this time, the sensors detect 47 parts of human body 30 times per second.

깊이 영상 데이터는 픽셀 별 키넥트와 대상체간의 상대적 거리를 나타내며, 이를 이미지 형태의 정보로 나타내는 것을 깊이맵(depth map)이라고 한다. 카메라에서 가까운 픽셀은 밝은 픽셀, 즉 높은 값을 가지며, 멀수록 낮은 값을 가지게 된다. The depth image data represents the relative distance between the pixel and the object by the pixel, and the depth map is referred to as the image type information. Pixels close to the camera have bright pixels, or high values, and the farther they are, the farther they are.

깊이 영상 데이터는 도 1을 참조하면, 총 16비트의 데이터로 표현하게 되는데, 3비트는 플레이어 인덱스(player index)로서 인간의 형태를 감지하기 위한 정보이며, 13비트는 깊이 비트(depth bits)이다. 여기서, 깊이 비트 13비트 중 12비트는 각 픽셀의 깊이 정보를 담고, 1비트는 깊이 측정의 불가여부에 사용된다. 도 1은 깊이 영상 데이터의 프레임을 나타낸 도면이다.Referring to FIG. 1, the depth video data is represented by a total of 16 bits of data. The 3 bits are a player index for detecting a human shape, and 13 bits are depth bits . Here, 12 bits out of 13 bits of depth bits contain depth information of each pixel, and 1 bit is used to deny the depth measurement. 1 is a diagram showing a frame of depth image data.

깊이맵은 3차원 비디오 합성에서 중요한 역할을 하게 된다. 이의 효율적인 압축은 추가적인 비트를 절약할 수 있으며, 결과적으로 영상 전송, 저장 및 재생 시 품질을 향상시킬 수 있다. The depth map plays an important role in 3D video synthesis. Its efficient compression can save additional bits and consequently improve quality during video transmission, storage and playback.

그런데, 2차원 비디오 코덱은 깊이 영상 데이터를 반영하는 알고리즘이 설계되어 있지 않기 때문에, 깊이 영상 데이터를 표준화된 방식으로 부호화/복호화할 수 있는 방식이 아직 체계적으로 정립되어 있지 않다.However, since an algorithm for reflecting depth image data is not designed in the 2D video codec, a method for encoding / decoding depth image data in a standardized manner has not yet been systematically established.

따라서, 종래에는 깊이 영상 데이터가 포함된 3차원 비디오를 부호화/복호화할 수 있는 방안이 마련될 필요성이 있다.Accordingly, there is a need to provide a method for encoding / decoding three-dimensional video including depth image data.

대한민국 등록특허공보 제10-1603467호 (2016.03.08 등록)Korean Registered Patent No. 10-1603467 (Registered on Mar. 23, 2016)

본 발명의 목적은 키넥트로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄이기 위한, 키넥트 영상 데이터 부호화／복호화 장치 및 그 방법을 제공하는데 있다.SUMMARY OF THE INVENTION It is an object of the present invention to provide a Kinect image data encoding / decoding apparatus for separating Kinect image data (i.e., RGB image data and depth image data) acquired from a Kinect and applying appropriate encoding schemes according to data types, Decoding apparatus and method therefor.

본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치는, 키넥트로부터 입력된 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정하기 위한 오프셋 보정부; 상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하기 위한 정규화 처리부; 및 상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하기 위한 깊이 영상 부호화부;를 포함할 수 있다.An apparatus for encoding a Kinect image data according to an exemplary embodiment of the present invention includes an offset correcting unit for correcting an offset of a range difference of depth image data input from a Kinect; A normalization processor for performing a normalization process so as to be included in a maximum bit range that can be expressed with respect to a depth value of the corrected depth image data; And a depth image encoding unit for performing a depth encoding process on the normalized depth image data to output a depth bitstream.

일실시예에 의하면, 상기 키넥트로부터 입력된 RGB 영상 데이터에 대한 암호화를 수행하기 위한 RGB 영상 부호화부;를 더 포함할 수 있다.According to an embodiment of the present invention, the apparatus may further include an RGB image encoding unit for performing encryption of RGB image data input from the keynote.

상기 깊이 영상 부호화부는, H.265/HEVC 코덱을 이용하여 부호화 과정을 수행하는 것일 수 있다.The depth image encoding unit may perform an encoding process using an H.265 / HEVC codec.

상기 RGB 영상 부호화부는, H.264 코덱을 이용하여 부호화 과정을 수행하는 것일 수 있다.The RGB image encoding unit may perform an encoding process using an H.264 codec.

상기 오프셋 보정부는, 상기 깊이 영상 데이터의 범위에서 최소값을 영점으로 맞춰주는 것일 수 있다.The offset correcting unit may adjust the minimum value in the range of the depth image data to a zero point.

상기 표현 가능한 최대 비트는, 12비트일 수 있다.The maximum representable bit may be 12 bits.

또한, 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 방법은, 키넥트로부터 입력된 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정하는 단계; 상기 보정된 깊이 영상 데이터의 깊이값에 대해 표현 가능한 최대 비트 범위 내에 포함되도록 정규화 처리를 수행하는 단계; 및 상기 정규화 처리된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력하는 단계;를 포함할 수 있다.According to another aspect of the present invention, there is provided a method of encoding a Kinect image data, comprising: correcting an offset of a range difference of depth image data input from a Kinect; Performing a normalization process so as to be included in a maximum bit range that can be expressed with respect to a depth value of the corrected depth image data; And outputting a depth bit stream by performing a coding process on the normalized depth image data.

일실시예에 의하면, 상기 키넥트로부터 입력된 RGB 영상 데이터에 대한 암호화를 수행하는 단계;를 더 포함할 수 있다.According to an embodiment of the present invention, the method may further include encrypting RGB image data input from the keynote.

본 발명은 키넥트로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄일 수 있다.The present invention can reduce data loss by separating Kinect image data (i.e., RGB image data and depth image data) acquired from the Kinect and applying an appropriate encoding method according to the data type.

또한, 본 발명은 깊이 영상 데이터에 대해 H.265/HEVC 코덱을 사용하여 부호화함으로써 데이터 손실 없이 압축할 수 있다.In addition, the present invention can compress the depth image data without data loss by encoding using the H.265 / HEVC codec.

또한, 본 발명은 능동 센서 기반 3차원 이미지 제작이 가능하여 저비용으로 고성능의 3차원 이미지 생성 및 저장, 데이터로부터 3차원 이미지 합성 등을 할 수 있다.Further, the present invention can produce a three-dimensional image based on an active sensor, thereby enabling a high-performance three-dimensional image generation and storage and a three-dimensional image synthesis from data at low cost.

또한, 본 발명은 H.265/HEVC 코덱을 사용하여 2차원 영상을 압축하고, 추가적으로 깊이 영상 데이터를 확장 프로파일에 저장하여 3차원 영상 제작시 데이터 부호화에 널리 활용할 수 있다. In addition, the present invention compresses a 2D image using the H.265 / HEVC codec, and further stores the depth image data in an extension profile, thereby making it widely applicable to data encoding in 3D image production.

또한, 본 발명은 3차원 영상 렌더링 작업 시 용량의 오버헤드로 제한이 예상되는 모바일 장치에도 적극 활용할 수 있다. Also, the present invention can be utilized for a mobile device which is expected to be limited by capacity overhead in a 3D image rendering operation.

또한, 본 발명은 H.265/HEVC 코덱의 확장 프로파일인 모노크롬 12를 채택하여 낭비되는 공간 없이 깊이 영상 데이터를 담을 수 있다. In addition, the present invention adopts the monochrome 12, which is an expansion profile of the H.265 / HEVC codec, to store depth image data without wasted space.

또한, 본 발명은 기계 학습을 이용하여 특징을 추출하여 학습하는 데이터 위주로 정보를 처리하는 시스템에 효율적으로 활용될 수 있다.In addition, the present invention can be efficiently utilized in a system for processing information based on data extracted by extracting features using machine learning.

또한, 본 발명은 움직임을 주로 처리하는 경우 데이터 처리 대상 용량을 줄여서 3차원을 표현할 수 있기 때문에 데이터 저장 및 처리를 용이하게 구현할 수 있다.In addition, since the present invention can represent three dimensions by reducing the capacity to be subjected to data processing when the motion is mainly processed, data storage and processing can be easily implemented.

또한, 본 발명은 H.265/HEVC 코덱은 여타의 코덱보다 향상된 압축률을 보이며 저용량으로 고해상도의 영상 데이터를 처리할 수 있다는 이점이 있기 때문에, 깊이 영상 데이터를 포함하여 3차원 비디오를 처리할 때 유리한 이점을 제공할 수 있다.In addition, since the H.265 / HEVC codec has an improved compression ratio than other codecs and can process high-resolution image data with a low capacity, the H.265 / HEVC codec is advantageous in processing three-dimensional video including depth image data. This can provide benefits.

도 1은 깊이 영상 데이터의 프레임을 나타낸 도면,
도 2는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치를 나타낸 도면,
도 3은 본 발명의 일실시예에 따른 키넥트 영상 데이터 복호화 장치를 나타낸 도면,
도 4는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화/복호화 방법을 나타낸 도면이다.1 is a diagram showing a frame of depth image data,
2 is a block diagram of a Kinect image data encoding apparatus according to an embodiment of the present invention.
3 is a block diagram of a Kinect image data decoding apparatus according to an embodiment of the present invention.
FIG. 4 illustrates a method of encoding / decoding Kinect image data according to an embodiment of the present invention. Referring to FIG.

이하 본 발명의 바람직한 실시 예를 첨부한 도면을 참조하여 상세히 설명한다. 다만, 하기의 설명 및 첨부된 도면에서 본 발명의 요지를 흐릴 수 있는 공지 기능 또는 구성에 대한 상세한 설명은 생략한다. 또한, 도면 전체에 걸쳐 동일한 구성 요소들은 가능한 한 동일한 도면 부호로 나타내고 있음에 유의하여야 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description and the accompanying drawings, detailed description of well-known functions or constructions that may obscure the subject matter of the present invention will be omitted. It should be noted that the same constituent elements are denoted by the same reference numerals as possible throughout the drawings.

이하에서 설명되는 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위한 용어로 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.The terms and words used in the present specification and claims should not be construed in an ordinary or dictionary sense, and the inventor shall properly define the terms of his invention in the best way possible It should be construed as meaning and concept consistent with the technical idea of the present invention.

따라서 본 명세서에 기재된 실시 예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시 예에 불과할 뿐이고, 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 예들이 있을 수 있음을 이해하여야 한다.Therefore, the embodiments described in the present specification and the configurations shown in the drawings are merely the most preferred embodiments of the present invention, and not all of the technical ideas of the present invention are described. Therefore, It is to be understood that equivalents and modifications are possible.

첨부 도면에 있어서 일부 구성요소는 과장되거나 생략되거나 또는 개략적으로 도시되었으며, 각 구성요소의 크기는 실제 크기를 전적으로 반영하는 것이 아니다. 본 발명은 첨부한 도면에 그려진 상대적인 크기나 간격에 의해 제한되어지지 않는다.In the accompanying drawings, some of the elements are exaggerated, omitted or schematically shown, and the size of each element does not entirely reflect the actual size. The invention is not limited by the relative size or spacing depicted in the accompanying drawings.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다.When an element is referred to as "including" an element throughout the specification, it is to be understood that the element may include other elements, without departing from the spirit or scope of the present invention. Also, when a part is referred to as being "connected" to another part, it includes not only "directly connected" but also "electrically connected" with another part in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The singular expressions include plural expressions unless the context clearly dictates otherwise. It will be understood that terms such as "comprise" or "comprise ", when used in this specification, specify the presence of stated features, integers, , But do not preclude the presence or addition of one or more other features, elements, components, components, or combinations thereof.

또한, 명세서에서 사용되는 "부"라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, "부"는 어떤 역할들을 수행한다. 그렇지만 "부"는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. "부"는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수도 있고 하나 또는 그 이상의 프로세서들을 재생시키도록 구성될 수도 있다. 따라서, 일 예로서 "부"는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세스들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 "부"들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 "부"들로 결합되거나 추가적인 구성요소들과 "부"들로 더 분리될 수 있다.Also, as used herein, the term "part " refers to a hardware component such as software, FPGA or ASIC, and" part " However, "part" is not meant to be limited to software or hardware. "Part" may be configured to reside on an addressable storage medium and may be configured to play back one or more processors. Thus, by way of example, and not limitation, "part (s) " refers to components such as software components, object oriented software components, class components and task components, and processes, Subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays and variables. The functions provided in the components and "parts " may be combined into a smaller number of components and" parts " or further separated into additional components and "parts ".

아래에서는 첨부한 도면을 참고하여 본 발명의 실시예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and similar parts are denoted by like reference characters throughout the specification.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 설명한다.Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings.

도 2는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치를 나타낸 도면이고, 도 3은 본 발명의 일실시예에 따른 키넥트 영상 데이터 복호화 장치를 나타낸 도면이다.FIG. 2 is a block diagram of a Kinect image data encoding apparatus according to an embodiment of the present invention, and FIG. 3 is a block diagram of a Kinect image data decoding apparatus according to an embodiment of the present invention.

도 2에 도시된 바와 같이, 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화 장치(100a)는, 키넥트(10)로부터 취득된 키넥트 영상 데이터(즉, RGB 영상 데이터 및 깊이 영상 데이터)를 분리하여 데이터 종류에 따라 적절한 부호화 방식을 적용함으로써 데이터 손실을 줄일 수 있다.2, the Kinect image data encoding apparatus 100a according to an embodiment of the present invention includes a key data inputting unit 110 for inputting the key-knock image data (i.e., RGB image data and depth image data) And data loss can be reduced by applying an appropriate encoding method according to the data type.

깊이 영상 데이터는 도 1과 같이 최대 12비트의 값으로 표현이 가능하고, 이를 압축하기 위한 방식으로 영상 코덱을 사용하여 부호화(압축)하는 방식을 고려해볼 수 있다.As shown in FIG. 1, depth video data can be represented by a maximum value of 12 bits, and a method of encoding (compressing) the video data using an image codec as a method for compressing the depth video data can be considered.

키넥트(10)는 640×480 픽셀의 RGB 영상 데이터와 깊이맵을 생성할 수 있다. 깊이맵은 데이터로 표현되고, 스트림이나 신호로 전달하기 위해 부호화 과정이 필요하다.The Kinect 10 can generate RGB image data and depth maps of 640 x 480 pixels. The depth map is represented by data and requires a coding process to be delivered to the stream or signal.

일반적으로, 비디오 정보는 부호화하는 경우에 코덱(codec)을 사용하게 된다. 코덱은 임의의 데이터 스트림을 부호화(Encoding) 및 복호화(Decoding) 하는 소프트웨어 또는 하드웨어를 의미한다.In general, video information uses a codec when it is encoded. A codec refers to software or hardware that encodes and decodes an arbitrary data stream.

여기서는 RGB 영상 데이터를 H.264 코덱에 따라 부호화하고, 깊이 영상 데이터를 H.265/HEVC(High Efficiency Video Coding) 코덱에 따라 부호화하는 경우에 대해 설명하기로 한다. 이에 한정되지 않고, 이외에도 예를 들어 MP3G-1, MPEG-2, MPEG-4, H.264/AVC, MVC, SVC 등을 적용할 수도 있다.Here, the case where the RGB image data is encoded according to the H.264 codec and the depth image data is encoded according to the H.265 / HEVC (High Efficiency Video Coding) codec will be described. For example, MP3G-1, MPEG-2, MPEG-4, H.264 / AVC, MVC, and SVC may be applied.

특히, H.265/HEVC 코덱은 기존 H.264 코덱을 개발한 ISO/IEC MPEG와 ITU-U 의 영상 부호화 전문가 그룹이 협력해 개발한 차세대 영상 압축 기술이다. 이러한 H.265/HEVC 코덱은 주 프로파일을 다수 가지고 있는데, 버전1에서 Main, Main 10, Main Still Picture가 주를 이루며, 버전2에서 확장 프로파일 21개가 추가되었다. 확장 프로파일은 비트 깊이, 4:2:2/4:4:4 크로마 샘플링, 멀티뷰 비디오 코딩(MVC), 확장 비디오코딩(SCV) 등의 다양한 요소를 포함하게 된다.In particular, the H.265 / HEVC codec is a next-generation image compression technology developed by ISO / IEC MPEG, which developed the existing H.264 codec, and the image coding expert group of ITU-U. These H.265 / HEVC codecs have a number of main profiles. In version 1, Main, Main 10 and Main Still Picture are main, and in extension 2, 21 extension profiles are added. The extension profile will include various elements such as bit depth, 4: 2: 2/4: 4: 4 chroma sampling, multi-view video coding (MVC), and extended video coding (SCV).

이처럼, H.265/HEVC 코덱의 프로파일은 버전2의 모노크롬 12(monochrome 12) 또는 그 이상의 프로파일을 이용한 코덱을 사용하여 12비트 깊이 영상 데이터를 표현하게 된다. As such, the profile of the H.265 / HEVC codec will represent 12 bit depth image data using the codec using the version 2 monochrome 12 or higher profile.

이는 깊이 영상 데이터의 입력값 범위가 12비트 범위 내에 표현될 수 있다는 점에 기인한다.This is due to the fact that the input value range of the depth image data can be expressed within a range of 12 bits.

다시 말해, 깊이 영상 데이터는 픽셀별로 8비트 부호화를 진행하는 H.264 코덱으로 압축을 진행하면 정보 손실이 발생할 수 있다. 그런데, 깊이 영상 데이터는 부호화 이후에 데이터 손실 없는 복호화가 가능한지에 따라 판독 가능성이 달라질 수 있기 때문에 정보 손실 없이 압축하는 방식이 필요하다. 예를 들어, 수화 인식 시스템은 정교한 깊이 영상 데이터를 복호화할 수 있는지에 따라 손짓에 대한 판독 가능성이 높아질 수 있기 때문에 가능한 손실이 없는 압축 방식이 필요하다.In other words, information can be lost if the depth image data is compressed by an H.264 codec that performs 8-bit encoding for each pixel. However, since the readability of the depth image data can be changed depending on whether the data can be decoded without data loss after coding, a method of compressing the data without loss of information is required. For example, a sign language recognition system requires a lossless compression scheme because it can increase the readability of the gesture depending on whether it can decode the sophisticated depth image data.

이에 따라, 깊이 영상 데이터는 RGB 영상 데이터와 동일하게 H.264 코덱을 적용하여 부호화하지 않고, RGB 영상 데이터와 분리하여 H.265/HEVC 코덱을 적용하여 부호화한다.Accordingly, the depth image data is encoded by applying the H.265 / HEVC codec separately from the RGB image data, instead of encoding the H.264 codec by using the same method as the RGB image data.

다시 도 2를 참조하면, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터를 H.265/HEVC 코덱으로 부호화하기 위해 H.265/HEVC 코덱에서 원하는 형식(format)으로 맞추는 과정이 필요하다. 이를 위해, 키넥트 영상 데이터 부호화 장치(100a)는 키넥트 버전에 따라 오프셋을 제거하여 12비트에 맞게 변환시키는 전처리 과정을 수행한다.Referring again to FIG. 2, the Kinect image data encoding apparatus 100a needs to process the depth image data into a desired format in the H.265 / HEVC codec in order to encode the depth image data using the H.265 / HEVC codec. To this end, the Kinect image data encoding apparatus 100a performs a preprocessing process of removing the offset according to the KNext version and converting it into 12 bits.

키넥트 영상 데이터 부호화 장치(100a)는 RGB 영상 부호화부(110a), 깊이 영상 전처리부(210a), 깊이 영상 부호화부(220a)를 포함한다.The Kinect image data encoding apparatus 100a includes an RGB image encoding unit 110a, a depth image preprocessing unit 210a, and a depth image encoding unit 220a.

RGB 영상 부호화부(110a)는 키넥트(10)로부터 입력된 RGB 영상 데이터에 대한 부호화 과정을 진행하여 RGB 비트 스트림을 출력한다. 이때, RGB 영상 부호화부(110a)는 H.264 코덱을 적용하여 RGB 영상 데이터에 대한 부호화 과정을 진행한다.The RGB image encoding unit 110a performs an encoding process on RGB image data input from the keynote 10 and outputs an RGB bit stream. At this time, the RGB image encoding unit 110a applies the H.264 codec to process the RGB image data.

깊이 영상 전처리부(210a)는 키넥트(10)로부터 입력된 깊이 영상 데이터에 대해 H.265/HEVC 코덱을 적용하여 부호화를 진행하기 위한 전처리 과정을 수행한다. 깊이 영상 전처리부(210a)는 오프셋 보정부(211)와 정규화 처리부(212)를 포함한다.The depth image preprocessing unit 210a preprocesses the depth image data inputted from the keynote 10 by applying the H.265 / HEVC codec to the depth image data. The depth image preprocessing unit 210a includes an offset correction unit 211 and a normalization processing unit 212. [

오프셋 보정부(211)는 키넥트 버전에 따라 나타내는 깊이 영상 데이터의 범위 차이에 대한 오프셋을 보정한다. 즉, 오프셋 보정부(211)는 키넥트 버전에 따라 깊이 영상 데이터에서 0∼4096 또는 500∼4500으로 범위 차이가 발생하는 경우에 오프셋을 영점(0점)으로 맞춰주는 오프셋 보정을 수행한다.The offset correcting unit 211 corrects the offset of the range difference of the depth image data indicated by the KNext version. That is, the offset correcting unit 211 performs offset correction, which adjusts the offset to a zero point (0 point) when a range difference from 0 to 4096 or 500 to 4500 occurs in the depth image data according to the Kinect version.

오프셋 보정부(211)는 깊이 영상 데이터의 범위가 500∼4500와 같이 깊이값 500을 최소값으로 가지는 경우에, 오프셋 보정을 아래 수학식 1처럼 수행하여 깊이값을 영점 기준으로 조정한다.When the depth image data has a depth value of 500 as a minimum value such as 500 to 4500, the offset correction unit 211 adjusts the depth value as a zero point reference by performing an offset correction as shown in Equation 1 below.

정규화 처리부(212)는 오프셋 보정부(211)를 통해 오프셋 보정 과정을 수행한 후, 보정된 깊이값에 대해 12비트(즉, 4096개 값) 범위 내에 들어오도록 정규화 처리를 수행한다.The normalization processing unit 212 performs an offset correction process through the offset correction unit 211 and performs a normalization process so that the corrected depth value falls within a range of 12 bits (i.e., 4096 values).

즉, 정규화 처리부(212)는 최대값이 12비트(즉, 4096개 값)을 넘지 않는 경우에 그대로 이용하며, 최대값이 12비트(즉, 4096개 값)을 넘는 경우에 정규화 처리를 수행한다.That is, the normalization processing unit 212 uses the value when the maximum value does not exceed 12 bits (that is, 4096 values), and performs the normalization processing when the maximum value exceeds 12 bits (that is, 4096 values) .

정규화 처리부(212)는 아래 수학식 2와 같이 깊이값에 대한 정규화 처리를 수행한다.The normalization processing unit 212 performs a normalization process on the depth value as shown in Equation (2) below.

깊이 영상 부호화부(220a)는 정규화 처리부(212)로부터 정규화된 깊이 영상 데이터에 대한 부호화 과정을 진행하여 깊이 비트 스트림을 출력한다. 이때, 깊이 영상 부호화부(220a)는 H.265/HEVC 코덱을 적용하여 깊이 영상 데이터에 대한 부호화 과정을 진행한다.The depth image encoding unit 220a performs a process for normalizing the depth image data from the normalization processing unit 212 to output a depth bit stream. At this time, the depth image encoding unit 220a applies the H.265 / HEVC codec to process the depth image data.

이와 같이, RGB 비트 스트림 및 깊이 비트 스트림은 파일로 저장하거나 네트워크를 통해 전송될 수 있다.As such, the RGB bit stream and depth bit stream can be stored in a file or transmitted over a network.

도 3을 참조하면, 키넥트 영상 데이터 복호화 장치(100b)는 RGB 비트 스트림 및 깊이 비트 스트림을 복호화한다. 즉, 키넥트 영상 데이터 복호화 장치(100b)는 RGB 비트 스트림 및 깊이 비트 스트림에 대해 역으로 재생하거나 정보를 추출하기 위해 전술한 키넥트 영상 데이터 부호화 장치(100a)의 수행 과정을 반대로 진행한다.Referring to FIG. 3, the Kinect image data decoding apparatus 100b decodes the RGB bit stream and the depth bit stream. That is, the Kinect image data decoding apparatus 100b reverses the operation of the Kinect image data encoding apparatus 100a described above in order to reproduce or extract information inversely to the RGB bit stream and the depth bit stream.

키넥트 영상 데이터 복호화 장치(100b)는 RGB 영상 복호화부(110b), 깊이 영상 복호화부(220b), 깊이 영상 후처리부(210b)를 포함한다.The Kinect image data decoding apparatus 100b includes an RGB image decoding unit 110b, a depth image decoding unit 220b, and a depth image post-processing unit 210b.

RGB 비트 스트림은 복호화를 통해 화면에 재생하거나, 깊이 비트 스트림은 복호화를 통해 흑백 화면으로 표현할 수 있다. 이때, 픽셀 별로 0∼255값으로 변환하는 과정을 거친다.The RGB bit stream can be reproduced on the screen through decoding or the depth bit stream can be expressed as a monochrome screen through decoding. At this time, the process of converting from 0 to 255 is performed for each pixel.

수화 인식 시스템은 깊이 비트 스트림을 복호화하여 특징값(Feature Vector)을 추출하여 기계학습에 사용할 수 있다.The sign language recognition system can extract the feature vector by decoding the depth bit stream and use it for machine learning.

도 4는 본 발명의 일실시예에 따른 키넥트 영상 데이터 부호화/복호화 방법을 나타낸 도면이다.FIG. 4 illustrates a method of encoding / decoding Kinect image data according to an embodiment of the present invention. Referring to FIG.

키넥트 영상 데이터 부호화 장치(100a)는 RGB 영상 데이터에 대한 부호화를 통해 RGB 비트 스트림을 출력한다(S101). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 H.264 코덱을 이용한다.The Kinect image data encoding apparatus 100a outputs an RGB bit stream through encoding of RGB image data (S101). At this time, the Kinect image data encoding apparatus 100a uses the H.264 codec.

이와 동시에, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터에 대한 오프셋 보정을 수행한다(S201). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터의 범위 최소값을 영점 기준으로 조정한다.At the same time, the Kinect image data encoding apparatus 100a performs offset correction on the depth image data (S201). At this time, the Kinect image data encoding apparatus 100a adjusts the minimum range value of the depth image data on the basis of the zero point.

이후, 키넥트 영상 데이터 부호화 장치(100a)는 보정된 깊이 영상 데이터에 대한 정규화 처리를 수행한다(S202). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 보정된 깊이값에 대해 표현 가능한 최대 비트(즉, 12비트) 범위 내에 포함되도록 정규화 처리를 수행한다.Thereafter, the Kinect image data encoding apparatus 100a performs a normalization process on the corrected depth image data (S202). At this time, the Kinect image data encoding apparatus 100a performs the normalization process so as to be included within the range of the maximum bit (i.e., 12 bits) that can be expressed with respect to the corrected depth value.

그런 다음, 키넥트 영상 데이터 부호화 장치(100a)는 깊이 영상 데이터에 대한 부호화를 통해 깊이 비트 스트림을 출력한다(S203). 이때, 키넥트 영상 데이터 부호화 장치(100a)는 H.265/HEVC 코덱을 이용한다.Then, the Kinect image data encoding apparatus 100a outputs the depth bit stream through the encoding of the depth image data (S203). At this time, the Kinect image data encoding apparatus 100a uses the H.265 / HEVC codec.

일부 실시 예에 의한 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CDROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.The method according to some embodiments may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and configured for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape, optical media such as CDROMs, DVDs, magneto-optical media such as floptical disks, Magneto-optical media, and hardware devices specifically configured to store and perform program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like.

비록 상기 설명이 다양한 실시예들에 적용되는 본 발명의 신규한 특징들에 초점을 맞추어 설명되었지만, 본 기술 분야에 숙달된 기술을 가진 사람은 본 발명의 범위를 벗어나지 않으면서도 상기 설명된 장치 및 방법의 형태 및 세부 사항에서 다양한 삭제, 대체, 및 변경이 가능함을 이해할 것이다. 따라서, 본 발명의 범위는 상기 설명에서보다는 첨부된 특허청구범위에 의해 정의된다. 특허청구범위의 균등 범위 안의 모든 변형은 본 발명의 범위에 포섭된다.Although the foregoing is directed to novel features of the present invention that are applicable to various embodiments, those skilled in the art will appreciate that the apparatus and method described above, without departing from the scope of the present invention, It will be understood that various deletions, substitutions, and alterations can be made in form and detail without departing from the spirit and scope of the invention. Accordingly, the scope of the present invention is defined by the appended claims rather than the foregoing description. All variations within the scope of the appended claims are embraced within the scope of the present invention.

10 : 키넥트 110a : RGB 영상 부호화부
210a : 깊이 영상 전처리부 211 : 오프셋 보정부
212 : 정규화 처리부 220a : 깊이 영상 부호화부
110b : RGB 영상 복호화부 210b : 깊이 영상 후처리부
220b : 깊이 영상 복호화부10: Kinect 110a: RGB image coding unit
210a: depth image preprocessing unit 211: offset correction unit
212: normalization processing unit 220a: depth image coding unit
110b: RGB image decoding unit 210b: depth image post-
220b: depth image decoding unit

Claims

An offset correcting unit for correcting an offset of a range difference of depth image data input from a key knot;
A normalization processor for performing a normalization process so as to be included in a maximum bit range that can be expressed with respect to a depth value of the corrected depth image data; And
A depth image encoding unit for performing a depth encoding process on the normalized depth image data to output a depth bitstream;
And a second key data generator for generating a second key data.

The method according to claim 1,
An RGB image encoding unit for encoding the RGB image data input from the keynote;
Further comprising a keycode generator for generating a keycode for the keycode.

The method according to claim 1,
Wherein the depth image encoding unit comprises:
Wherein the encoding process is performed using the H.265 / HEVC codec.

3. The method of claim 2,
Wherein the RGB image encoding unit comprises:
Wherein the encoding process is performed using the H.264 codec.

The method according to claim 1,
Wherein the offset correcting unit comprises:
And adjusts the minimum value in the range of the depth image data to a zero point.

The method according to claim 1,
Wherein the maximum expressible bit is 12 bits.

Correcting an offset for a range difference of depth image data input from a key knot;
Performing a normalization process so as to be included in a maximum bit range that can be expressed with respect to a depth value of the corrected depth image data; And
Outputting a depth bit stream by performing a coding process on the normalized depth image data;
And generating a keycode image.

8. The method of claim 7,
Performing encryption on RGB image data input from the keynote;
Further comprising the steps of: