KR20100066289A

KR20100066289A - Method and apparatus for providing realistic immersive multimedia services

Info

Publication number: KR20100066289A
Application number: KR1020090030316A
Authority: KR
Inventors: 이봉호; 이광순; 이현; 윤국진; 이용주; 허남호; 김진웅; 이수인
Original assignee: 한국전자통신연구원
Priority date: 2008-12-08
Filing date: 2009-04-08
Publication date: 2010-06-17
Also published as: KR101235832B1

Abstract

PURPOSE: A method and an apparatus for providing realistic immersive multimedia services are provided to select a stereoscopic image area corresponding to a sound source of a stereoscopic audio using connection information between the stereoscopic audio and the stereoscopic image, thereby matching a specific object with an audio. CONSTITUTION: A stereoscopic audio and a stereoscopic image connected with the stereoscopic audio are obtained(302). Connection information between the stereoscopic audio and the stereoscopic image is obtained(304). An area corresponding to a fixed sound source of the stereoscopic audio is selected from a played stereoscopic image using the connection information(306). The connection information includes identification information for each sound source of the stereoscopic audio and identification information of each area included in the stereoscopic image.

Description

METHOD AND APPARATUS FOR PROVIDING SENSITIVE MULTIMEDIA SERVICES {METHOD AND APPARATUS FOR PROVIDING REALISTIC IMMERSIVE MULTIMEDIA SERVICES}

본 발명은 입체 오디오 및 입체 영상을 이용하여 실감 멀티미디어 서비스를 제공하는 방법 및 장치, 그리고 입체 오디오 및 입체 영상을 포함하는 실감 멀티미디어 콘텐츠를 재생하는 방법 및 장치에 관한 것이다.The present invention relates to a method and apparatus for providing immersive multimedia service using stereoscopic audio and stereoscopic images, and to a method and apparatus for reproducing sensory multimedia contents including stereoscopic audio and stereoscopic images.

본 발명은 지식경제부의 IT원천기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: 2008-F-011-01, 과제명: 차세대 DTV 핵심기술개발(표준화연계)]The present invention is derived from a study conducted as part of the IT source technology development project of the Ministry of Knowledge Economy [Task Management No .: 2008-F-011-01, Title: Next-Generation DTV Core Technology Development (Standardization)]

3차원 입체 영상, 또는 입체 영상을 만드는 기법은 좌우 양안 입체 영상을 텍스쳐 입체 영상으로 사용하는 방법과, 텍스쳐 입체 영상과 깊이(depth) 정보를 표현한 입체 영상을 사용하는 방법이 있다. 이러한 입체 영상 기법 중 하나로서 스테레오스코픽 입체 영상을 들 수 있다. 스테레오스코픽 입체 영상은 일정한 거리, 즉 시점을 가진 좌우 양안 입체 영상을 사용하는 입체 영상으로, 카메라의 간격과 카메라로부터 물체의 거리에 따라 입체감을 나타낸다. Techniques for creating a 3D stereoscopic image or a stereoscopic image include a method of using left and right binocular stereoscopic images as texture stereoscopic images, and a method of using a stereoscopic image representing texture stereoscopic images and depth information. One of such stereoscopic imaging techniques may be stereoscopic stereoscopic images. A stereoscopic stereoscopic image is a stereoscopic image using left and right binocular stereoscopic images having a constant distance, that is, a viewpoint, and exhibits a stereoscopic feeling according to the distance between cameras and the distance of an object from the camera.

입체 영상을 시청하는 시청자는 2D 입체 영상에서 볼 수 없었던, 마치 물체가 앞으로 튀어 나오거나 뒤로 들어가 보이는 등의 거리감을 통해 입체 효과를 느낄 수 있다. 이러한 입체 영상은 현재 방송 분야에 접목되고 있으며 일부에서는 3DTV 상용 방송 서비스를 시작하였다. 이동통신 분야에서도 입체 카메라를 장착하여 입체 영상을 볼 수 있는 기능을 구비한 단말이 출시된 상태이고, 이동통신망을 통하여 관련 콘텐츠를 전송하는 입체 영상 서비스가 제공될 것으로 보인다. 향후에는 이를 통한 입체 영상 통신도 가능할 것으로 보이며, 현재 관련 기술이 개발 중이다. 이러한 입체 영상 관련 서비스는 위에 언급된 분야에서뿐만 아니라 입체 디스플레이 및 입체 표현이 가능한 모든 종류의 단말에서 통용될 것으로 보인다.Viewers watching a stereoscopic image can feel a stereoscopic effect through a sense of distance, such as an object popping forward or entering backward, which was not seen in a 2D stereoscopic image. Such stereoscopic images are currently being applied to the broadcasting field, and some have started commercial 3DTV broadcasting services. In the mobile communication field, a terminal having a function of viewing a stereoscopic image by mounting a stereoscopic camera has been released, and a stereoscopic image service for transmitting related contents through a mobile communication network is expected to be provided. In the future, it is expected that 3D video communication will be possible, and related technologies are currently being developed. Such stereoscopic image related services are expected to be commonly used in all kinds of terminals capable of stereoscopic display and stereoscopic representation as well as in the above-mentioned fields.

입체 영상의 경우 화면 내에 3차원 공간을 형성할 수 있다. 즉, 입체 영상 내에서 특정 입체 영상 객체가 존재하는 경우, X와 Y축만이 아니라 깊이 감을 형성하는 Z축이 존재하므로 좌우 상하뿐만 아니라 앞뒤 공간에 어느 특정 입체 영상 객체를 배치하는 것이 가능하다. 따라서 화면 내의 특정 객체를 좌에서 우로, 또는 위에서 아래로 이동시키는 것이 가능하고, 해당 객체를 화면의 앞이나 뒤로 이동 또는 배치하는 것도 가능하다.In the case of a stereoscopic image, a three-dimensional space may be formed in the screen. That is, when a specific stereoscopic image object exists in the stereoscopic image, not only the X and Y axes but also a Z axis forming a sense of depth exist, it is possible to arrange any specific stereoscopic image object in the front and rear space as well as the left and right. Accordingly, it is possible to move a specific object in the screen from left to right or from top to bottom, and to move or arrange the object in front of or behind the screen.

이러한 기능은 로컬 상호작용이 거의 불가능한 단순 시청형 3D 비디오 보다 화면이 배경 비디오를 포함하여 하나 이상의 이미지 객체로 구성되어 사용자와의 상호 작용이 가능한 MPEG-4 BIFS와 같은 포맷에서 이용될 경우 사용자에게 더 좋은 공간적인 입체 효과를 제공할 수 있다. 상기 BIFS와 같은 서비스의 경우 사용자와 의 상호작용은 필요에 따라 원하는 객체의 깊이 감을 늘이고 줄이는 효과 및 객체의 밝기나 색감을 조정하는 것과 같은 효과를 제공할 수 있다. This feature is more useful to users when the screen is used in formats such as MPEG-4 BIFS, where the screen consists of one or more image objects, including background video, to interact with the user, rather than simple viewing 3D video with almost no local interaction. It can provide a good spatial stereo effect. In the case of a service such as BIFS, interaction with a user may provide effects such as increasing and decreasing the depth of a desired object and adjusting brightness or color of the object as needed.

한편, 오디오 분야의 경우에도 디지털 오디오 신호 처리 기술의 발달로 최근 입체 음향에 대한 관심이 높아지고 있다. 입체음향이란, 음원이 발생한 공간에 위치하지 않은 청취자가 음향을 들었을 때 방향감, 거리감 및 공간감을 지각할 수 있도록 음향에 공간정보를 부가한 음향을 말한다. 상기 입체 음향이 지원되는 재생 장치를 이용하면 현장에 있지 않아도 마치 현장에서 해당 음향을 듣는 것과 같은 효과를 얻을 수 있다. 이러한 입체음향은 영화를 포함하여 게임, 가상현실 및 멀티미디어 컨텐츠 분야에 다양하게 적용될 수 있다. 뿐만 아니라 통신망을 이용한 원격회의, 실감음향통신, 원격교육 및 방송, 국방 등 음향이 필요한 모든 분야에서 입체음향 기술은 현실감 및 몰입감을 증대시키는 요소기술로 중요하게 인식되고 있다.Meanwhile, in the audio field, interest in stereoscopic sound has recently increased due to the development of digital audio signal processing technology. The three-dimensional sound refers to a sound in which spatial information is added to the sound so that a listener who is not located in a space where a sound source is generated can perceive a sense of direction, distance, and space when the sound is heard. If the stereoscopic sound reproduction apparatus is used, it is possible to obtain the same effect as if the corresponding sound is heard in the field even when not in the field. The stereophonic sound may be variously applied to games, virtual reality, and multimedia contents including movies. In addition, in all fields that require sound, such as teleconference, realistic acoustic communication, distance education and broadcasting, and defense using a communication network, stereoscopic sound technology is recognized as an important technology to increase the realism and immersion.

입체음향기술은 현장감 있는 입체음향을 생성하기 위한 방법을 통칭하는 것으로 음상 정위(Sound Image Localization)방법, 음장제어(Sound Field Control) 방법, 효과음(Effect Sound) 생성 방법, 간섭(crosstalk)제거, 외재화(Externalization) 및 가상 서라운드(Virtual Surround) 기술 등으로 구분될 수 있다.Stereoscopic sound technology is a collective name for creating realistic stereoscopic sound, which includes sound image localization, sound field control, effect sound generation, crosstalk removal, etc. It may be divided into externalization and virtual surround technology.

음상 정위 방법은 음원의 위치를 원하는 가상의 위치에 정위시키는 기술을 의미하여, 음장 제어 방법은 콘서트홀이나 녹음실 같은 음원의 공간을 가상으로 생 성하는 기술이다. 또한 효과음 제어 기술은 메어리나 코러스 같은 효과음을 발생하는 기술이며, 가상 서라운드 기술은 2채널의 스피커 환경하에 4채널이나 5.1채널의 음향을 재생할 수 있는 방법이다.The sound image positioning method refers to a technique of positioning a sound source at a desired virtual position, and the sound field control method is a technique for virtually creating a space of a sound source such as a concert hall or a recording studio. In addition, the sound effect control technology generates sound effects such as a melody or chorus, and the virtual surround technology is a method of reproducing sound of 4 or 5.1 channels in a 2-channel speaker environment.

입체 음향은 청각에 공간 배치 정보를 주어, 결과적으로 입체 영상 정보를 식별할 수 있게 하며, 이것에 의하여 음원 위치, 주위 환경, 청취자의 위치, 물체의 움직임, 음상의 형태 등을 얻을 수 있다. 상기 입체 음향은 재생 방식에 따라 멀티채널(Multichannel) 타입의 서라운드(Surround) 방식과 바이노럴(Binaural) 타입의 2채널 스테레오 방식으로 구분된다.The stereo sound gives the spatial arrangement information to the auditory, and as a result, the stereoscopic image information can be identified, thereby obtaining the sound source position, the surrounding environment, the position of the listener, the movement of the object, the shape of the sound image, and the like. The stereo sound is divided into a multichannel surround method and a binaural two channel stereo method according to a reproduction method.

일반적으로 음향의 전달 경로는 물질에 의한 반사, 회절, 산란 등의 현상을 발생시키는 공간 전달계와 인간의 두뇌와 귀에 의한 반사, 회절. 공진 등의 현상을 유발하는 머리 전달계로 구분된다. 사람이 귀로 전달된 소리의 공간성을 지각하는 주 요인은 양 귀에 도달하는 두 소리의 시간차와 레벨차 그리고 스펙트럼의 차이에 있다.In general, the transmission path of sound is a spatial transmission system that causes phenomena such as reflection, diffraction, and scattering by materials, and reflection and diffraction by the human brain and ears. It is divided into head transmission system that causes phenomenon such as resonance. The main factors that perceive the spatiality of the sound transmitted to the ear are the time difference, the level difference, and the difference between the two sounds reaching the ears.

바이노럴 타입은 음원(Sound Source)이 발생한 공간 내에 있는 청취자의 양쪽 귀에 마이크로폰을 각각 설치하여 녹음한 신호를 말하며, 이 신호를 헤드폰으로 재생할 경우 현장에서 직접 듣는 것과 같은 음상(Sound Image)을 재생 할 수 있다. 여기서, 음원이란 실제 물리적으로 음을 발생하는 객체의 위치를 말하며, 음상은 인간이 지각하는 감각상의 음원을 말한다. 음원과 음상은 공간적 특성이 반드시 일치하지 않으며 음원과 음상이 일치할수록 좋은 음질의 입체음향이 구현되었다고 할 수 있다. 바이노럴 신호에는 음원의 위치, 방향뿐만 아니라 음원을 둘러싸고 있는 공간, 즉 음장(Sound Field)과 관련한 공간적 단서들이 포함되어 있다. The binaural type refers to a signal recorded by installing microphones on both ears of a listener in a space where a sound source is generated, and when the signal is reproduced through headphones, a sound image is reproduced as if directly heard in the field. can do. Here, the sound source refers to the position of the object that actually generates the sound, and the sound image refers to the sound source of the sensory sense perceived by humans. The spatial characteristics of the sound source and the sound image do not necessarily coincide with each other, and the more the sound source and the sound image match, the better sound quality stereoscopic sound is realized. The binaural signal includes not only the position and direction of the sound source but also spatial cues related to the space surrounding the sound source, that is, the sound field.

입체음향 재생방식은 재생하는 채널 수에 따라 2채널에 의한 입체음향 재생방식과 다채널에 의한 입체음향 재생방식으로 나누어질 수 있는데, 2채널에 의한 입체음향 재생방식은 인간이 두 개의 귀로 음향을 지각하는 특성을 이용하여 음상정위와 음장제어에 의해 생성된 입체음향을 2채널의 헤드폰환경이나 2개의 스피커 환경에서 재생하는 기술을 말하며 현재까지 개발된 입체음향 기술의 대부분은 2채널 재생방식이다.Stereophonic sound reproduction method can be divided into stereophonic sound reproduction method by two channels and stereophonic sound reproduction method by multiple channels according to the number of channels to be reproduced. This technology refers to the technology of reproducing the stereoscopic sound generated by the sound image control and the sound field control in the two-channel headphone environment or the two speaker environment using the perceptual characteristics. Most of the stereoscopic sound technologies developed to date are two-channel reproduction methods.

바이노럴 타입의 2채널 입체음향 생성 방식은 녹음을 통한 방식과 최근에 많이 사용되는 필터처리 방식으로 나뉜다. 녹음을 통한 방식은 청취자의 양쪽 귀에 마이크로폰을 삽입하여 현장음을 녹음하고 이를 재생하는 방식이다. 필터처리방식은 모노(Mono)음을 머리전달함수(FRTF: Head Related Transfer Function)와 공간전달함수(RTF: Room Transfer Function)라는 필터를 통과시켜 입체음향을 재생하는 방식이다. 머리전달함수란 무향실내에서 더미헤드를 중심으로 구의 형태로 여러 각도에 배치된 스피커로부터 백색잡음과 같은 임펄스 신호를 발생시켜, 더미헤드 양쪽 귀 안에 장착된 마이크로폰으로 측정한 임펄스 응답을 각도별로 DB화 해 놓은 것을 말한다. 이 DB로부터 원하는 위치에 해당하는 머리전달함수를 선택하여 단순음원과 콘볼루션 연산을 통해 해당 위치에 음상을 정위시킬 수 있다.The binaural type two-channel stereophonic sound generation method is divided into a recording method and a filter processing method which is widely used recently. In the recording method, a microphone is inserted into both ears of a listener to record a scene sound and play the same. The filter processing method is a method of reproducing stereo sound by passing mono filters through a head related transfer function (FRTF) and a room transfer function (RTF). The head transfer function generates impulse signals such as white noise from speakers arranged at various angles in the form of spheres around the dummy head in an anechoic chamber, and generates impulse responses measured by microphones mounted on both ears of the dummy head. Say what you have done. The head transfer function corresponding to the desired position can be selected from this DB, and the sound image can be positioned at the corresponding position through the simple sound source and the convolution operation.

멀티 채널 서라운드 방식의 경우, 돌비 디지털 시스템의 5.1 채널을 예로 들면, 도 1과 같이 일반적으로 중앙, 좌우 전방 및 좌우 후방 및 서브 우퍼로 구성하여 입체 음향 서비스를 제공하고 있다. 이러한 시스템은 청각 마스킹(Auditory Masking)이라는 정신적인 청각 현상을 이용한 기술로 이른바 청각심리화 디지털코딩(Perceptual Digital Audio Coding) 기술로, 주로 영화 분야에서 사용되고 있으며 다채널 음향으로도 사용되고 있다.In the multi-channel surround system, 5.1 channels of the Dolby Digital system are taken as examples, and the stereo, stereo, stereo, and subwoofers are generally provided as shown in FIG. 1. Such a system uses a psychoacoustic phenomenon called auditory masking, which is called perceptual digital audio coding, which is mainly used in the film field, and is also used as a multichannel sound.

본 발명은 입체 오디오와 입체 영상 간의 연계 정보를 이용하여 입체 오디오의 음원에 대응하는 입체 영상 영역을 선택함으로써 단순히 오디오와 입체 영상의 동기만을 맞추는 것이 아니라 입체 영상 내에 위치한 특정 객체와 오디오를 매칭시킬 수 있는 방법 및 장치를 제공하는 데 일 목적이 있다.According to the present invention, by selecting a stereoscopic image region corresponding to a sound source of stereoscopic audio using linkage information between stereoscopic audio and stereoscopic images, it is possible to match audio with a specific object located in the stereoscopic image rather than simply synchronizing audio and stereoscopic images. It is an object of the present invention to provide a method and apparatus.

또한 본 발명은 입체 영상 내의 특정 객체와 입체 오디오를 매칭시킴으로써 해당 오디오에 맞는 입체 영상 효과를 해당 객체에 부여하여 보다 실감나는 입체 오디오 및 입체 영상 서비스를 제공할 수 있는 방법 및 장치를 제공하는 데 다른 목적이 있다.In addition, the present invention provides a method and apparatus that can provide a more realistic stereoscopic audio and stereoscopic video services by matching stereoscopic audio with a specific object in the stereoscopic image to give a stereoscopic image effect corresponding to the audio to the object. There is a purpose.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention, which are not mentioned above, can be understood by the following description, and more clearly by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.

이러한 목적을 달성하기 위한 본 발명은 입체 오디오 및 입체 영상 재생 방법에 있어서, 입체 오디오 및 상기 입체 오디오와 연계된 입체 영상을 획득하는 단계, 입체 오디오 및 입체 영상 간의 연계 정보를 획득하는 단계 및 연계 정보를 이용하여, 재생되는 입체 영상에서 입체 오디오의 소정의 음원에 대응하는 영역을 선택하는 단계를 포함하고, 여기서 연계 정보는 입체 오디오의 각 음원에 대한 식별 정보 및 입체 영상에 포함된 각 영역의 식별 정보를 포함하는 것을 일 특징으로 한다.In order to achieve the above object, the present invention provides a method of reproducing stereoscopic audio and stereoscopic images, the method comprising: acquiring stereoscopic audio and stereoscopic images associated with the stereoscopic audio, acquiring linkage information between stereoscopic audio and stereoscopic images, and interlinking information; Selecting a region corresponding to a predetermined sound source of stereoscopic audio in the stereoscopic image to be reproduced, wherein the linkage information is identification information for each sound source of stereoscopic audio and identification of each region included in the stereoscopic image. It is characterized by including information.

또한 본 발명은 입체 오디오 및 입체 영상 재생 장치에 있어서, 입체 오디오 및 입체 오디오와 연계된 입체 영상을 획득하는 오디오 및 입체 영상 획득부, 입체 오디오 및 입체 영상 간의 연계 정보를 획득하는 연계 정보 획득부 및 연계 정보를 이용하여, 재생되는 입체 영상에서 입체 오디오의 소정의 음원에 대응하는 영역을 선택하는 입체 영상 처리부를 포함하고, 여기서 연계 정보는 입체 오디오의 각 음원에 대한 식별 정보 및 입체 영상에 포함된 각 영역의 식별 정보를 포함하는 것을 다른 특징으로 한다.The present invention also provides a stereo audio and stereoscopic image reproducing apparatus, comprising: an audio and stereoscopic image acquisition unit for acquiring stereoscopic images linked to stereoscopic audio and stereoscopic audio; A stereoscopic image processing unit for selecting a region corresponding to a predetermined sound source of stereoscopic audio in the stereoscopic image to be reproduced by using the interlinked information, wherein the interlinked information is included in the identification information and the stereoscopic image of each sound source of the stereoscopic audio; It is another feature to include identification information of each area.

또한 본 발명은 입체 오디오 및 입체 영상 제공 방법에 있어서, 입체 오디오 및 입체 오디오와 연계된 입체 영상을 전송하는 단계 및 입체 영상을 재생할 때 입체 오디오의 소정의 음원에 대응하는 영역을 선택하기 위한 입체 오디오 및 입체 영상 간의 연계 정보를 전송하는 단계를 포함하고, 여기서 연계 정보는 입체 오디오의 각 음원에 대한 식별 정보 및 상기 입체 영상에 포함된 각 영역의 식별 정보 를 포함하는 것을 또 다른 특징으로 한다.In addition, in the stereoscopic audio and stereoscopic image providing method, the present invention provides a method for transmitting stereoscopic audio and stereoscopic audio associated with stereoscopic audio, and stereoscopic audio for selecting an area corresponding to a predetermined sound source of stereoscopic audio when stereoscopic video is played back. And transmitting linkage information between stereoscopic images, wherein the linkage information includes identification information of each sound source of stereoscopic audio and identification information of each region included in the stereoscopic image.

또한 본 발명은 입체 오디오 및 입체 영상 제공 장치에 있어서, 입체 오디오, 입체 오디오와 연계된 입체 영상, 입체 영상을 재생할 때 입체 오디오의 소정의 음원에 대응하는 영역을 선택하기 위한 입체 오디오 및 입체 영상 간의 연계 정보를 전송하는 데이터 전송부를 포함하고, 여기서 연계 정보는 입체 오디오의 각 음원에 대한 식별 정보 및 입체 영상에 포함된 각 영역의 식별 정보를 포함하는 것을 또 다른 특징으로 한다.In addition, the present invention is a stereoscopic audio and stereoscopic image providing apparatus, stereoscopic audio, stereoscopic audio associated with stereoscopic audio, stereoscopic audio and stereoscopic image for selecting a region corresponding to a predetermined sound source of stereoscopic audio when stereoscopic video is reproduced And a data transmitter for transmitting the association information, wherein the association information includes identification information of each sound source of stereoscopic audio and identification information of each region included in the stereoscopic image.

전술한 바와 같은 본 발명에 의하면, 입체 오디오와 입체 영상 간의 연계 정보를 이용하여 입체 오디오의 음원에 대응하는 입체 영상 영역을 선택함으로써 단순히 오디오와 입체 영상의 동기만을 맞추는 것이 아니라 입체 영상 내에 위치한 특정 객체와 오디오를 매칭시킬 수 있는 장점이 있다.According to the present invention as described above, by selecting the stereoscopic image region corresponding to the sound source of the stereoscopic audio using the linkage information between the stereoscopic audio and the stereoscopic image, a specific object located in the stereoscopic image, not merely to synchronize the audio and stereoscopic image. There is an advantage that can match audio with.

또한 본 발명은 입체 영상 내의 특정 객체와 입체 오디오를 매칭시킴으로써 해당 오디오에 맞는 입체 영상 효과를 해당 객체에 부여하여 보다 실감나는 입체 오디오 및 입체 영상 서비스를 제공할 수 있는 장점이 있다.In addition, the present invention has the advantage of providing a more realistic stereoscopic audio and stereoscopic video services by matching the stereoscopic audio and a specific object in the stereoscopic image to give the stereoscopic image effect to the corresponding object.

전술한 목적, 특징 및 장점은 첨부된 도면을 참조하여 상세하게 후술되며, 이에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기 술적 사상을 용이하게 실시할 수 있을 것이다. 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 상세한 설명을 생략한다. The above objects, features, and advantages will be described in detail with reference to the accompanying drawings, and thus, those skilled in the art may easily implement the technical idea of the present invention. In describing the present invention, when it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description will be omitted.

본 발명은 입체 오디오 및 이 입체 오디오와 연계된 입체 영상의 연계 정보를 이용하여, 재생되는 입체 영상에서 입체 오디오의 소정의 음원에 대응하는 영역을 선택하는 기술에 관한 것이다. 여기서 연계 정보는 입체 오디오의 각 음원에 대한 식별 정보 및 입체 영상에 포함된 각 영역의 식별 정보를 포함한다. 예를 들어, 오케스트라에 의하여 연주되는 교향악 콘서트 장면을 입체 오디오를 이용하여 시청하는 경우에, 오케스트라의 특정 파트(예를 들면, 바이올린)의 연주가 부각되거나 커지면 그에 해당하는 입체 영상의 특정 영역 또는 객체(예를 들면, 바이올린 연주자)의 색감이 변화하거나, 입체감이 커지는 등의 효과를 부여하는 것이다.The present invention relates to a technique for selecting a region corresponding to a predetermined sound source of stereoscopic audio in a stereoscopic video to be reproduced by using stereoscopic audio and linkage information of stereoscopic video associated with the stereoscopic audio. The linkage information includes identification information of each sound source of stereoscopic audio and identification information of each region included in the stereoscopic image. For example, when viewing a symphony concert scene played by an orchestra using stereoscopic audio, when a performance of a particular part of the orchestra (for example, a violin) is highlighted or enlarged, a specific region or object of the stereoscopic image corresponding thereto ( For example, the effect of changing the color of the violin player) or increasing the three-dimensional feeling is provided.

이를 위해 연계 정보에는 입체 오디오의 각 음원에 대한 식별 정보와 입체 영상에 포함된 각 영역 또는 객체의 식별 정보가 포함되어 있다. 그리고 포함된 음원 식별 정보 및 입체 영상의 영역 또는 객체 식별 정보 간의 연계 정보를 통해, 어떤 음원이 재생되고 있을 때 그에 해당하는 입체 영상의 어떠한 영역 또는 객체가 대응되는지를 파악할 수 있다.To this end, the linkage information includes identification information of each sound source of stereoscopic audio and identification information of each region or object included in the stereoscopic image. Further, through linkage information between the included sound source identification information and the region or object identification information of the stereoscopic image, it is possible to determine which region or object of the stereoscopic image corresponding to the corresponding sound source is played.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하기로 한다. 도면에서 동일한 참조부호는 동일 또는 유사한 구성요소를 가리키는 것으로 사용된다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, the same reference numerals are used to indicate the same or similar components.

도 1은 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상의 재생 및 제공 방법을 설명하기 위한 도면이다.1 is a view for explaining a method of reproducing and providing stereoscopic audio and stereoscopic video according to an embodiment of the present invention.

도 1에 나타난 바와 같이, 시청자는 오케스트라가 음악을 연주하는 장면(104)을 5.1채널 오디오 환경(102)을 통해 감상할 수 있다. 이 때 오케스트라 연주 화면(104)은 3차원 객체들로 구성된 입체 영상이 될 수 있다. As shown in FIG. 1, a viewer can watch a scene 104 where an orchestra plays music through a 5.1 channel audio environment 102. At this time, the orchestra playing screen 104 may be a stereoscopic image composed of three-dimensional objects.

도 1에서, 연주 화면(104)은 3차원 객체들로 구성된 입체 영상이므로 각 객체, 예를 들면 연주자나 악기 등 화면에 나타난 모든 객체에는 입체감이 부여되어 있다. 따라서 시청자는 실제 연주장에서와 마찬가지로 각 연주자의 위치를 입체적으로 감상할 수 있다. 또한 5.1채널 오디오 환경(102)을 통해, 시청자는 마치 실제 연주장에 와 있는 것과 같이 입체적인 음향을 즐길 수 있다. 이러한 오디오 및 입체 영상의 재생 및 시청 과정은 종래 이루어졌던 것으로, 입체 오디오 및 그와 연계된 입체 영상을 재생하는 것에 불과하다.In FIG. 1, since the performance screen 104 is a three-dimensional image composed of three-dimensional objects, a three-dimensional effect is given to each object, for example, every object displayed on the screen such as a player or a musical instrument. Therefore, the viewer can three-dimensionally view the position of each player as in a real performance hall. In addition, the 5.1-channel audio environment 102 allows viewers to enjoy stereoscopic sound as if they were in a real venue. The process of reproducing and viewing such audio and stereoscopic images is conventional, and is merely to reproduce stereoscopic audio and stereoscopic images associated therewith.

본 발명의 일 실시예에서는 종래와 같이 단순하게 오디오 및 입체 영상을 재생하는 것이 아니라, 입체 오디오의 특정 음원이 강조되는 경우 그에 해당하는 입체 영상 객체를 부각시킴으로써 시청자가 보다 실감나는 화면을 즐길 수 있도록 한다. In an embodiment of the present invention, instead of simply playing audio and stereoscopic images as in the related art, when a specific sound source of stereoscopic audio is emphasized, a stereoscopic image object corresponding to the stereoscopic image object is emphasized so that a viewer can enjoy a more realistic screen. do.

만약 5.1채널 오디오 환경(102)에서 입체 영상 재생 중 도 1과 같이 오른쪽(Right) 오디오가 강조되거나 소리가 커지는 경우, 연주 장면(104)에서는 그에 해당하는 영역, 즉 화면의 오른쪽 뒷 부분이 선택된다. 이렇게 해당 영역이 선택되면, 입체감을 조정하여 해당 영역이 다른 영역보다 더 부각되도록 한다. 예를 들 어, 연주 장면(104)이 하나 이상의 객체 및 깊이 맵(depth)으로 이루어진 입체 영상인 경우, 깊이(depth) 값을 증가 또는 감소 시킴으로써 해당 영역(또는 객체)이 보다 부각될 수 있다. If the right audio is emphasized or the sound becomes loud as shown in FIG. 1 during the stereoscopic image reproduction in the 5.1-channel audio environment 102, the corresponding region, that is, the right rear part of the screen, is selected in the playing scene 104. . When the area is selected in this way, the three-dimensional effect is adjusted so that the area is more prominent than other areas. For example, when the playing scene 104 is a stereoscopic image composed of one or more objects and a depth map, the corresponding area (or object) may be more highlighted by increasing or decreasing the depth value.

이러한 효과를 제공하기 위해서, 본 발명의 다른 실시예에서는 텍스춰와 깊이(depth) 이미지로 구성된 입체 영상을 사용할 수 있다. 여기서 깊이 입체 영상은 공간상에서 깊이 감을 제공하는 입체 영상으로, 일반적으로 0~255의 깊이 값을 사용하여 깊이 감을 표현한다. 이러한 깊이 값은 단말에서 사용자 또는 응용 프로그램에 의해 조정이 가능하며 이를 통해 깊이 감을 원하는 대로 변경할 수 있다. 이러한 깊이감 제어를 입체 오디오와 연동하여 제공하므로써 사용자에게 입체 오디오의 소정의 음원에 해당하는 입체 영상 영역(또는 객체)를 보다 실감나게 인식시키는 것이 가능해진다.In order to provide such an effect, another embodiment of the present invention may use a stereoscopic image composed of a texture and a depth image. Here, the depth stereoscopic image is a stereoscopic image providing a depth sense in space, and generally expresses the depth sense using a depth value of 0 to 255. The depth value can be adjusted by the user or an application program in the terminal, and through this, the depth can be changed as desired. By providing such depth control in conjunction with stereoscopic audio, the user can more realistically recognize a stereoscopic image area (or object) corresponding to a predetermined sound source of stereoscopic audio.

본 발명의 또 다른 실시예에서는 선택된 영역의 깊이를 조정하는 대신 해당 영역의 색감이나 명암 등을 조정함으로써 해당 영역을 시청자에게 보다 확실하게 인식시키는 방법을 사용할 수도 있다.In another embodiment of the present invention, instead of adjusting the depth of the selected area, a method of reliably recognizing the corresponding area to the viewer by adjusting the color or contrast of the corresponding area may be used.

본 발명의 일 실시예에서, 입체 영상을 구성하는 3D 객체는 깊이(depth) 또는 디스패러티(disparity)의 조정이 가능한 3D 객체로 구성된다. 여기서 3D 객체는 임의의 크기를 갖는 3D 비디오를 포함하여 알파 맵(alpha map)을 지원하는 3D 이미지 객체(예를 들어, JPEG, PNG 또는 MNG 형식)를 포함한다. 이러한 입체 영상을 구성하는 3D 객체는 화면상의 3차원 공간에 위치하며, 각 3D 객체는 특정 오디오 채널(또는 음원)과 연계된다.In one embodiment of the present invention, the 3D object constituting the stereoscopic image is composed of a 3D object that can adjust the depth (depth) or disparity (disparity). Here, the 3D object includes a 3D image object (eg, JPEG, PNG, or MNG format) that supports an alpha map, including a 3D video having an arbitrary size. The 3D objects constituting the stereoscopic image are located in a 3D space on the screen, and each 3D object is associated with a specific audio channel (or sound source).

본 발명의 일 실시예에서는, 입체 오디오가 재생될 때 해당 오디오의 소정 음원에 해당하는 3D 객체의 깊이 맵(depth map)을 조정하여 3D 객체의 깊이감을 확대 또는 축소하고, 이를 통해 입체 오디오 또한 부각시키는 효과를 제공할 수 있다. 여기서 조정이라 함은 깊이 맵 또는 디스패러티 값을 조정하여 깊이감을 변경함을 의미한다.According to an embodiment of the present invention, when stereoscopic audio is played, the depth map of the 3D object corresponding to a predetermined sound source of the audio is adjusted to enlarge or reduce the depth of the 3D object, thereby increasing the stereoscopic audio. Can provide an effect. Here, the adjustment means changing the depth by adjusting the depth map or disparity value.

이렇게 입체 오디오의 소정 음원과 그에 대응하는 재생 입체 영상의 영역을 선택하기 위해서는 입체 영상과 오디오 간의 연계 정보가 필요하다. 본 발명의 일 실시예에서는 이러한 연계 정보로서 3D 객체와 입체 오디오 간의 메타 정보가 이용된다. 이 메타 정보는 메타 정보 식별자, 3D 객체와 오디오 채널(또는 오디오 객체)의 식별자, 깊이 조정 정보, 조정 방법, 깊이 효과, 이벤트 시작 시간, 이벤트 지속 시간 및 이벤트 종료 시간 등의 정보를 포함할 수 있다.In order to select a predetermined sound source of stereoscopic audio and a region of a reproduction stereoscopic image corresponding thereto, linkage information between stereoscopic image and audio is required. In one embodiment of the present invention, the meta information between the 3D object and the stereoscopic audio is used as the linking information. This meta information may include information such as a meta information identifier, identifiers of 3D objects and audio channels (or audio objects), depth adjustment information, adjustment methods, depth effects, event start time, event duration and event end time. .

먼저 메타 정보 식별자는 메타 정보를 구성하는 데이터(패킷, 파일을 포함한 식별 정보 제공이 가능한 문자 및 코드)를 식별하기 위한 식별자이다.First, the meta information identifier is an identifier for identifying data (packets, characters and codes capable of providing identification information including a file) constituting the meta information.

객체 식별자는 연계된 오디오와 입체 영상을 식별하기 위한 정보로, MPEG-2 시스템을 사용할 경우 ES ID가 될 수 있으며, MPEG-2 시스템이 아닌 경우는 별도의 CRDI, URL 및 URI를 사용함으로써 오디오와 입체 영상을 식별할 수 있다.The object identifier is information for identifying the associated audio and stereoscopic image. If the MPEG-2 system is used, the object identifier can be an ES ID. If the MPEG-2 system is not used, the object identifier uses a separate CRDI, URL, and URI. Stereoscopic images can be identified.

깊이 조정 정보는 조정 값, 최소 값, 최대 값으로 구성될 수 있으며, 조정 값은 기존의 깊이 맵을 조정하기 위한 기본 값을 의미한다. 최소값은 깊이 값 조정 시 보장되어야 할 최소 값을 의미하며, 최대값은 깊이 값 조정시 보장되어야 할 최대 값을 의미한다. The depth adjustment information may be composed of an adjustment value, a minimum value, and a maximum value, and the adjustment value means a basic value for adjusting the existing depth map. The minimum value means the minimum value to be guaranteed when adjusting the depth value, and the maximum value means the maximum value to be guaranteed when adjusting the depth value.

깊이 조정 방법은 깊이 조정 값을 반영하기 위한 방법으로, 일반적으로 더하기, 곱하기, 선형 증가 또는 선형 감소와 같은 방법을 적용할 수 있다. 만약 조정 방법이 더하기일 경우, 기존 깊이 맵의 값에 깊이 조정 값을 더함으로써 해당 영역의 깊이감을 조정할 수 있다. 깊이 조정 방법이 선형 증가인 경우, 깊이 조정 값은 기존 값에서 선형적으로 증가될 수 있으며, 반대로 선형 감소인 경우에는 조정 값이 기존 값에서 선형적으로 감소될 수 있다.The depth adjustment method is a method for reflecting the depth adjustment value, and generally, a method such as addition, multiplication, linear increase, or linear decrease may be applied. If the adjustment method is added, the depth of the corresponding area may be adjusted by adding the depth adjustment value to the value of the existing depth map. When the depth adjustment method is a linear increase, the depth adjustment value may be linearly increased from the existing value, and conversely, in the case of a linear decrease, the adjustment value may be linearly decreased from the existing value.

이벤트 시작 시간은 해당 오디오의 소정 음원과 대응되는 재생 입체 영상의 3D 객체의 깊이감 조정이 적용되는 시작 시간을 의미한다. 시간 표시 방법으로서 일반 방송의 경우 UTC(Universal Time coordinated) 값을 사용할 수 있으며, 인터넷의 경우에는 인터넷 표준 시간을 적용할 수 있다. 또한 이벤트 지속 시간은 3D 객체의 깊이감 조정이 적용되는 지속 시간을 의미하며, 3D 객체의 깊이감 조정이 종료되는 시간을 의미한다.The event start time means a start time to which the depth adjustment of the 3D object of the reproduced stereoscopic image corresponding to the predetermined sound source of the corresponding audio is applied. As a time display method, UTC (Universal Time coordinated) value can be used for general broadcasting, and Internet standard time can be applied for Internet. In addition, the event duration refers to the duration in which the depth adjustment of the 3D object is applied, and the time when the depth adjustment of the 3D object ends.

이러한 메타 정보는 응용에 따라 다양한 형태로 구성되어 제공될 수 있다. 예를 들어 DVD와 같은 저장 매체를 통해 파일로 저장될 경우 별도의 트랙으로 구성되어 제공될 수 있다. 방송망을 통해 제공될 경우는 특정한 메타 데이터로 구성되어 스트림으로 전송되거나 아니면 비디오 스트림에 메타 정보로 부가되어 제공될 수 있다. 인터넷을 통한 통신망의 경우에는 개별 파일로 별도로 제공될 수 있으며 방송망 제공 방법과 동일하게 별도의 메타 데이터나 비디오 또는 오디오 스트림 내에 포함되어 제공될 수도 있다. 즉, 본 발명의 바람직한 실시예에서 연계 정보는 입체 오디오, 입체 영상과 함께 하나의 통합된 파일 형식으로 제공될 수도 있고, 입체 오디오, 입체 영상과는 별도의 파일로 구성되어 제공될 수도 있다.Such meta information may be provided in various forms depending on the application. For example, when stored as a file through a storage medium such as a DVD, it may be provided as a separate track. When provided through a broadcasting network, it may be composed of specific metadata and transmitted as a stream or may be provided as meta information in a video stream. In the case of a communication network through the Internet, it may be separately provided as an individual file, and may be included in separate metadata, video, or audio stream in the same manner as a broadcasting network providing method. That is, in the preferred embodiment of the present invention, the linkage information may be provided in one integrated file format together with the stereoscopic audio and the stereoscopic image, or may be provided as a separate file from the stereoscopic audio and the stereoscopic image.

지금까지 도 1을 통하여 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 제공에 대하여 설명하였다. 도 1에서는 편의상 특정한 3차원 입체 영상을 통해 본 발명의 실시예를 설명하였으나, 본 발명이 이와 같은 특정한 입체 영상에만 국한되어 적용되는 것은 아니다.Until now, the stereoscopic audio and stereoscopic image provision according to an embodiment of the present invention have been described with reference to FIG. 1. In FIG. 1, an embodiment of the present invention has been described through a specific 3D stereoscopic image for convenience, but the present invention is not limited to such a specific stereoscopic image.

도 2는 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 재생 장치의 구성을 나타내는 구성도이다.2 is a block diagram showing the configuration of a stereoscopic audio and stereoscopic image reproducing apparatus according to an embodiment of the present invention.

도 2에서, 입체 오디오 및 입체 영상 재생 장치(202)는 오디오 및 입체 영상 획득부(204), 연계 정보 획득부(206), 입체 영상 처리부(208)를 포함한다. In FIG. 2, the stereoscopic audio and stereoscopic image reproducing apparatus 202 includes an audio and stereoscopic image acquisition unit 204, a linkage information acquisition unit 206, and a stereoscopic image processing unit 208.

입체 오디오 및 입체 영상 재생 장치(202)는 입체 오디오 및 이 입체 오디오와 연계된 입체 영상을 획득하는 역할을 한다. 이러한 오디오 및 입체 영상은 방송망 또는 통신망을 통해 실시간으로 획득될 수도 있고, 기록 매체 등에 의해 미리 저장된 파일 등으로부터 획득될 수도 있다. 즉, 오디오 및 입체 영상의 획득은 어떠한 방법이나 매체를 통해서도 가능하다.The stereoscopic audio and stereoscopic image reproducing apparatus 202 serves to acquire stereoscopic audio and stereoscopic images associated with the stereoscopic audio. Such audio and stereoscopic images may be obtained in real time through a broadcasting network or a communication network, or may be obtained from a file previously stored by a recording medium or the like. That is, the acquisition of audio and stereoscopic images can be made through any method or medium.

연계 정보 획득부(204)는 입체 오디오 및 입체 영상 재생 장치(202)가 획득한 입체 오디오 및 입체 영상 간의 연계 정보를 획득하는 역할을 한다. 연계 정보 또한 오디오 및 입체 영상과 마찬가지로 여러가지 방법이나 매체를 통해서 획득될 수 있다. 예를 들어, 오디오 및 입체 영상이 DVD 기록 매체에 저장된 경우, 연계 정보 획득부(204)는 오디오 및 입체 영상과는 별도의 트랙에 저장된 연계 정보(예 를 들면, 메타 데이터)를 획득할 수 있다.The linkage information acquisition unit 204 acquires linkage information between the stereoscopic audio and the stereoscopic image obtained by the stereoscopic audio and the stereoscopic image reproducing apparatus 202. The association information may be obtained through various methods or media as well as audio and stereoscopic images. For example, when audio and stereoscopic images are stored in a DVD recording medium, the linkage information acquisition unit 204 may acquire linkage information (eg, metadata) stored in a track separate from the audio and stereoscopic images. .

입체 영상 처리부(208)는 연계 정보 획득부(204)에 의해 획득된 연계 정보를 이용하여, 재생되는 입체 영상에서 입체 오디오의 소정의 음원에 대응하는 영역을 선택하는 역할을 한다. 앞서 언급한 바와 같이, 연계 정보에는 입체 오디오의 각 음원에 대한 식별 정보 및 입체 영상에 포함된 각 영역(또는 객체)의 식별 정보가 포함되어 있다. 또한 컨텐츠 제공자나 편집자에 의해 저장된, 어느 음원이 어느 입체 영상 영역에 대응되는지에 대한 정보 또한 포함될 수 있다. 입체 영상 처리부(208)는 이러한 연계 정보를 이용하여 입체 오디오의 소정의 음원에 대응하는 영역을 재생되는 입체 영상에서 선택하며, 선택된 영역에 대하여 입체감 정보(깊이 정보 또는 디스패러티 정보를 포함)나 색감, 명암 등을 조정한다. The stereoscopic image processing unit 208 serves to select a region corresponding to a predetermined sound source of stereoscopic audio in the stereoscopic image to be reproduced using the interlocking information acquired by the interlocking information acquisition unit 204. As mentioned above, the association information includes identification information of each sound source of stereoscopic audio and identification information of each region (or object) included in the stereoscopic image. In addition, information about which sound source corresponds to which stereoscopic image area stored by a content provider or an editor may also be included. The stereoscopic image processing unit 208 selects a region corresponding to a predetermined sound source of stereoscopic audio from the reproduced stereoscopic image using the linkage information, and adds stereoscopic information (including depth information or disparity information) or color sense for the selected region. Adjust the contrast.

여기서, 연계 정보에는 위에 언급한 오디오 및 입체 영상의 식별 정보 뿐만 아니라, 선택된 영역의 입체감 정보, 색감 또는 명암의 조정을 수행하기 위한 조정 값과, 그러한 조정이 수행되기 위한 시간 정보(조정 시작 시간, 조정 수행 시간, 조정 종료 시간)등이 포함될 수 있다.Here, the linkage information includes not only the identification information of the audio and stereoscopic images mentioned above, but also adjustment values for performing adjustment of stereoscopic information, color or contrast of the selected area, and time information (such as adjustment start time, Adjustment execution time, adjustment end time), and the like.

한편, 도 2에는 도시되지 않았으나, 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상을 재생하기 위해서는 이러한 입체 오디오 및 입체 영상이 제공되어야 한다. 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 제공 장치는 입체 오디오, 이 입체 오디오와 연계된 입체 영상, 이 입체 영상을 재생할 때 입체 오디오의 소정의 음원에 대응하는 영역을 선택하기 위한, 입체 오디오 및 입체 영상 간의 연계 정보를 전송하는 데이터 전송부를 포함할 수 있다.On the other hand, although not shown in Figure 2, in order to reproduce the stereoscopic audio and stereoscopic images according to an embodiment of the present invention, such stereoscopic audio and stereoscopic images should be provided. The stereoscopic audio and stereoscopic image providing apparatus according to an embodiment of the present invention is stereoscopic audio, stereoscopic image associated with the stereoscopic audio, stereoscopic image for selecting a region corresponding to a predetermined sound source of stereoscopic audio when the stereoscopic image is reproduced. It may include a data transmission unit for transmitting the linkage information between the audio and stereoscopic image.

또한 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 제공 장치는 입체 오디오/비디오 장면 구성 데이터를 생성하고, 이 입체 오디오/비디오 장면 구성 데이터를 데이터 전송부를 통해 전송할 수 있다. 본 발명의 일 실시예에 의한 연계 정보는 하나의 입체 오디오와 연계된 하나의 입체 영상에 대한 것인데, 입체 영상에는 보통 복수 개의 객체가 존재하게 마련이다. 따라서 하나 이상의 입체 오디오, 하나 이상의 입체 영상 및 하나 이상의 연계 정보를 이용하여 특정 장면에 대한 장면 구성 데이터를 생성하는 것이 가능하다. 예를 들어 어떤 특정 장면에서, A라는 오디오 객체에 대해서는 영상 객체 A'가, B라는 오디오 객체에 대해서는 영상 객체 B'가 각각 연계되어 있는 경우, 해당 장면에서 A 및 B 오디오 객체가 동시에 재생됨에 따라서 A' 및 B'의 깊이감이 동시에 조정될 수 있는 것이다. 이렇게 특정 장면에 대한 복수 개의 입체 오디오, 입체 영상 및 연계 정보를 통합하여 하나의 입체 오디오/비디오 장면 구성 데이터로 생성하고 이를 전송하는 것이 가능하다.In addition, the stereoscopic audio and stereoscopic image providing apparatus according to an embodiment of the present invention may generate stereoscopic audio / video scene configuration data and transmit the stereoscopic audio / video scene configuration data through the data transmission unit. The linkage information according to an embodiment of the present invention relates to one stereoscopic image linked to one stereoscopic audio, and a plurality of objects are usually present in the stereoscopic image. Therefore, it is possible to generate scene configuration data for a specific scene using at least one stereoscopic audio, at least one stereoscopic image, and at least one associated information. For example, in a particular scene, if an image object A 'is associated with an audio object called A and an image object B' is associated with an audio object called B, the A and B audio objects are simultaneously played in the scene. The depth of A 'and B' can be adjusted simultaneously. In this way, it is possible to integrate a plurality of stereoscopic audio, stereoscopic images, and linkage information for a specific scene to generate and transmit one stereoscopic audio / video scene configuration data.

도 3은 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 재생 방법 과정을 설명하기 위한 흐름도이다.3 is a flowchart illustrating a process of stereoscopic audio and stereoscopic image reproduction according to an embodiment of the present invention.

먼저 입체 오디오 및 이 입체 오디오와 연계된 입체 영상을 획득한다(302). 앞서 말한 바와 같이 입체 오디오 및 입체 영상은 어떠한 방법이나 매체로도 획득하는 것이 가능하다. 그리고 입체 오디오 및 입체 영상의 획득과 함께, 입체 오디오 및 입체 영상 간의 연계 정보를 획득한다(304).First, stereoscopic audio and a stereoscopic image associated with the stereoscopic audio are obtained (302). As mentioned above, stereoscopic audio and stereoscopic images can be obtained by any method or medium. In addition to acquiring stereoscopic audio and stereoscopic images, linkage information between stereoscopic audio and stereoscopic images is acquired (304).

그리고 나서, 획득된 연계 정보를 이용하여, 재생되는 입체 영상에서 입체 오디오의 소정의 음원에 대응하는 영역(또는 객체)을 선택한다(302). 이렇게 선택된 영역의 입체감 정보(깊이 또는 디스패러티 정보), 색감 또는 명암을 조정함으로써 시청자에게 해당 음원에 대한 보다 실감나는 입체 영상을 제공하는 것이 가능하다. 입체감 정보의 경우, 깊이 또는 디스패러티 정보의 조정 값을 늘리거나 줄임으로써 해당 영역이나 객체를 더 튀어나와 보이게 혹은 더 들어가 보이게 할 수 있다. 또한 색감을 변형시킴으로써 해당 영역이 주변의 다른 영역 또는 물체와는 확연히 다르게 보이도록 할 수 있다. 또는 해당 영역의 명암을 조정하여 같은 효과를 얻을 수도 있다. Then, using the obtained linkage information, a region (or object) corresponding to a predetermined sound source of stereoscopic audio is selected in the stereoscopic image to be reproduced (302). By adjusting stereoscopic information (depth or disparity information), color or contrast of the selected area, it is possible to provide a viewer with a more realistic stereoscopic image of the corresponding sound source. In the case of 3D information, by increasing or decreasing the adjustment value of the depth or disparity information, the corresponding region or object can be made to protrude or appear more. Also, by changing the color, the area can be seen to be distinctly different from other areas or objects around it. Alternatively, the same effect can be obtained by adjusting the contrast of the corresponding area.

도 4는 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 제공 과정을 설명하기 위한 흐름도이다. 4 is a flowchart illustrating a stereoscopic audio and stereoscopic image providing process according to an exemplary embodiment of the present invention.

먼저 입체 오디오 및 이 입체 오디오와 연계된 입체 영상을 전송한다(402). 그리고 이 입체 영상을 재생할 때, 입체 오디오의 소정의 음원에 대응하는 영역을 선택하기 위한, 입체 오디오 및 입체 영상 간의 연계 정보를 전송한다(404). 여기서, 연계 정보는 입체 오디오의 각 음원에 대한 식별 정보 및 입체 영상에 포함된 각 영역의 식별 정보를 포함한다.First, stereoscopic audio and stereoscopic images associated with the stereoscopic audio are transmitted (402). When the stereoscopic image is reproduced, linkage information between the stereoscopic audio and the stereoscopic image is selected (404) for selecting an area corresponding to a predetermined sound source of stereoscopic audio. Here, the association information includes identification information of each sound source of stereoscopic audio and identification information of each region included in the stereoscopic image.

그리고 나서, 복수 개의 입체 오디오, 입체 영상 및 연계 정보를 이용하여 입체 오디오/비디오 장면 구성 데이터를 생성하며(406), 생성된 입체 오디오/비디오 장면 구성 데이터를 전송한다(408).Then, the stereoscopic audio / video scene configuration data is generated using the plurality of stereoscopic audio, stereoscopic images, and associated information (406), and the generated stereoscopic audio / video scene configuration data is transmitted (408).

이러한 본 발명에 의하면, 입체 오디오와 입체 영상 간의 연계 정보를 이용하여 입체 오디오의 음원에 대응하는 입체 영상 영역을 선택함으로써 단순히 오디오와 입체 영상의 동기만을 맞추는 것이 아니라 입체 영상 내에 위치한 특정 객체와 오디오를 매칭시킬 수 있는 장점이 있다.According to the present invention, by selecting the stereoscopic image area corresponding to the sound source of the stereoscopic audio using the linkage information between the stereoscopic audio and the stereoscopic image, it is not only to synchronize the audio and stereoscopic image, but also to locate a specific object and audio located in the stereoscopic image. There is an advantage to match.

또한 본 발명에 의하면 입체 영상 내의 특정 객체와 입체 오디오를 매칭시킴으로써 해당 오디오에 맞는 입체 영상 효과를 해당 객체에 부여하여 보다 실감나는 입체 오디오 및 입체 영상 서비스를 제공할 수 있는 장점이 있다.Further, according to the present invention, by matching a stereoscopic audio with a specific object in the stereoscopic image, a stereoscopic image effect corresponding to the audio is provided to the corresponding object, thereby providing a more realistic stereoscopic audio and stereoscopic video service.

전술한 본 발명은, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.The present invention described above is capable of various substitutions, modifications, and changes without departing from the spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited by.

도 1은 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상의 재생 및 제공 방법을 설명하기 위한 도면.1 is a view for explaining a method of playing and providing stereoscopic audio and stereoscopic images according to an embodiment of the present invention.

도 2는 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 재생 장치의 구성을 나타내는 구성도.2 is a block diagram showing the configuration of a stereoscopic audio and stereoscopic image reproducing apparatus according to an embodiment of the present invention.

도 3은 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 재생 방법 과정을 설명하기 위한 흐름도.3 is a flowchart illustrating a process of stereoscopic audio and stereoscopic image reproduction according to an embodiment of the present invention.

도 4는 본 발명의 일 실시예에 의한 입체 오디오 및 입체 영상 제공 과정을 설명하기 위한 흐름도.4 is a flowchart illustrating a stereoscopic audio and stereoscopic image providing process according to an exemplary embodiment of the present invention.

Claims

Obtaining stereoscopic audio and stereoscopic images associated with the stereoscopic audio;

Obtaining linkage information between the stereoscopic audio and the stereoscopic image; And

Selecting an area corresponding to a predetermined sound source of the stereoscopic audio from the stereoscopic image to be reproduced using the linkage information;

The link information includes identification information of each sound source of the stereoscopic audio and identification information of each region included in the stereoscopic image.

The method of claim 1,

Adjusting stereoscopic information of the selected area;

And the stereoscopic information is depth information or disparity information of the stereoscopic image.

The method of claim 1,

And adjusting the color or contrast of the selected area.

The method according to claim 1 or 2,

The linkage information is

And at least one of an adjustment value for adjusting the stereoscopic information, color or contrast, and time information for adjusting the stereoscopic information, color or contrast.

An audio and stereoscopic image acquisition unit for acquiring stereoscopic audio and stereoscopic images associated with the stereoscopic audio;

A linkage information acquisition unit for acquiring linkage information between the stereoscopic audio and the stereoscopic image; And

A stereoscopic image processing unit which selects a region corresponding to a predetermined sound source of the stereoscopic audio from the stereoscopic image to be reproduced using the linkage information;

The method of claim 5,

The stereoscopic image processing unit

Adjust stereoscopic information of the selected area,

The method of claim 5,

The stereoscopic image processing unit

And adjusting the color or contrast of the selected area.

The method according to claim 6 or 7,

The linkage information is

Transmitting stereoscopic audio and stereoscopic images associated with the stereoscopic audio; And

Transmitting linking information between the stereoscopic audio and the stereoscopic image for selecting an area corresponding to a predetermined sound source of the stereoscopic audio when the stereoscopic image is reproduced,

The method of claim 9,

And the stereoscopic audio, the stereoscopic image, and the linkage information are configured as a single unified file.

The method of claim 9,

The link information comprises a stereo audio and a stereoscopic image and a separate file, stereoscopic audio and stereoscopic image providing method.

The method of claim 9,

Generating stereoscopic audio / video scene configuration data using the stereoscopic audio, the stereoscopic image, and the linkage information; And

Transmitting the stereoscopic audio / video scene composition data.

Stereo audio and stereoscopic image providing method further comprising.

3D audio, a stereoscopic image associated with the stereoscopic audio, and a data transmission unit configured to transmit linkage information between the stereoscopic audio and the stereoscopic image for selecting a region corresponding to a predetermined sound source of the stereoscopic audio when the stereoscopic image is reproduced. and,

The method of claim 13,

The apparatus may further include a scene configuration unit configured to generate stereoscopic audio / video scene configuration data using the stereoscopic audio, the stereoscopic image, and the linkage information.

And the data transmission unit transmits the stereoscopic audio / video scene configuration data.