KR20170012466A

KR20170012466A - Apparatus and method for producing and playing back a copy-protected wave field synthesis audio rendition

Info

Publication number: KR20170012466A
Application number: KR1020167036833A
Authority: KR
Inventors: 토마스 스포레르; 르네 로디가스트
Original assignee: 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베.
Priority date: 2014-06-20
Filing date: 2015-06-12
Publication date: 2017-02-02
Also published as: CN106576212B; JP2017522802A; DE102014211899A1; CN106576212A; US20170150286A1; JP6253816B2; KR101913165B1; EP3158778B1; WO2015193196A1; EP3158778A1

Abstract

실시예는 복수의 오디오 오브젝트들을 갖는 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 장치를 제공하며, 각각의 오디오 오브젝트는 오디오 파일 및 위치 정보를 포함한다. 이 장치는, 적어도 하나의 오디오 오브젝트에 대한 수정된 오디오 파일을 생성하기 위해 복수의 오디오 오브젝트들 중 적어도 하나의 오디오 파일에 워터마크를 내장하기 위한 워터마크 내장기를 포함하며, 워터마크는 재생실을 특정한다. 추가로, 이 장치는 수정된 오디오 파일의 특정 재생실의 스피커 구성 및 적어도 하나의 오디오 오브젝트에 대한 위치를 사용함으로써, 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 파 필드 합성 프로세서를 포함한다.The embodiment provides an apparatus for generating a copy-protected far field composite audio representation of an audio scene having a plurality of audio objects, each audio object including an audio file and location information. The apparatus includes a watermark embedder for embedding a watermark in at least one audio file of a plurality of audio objects to generate a modified audio file for at least one audio object, Specify. Additionally, the apparatus includes a far field synthesis processor for generating a copy protected, far field synthesized audio representation of the audio scene by using the speaker configuration of the particular playback room of the modified audio file and the location for the at least one audio object .

Description

[0001] APPARATUS AND METHOD FOR PRODUCING AND PLAYING A COPY-PROTECTED WAVE FIELD SYNTHESIS AUDIO RENDITION [0002]

본 발명의 실시예들은 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 생성하는 장치, 연관된 방법 뿐만 아니라 및 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 재생하는 장치 및 연관된 방법에 관한 것이다. 추가적인 실시예들은 이 방법들을 수행하기 위한 컴퓨터 프로그램에 관한 것이다.Embodiments of the present invention are directed to an apparatus for generating a copy protected far field composite audio representation of an audio scene, as well as associated methods and apparatus and associated methods for reproducing a copy protected far field synthetic audio representation of an audio scene. Additional embodiments relate to a computer program for performing these methods.

파 필드 합성 재생 시스템들에서, 미처리 데이터, 즉 오디오 파일 뿐만 아니라 메타데이터로서 통상적으로 존재하는 오디오 오브젝트들은 각각 저장 및 송신되고, 재생실 내에 실제로 존재하는 스피커들 및 실제로 존재하는 스피커 구성(예를 들어, 공간에 분산된 30개보다 많은 스피커들을 갖는 어레이)에 의존하여 각각 렌더링된다. 이를 위해, 메타데이터는 통상적으로 인클로징된 오디오 오브젝트들에 대한 위치 정보를 포함한다. 렌더링 동안, 위치 정보에 따라, 그리고 존재하는 스피커 구성에 따라, 오디오 파일들은 재생실의 개별적인 오디오 오브젝트를 가상으로 포지셔닝하는 것을 목적으로 복수의 스피커 채널들에 분배된다. 그 결과, 통상적으로, 오디오 오브젝트에 할당된 오디오 파일은 모든 스피커 채널들을 통해 출력되지만, 상이한 스케일링으로(즉, 상이한 라우드니스(loudness)로) 및 상이한 지연으로 출력된다. In far field synthetic reproduction systems, raw data, that is, audio files that are typically present as metadata as well as audio files, are stored and transmitted, respectively, and are used to represent the actual speakers and actual speaker configurations , An array with more than 30 loudspeakers scattered in space), respectively. To this end, the metadata typically includes location information for the enclosed audio objects. During rendering, audio files are distributed to a plurality of speaker channels for the purpose of virtually positioning individual audio objects in the play room, depending on the position information, and the speaker configuration present. As a result, audio files normally assigned to audio objects are output through all speaker channels, but are output with different scaling (i.e., with different loudness) and with different delays.

일부 상황들에서, 재생실의 하드웨어는 최소한으로 감소되어야 하고, 이는, 어떠한 렌더러(아래에서 파 필드 합성 프로세서로 지칭됨)도 없이 오직 스피커 어레이를 갖는 플레이어만 재생실에 설치될 필요가 있게 한다. 그러한 접근법에서, 오디오 장면의 파 필드 합성 오디오 표현은 정확한 스피커 구성을 위해 사전-렌더링되고, 정확하게 사전-렌더링된 파 필드 합성 오디오 표현은 정확한 재생실에서 재생되는 것으로 고려되어야 하는데, 이는, 잘못된 룸(즉, 잘못된 스피커 어레이를 갖는 룸)에서 오디오 표현의 재생은 통상적으로 오디오 품질의 상당한 감소를 초래하기 때문이다. 예를 들어, 이러한 개념에 기초하여, 몇몇 룸들과 상이한 스피커 셋업들을 갖는 영화관에서는 후속적인 품질 손실들에 의한 잘못된 동작이 배제될 수 없다.In some situations, the hardware of the playback room must be minimized, which allows only players with a speaker array to be installed in the playback room, without any renderer (referred to below as a far field synthesis processor). In such an approach, the far field composite audio representation of the audio scene must be pre-rendered for the correct speaker configuration and the correctly pre-rendered far field composite audio representation should be considered to be reproduced in the correct reproduction room, E., A room with an erroneous speaker array), typically results in a significant reduction in audio quality. For example, based on this concept, in a movie theater having different room sets and different speaker setups, erroneous actions due to subsequent quality losses can not be ruled out.

특히, 사전-렌더링된 컨텐츠의 상황에서 권한 관리에 의해 추가적 요구들이 행해져서, 재생실에서 특정 컨텐츠의 재생은, 오직 라이센스가 이용가능한 경우에만 허용되는 조치가 취해져야 한다. 이러한 문제를 해결하기 위한 몇몇 접근법이 종래 기술에 존재한다. In particular, additional requests are made by rights management in the context of pre-rendered content so that the reproduction of specific content in the play room must be taken only if licenses are available. Several approaches exist to address this problem in the prior art.

하나의 솔루션은, 예를 들어, 특히 라이센스 문제에 대해, 예를 들어, 동글(일반적으로: 휴대용 메모리 매체)에서 암호화의 사용 및 키의 저장을 별도로 하는 것이다. 여기서, 동글은, 동일한 것을 복사하기 충분히 곤란하도록 설계되는 것이 바람직하다. 이러한 절차에 의해, 오직 그 동글에 의해서만 재생이 인에이블되는 것이 보장될 수 있다. 이 방법의 단점은, 동글을 잃어버린 경우, 전체 라이센스 컨텐츠가 더 이상 재생될 수 없다는 점이다. 추가적으로, 암호화될 데이터 레이트는 비교적 높고, 이는 하드웨어를 가장 필수적인 것만으로 감소시키려는 목적에 반한다. One solution is to separate the use of encryption and storage of keys, for example, in a dongle (typically: a portable memory medium), for example, especially for license issues. Here, it is preferable that the dongle is designed so as to be difficult enough to copy the same thing. With this procedure, it can be ensured that playback is enabled only by that dongle. The disadvantage of this method is that if the dongle is lost, the entire licensed content can no longer be played back. Additionally, the data rate to be encrypted is relatively high, which is against the goal of reducing hardware to only the most essential.

오디오 파일을 암호화하는 것에 대한 대안으로서, 소위 오디오 워터마킹(watermarking)(아래에서 오디오 워터마크로 지칭됨)이 사용될 수 있다. 여기서, 유용한 신호에 의해 마스킹된 신호, 즉, 비가청 신호가 오디오 신호에 인식된다. 예를 들어, 워터마크에 의한 가청 간섭들을 방지하기 위해, 워터마크는 개별적인 채널에만 인식될 수 있다. 재생 측에서, 워터마크 검출기는 워터마크를 추출할 수 있고, 라이센스가 이용가능한 재생 시스템의 식별 번호와 워터마크가 매칭하지 않는 경우 재생을 거부할 수 있다. 이 워터마킹 기술은 또한, 워터마크에 기초하여, 사전-렌더링된 파 필드 합성 오디오 표현과 특정 재생실과의 연관성이 미리 결정될 수 있도록 사전-렌더링 기술과 호환된다. As an alternative to encrypting the audio file, so-called audio watermarking (referred to below as an audio watermark) may be used. Here, the signal masked by the useful signal, i.e., the non-audible signal, is recognized in the audio signal. For example, to prevent audible interference by watermarks, watermarks can be recognized only on individual channels. On the playback side, the watermark detector can extract the watermark and refuse to play if the license does not match the identification number of the available playback system with the watermark. This watermarking technique is also compatible with pre-rendering techniques so that the association of a pre-rendered far field composite audio representation with a particular playback room can be predetermined based on the watermark.

오디오 워터마킹에 의한 복사 방지의 기본적인 문제는 시행착오 방식에 의한 의도적인 파괴가 가능하다는 점이다. 그 배경은, "공격자"가 워터마크에 대한 액세스를 갖고, 더 이상 워터마크가 검출가능하지 않을 때까지 신호를 변경할 수 있다는 것이다. 특히, 사전-렌더링된 파 필드 합성 오디오 표현의 스피커 채널과 같은 단일 채널에서만 워터마크가 인식되는 앞서 언급된 접근법에서, 인접한 2개의 채널들의 상관을 비교함으로써, 타겟팅된 공격이 더 용이하게 행해지는 문제가 존재한다. 따라서, 개선된 접근법에 대한 요구가 존재한다.The basic problem of copy protection by audio watermarking is that it can be intentionally destroyed by trial and error method. The background is that the "attacker" has access to the watermark and can change the signal until the watermark is no longer detectable. In particular, in the above-mentioned approach, in which the watermark is recognized only in a single channel, such as the speaker channel of a pre-rendered far field composite audio representation, the problem of the targeted attack being made more easily by comparing the correlation of two adjacent channels Lt; / RTI > Thus, there is a need for an improved approach.

본 발명의 목적은, 파 필드 합성 오디오 표현들 및 특히 사전-렌더링된 파 필드 합성 오디오 표현들에 대한 복사 방지를 개선하는 장치 및 방법을 제공하는 것이다. It is an object of the present invention to provide an apparatus and method for improving copy protection for far field composite audio representations and in particular for pre-rendered far field composite audio representations.

문제점은 독립항들의 요지들에 의해 해결된다. The problem is solved by the points of the independent clauses.

제 1 실시예는 복수의 오디오 오브젝트들을 갖는 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 장치를 제공하며, 각각의 오디오 오브젝트는 오디오 파일 및 위치 정보를 포함한다. 이 장치는, 적어도 하나의 오디오 오브젝트에 대한 수정된 오디오 파일을 생성하기 위해 복수의 오디오 오브젝트들 중 적어도 하나의 오디오 파일에 워터마크를 내장하기 위한 워터마크 내장기를 포함하며, 워터마크는 재생실을 특정한다. 추가로, 이 장치는 수정된 오디오 파일의 특정 재생실의 스피커 구성 및 적어도 하나의 오디오 오브젝트에 대한 위치를 사용함으로써, 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 파 필드 합성 프로세서를 포함한다 . The first embodiment provides an apparatus for generating a copy-protected far field composite audio representation of an audio scene having a plurality of audio objects, each audio object including an audio file and location information. The apparatus includes a watermark embedder for embedding a watermark in at least one audio file of a plurality of audio objects to generate a modified audio file for at least one audio object, Specify. Additionally, the apparatus includes a far field synthesis processor for generating a copy protected, far field synthesized audio representation of the audio scene by using the speaker configuration of the particular playback room of the modified audio file and the location for the at least one audio object .

본 발명의 제 2 양상은 워터마크를 내장하는 단계 및 복사 방지된 파 필드 합성 오디오 표현을 생성하는 단계를 포함하는 할당된 방법에 관한 것이다. A second aspect of the present invention relates to an assigned method comprising embedding a watermark and generating a copy protected far field synthetic audio representation.

따라서, 본 발명의 이러한 처음 2개의 양상들은, 워터마크가 사전-렌더링된 파 필드 합성 오디오 표현에 삽입되어, 파 필드 합성 오디오 표현이 계산되는 재생실을 특정한다는 지식에 기초한다. 본 발명에 따르면, 워터마크는 렌더링되지 않은 오디오 파일들(미처리 데이터), 즉 렌더링 이전에 제공된 오디오 트랙들에 삽입되어, 워터마크는 (특정 스피커 채널이 아닌) 적어도 하나의 오디오 오브젝트에 링크된다. 미처리 데이터에 워터마크를 인식시키는 것은, 렌더링 후 워터마크가 모든 스피커 채널들 및 적어도 스피커 채널들의 그룹에 걸쳐 분배되는 것을 가능하게 한다. 특히, 종래 기술에 비해, 이것은, 사전-렌더링된 파 필드 합성 오디오 표현으로부터 워터마크가 다시 용이하게 제거될 수 없다는 이점을 갖는다. 이것은 또한, 워터마크가 각각의 오브젝트에 대한 위치 정보에 따라 자신의 "캐리어 오브젝트"와 함께 시간에 따라 변한다는 사실에 의해서 지원된다. Thus, these first two aspects of the present invention are based on the knowledge that the watermark is inserted into the pre-rendered far field composite audio representation to specify the playback room in which the far field composite audio representation is calculated. According to the present invention, a watermark is inserted into un-rendered audio files (raw data), i.e., audio tracks provided prior to rendering so that the watermark is linked to at least one audio object (not a specific speaker channel). Recognizing the watermark in the raw data enables the watermark to be distributed across all speaker channels and at least a group of speaker channels after rendering. In particular, compared to the prior art, this has the advantage that the watermark can not be easily removed again from the pre-rendered far-field composite audio representation. This is also supported by the fact that the watermark changes with time along with its "carrier object" according to the position information for each object.

추가적인 실시예에 따르면, 사전-마스킹, 사후-마스킹, 동시 마스킹 및/또는 잡음 마스킹을 이용하여 워터마크가 오디오 오브젝트의 오디오 파일에 내장되어, 적어도 심리청각적(psychoacoustic) 관점에서 워터마크는 비가청적이다. According to a further embodiment, a watermark is embedded in an audio file of an audio object using pre-masking, post-masking, simultaneous masking and / or noise masking so that, at least from a psychoacoustic point of view, to be.

일 실시예에 따르면, 워터마크는 가장 큰 오디오 오브젝트와 같은 특정 특성을 갖는 오디오 오브젝트의 오디오 파일에 내장될 수 있다. 가장 큰 오디오 오브젝트에 워터마크를 삽입하는 것은 심리청각적 마스킹이 최대화되는 이점을 제공한다. According to one embodiment, the watermark may be embedded in an audio file of an audio object having certain characteristics, such as the largest audio object. Inserting a watermark into the largest audio object offers the benefit of maximizing psychoacoustic masking.

추가적인 실시예들은, (제 3 양상에 따라) 특정 재생실에서 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 재생하기 위한 장치를 제공한다 이 장치는, 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현의 적어도 하나의 스피커 채널에서 특정 재생실을 특정하는 워터마크를 검출하기 위한 워터마크 검출기, 및 오직 워터마크 검출기가 특정 재생실을 특정하는 워터마크를 검출한 경우에만 복사 방지된 파 필드 합성 오디오 표현을 재생하기 위한 플레이어를 포함한다. Additional embodiments provide an apparatus for reproducing a copy-protected far field synthetic audio representation of an audio scene in a particular reproduction room (according to the third aspect) A watermark detector for detecting a watermark that specifies a particular playback room in at least one speaker channel of the watermark detector, and a watermark detector for detecting only a watermark that specifies a particular playback room, For example.

본 발명의 제 4 양상에 따르면, 워터마크를 검출하는 단계 및 복사 방지된 파 합성 오디오 표현을 재생하는 단계를 포함하는, 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 재생하기 위한 방법이 제공된다. According to a fourth aspect of the present invention there is provided a method for reproducing a copy-protected far field composite audio representation of an audio scene, comprising the steps of detecting a watermark and reproducing a copy protected wave composition audio representation .

일 실시예에 따르면, 검출될 워터마크(즉, 각각의 룸의 워터마크)는 워터마크 검출기에 저장되거나, 예를 들어 인터페이스를 통해 데이터 캐리어로부터 판독될 수 있다. According to one embodiment, the watermark to be detected (i.e., the watermark of each room) may be stored in a watermark detector, or read from the data carrier, for example via an interface.

추가적인 실시예에 따르면, 워터마크 검출기는, 주파수 확산기, 및 주파수 확산기를 이용하여 스펙트럼 형태로 변환되는 검출될 워터마크와 적어도 하나의 스피커 채널의 신호 사이의 상관을 결정하도록 기능하는 상관기를 포함한다. According to a further embodiment, the watermark detector comprises a frequency spreader and a correlator that is operative to determine a correlation between the watermark to be detected and the signal of the at least one speaker channel that is transformed into a spectral form using a frequency spreader.

본 발명의 제 5 및 제 6 양상에 따르면, 전술된 방법들의 단계들 또는 하위단계들이 수행될 수 있게 하는 컴퓨터 프로그램이 제공된다.According to the fifth and sixth aspects of the present invention, there is provided a computer program that enables the steps or sub-steps of the above-described methods to be performed.

본 발명의 실시예들은 첨부된 도면들에 기초하여 아래에서 논의될 것이다.
도 1a는 제 1 실시예에 따른 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 장치의 개략적인 블록도이다.
도 1b는 추가적인 실시예에 따른 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 방법의 개략적인 흐름도이다.
도 2a는 제 2 실시예에 따른 복사 방지된 파 필드 합성 오디오 표현을 재생하기 위한 장치의 개략적인 블록도이다.
도 2b는 추가적인 실시예에 따른 복사 방지된 파 필드 합성 오디오 표현을 재생하기 위한 방법의 개략적인 흐름도이다.
도 3은 파 필드 합성 렌더링 동안의 단계들을 설명하기 위한 파 필드 합성 프로세서의 개략적인 블록도이다.
도 4는 오디오 파일에 워터마크를 내장할 경우의 동작 모드를 설명하기 위한 워터마크 내장기의 개략적인 블록도이다.Embodiments of the present invention will be discussed below based on the accompanying drawings.
1A is a schematic block diagram of an apparatus for generating a copy protected far field synthetic audio representation in accordance with the first embodiment.
1B is a schematic flow diagram of a method for generating a copy protected far field synthetic audio representation in accordance with a further embodiment.
2A is a schematic block diagram of an apparatus for reproducing a copy protected far field synthetic audio representation according to a second embodiment.
2B is a schematic flow diagram of a method for reproducing a copy protected far field synthetic audio representation in accordance with a further embodiment.
3 is a schematic block diagram of a far field synthesis processor for illustrating steps during far field synthesis rendering.
4 is a schematic block diagram of a watermark in-organs for explaining an operation mode when a watermark is embedded in an audio file.

이하, 첨부된 도면들을 참조하여 본 발명의 실시예들이 상세히 논의될 것이고, 도면들에서 동일한 엘리먼트들 및 동일한 기능들을 갖는 엘리먼트들에는 동일한 참조 부호들이 제공되어, 그 동일한 것들에 대한 설명은 상호 교환가능하거나 상호 적용가능함을 주목해야 한다. BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the present invention will now be described in detail with reference to the accompanying drawings, in which like elements and elements having the same functions are provided with the same reference numerals, Or mutually applicable.

도 1a, 도 1b, 도 2a 및 도 2b를 참조하여 본 발명의 실시예들을 상세히 논의하기 전에, 도 3에 기초한 파 필드 합성 프로세서 및 도 4에 기초한 워터마크 내장기가 설명될 것이다. Prior to discussing embodiments of the present invention in detail with reference to Figs. 1A, 1B, 2A and 2B, a farfield synthesis processor based on Fig. 3 and a watermark embedder based on Fig. 4 will be described.

도 3은 개략적인 스피커 어레이(20)와 함께 파 필드 합성 프로세서(10)를 도시한다.3 shows a far field synthesis processor 10 with a schematic speaker array 20.

스피커 어레이(20)는 통상적으로 스피커 채널들(LS1-LSn)을 통해 제어되는 복수의 개별적인 스피커들을 포함한다. 예를 들어, 40 또는 60 개의 스피커들을 갖는 스피커 어레이는, 예를 들어, 특정 재생실(22)에 배열된 360°어레이로서 구현될 수 있다. 예를 들어, 룸(22)은 영화 상영관일 수 있으며, 스피커 어레이(20)의 스피커들은 뷰어(24) 주위에 그룹화되거나 어레이로 배열된다. 따라서, 스피커들은, 예를 들어, 스크린 뒤, 뷰어 뒤 뿐만 아니라 청취자 옆의 좌측 및 우측에 배열된다.The speaker array 20 typically includes a plurality of individual speakers controlled via speaker channels LS1-LSn. For example, a speaker array having 40 or 60 loudspeakers may be implemented as a 360 degree array, for example, arranged in a particular reproduction room 22. [ For example, the room 22 can be a movie theater, and the speakers of the speaker array 20 are grouped around the viewer 24 or arranged in an array. Thus, the speakers are arranged, for example, behind the screen, behind the viewer, as well as on the left and right sides of the listener side.

또한, 포인트 P에서, 청취자는 스피커 어레이(20)의 복수의 스피커들에 의해 둘러싸여서, 오디오 오브젝트는 공간에 가상으로 위치될 수 있고, 스피커 채널들(LS1 및 LSn)들을 이용한 스피커 어레이(20)의 각각의 제어에 의해(예를 들어, 스피커 어레이(20)의 스피커들의 서브세트의 일방 제어에 의해) 각각 제거될 수 있다. 개별적인 스피커 채널들(LS1-LSn)이 오직 특정 재생실(22)의 특정 스피커 어레이(20)에 대해서만 결정될 수 있도록, 하나의 오디오 오브젝트의 이러한 가상 포지셔닝 및 가상 움직임은 각각 스피커 구성의 정확한 지식(스피커 어레이(20) 참조)에 크게 의존한다. 결정 및 계산은 각각 후술되는 바와 같이 파 필드 합성 프로세서(10)에 의해 수행된다. Also at point P the listener is surrounded by a plurality of speakers of the speaker array 20 so that the audio objects can be virtually positioned in space and the speaker array 20 using the speaker channels LS1 and LSn, (E. G., By one-way control of a subset of the speakers of the speaker array 20), respectively, by control of each of them. These virtual positioning and virtual movements of an audio object are each determined by the exact knowledge of the speaker configuration, such as the speaker (s), such that the individual speaker channels (LS1-LSn) can only be determined for a particular speaker array (20) (See array 20). The determination and calculation are performed by the far field synthesis processor 10, respectively, as described below.

파 필드 합성 프로세서(10)는, 특정 재생실(22)의 스피커 구성(20)에 대한 정보(I20)(수 및 위치)를 사용함으로써, 오디오 파일 및 위치 정보(시간에 걸친 움직임과 함께 데카르트 좌표계에서의 위치로 정의됨)를 각각 포함하는 복수의 오디오 오브젝트들(AO1-AOn)에 기초하여 복수의 스피커 채널들(LS1-LSn)을 계산하도록 구성된다.The far field synthesis processor 10 uses the information I20 (number and position) of the speaker configuration 20 of the specific reproduction room 22 to generate an audio file and positional information (in the Cartesian coordinate system (LS1-LSn) based on a plurality of audio objects (AO1-AOn), each including a plurality of audio objects (AO1-AOn).

이를 위해, 파 필드 합성 프로세서는 복수의 오디오 신호들이 상이한 오디오 오브젝트들에 공급되게 하는 다수의 입력들(AD1-ADn 참조)을 포함한다. 이러한 방식으로, 입력(AD1 참조)은 예를 들어, 제 1 오디오 오브젝트에 대한 오디오 파일(1) 및 그에 할당된 위치 정보를 수신한다. 예를 들어, 영화관 세팅에서, 오디오 오브젝트(1)는, 예를 들어, 스크린의 좌측에서 우측으로 움직이는 또는 가능하게는 추가적으로 뷰어로부터 멀리 그리고 뷰어를 향해 움직이는 배우의 음성일 것이다. 그 다음, 오디오 파일(1)은 이러한 배우의 실제 음성일 것인 한편, 위치 정보는 특정 시간에 녹음 세팅에서 제 1 배우의 현재 위치를 표현하는 시간의 함수이다. 한편, 오디오 파일 n은 예를 들어 제 1 배우와 같은 방식으로 또는 다른 방식으로 움직이는 추가적인 배우의 음성일 것이다. 다른 배우의 현재의 위치는 오디오 신호 n과 동기화된 위치 정보에 의해 파 필드 합성 프로세서(10)에 제공된다. 실제로, 녹음 세팅에 따라 상이한 가상 오디오 오브젝트들이 존재하고, 각각의 오디오 오브젝트의 오디오 파일은 개별적인 트랙으로서 파 필드 합성 프로세서(10)에 공급된다. To this end, the far field synthesis processor includes a plurality of inputs (see AD1-ADn) that allow a plurality of audio signals to be supplied to different audio objects. In this way, the input (see AD1) receives, for example, the audio file 1 for the first audio object and the assigned location information. For example, in a cinema setting, the audio object 1 may be the voice of an actor moving, for example, from left to right of the screen, or possibly further away from the viewer and moving towards the viewer. Next, the audio file 1 will be the actual voice of this actor, while the positional information is a function of time representing the current position of the first actor in the recording settings at a particular time. On the other hand, the audio file n may be the voice of an additional actor, for example moving in the same way as the first actor or in a different way. The current position of the other actor is provided to the far field synthesis processor 10 by position information synchronized with the audio signal n. Actually, there are different virtual audio objects according to the recording settings, and the audio files of each audio object are supplied to the far field composition processor 10 as individual tracks.

전술된 바와 같이, 파 필드 합성 프로세서는, 직접적으로 재생가능한 아날로그 형태로, 그러나 바람직하게는 디지털 형태로 복수의 스피커 채널들(LS1-LSn)을 출력하며, 그 다음, 이들은 스피커 어레이(20)의 스피커들을 통해 직접 재생될 수 있다. 파 필드 합성 프로세서(10)는 영화 상영관과 같은 재생 세팅(각각 청취실(22) 및 스피커 어레이(20) 참조)에서 개별적인 스피커들의 위치들을 입력 정보(I20)로서 수신한다. As described above, the far field synthesis processor outputs a plurality of speaker channels LS1-LSn in a directly reproducible analog form, but preferably in digital form, And can be directly reproduced through the speakers. The far field synthesis processor 10 receives the positions of the individual loudspeakers as input information I20 in a reproduction setting (see the listening room 22 and the speaker array 20, respectively) such as a movie theater.

추가로, 실내 음향과 같은 더 많은 정보가 이러한 정보 입력(I20)을 통해 판독될 수 있다. In addition, more information such as room sound can be read through this information input I20.

일반적으로, 예를 들어, 스피커 채널(LS1)에 할당되는 스피커 신호는, 가상 오디오 오브젝트들의 성분 신호들의 중첩일 것이어서, 스피커(LS1)에 대한 스피커 신호는 제 1 스피커 오브젝트(1)에 기초한 제 1 성분, 오디오 오브젝트(2)에 기초한 제 2 성분 뿐만 아니라 오디오 오브젝트(n)에 기초한 제 n 성분을 포함한다. 개별 성분 신호들은, 이들의 계산 이후 선형으로 중첩, 즉, 추가되어, 실제 세팅에서, 청취자가 인지할 수 있는 음원의 선형 중첩을 청취하는 청취자의 귀에 선형 중첩을 재생한다. 이러한 중첩으로 인해, 제 1, 제 2 및 제 n 오디오 오브젝트는 각각의 스피커 채널(LS1-LSn)에 포함되고, 오디오 파일은 상이한 스케일링 팩터들로 스케일링되고 그리고/또는 스피커 채널(LS1 및 LSn)마다 상이한 지연 팩터들로 지연된다. 여기서, 스피커 채널에서 오디오 오브젝트가 더 이상 가청이 아니도록, 개별적인 스피커 채널들(LS1-LSn)의 스케일링이 또한 0까지 수행될 수 있다는 점을 주목해야 한다.Generally speaking, for example, the speaker signal assigned to the speaker channel LS1 will be a superposition of the component signals of the virtual audio objects so that the speaker signal for the speaker LS1 is the first Component, an n-th component based on the audio object (n) as well as a second component based on the audio object (2). The individual component signals are linearly superimposed, i.e. added, after their computation to reproduce a linear superposition in the listener's ear that listens to the linear superposition of the sound source that the listener can perceive in the actual setting. Due to this overlapping, the first, second and n-th audio objects are included in the respective speaker channels LS1-LSn, the audio files are scaled by different scaling factors and / or the speaker channels LS1 and LSn And is delayed with different delay factors. Here, it should be noted that the scaling of the individual speaker channels (LS1-LSn) can also be performed to zero so that the audio object in the speaker channel is no longer audible.

도 4는 변조된 오디오 파일(AD')을 생성하기 위해 오디오 파일(AD)에 워터마크(WS)를 내장하기 위한 워터마크 내장기(30)를 도시한다. Fig. 4 shows an organ in watermark 30 for embedding a watermark WS in an audio file AD to generate a modulated audio file AD '.

워터마크 내장기(30)는, 예를 들어, PCM 신호로서 또는 시간 이산적 오디오 샘플들의 비트스트림으로서 존재하는 오디오 파일(AD) 및 내장될 워터마크(WS) 둘 모두를 판독한다. 이러한 2개의 판독된 디지털 신호들(AD 및 WS)은 이제, 예를 들어, 주파수 확산기를 이용하여 스펙트럼 형태, 즉, 오디오 스펙트럼 값들(AD_S) 및 워터마크 스펙트럼 값들(WS_S)로 변환된다(스테이지(30a) 참조). WS를 WS_S로 변환하는 것은, 예를 들어, 데이터 신호(WS)에 잡음 신호(백색 잡음) 또는 의사 잡음 신호를 곱함으로써 수행될 수 있다. AD를 AD_S로 변환하는 것은, 예를 들어, 고속 푸리에 변환의 도움으로 직접 변환될 수 있다. 오디오 파일(AD) 및 오디오 파일의 스펙트럼 형태(AD_S)로부터 시작하여, 무엇보다도, 마스킹을 위한 영역(예를 들어, 오디오 신호의 높은 전반적 에너지 및 (시간적) 마스킹 임계치들을 각각 갖는 영역들)을 표시하는 심리청각적 모델을 결정하는 것이 가능하다. 마스킹 임계치들은, 오디오 신호가 어떻게 변경되어, 그 변경이 결과적 청각적 인식과 무관할 수 있는지를 표시한다.The watermarked organs 30 read both the audio file AD and the watermark (WS) to be embedded, for example, as a PCM signal or as a bit stream of temporally discrete audio samples. These two read digital signals AD and WS are now converted to a spectral form, e.g., audio spectral values AD _S and watermark spectral values WS _S , using, for example, a frequency spreader See the stage 30a). Conversion of WS to WS _S can be performed, for example, by multiplying the data signal WS by a noise signal (white noise) or a pseudo noise signal. Converting AD to AD _S can be directly converted, for example, with the aid of a fast Fourier transform. Starting from the audio file AD and the spectral form AD _S of the audio file, it is of course possible to start with a region for masking (for example regions with high overall energy and temporal masking thresholds of the audio signal, respectively) It is possible to determine the psychoacoustic model to display. The masking thresholds indicate how the audio signal is changed so that the change can be independent of the resulting auditory perception.

시간적 마스킹(사후-마스킹, 사전-마스킹 또는 동기식 마스킹) 뿐만 아니라 잡음 마스킹(신호에 의한 잡음 마스킹 또는 잡음에 의한 신호 마스킹)과 같은 상이한 메커니즘들이 이용가능하다. 마스킹된 형태의 데이터 신호를 AD에 삽입하기 위해 사용될 수 있는 AD_S의 이러한 마스킹 임계치들 및 마스킹 영역들이 각각 공지되는 경우, AD_S 및 WS_S의 결합이 제 2 스테이지에서 수행된다(참조 번호(30b) 참조). 상세하게는, 결합하는 단계에서, 오디오 신호(AD_S)는 데이터 신호(WS_S)의 가중된 버전과 중첩되어, 가중 동안 결정된 마스킹 임계치들 및 결정된 마스킹 영역들이 각각 고려된다. 이러한 중첩의 결과는 (스펙트럼 변화에서) 수정된 오디오 신호(AD' 및 WS_S')이다. 이러한 절차에 의해, 오디오 파일(AD')을 재생하는 경우 인간에게 가청적인 오디오 재생의 어떠한 변경도 없이, 오디오 파일(AD)이 워터마크(WS)와 같은 데이터 신호에 대한 캐리어가 될 때까지 오디오 파일(AD)을 수정하는 것이 가능하다.Different mechanisms are available, such as temporal masking (post-masking, pre-masking or synchronous masking) as well as noise masking (signal masking by signal or signal masking by noise). If s that can be used to insert the data signal of a masked form in the AD such masking threshold of the AD _S and masking regions are known, respectively, a combination of AD _S and WS _S is carried out in the second stage (reference numeral (30b ) Reference). Specifically, in the step of combining, the overlapping with the weighted version of the audio signal _(S AD) is a data signal (WS _S), for weighting the determined masking threshold value and the determined masking regions are considered, respectively. The result of this superposition is the modified audio signals (AD 'and WS _S ') (in spectral variations). By this procedure, when the audio file AD 'is reproduced, the audio file AD is reproduced without any change of the audible audio reproduction to the human being, until the audio file AD becomes the carrier for the data signal such as the watermark WS. It is possible to modify the file AD.

도 1a는 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 생성하기 위한 장치(100)를 도시한다. 장치(100)는 복수의 오디오 오브젝트들(각각, AD1 + PO1 및 ADn + POn 참조)에 대한 입력들 및 복수의 스피커 채널들(LS1-LSn)에 대한 출력들을 포함한다. 추가로, 장치(100)는 워터마크 내장기(102) 및 파 필드 합성 프로세서(104)를 포함한다. 워터마크 내장기(102)는 입력측, 즉, 오디오 오브젝트들(AD1 + PO1 및 ADn + POn)에 대한 입력들 측에 배열된다. 파 필드 합성 프로세서(104)는 출력측, 즉, 스피커 채널들(LS1-LSn)에 대한 출력들의 측에 제공된다. 후속적으로, 장치(100)의 동작 모드가, 할당된 방법을 도시하는 도 1b를 참조하여 설명될 것이다. FIG. 1A illustrates an apparatus 100 for generating a copy protected, far field composite audio representation of an audio scene. The apparatus 100 includes inputs for a plurality of audio objects (see AD1 + PO1 and ADn + POn, respectively) and outputs for a plurality of speaker channels LS1-LSn. Additionally, the apparatus 100 includes a watermarked organs 102 and farfield synthesis processor 104. The organ 102 within the watermark is arranged on the input side, i.e., on the inputs side for audio objects AD1 + PO1 and ADn + POn. The far field synthesis processor 104 is provided on the output side, i.e., on the side of outputs to the speaker channels LS1-LSn. Subsequently, the operating mode of the device 100 will be described with reference to FIG. 1B showing the assigned method.

오디오 장면들의 파 필드 합성 오디오 표현은 적어도 복수의 오디오 오브젝트들에(각각 AD1 + PO1 및 ADn + POn 참조) 기초한다. 따라서, 이미 앞서 예시된 바와 같이, 각각의 오디오 오브젝트는 오디오 파일(AD1 또는 Adn) 뿐만 아니라 할당된 위치 정보(PO1 또는 POn)를 포함한다. The far field composite audio representation of the audio scenes is based on at least a plurality of audio objects (see AD1 + PO1 and ADn + POn, respectively). Thus, as already exemplified above, each audio object includes the audio file AD1 or Adn as well as the assigned location information PO1 or POn.

제 1 단계에서, 장치(100)(도 1b의 단계(120) 참조)는 워터마크(WS)를 내장하고, 이는, 적어도 하나의 오디오 파일, 즉, 복수의 오디오 오브젝트들 중 AD1 또는 Adn에서 워터마크 내장기(102)에 대한 디지털 신호로서 이용가능하다. 워터마크는 파 필드 합성 오디오 표현이 렌더링되는 특정 재생실을 특정한다. 여기서, 워터마크는 재생실, 재생실의 플레이어 또는 일반적으로 그 룸에 할당된 키의 ID 또는 개별적인 고유 ID를 포함할 수 있다. 내장하는 것은 앞서 설명된 프로세스에 따라 수행될 수 있다. 내장의 결과는 적어도 수정된 오디오 파일(AD1' 또는 ADn')(여기서는 AD1')이다. In step 1, the device 100 (see step 120 of FIG. 1B) embeds a watermark WS, which contains at least one audio file, i.e., AD1 or Adn, Can be used as a digital signal for the organs 102 in the mark. The watermark specifies a particular play room in which the far field composite audio representation is rendered. Here, the watermark may include a playback room, a player in the playback room, or generally an ID of a key assigned to the room, or an individual unique ID. Embedding can be performed according to the process described above. The result of the built-in is at least the modified audio file AD1 'or ADn' (here AD1 ').

따라서, 워터마크 내장기(102)는 위치 정보(PO1)와 함께 수정된 오디오 파일(AD1 ')을 출력하고, 위치 정보(POn)와 함께 수정되지 않은 오디오 파일(ADn)을 추가로 포워딩한다. 워터마크 내장기(102)가 몇몇 오디오 파일들(AD1 및 ADn)에 워터마크를 내장하는 경우, 추가적인 실시예들에 따르면, 위치 정보(PO1 및 POn)와 함께 몇몇 수정된 오디오 파일들(AD1' 및 Adn')이 출력된다. 대안적으로, 위치 정보는 워터마크 내장기(102)에 의해 전달되는 것이 아니라, 파 필드 합성 프로세서(104)에 직접 공급될 수 있다. Accordingly, the watermark in-progress 102 outputs the modified audio file AD1 'together with the position information PO1 and further forwards the unmodified audio file ADn together with the position information POn. If the watermark in-organ 102 embeds a watermark in some audio files AD1 and ADn, according to further embodiments, some modified audio files AD1 'and AD2' together with the position information PO1 and POn, And Adn ') are output. Alternatively, the location information may be fed directly to the farfield composition processor 104, rather than being delivered by the organ 102 within the watermark.

추가적인 실시예들에 따르면, 워터마크 내장기(102)는 특정 특성을 갖는 오직 하나의 오디오 파일에만 워터마크를 내장할 수 있다. 특성은 예를 들어, 다른 오디오 오브젝트들에 대한 오디오 오브젝트의 상대적인 볼륨 또는 다른 오브젝트들에 비교된 오디오 오브젝트의 상대적인 동작일 수 있다. 또한, 워터마크 내장기(102)는 검출되는 특성에 관해서 복수의 오디오 오브젝트들을 검사하고, 워터마크를 내장하기 위한 워터마크를 선택하도록 구성될 수 있다. According to additional embodiments, the watermarked organs 102 may embed watermarks in only one audio file with specific characteristics. The property may be, for example, the relative volume of the audio object to other audio objects or the relative motion of the audio object to other objects. In addition, the watermarked organs 102 can be configured to examine a plurality of audio objects with respect to the detected characteristic, and to select a watermark for embedding the watermark.

워터마크 내장기(102)가 도 4에서 설명된 워터마크 내장기의 기능을 포함하는 것으로 설명되는 경우에도, 워터마크 내장기(102)는 또한 상이하게 구성될 수 있고 워터마크들에 대한 다른 내장 메커니즘들을 사용할 수 있다. Even if the organ 102 in the watermark is described as including the function of the organ in the watermark described in Fig. 4, the organ 102 in the watermark can also be configured differently, Mechanisms can be used.

파 필드 합성 프로세서(104)는, 개별적인 스피커 채널들(LS1-LSn)을 이용하여 스케일링되고, 지연되고 합산된 형태로 오디오 오브젝트들을 출력하기 위해, 복수의 오디오 오브젝트들(ADn+POn)로부터 시작하여, 각각의 재생실(도 1b의 단계(140) 참조)에 대해 파 필드 합성 오디오 표현, 즉, 개별적인 오디오 오브젝트들의 스케일링(AD1'+PO1 및 ADn+POn)을 계산하는 장치(100)의 제 2 기능적 엘리먼트이고, 적어도 하나의 오디오 오브젝트는 수정된 오디오 파일(AD1')을 포함한다. 이를 위해, 파 필드 합성 프로세서는, 오디오 오브젝트들의 오디오 파일들(AD1'/ADn) 및 위치 정보(PO1/POn) 이외에 스피커 구성(I20)에 대한 정보를 또한 수신한다. 계산은 기본적으로 앞서 설명된 바와 같이 수행된다. 따라서, 오디오 장면의 오디오 표현은 복수의 스피커 채널(LS1-LSn)로서 출력되고, 하드 드라이브 또는 블루레이와 같은 메모리 매체 상에 저장될 수 있으며, 복수의 스피커 채널들(LS1-LSn)은 별개로 저장되는 것이 바람직하다. The far field synthesis processor 104 may be configured to start from a plurality of audio objects ADn + POn to output audio objects in a scaled, delayed, and summed form using individual speaker channels LS1-LSn , A second field of device 100 for calculating the far field composite audio representation, i. E., The scaling of individual audio objects (AD1 '+ PO1 and ADn + POn), for each playback room (see step 140 of FIG. Functional object, and at least one audio object includes a modified audio file AD1 '. To this end, the far field synthesis processor also receives information about the speaker configuration I20 in addition to the audio files AD1 '/ ADn and the location information PO1 / POn of the audio objects. The calculation is basically carried out as described above. Thus, the audio representation of the audio scene may be output as a plurality of speaker channels (LS1-LSn) and stored on a memory medium such as a hard drive or Blu-ray, and the plurality of speaker channels (LS1-LSn) Is preferably stored.

결과적으로, 워터마크(오디오 워터마크)는 모든 또는 적어도 몇몇 스피커 채널들(LS1-LSn)에 걸쳐 (정적으로 및 시간적으로) 분포되고, 개별적인 오디오 오브젝트들과 동일한 음향 위치를 갖는다. 따라서, 심리청각적 관점에서, 이것은 최적으로 비가청인데, 이는, 동일한 방향이 또한 동일한 최대 마스킹을 의미하기 때문이다. 추가로, 예를 들어, 개별적인 스피커 채널들의 비교에 의해, 워터마크가 용이하게 검출 및 제거될 수 없는 것이 보장될 수 있다. 이를 위한 배경은, 워터마크가 스피커 채널들 전부 또는 적어도 대부분에 걸쳐 분포되지만, 상이한 스케일링 및 지연으로 분포되어, 워터마크에 대한 결론을 허용하는 채널 사이의 어떠한 상관도 검출될 수 없다. As a result, the watermark (audio watermark) is distributed (statically and temporally) over all or at least some of the speaker channels LS1-LSn and has the same acoustic position as the individual audio objects. Thus, from a psychoacoustical point of view, this is optimally non-audible because the same direction also means the same maximum masking. In addition, for example, by comparison of individual speaker channels, it can be ensured that the watermark can not be easily detected and removed. The background for this is that the watermark is distributed over all or at least most of the speaker channels, but is distributed with different scaling and delays so that no correlation between the channels that allows conclusions for the watermark can be detected.

도 2a는 오디오 장면의 복사 방지된 파 필드 합성 오디오 표현을 재생하기 위한 장치(200)를 도시한다. 장치(200)는 워터마크 검출기(202) 및 플레이어(204)를 포함한다. 장치(200)는 워터마크 검출기(202)와 플레이어(204) 둘 모두에 의해 액세스될 수 있는 스피커 채널들(LS1-LSn)에 대한 데이터 인터페이스를 포함한다. 플레이어(204)는, 한편으로는, 워터마크 검출기(202)에 정보적으로 접속되고, 다른 한편으로는, 직접적으로 또는 여기서는 LS1*-LSn*로 표시된 복수의 스피커 채널들에 대한 증폭기를 통해 스피커 어레이(20)에 커플링된다. 다음으로, 장치(200)의 동작 모드는, 장치(200)가 기초로 하는 할당된 방법과 함께 논의될 것이다(도 2b 참조). FIG. 2A illustrates an apparatus 200 for reproducing a copy protected, far field composite audio representation of an audio scene. Apparatus 200 includes a watermark detector 202 and a player 204. Apparatus 200 includes a data interface for speaker channels LS1-LSn that can be accessed by both watermark detector 202 and player 204. [ Player 204 is on the one hand communicatively coupled to the watermark detector 202 via an amplifier for a plurality of speaker channels, which is informally connected to the watermark detector 202 and, on the other hand, directly or in this case LS1 * -LSn * And is coupled to the array 20. Next, the mode of operation of the device 200 will be discussed with the assigned method on which the device 200 is based (see FIG. 2B).

예를 들어, 모바일 데이터 캐리어에 저장될 수 있는 파 필드 합성 오디오 표현은, 이미 렌더링된 스피커 채널들(LS1-LSn)의 형태로 장치(200)로 판독되며, 개별적인 스피커 채널들(LS1-LSn)은 장치(200)의 컴포넌트들(202 및 204) 둘 모두에 대해 이용가능하다.For example, a far field composite audio representation, which may be stored in a mobile data carrier, is read into the device 200 in the form of already rendered speaker channels LS1-LSn, and individual speaker channels LS1- Is available for both components 202 and 204 of device 200. [

제 1 단계(도 2b의 단계(220) 참조)에서, 워터마크 검출기(202)에 저장되거나 외부로부터 판독될 수 있는 검출될 워터마크(SWS)의 검출이 수행된다. 검출될 워터마크(SWS)를 판독하는 것은, 예를 들어, 동글을 이용하여 또는 일반적으로 장치(200)에 접속된 외부 저장 매체를 이용하여 수행될 수 있다. 검출될 워터마크(SWS)는 도 1에 대해 논의되거나 설명된 워터마크(WS)에 대응한다. 검출될 워터마크(SWS)를 검출하기 위해, 검출될 워터마크는 통상적으로 미리 렌더링되고, 렌더링은 기본적으로 삽입과 유사하게 수행된다. 따라서, 워터마크는 스펙트럼 형태로, 즉, 잡음 생성기(주파수 확산기)를 이용하여 변환된다. 그 다음, 검출될 워터마크(SWS)의 이러한 스펙트럼 버전은 상관기를 이용하여 스피커 채널들(LS1-LSn)과 비교될 수 있다. 바람직하게는, 워터마크 검출기(202)는 복수의 스피커 채널들(LS1-LSn)에서 검출될 워터마크(SWS)를 검출하도록 구성된다. In the first step (see step 220 of FIG. 2B), detection of the watermark SWS to be detected, which is stored in the watermark detector 202 or can be read from the outside, is performed. Reading the watermark SWS to be detected may be performed, for example, using a dongle or using an external storage medium that is generally connected to the device 200. The watermark SWS to be detected corresponds to the watermark WS discussed or described with reference to Fig. To detect the watermark SWS to be detected, the watermark to be detected is typically pre-rendered, and the rendering is basically performed in a manner similar to the insertion. Thus, the watermark is transformed in a spectral form, i. E. Using a noise generator (frequency spreader). This spectral version of the watermark SWS to be detected can then be compared to the speaker channels LS1-LSn using a correlator. Preferably, the watermark detector 202 is configured to detect a watermark SWS to be detected in a plurality of speaker channels LS1-LSn.

추가적인 실시예에 따르면, 예를 들어, 워터마크가 가장 큰 오디오 오브젝트에 할당되는 경우, 워터마크는 오직 가장 큰 스피커 채널에서만 검출될 수 있는데, 이는, 통상적으로 가장 큰 스피커 채널이 또한 가장 큰 오브젝트를 포함하기 때문이다. 여기서, 특히 몇몇 공간적으로 인접한 오디오 오브젝트들이 개별적으로 가장 큰 오브젝트보다 큰 경우, 이것이 반드시 적용되는 것은 아님을 주목해야 한다. According to a further embodiment, for example, when a watermark is assigned to the largest audio object, the watermark can only be detected in the largest speaker channel, since typically the largest speaker channel also has the largest object . It should be noted here that this is not necessarily the case, especially if some spatially adjacent audio objects are individually larger than the largest object.

따라서, 워터마크가 스피커 채널에서 또는 바람직하게는 상관을 이용하여 몇몇 스피커 채널들에서 결정된 경우, 인에이블 신호가 플레이어(204)에 송신될 수 있고, 그 다음, 인에이블 신호는 파 필드 합성 오디오 표현의 재생을 가능하게 한다. Thus, if the watermark is determined in some speaker channels in a speaker channel or preferably using correlation, an enable signal may be sent to the player 204, and then the enable signal may be a far field composite audio representation .

결과적으로, 플레이어(204)는 오디오 표현(도 2b의 단계(240) 참조)을 재생하고, 실제 재생은 기본적으로, 예를 들어, 스피커 신호들(LS1*-LSn*)로서 증폭된 형태로, 스피커 신호들(LS1-LSn)의 스피커 어레이(20)로의 송신만을 표현한다. As a result, the player 204 reproduces the audio representation (see step 240 of FIG. 2B), and the actual reproduction is basically in the form of amplified, for example, as the speaker signals LS1 * -LSn * Expresses only the transmission of the speaker signals LS1-LSn to the speaker array 20.

추가적인 실시예에 따르면, 워터마크 검출기(202)에 기초하여 플레이어(204)에 의한 능동적인 재생 방지가 가능할 것이다. 이것은, 스피커 채널들(LS1-LSn)에서 워터마크를 파괴하는 것이, 스피커 채널들(LS1-LSn) 및 파 필드 합성 오디오 표현의 재생이 각각 수행되는 것에 대한 성공을 여전히 도출하지는 않을 것이라는 이점을 갖는다. According to a further embodiment, active playback prevention by the player 204 based on the watermark detector 202 will be possible. This has the advantage that destroying the watermark on the speaker channels LSl-LSn will still not yield success against the playback of the speaker channels LSl-LSn and the far field composite audio representation, respectively .

대체로, 앞서 설명된 개념은, 플레이어 측에 어떠한 별도의 렌더러도 요구되지 않고, 따라서 컴퓨팅 전력이 낮게 유지될 수 있다는 이점을 제공한다. 이러한 감소된 컴퓨팅 전력에 의해, 오디오 워터마크에 의해 보호되는 사전-렌더링된 컨텐츠는 또한 데이터 메모리와 관련된 내장형 보드들 또는 DSP들과 같은 성능이 낮은 플랫폼들에 의해 재생될 수 있다. 그 다음, 이러한 플레이어들은 예를 들어, 스위치 박스들, 벽 박스들, 외부 디바이스들 또는 별도의 디바이스들과 같은 모바일 시스템들로서 사용될 수 있다. In general, the concept described above provides the advantage that no separate renderer is required on the player side and therefore the computing power can be kept low. With this reduced computing power, the pre-rendered content protected by the audio watermark can also be played back by lower performance platforms such as embedded boards or DSPs associated with the data memory. These players can then be used as mobile systems, such as, for example, switch boxes, wall boxes, external devices or separate devices.

일부 양상들은 장치의 상황에서 설명되었지만, 이러한 양상들은 또한 대응하는 방법의 설명을 표현하는 것이 명백하여, 장치의 블록 또는 디바이스는 또한 각각의 방법 단계 또는 방법 단계의 특징에 대응한다. 유사하게, 방법의 상황에서 설명되는 양상들은 또한 대응하는 장치의 블록 또는 세부사항 또는 특징의 설명을 표현한다. 방법 단계들의 일부 또는 전부는, 예를 들어, 마이크로프로세서, 프로그래밍가능 컴퓨터 또는 전자 회로와 같은 하드웨어 장치에 의해(또는 이를 사용하여) 실행될 수 있다. 일부 실시예들에서, 가장 중요한 방법 단계들 중 일부 또는 몇몇은 이러한 장치에 의해 실행될 수 있다.While some aspects have been described in the context of an apparatus, it is evident that these aspects also represent a description of a corresponding method, such that a block or device of the apparatus also corresponds to a characteristic of each method step or method step. Similarly, aspects described in the context of a method also represent descriptions of blocks or details or features of corresponding devices. Some or all of the method steps may be performed by (or using) a hardware device such as, for example, a microprocessor, programmable computer or electronic circuitry. In some embodiments, some or some of the most important method steps may be performed by such an apparatus.

오디오 신호 또는 비디오 신호 또는 전송 스트림 신호와 같은 창작적으로 인코딩된 신호는 디지털 메모리 매체 상에 저장될 수 있거나 또는 유선 송신 매체 또는 무선 송신 매체와 같은 송신 매체, 예를 들어, 인터넷을 통해 송신될 수 있다.A creatively encoded signal, such as an audio signal or a video signal or a transport stream signal, may be stored on a digital memory medium or transmitted over a transmission medium, such as a wired transmission medium or wireless transmission medium, have.

창작적인 인코딩된 오디오 신호는 디지털 메모리 매체 상에 저장될 수 있거나 또는 유선 송신 매체 또는 무선 송신 매체, 예를 들어, 인터넷과 같은 송신 매체를 통해 송신될 수 있다. The inventive encoded audio signal may be stored on a digital memory medium or transmitted over a wired transmission medium or a wireless transmission medium, e.g., a transmission medium such as the Internet.

특정한 구현 요건들에 따라, 본 발명의 실시예들은 하드웨어 또는 소프트웨어로 구현될 수 있다. 구현은, 각각의 방법이 수행되도록 프로그래밍가능 컴퓨터 시스템과 협력하거나 협력할 수 있는, 전자적으로 판독가능한 제어 신호들을 저장하는 디지털 저장 매체, 예를 들어, 플로피 디스크, DVD, 블루레이 디스크, CD, ROM, PROM, EPROM, EEPROM 또는 FLASH 메모리, 하드 드라이브 또는 다른 자기 또는 광학 메모리를 사용하여 수행될 수 있다. 따라서, 디지털 저장 매체는 컴퓨터 판독가능일 수 있다.Depending on the specific implementation requirements, embodiments of the present invention may be implemented in hardware or software. The implementation may be implemented in a digital storage medium, e. G. A floppy disk, a DVD, a Blu-ray Disc, a CD, a ROM , PROM, EPROM, EEPROM or FLASH memory, hard drive or other magnetic or optical memory. Thus, the digital storage medium may be computer readable.

본 발명에 따른 일부 실시예들은, 본원에서 설명되는 방법들 중 하나가 수행되도록 프로그래밍가능 컴퓨터 시스템과 협력할 수 있는, 전자적으로 판독가능한 제어 신호들을 포함하는 데이터 캐리어를 포함한다.Some embodiments in accordance with the present invention include a data carrier that includes electronically readable control signals that can cooperate with a programmable computer system to perform one of the methods described herein.

일반적으로, 본 발명의 실시예들은 프로그램 코드를 갖는 컴퓨터 프로그램 물건으로서 구현될 수 있고, 프로그램 코드는, 컴퓨터 프로그램 물건이 컴퓨터 상에서 실행되는 경우 본 방법들 중 하나를 수행하도록 동작한다. In general, embodiments of the present invention may be implemented as a computer program product having program code, and the program code is operable to perform one of the methods when the computer program product is run on a computer.

프로그램 코드는 예를 들어, 머신-판독가능 캐리어 상에 저장될 수 있다. The program code may be stored, for example, on a machine-readable carrier.

다른 실시예들은, 본원에서 설명되는 방법들 중 하나를 수행하기 위한 컴퓨터 프로그램을 포함하고, 컴퓨터 프로그램은 머신 판독가능 캐리어 상에 저장된다. Other embodiments include a computer program for performing one of the methods described herein, wherein the computer program is stored on a machine-readable carrier.

따라서, 달리 말하면, 창작적 방법의 일 실시예는, 컴퓨터 프로그램이 컴퓨터 상에서 실행되는 경우, 본원에서 설명되는 방법들 중 하나를 수행하기 위한 프로그램 코드를 포함하는 컴퓨터 프로그램이다.Thus, in other words, one embodiment of the inventive method is a computer program comprising program code for performing one of the methods described herein when the computer program is run on a computer.

따라서, 창작적 방법들의 추가적인 실시예는, 본원에서 설명되는 방법들 중 하나를 수행하기 위한 컴퓨터 프로그램이 포함되고 기록되는 데이터 캐리어(예를 들어, 디지털 저장 매체 또는 컴퓨터 판독가능 매체)이다. Accordingly, additional embodiments of the inventive methods are data carriers (e.g., digital storage media or computer readable media) in which computer programs for carrying out one of the methods described herein are included and recorded.

따라서, 창작적 방법의 추가적인 실시예는, 본원에서 설명되는 방법들 중 하나를 수행하기 위한 컴퓨터 프로그램을 표현하는 신호들의 시퀀스 또는 데이터 스트림이다. 예를 들어, 신호들의 시퀀스 또는 데이터 스트림은, 예를 들어, 인터넷을 통해, 데이터 통신 접속을 통해 전송되도록 구성될 수 있다. Thus, a further embodiment of the inventive method is a sequence or data stream of signals representing a computer program for performing one of the methods described herein. For example, a sequence of signals or a data stream may be configured to be transmitted over a data communication connection, for example, over the Internet.

추가적인 실시예는, 본원에서 설명되는 방법들 중 하나를 수행하도록 구성 또는 적응되는 프로세싱 수단, 예를 들어, 컴퓨터 또는 프로그래밍가능 로직 디바이스를 포함한다. Additional embodiments include processing means, e.g., a computer or programmable logic device, configured or adapted to perform one of the methods described herein.

추가적인 실시예는, 본원에서 설명되는 방법들 중 하나를 수행하기 위한 컴퓨터 프로그램이 설치된 컴퓨터를 포함한다. Additional embodiments include a computer in which a computer program for performing one of the methods described herein is installed.

본 발명에 따른 추가적인 실시예는, 본원에서 정의되는 방법들 중 하나를 수행하기 위한 컴퓨터 프로그램을 수신기에 송신하도록 구성되는 장치 또는 시스템을 포함한다. 송신은, 전자적으로 또는 광학적으로 수행될 수 있다. 수신기는, 예를 들어, 컴퓨터, 모바일 디바이스, 메모리 디바이스 등일 수 있다. 장치 또는 시스템은, 예를 들어, 컴퓨터 프로그램을 수신기에 전송하기 위한 파일 서버를 포함할 수 있다.Additional embodiments in accordance with the present invention include an apparatus or system configured to transmit a computer program to a receiver for performing one of the methods defined herein. The transmission can be performed electronically or optically. The receiver may be, for example, a computer, a mobile device, a memory device, or the like. A device or system may include, for example, a file server for transmitting a computer program to a receiver.

일부 실시예들에서, 프로그래밍가능 로직 디바이스(예를 들어, 필드 프로그래밍가능 게이트 어레이, FPGA)는 본원에서 설명되는 방법들의 기능들 중 일부 또는 전부를 수행하기 위해 사용될 수 있다. 일부 실시예들에서, 필드 프로그래밍가능 게이트 어레이는, 본원에서 정의되는 방법들 중 하나를 수행하기 위해 마이크로프로세서와 협력할 수 있다. 일반적으로, 방법들은 임의의 하드웨어 장치에 의해 바람직하게 수행된다. 하드웨어 장치는, 컴퓨터 프로세서(CPU)와 같은 범용으로 적용가능한 하드웨어 또는 예를 들어, ASIC와 같이 방법에 특정적인 하드웨어일 수 있다.In some embodiments, a programmable logic device (e.g., a field programmable gate array, FPGA) may be used to perform some or all of the functions of the methods described herein. In some embodiments, the field programmable gate array may cooperate with the microprocessor to perform one of the methods defined herein. In general, the methods are preferably performed by any hardware device. The hardware device may be general purpose applicable hardware such as a computer processor (CPU) or method specific hardware such as, for example, an ASIC.

앞서 설명된 실시예들은, 본 발명의 원리들에 대해 단지 예시적이다. 본원에서 설명되는 배열들 및 세부사항들의 변형들 및 변화들이 당업자들에게 자명할 것이 이해된다. 따라서, 본 발명은 첨부된 특허 청구항들의 범주에 의해서만 제한되며, 본원의 실시예들의 서술 및 설명의 방식으로 제시되는 특정 세부사항들에 의해서는 제한되지 않도록 의도된다.The embodiments described above are merely illustrative of the principles of the present invention. It is understood that variations and modifications of the arrangements and details described herein will be apparent to those skilled in the art. Accordingly, the invention is limited only by the scope of the appended claims, and is not intended to be limited by the specific details presented in the manner of description and explanation of the embodiments herein.

Claims

An apparatus (100) for generating a copy protected, far field synthetic audio representation of an audio scene having a plurality of audio objects,
Each audio object includes audio files AD1, AD2, ADn and position information PS1, PS2, PSn,
(WS) for embedding a watermark (WS) in at least one of the audio files (AD1, AD2, ADn) of the plurality of audio objects to generate a modified audio file (AD1 ') for at least one audio object In-mark organ 102 - The watermark WS specifies the particular reproduction room 22 in which the far field composite audio representation is rendered in accordance with the speaker configuration I20 present in the particular reproduction room 22 -; And
By using the speaker configuration (I20) of the specific reproduction room (22), the modified audio file (AD1 ') and the position information (PS1, PS2, PSn) for the at least one audio object, And a far field synthesis processor (104) for generating the copy protected far field composite audio representation.
An apparatus (100) for generating a copy protected far field synthetic audio representation.

The method according to claim 1,
The watermark inner organ 102 is configured to embed the watermark WS containing predetermined properties in the audio files AD1, AD2, ADn of the audio objects in the plurality of audio objects.
An apparatus (100) for generating a copy protected far field synthetic audio representation.

3. The method of claim 2,
Wherein the predetermined property comprises a relative loudness of an audio object of the plurality of audio objects to other audio objects and / or the predetermined characteristic comprises audio of the plurality of audio objects < RTI ID = 0.0 > Including the relative motion of the object,
An apparatus (100) for generating a copy protected far field synthetic audio representation.

4. The method according to any one of claims 1 to 3,
The far field synthesis processor 104 is configured to calculate a plurality of speaker channels LS1, LS2, LSn to generate the copy protected far field composite audio representation of the audio scene, The channels LS1, LS2 and LSn may be arranged in such a way that the plurality of audio files of the audio objects scaled by different scaling factors and / or delayed by different delay factors, according to the position information PS1, PS2, AD1, AD2, ADn)
An apparatus (100) for generating a copy protected far field synthetic audio representation.

5. The method of claim 4,
At least two of said plurality of speaker channels (LS1, LS2, LSn) comprise said one modified audio file (AD1 ') for said at least one audio object at different scales and / or different delays doing,
An apparatus (100) for generating a copy protected far field synthetic audio representation.

The method according to claim 4 or 5,
Wherein the plurality of speaker channels (LS1, LS2, LSn) comprise at least 40 channels,
An apparatus (100) for generating a copy protected far field synthetic audio representation.

7. The method according to any one of claims 1 to 6,
The watermark within the organ (102) is configured to incorporated in the frequency spectrum _(S AD) of the audio file (AD1, AD2, ADn), the watermark (WS),
An apparatus (100) for generating a copy protected far field synthetic audio representation.

8. The method according to any one of claims 1 to 7,
The watermark inside organs 102 are arranged in such a way that the watermark WS is masked using post-masking, pre-masking, simultaneous masking and / or noise masking, , AD2, ADn)
An apparatus (100) for generating a copy protected far field synthetic audio representation.

CLAIMS 1. A method for generating a copy protected, far field composite audio representation of an audio scene having a plurality of audio objects,
Each audio object includes audio files AD1, AD2, ADn and position information PS1, PS2, PSn,
Embedding a watermark (WS) in at least one of said audio files (AD1, AD2, ADn) of said plurality of audio objects to generate a modified audio file (AD1 ') for at least one audio object 120) - said watermark (WS) specifying said specific reproduction room (22) in which said far field composite audio representation is rendered in accordance with a speaker configuration (I20) present in a particular reproduction room (22); And
By using the speaker configuration (I20) of the specific reproduction room (22), the modified audio file (AD1 ') and the position information (PS1, PS2, PSn) for the at least one audio object, And generating (140) the copy protected far field composite audio representation.
A method for generating a copy protected far field synthetic audio representation.

An apparatus (200) for playing a copy-protected far field synthesized audio representation of an audio scene in a specific playback room (22)
For detecting a watermark (WS) identifying the particular reproduction room (22) in at least one speaker channel (LS1, LS2, LSn) of the copy-protected far field composite audio representation of the audio scene 202); And
Wherein the watermark detector (202) determines the watermark (WS) that identifies the particular reproduction room (22) in which the far field composite audio representation is rendered in accordance with a speaker configuration (I20) present in the particular reproduction room (22) (204) for playing back the copy protected far field composite audio representation only when it is detected,
A device (200) for reproducing a copy protected far field synthetic audio representation.

11. The method of claim 10,
The player 204 is configured to determine whether the watermark (WS) that matches the watermark (SWS) to be detected is not detected by the watermark detector (202)
A device (200) for reproducing a copy protected far field synthetic audio representation.

The method according to claim 10 or 11,
The watermark SWS to be detected is stored in the watermark detector 202 or the device includes an interface that allows a portable data carrier stored with the watermark SWS to be detected to be connected,
A device (200) for reproducing a copy protected far field synthetic audio representation.

13. The method according to any one of claims 10 to 12,
The watermark detector (202) comprises a frequency spreader, and a demultiplexer for demultiplexing the signal between the watermark (SWS) to be detected and the signal of the at least one speaker channel (LS1, LS2, LSn) A correlator configured to determine a correlation,
A device (200) for reproducing a copy protected far field synthetic audio representation.

13. The method according to any one of claims 10 to 12,
The player (204) is connected to a speaker array (20) of the particular reproduction room (22) comprising a plurality of speakers, each speaker comprising a separate speaker channel (LS1, LS2, LSn)
A device (200) for reproducing a copy protected far field synthetic audio representation.

A method for reproducing a copy-protected far field composite audio representation of an audio scene in a specific playback room (22)
The far field composite audio representation is rendered according to a speaker configuration (I20) present in a particular reproduction room (22) at least one speaker channel (LS1, LS2, LSn) of the copy protected wave field synthetic audio representation of the audio scene (220) detecting a watermark (WS) specifying the specific reproduction room (22); And
(240) only when the watermark (WS) identifying the particular reproduction room (22) is detected.
A method for reproducing a copy protected far field synthetic audio representation.

15. A computer program having program code for performing the method according to claim 9 or 15 when executed on a computer.