EP1864279A1 - Device and method for producing a data flow and for producing a multi-channel representation - Google Patents

Device and method for producing a data flow and for producing a multi-channel representation

Info

Publication number
EP1864279A1
EP1864279A1 EP06707562A EP06707562A EP1864279A1 EP 1864279 A1 EP1864279 A1 EP 1864279A1 EP 06707562 A EP06707562 A EP 06707562A EP 06707562 A EP06707562 A EP 06707562A EP 1864279 A1 EP1864279 A1 EP 1864279A1
Authority
EP
European Patent Office
Prior art keywords
channel
multi
fingerprint
block
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP06707562A
Other languages
German (de)
French (fr)
Other versions
EP1864279B1 (en
Inventor
Wolfgang Fiesel
Matthias Neusinger
Harald Popp
Stephan Geyersberger
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Original Assignee
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to DE102005014477A priority Critical patent/DE102005014477A1/en
Application filed by Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV filed Critical Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority to PCT/EP2006/002369 priority patent/WO2006102991A1/en
Publication of EP1864279A1 publication Critical patent/EP1864279A1/en
Application granted granted Critical
Publication of EP1864279B1 publication Critical patent/EP1864279B1/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing

Abstract

The aim of the invention is temporally synchronise a data flow comprising multi-channel additional data and a data flow comprising data, via at least one base channel (3). A finger print information calculation (2) is carried out on the encoding side for the at least one base channel (3), in order to introduce (4) the finger print information into a temporal link in relation to the additional data in a data flow (4). Finger print information is calculated on the decoding side from at least one base channel and is used together with the finger print information which is extracted from the data flow, in order to calculate and to compensate, for example, a time difference between the data flow comprising the multi-channel additional information and the data flow comprising at least one base channel, by means of a correlation, in order to obtain a synchronised multi-channel representation.

Description

Apparatus and method for generating a data stream and for generating a multi-channel representation

description

The present invention relates to audio signal processing and more particularly to multichannel

Processing techniques based that on the

Based on at least one base channel or downmix channel and

Multi-channel additional information a multichannel

Reconstruction of an original multi-channel signal is testament ER.

Technologies under currently under development enabling ever more efficient transmission of audio signals by data reduction, but also an increase in listening pleasure by extensions, such as through the use of multi-channel technology. Examples of such an extension of the usual transmission techniques have become known in recent years under the name Binaural Cue Coding (BCC) and "Spatial Audio Coding", as described in J. Herre, C. Faller, S. Disch, C. Ertel, J . Hubert A. infernal zer, K. Linzmeier, C. Sprenger, P. Kroon: "Spatial Audio Coding: Next-generation Efficient and Compatible Coding of Multi-channel Audio" 117th. AES Convention, San Francisco 2004, Preprint 6186, is described.

Below is detail, discussed various techniques for reducing the amount of data that is required for transmitting a Multika- nal audio signal.

Such techniques are called joint stereo techniques. For this purpose, reference is made to Fig. 3, which shows a joint stereo device 60. This device can be a device which implements the (IS) technology or binaural cue coding (BCC) for example, the intensity stereo. Such an apparatus typically receives as an input signal at least two channels CHI, CH2, .... CHn, and outputs a single carrier channel and parametric multi-channel information. The parametric data are defined such that, in a decoder, an approximation of an original channel (CHI, CH2, ... CHn) can be calculated.

Normally, the carrier channel subband samples, spectral coefficients, time domain samples etc. will comprise providing a relatively fine representation of the underlying signal, while the parametric data do not include such samples or spectral coefficients but include control parameters for controlling a certain reconstruction algorithm such as weighting by multiplication, by time-shifting, by frequency shifting, etc. the parametric multi-channel information, therefore, comprise a relatively coarse representation of the signal or the associated channel. Expressed in numbers, the amount of data required by a carrier channel, a quantity of about 60 to 70 kbit / s, while the amount of data required by parametric side information for one channel in the range of 1.5 to s is 2.5 kbit /. It should be noted that the above numbers apply to compressed data. Of course, a non-compressed CD channel requires data rates in the range of about ten times. An example for parametric data are the known scale factors, intensities ty stereo information or BCC parameters, as set forth below.

The technique of intensity stereo coding is described in the AES preprint 3799, "Intensity Stereo Coding", J. Herre, KH Brandenburg, D. Lederer, February 1994, Amsterdam. General based the concept of intensity stereo is on a main axis transformation is. to carry out the data of both stereophonic audio channels, if most of the data points are concentrated around the first principle axis, a coding gain can be achieved by rotating both signals by a certain angle before the coding takes place. However, this is not always for real stereophonic optionally reproductive technologies. Therefore, this technique is modified such that the second orthogonal component from transmission in the bit stream is excluded. Thus, there are the reconstructed signals for the left and right channels consist of differently weighted or scaled versions of the same transmitted signal. However, different the reconstructed Sig tional in amplitude, but they are i- dentisch with respect to their phase information. The E nergy-time envelopes of both original audio channels, however, maintained by the selective scaling operation, which typically operates in a frequency selective manner. This corresponds to the human perception of sound at high frequencies where the dominant spatial information is true loading by the energy.

In addition, the transmitted signal, the two components is generated in practical implementations ie, the carrier channel from the Suπunensignal of the left channel and the right channel instead of the rota tion. Furthermore, this processing, ie generating intensity stereo parameters performed frequency selective for performing the scaling operations, that is, independently for each scale factor band, ie for each encoder frequency. Preferably advertising combines the two channels to form a combined or "carrier M channel, and in addition to the combined channel, the intensity stereo information. The stereo information Intensity depend on the energy of the first channel, the energy of the second channel or the energy of the combined channel.

The BCC technique is described in the AES convention paper 5574 "Bi naural Cue Coding Applied to stereo and multichannel audio compression", T. Faller, F. Baumgarte, May 2002, Munich. In BCC coding is a number of audio input channels are converted to a spectral representation by using a DFT-based transfor- on with overlapping windows. the resulting spectrum is divided into non-overlapping sections, each of which has an index. each partition has a bandwidth proportional to the equivalent to the the inter-channel level differences rectangular bandwidth (ERB). (ICLD Inter channel level differences) and the inter-channel time differences (ICTD; ICTD = Inter channel time differences) are calculated k for each partition and for each frame the. ICLD and ICTD are quantized and encoded to finally come as side information in a BCC bit stream. the inter-channel level differences and the Interkanal- Ze itunterschiede are given for each channel relative to a reference channel. Then the parameters are calculated in accordance with predetermined formulas which ones of the specified parti- of the signal to be processed depend.

On the decoder side, the decoder typically receives a mono signal and the BCC bit stream. The mono signal is transformed into the frequency domain and input into a synthesis block space (spatial synthesis block), which also receives deco ied ICLD and ICTD values. In the Spatial synthesis block, the BCC parameters (ICLD and ICTD) are used to perform a weighting operation of the mono signal in order to synthesize the multi-channel signals, which represent a reconstruction of the original multi-channel audio signal after a frequency / time ümwandlung.

In the case of BCC, the joint stereo module 60 is operative to output the channel side information such that the parametric channel data are quantized and encoded ICLD o- the ICTD parameters, wherein one of the original channels is used as the reference channel for coding the channel side information becomes. Normally, the carrier signal from the sum of the participating original channels is formed.

Naturally, the above techniques only provide a Monodar- position for a decoder, which can only process the carrier channel, but is not able to process the parametric data for generating one or more approximations of more than one input channel.

The BCC technique is described in changed UES patent publications US 2003/0219130 Al, US 2003/0026441 Al and US 2003/0035553 Al. In addition to the specialist publication "Binaural Cue Coding. Part II: certificate and Applica- ons λΛ, T. Faller and F. Baumgarte, IEEE Trans On Audio and Speech Proc.. Vol. 11, no. 6, Nov. 2003 directed.

Subsequently, a typical BCC scheme for Multikanalau- represented diocodierung more detail, specifically with respect remov- mend to Figs. 4 to 6.

Fig. 5 shows such a BCC scheme for coding / transmission of multi-channel audio signals. The Multika- nalaudioeingangssignal at an input 110 of a BCC encoder 112 is downmixed in a downmix block so-called 114th In this example, the original multi-channel signal at the input 110 is a 5-channel surround signal having a front left channel, a front right channel, a left surround channel, a right surround channel and a center channel. In the preferred embodiment of the present invention, the downmix block 114 produces a sum signal by a simple addition of these five channels into a mono signal.

Other downmixing schemes are known in the art, such that a downmix channel is obtained with a single channel using a multi-channel input signal. This single channel is output at a sum signal line 115th A page information obtained from the BCC analysis block 116 is output on a side information line 117th

In the BCC analysis block Interkanal- level differences (ICLD) and inter-channel time differences (ICTD) are calculated as has been outlined above. Recently, the BCC analysis block 116 is (ICC values) to calculate also able to inter-channel correlation values. The sum signal and the side information is transmitted in a quantized and encoded format to a BCC decoder 120th The BCC decoder decomposes the transmitted sum signal into a number of subbands and performs ska lierungen, delays and other processing steps to provide the subbands of the multichannel audio channels. This processing is performed such that ICLD, ICTD and ICC parameters (cues) of a reconstructed multi-channel match signal at the output 121 to the respective cues for the original multi-channel signal at the input 110 into the BCC encoder 112th To this end, the BCC decoder 120 includes a BCC synthesis block 122 and a processing block Seiteninformationenüberarbei- 123rd

Next, the internal construction of the BCC synthesis block 122 is shown with respect to FIG. 6,. The sum signal on line 115 is ümwandlungseinheit in a time / frequency or filter bank FB fed 125th At the output of block 125, there exists a number N of sub-band signals or, in an extreme case, a block of spectral coefficients, when the audio filter bank 125 is a 1: performs 1- transformation, that is a transformation that ER- N spectral coefficients from N time domain samples testifies.

The BCC synthesis block 122 further comprises a delay stage 126, a level modification stage 127, a correlation processing stage 128 and an inverse filter bank stage IFB 129. At the output of stage 129, the reconstructed multi-channel audio signal having for example five channels in the case of a 5-channel surround system at a rate output from phonetic speakers 124, as shown in Fig. 5 or Fig. 4.

The input signal Sn is converted into the frequency domain or filter bank area by means of the element 125th output from the element 125, the signal is copied such that several versions of the same signal are obtained as illustrated by the copy node 130th The number of versions of the original signal is equal to the number of output channels in the output signal. Then, each version of the original signal at node 130 a certain delay di, d 2, ..., di, ... ds is subjected. The delay parameters are computed by the side information processing block 123 in Fig. 5 and derived from the inter-channel time differences as they are calculated by the BCC analysis block 116 of FIG. 5.

The same is true for the multiplication parameters ai, a2, ..., ai, ..., ajj that also processing block by the Seiteninformationsver- 123 calculated based on the Interkanal- level differences as they were calculated by the BCC analysis block 116, become.

The calculated by the BCC analysis block 116 ICC parameters for controlling the functionality of block

used 128, so that receive certain correlations between the delayed and manipulated in their levels of signals at the outputs of the block 128th It should be noted here that the order of steps 126, 127, 128 may differ from that shown in Fig. 6 order. It should be noted that, for a frame-wise processing of the audio signal, the BCC analysis is frame-wise performed so variable over time, and further that a frequency as BCC analysis is obtained, as can be seen by the filter bank allocation of Fig. 6. This means that the BCC parameters are obtained for each spectral band. This further means that in the case in which the audio filter bank 125 decomposes the input signal into for example 32 band pass signals, the BCC analysis block obtains a set of BCC parameters for each of the 32 bands. Naturally the BCC synthesis block 122 executes in FIG. 5 which is shown in detail in Fig. 6, performs a reconstruction which is also based on the 32 bands exemplified.

Referring which is used to determine individual BCC parameters is illustrated with to FIG. 4 a scenario. Normally, the ICLD, ICTD and ICC parameters can be defined between channel pairs advertising to. It is preferred, however, to determine ICLD and ICTD parameters between a reference channel and each other channel. This is illustrated in Fig. 4A.

ICC parameters can be de- finiert in different ways. Generally speaking, one can determine ICC parameters in the encoder between all possible channel pairs as illustrated in Fig. 4B. However, it has been proposed to calculate only ICC parameters between the strongest two channels at a time, as shown in Fig. 4C, where an example is shown, wherein at a time, an ICC parameter between the channels 1 and 2 is calculated, and an ICC parameter is calculated between channels 1 and 5 at a different time. The decoder then synthesizes the Interkanalkor- relation between the strongest channels in the decoder and uses certain heuristic rules for computing and synthesizing the inter-channel coherence for the remaining channel pairs. With respect to the calculation example, the multiplication parameters ai, aN based on transmitted ICLD parameters over 5574 reference is made to AES convention paper no.. The ICLD parameters represent an energy distribution of an original multi-channel signal. Without loss of generality, it is preferable, as shown in Fig. 4A, to take four ICLD parameters representing the energy difference between the respective channels and the front left Ka nal , In the side information processing block 122, the multiplication parameters ai, ..., a 'are derived from the ICLD parameters such that the total energy of all reconstructed output channels is the same (or proportional to the energy of the transmitted sum signal).

Generally, a production of at least one base channel and the side information takes place at such a particular parametric Multikanalcodierschemen, as is apparent from Fig. 5. Typically, block-based schemes are used in which, as also shown in FIG. 5 can be seen, the original multi-channel signal is subjected at the input 110 of a block-processing by a block step 111, such that a block of, for example, 1152 samples of the downmix signal or the sum signal and the at least one base channel for that block is formed, while the corresponding multi-channel parameters are generated at the same time for that block by the BCC analysis. After downmix channel, the sum signal is typically encoded again with a block-based encoder, such as an MP3 encoder or an AAC encoder in order to obtain a more Datenra- tenreduktion. Similarly, the parameter data is encoded, for example by differential encoding, scaling of / quantization and entropy coding.

Then, at the output of the entire encoder, thus comprising the BCC encoder 112 and a downstream Basiskanal- encoder, a common data stream is written, in which a block of the following at least one base channel to a previous block, at least one base channel, and in which the encoded multi-channel additional information is also keyed, for example, by a bit stream multiplexer.

This keying will take place so that the data stream of base channel data and multi-channel additional information always includes a block of base channel data and in association with this block comprises a block of multi-channel additional data, which then z. B. forming a common transmission frame. This transmission frame is then sent over a link to a decoder.

The decoder includes input side back a data stream demultiplexer, to split a frame of the data stream into a block of base channel data and a block of associated multi-channel additional information. Then, the block of base data z. B. decoded by an MP3 decoder or an AEAC decoder. This block of decoded base data is then fed together with the block of optionally also decoded multi-channel additional information to the BCC decoder 120th

Thus, due to the common transmission channel from the base data and additional information, the temporal assignment of additional information on the basis of channel data is set automatically, and by a decoder, the frame beitet as Ar, to restore readily. The decoder so to speak, automatically place due to the common transmission of the two data types in a single data stream, the block of base channel data associated additional information so that a multi-channel reconstruction is possible with high quality. It will occur so no problem that the multi-channel additional information have a time offset to the base channel data. But would such a shift be present, this would lead reconstruction in a significant loss of quality of multi-channel, since then a block of base channel data is processed together with multi-channel additional data, though this multi-channel additional data do not belong to the block of base data, but z. As at an earlier o later block.

Such a scenario in which the association between multi-channel additional data and base channel data is no longer present, is likely to occur if no common data stream is written, but if a separate data stream with the base channel data exists and another of which separate data stream with the multi-channel additional information is available. Such a situation can, for example, in a sequentially working transmission system arise, such as broadcasting or the Internet. Here, the transferred audio program in audio data base (mono or stereo downmix audio signal) and extension data (multi-channel additional information) is divided, which are sent individually or combined. Even if the two data streams are still transmitted synchronously from a transmitter may lurk "surprises" which lead to the in relation to the number of bits significantly more compact data stream with the multi-channel additional data z on the transmission path to the receiver, many . as is transmitted to a receiver faster than the data stream with the basic channel data.

Further, it is preferred to use encoder / decoder with non-constant output data rate to achieve a particularly good bit efficiency. Here is impossible to predict how long the decoding of a block of base channel data. Furthermore, this processing also depends on the actual hardware components used to be decoded more reindeer on how they must be present for example in a PC or digital receiver. Further, system or algorithmically-inherent blurring exist since, in particular, although a constant output data rate is generated at the Bitsparkassentechnik on average, but when viewed locally, bits that are not needed for a particularly well block to be coded is saved, in order for another block, which is particularly difficult to encode, because the audio signal z. B. is particularly transient, to be taken from the bit savings bank again.

On the other hand, the separation of the common data stream described above in two separate data streams besonde- re advantages. So is a classic receiver, ie for. As a pure mono or stereo receiver at any time regardless of content and version of the multi-channel additional information to be able to receive the audio base data and reproduce. The separation into separate streams so ensures the backward compatibility of the whole concept.

By contrast, a recipient of the newer generation can evaluate this multi-channel additional data and combine it with the audio base data so that the user tion the full expansion, here is the multi-channel sound can be provided.

A particularly interesting application scenario of the separate transmission of audio base data and extension data is digital broadcasting. Here, with the help of multi-channel additional information which previously broadcast stereo audio signal through low additional transmission costs to a multi-channel format, such as 5.1, extended. Here the program provider generated on the transmitter side of Mehrkanaltonquellen as they are, for example, can be found on DVD-Audio / Video, the multi-channel additional information. Then, these are transmitted Mehrkanalzusatzin- formations parallel to the radiated far as audio stereo signal, which however is now not simply a stereo signal, but includes two base channels, which are derived by any downmix of the multichannel signal. For the listener, the stereo signal of the two basic channels but sounds like a usual stereo signal, as ultimately similar steps are taken in the multi-channel analysis, as they have been, carried out by a sound engineer who has mixed a stereo signal from several tracks.

A major advantage of separation is the compatibility with the previously existing digital broadcasting systems. A classic receiver that can not evaluate this additional information is, as before, receive the Zweikanaltonsignal without any qualitative restrictions and play back. A receiver of newer design, however, in addition to previously received stereo sound evaluate these multi-channel information to decode and reconstruct the original 5.1 multi-channel signal therefrom.

In order to enable the simultaneous transmission of multi-channel additional information as a supplement to the previously used stereo signal can be, as it has already been executed, combine for a digital broadcasting system, the multi-channel additional information with the coded downmix audio signal, so that there is a single data stream, which is then optionally scalable and can be read from an existing receiver as well, but ignores the additional data relating to the multi-channel additional information.

The recipient sees only a (valid) audio stream and, if he is a recipient of the newer type, from the data stream further the Mehrkanaltonzusatzinfor- mation extract synchronism with the associated audio data block, decode and output as 5.1-channel surround sound via an appropriately upstream data distributor again ,

The disadvantage of this approach is an extension of the existing infrastructure and existing data paths so that they, as before, only the stereo audio signals offer the combined from Downmixsignalen and expansion DA tensignale instead. So if you are the standard dardübertragungsformat leaves for stereo data synchronization can be guaranteed by the common data stream also in radio transmissions.

However, it is highly problematic for enforcement in the market if the existing broadcast infrastructure need to be changed, so if the problem exists not only on the part of the decoder, but also on the part of broadcasters and the standardized transmission protocols. This concept is so because of the problem to change a once standardized and implemented system again, very disadvantageous.

The other alternative is to not to couple the multi-channel additional information to the used audio coding system and therefore not to key in the actual audio stream. In this case, the Ü occurs transmission via a separate, but time does not necessarily synchronized parallel digital additional channel. This situation can occur when the down-mix files in unreduced form, for example as PCM data via AES / Ebue data format are routed through an existing studios usual audio distribution infrastructure. These infrastructures are designed to distribute audio signals between various sources of digital. For this purpose, known as "switchers" functional units are normally used. Alternatively or additionally, audio signals are processed in the PCM Forraat for purposes of Klangrege- development and dynamic compression. All these steps result in a path from sender to receiver to incalculable delays.

On the other hand, the separate transmission of Basiskanal- data and multi-channel additional information is particularly interesting because existing stereo infrastructures must not be changed, so the described relative to the first possibility disadvantages of non-standard conformity not occur here. A broadcasting system only needs to send an additional channel, but do not change the infrastructure for the existing stereo channel. The additional effort is therefore in a sense driven solely on the part of the recipient, but so that backward compatibility is that so a user who has a new receiver, better sound quality gets as a user who has an old receiver.

As has already been explained, the magnitude of the temporal shift can not be determined from the received audio signal and the additional information. To put a time correct reconstruction and allocation of the multichannel signal in the receiver is no longer guaranteed. Another example of such a delay problem is when an already running two kanaliges transmission system on multi-channel transmission is to be extended, for example in a receiver of a digital radio. It is often the case that the decoding of the downmix signal takes place by means of an already existing in the receiver two channel audio decoder, the delay time is not known and therefore can not be compensated. In the extreme case, the downmix audio signal may even reach the multi-channel reconstruction audio decoder via a transmission chain containing the analog parts, ie that carried a point a digital / analog ümsetzung and after a further storage / transmission again takes place an analog / digital conversion , Something like this always takes place at a radio transmission. Here, too, no clues are initially available as a matching delay compensation of the downmix signal can be carried out additional data relative to the multi-channel. Even if the sampling frequency for the A / D conversion and the sampling frequency for the D / A conversion differ slightly from each other, a slow drift of the time necessary compensation delay corresponding to the ratio of the two sampling rates arises another. To synchronize the additional data to the base data, various techniques can be used, which are known as "time synchronization method." DIE se based on the fact to key timestamps in both data streams such that the basis of this time stamp in the receiver correctly assigned the mating data can be achieved. However, the one-button time stamps also already leads to a change in the normal stereo infrastructure.

The object of the present invention is to provide a concept for generating a data stream and for generating a multi-channel representation, through the synchronization of the base channel and multichannel additional information data is accessible.

This object is achieved by an apparatus for generating a data stream according to claim 1, an apparatus for generating a multi-channel representation in accordance with claim 17, a method for generating a data stream according to claim 26, a method for generating a multi-channel representation in accordance with claim 27, a computer program according to claim 28 or a data stream representation solved according to claim 29th

The present invention is based on the recognition that separate transmission and time-synchronous combining of a base channel data stream and a multichannel additional information data stream is possible because the multi-channel data stream is modified to "transmission side", that fingerprint information of a time characteristic of at least play one base channel, are introduced into the data stream with the multi-channel additional information so that from the data stream is a connection between the multi-channel additional information and the fingerprint information can be derived. for example, certain multi-channel include additional information for certain base channel data. Just this assignment must also be secured in the transmission of separate data streams.

According to the invention the membership of multi-channel additional information is to basic channel data, thus signals that are determined by the basic channel data fingerprint information, with which the multi-channel additional information, which belong to this very basic channel data, as it were marked on the transmitter side. This marking or signaling the connection between the multichannel additional information and the fingerprint information is reached at a blocked data processing by the fact that a block of multi-channel additional information, which belong to exactly one block of base channel data, a block fingerprint exact blocks from be assigned based channel data to which the considered block of multichannel additional information belongs.

In other words, a fingerprint exactly the Basiska- is naldatenblocks must be processed together with the multi-channel additional information in the reconstruction are assigned to the multi-channel additional information. In a block-based transmission of the block fingerprint of the block of base channel data in the block structure of the multi-channel additional data stream can be keyed such that each block of multi-channel additional information contains the block fingerprint of the corresponding basic data. The block fingerprint can be written to a previously used block of multichannel additional information immediately following, or may be written before the previously existing block, or may be written at any known location within that block, so that when the multi-channel reconstruction of the block -Fingerabdruck be read for synchronization purposes. Therefore are in the normal data stream multi-channel additional data and in accordance with interspersed the block fingerprints. Alternatively, the data stream could be written such that z. For example, all block fingerprints, provided with additional information, such as a block counter, at the beginning of the data stream according to the invention generated such that a first portion of the data stream contains only block fingerprints and a second portion of the data stream to the block fingerprint information includes corresponding blocks written multi-channel additional data. This alternative has the disadvantage that reference information is needed, but the affiliation of the block fingerprints may be added to the block-written multi-channel additional information implicitly by the order, so that no additional information is needed.

In this case, in the multi-channel reconstruction for synchronization purposes a large number of block fingerprints could be read easily at first to get the refer- ence fingerprint information. Gradually, then the test fingerprints are added until used for correlation minimum number of test fingerprints are present. During this period, the set of reference fingerprints for could. B. already be subjected to a differential encoding, if the correlation is performed in the multi-channel reconstruction using differences, while in the data stream no difference block fingerprints but absolute block fingerprints are included.

Generally speaking, the data stream is processed with the basic channel data, that is, for example, first decoded and then supplied to a multi-channel reconstructor at the receiver end. Preferably, this multi-channel reconstructor is configured such that it when he gets no additional information, simply makes a through-connection, to output the preferably two base channels as a stereo signal. Parallel to this, the extraction of the reference fingerprint information and the calculation of the test fingerprint information from the decoded base channel data occurs, and then perform a correlation calculation to calculate the offset of the fundamental channel data to the multi-channel additional data. Depending on the implementation, can then be verified by further correlation calculation that this offset is also the right offset. This will be the case when the offset which has been obtained by the second correlation calculation, not more than a predetermined threshold from the offset which has been obtained by the first correlation calculation is different.

This was the case, it can be assumed that the offset was correct. This will be switched after receiving a synchronized multi-channel additional information from a stereo output to the multi-channel output.

This procedure is preferred when a user of the time that is required for synchronization to notice anything. Base channel data are thus processed in the moment when they are received, so of course, in the period in which the synchronization takes place, that takes place the offset calculation will only provide stereo data excluded, since no synchronized multi-channel additional information have not yet been found.

In another embodiment, which is not important to the "initial delay", which is required to calculate the offset, the reproduction can be carried out so that the whole synchronization computation is performed without already stereo data are output in parallel, and then from the first to provide the block of base channel data synchronized multi-channel additional information. the receiver will then have already from the first block to a synchronized 5.1 experience. in preferred embodiments of the present invention, the time for synchronization is normally from about 5 seconds because for optimum offset calculation as . 200 reference fingerprints are required as a reference Fingerabdruckinformätionen If this delay of about 5 seconds does not matter, as is the case with unidirectional transmissions, can be equal to a 5.1 playback - but not until after the Versatzberech voltage necessary time - advertising started to. For interactive applications, for example when it comes to dialogues or something similar, this delay will be disruptive, so here sometime when synchronization is finished, it proceeds from the stereo playback on Multika- nal playback. Thus, it was found that it is better to provide only a stereo playback as a multi-channel playback with non-synchronized multi-channel additional information.

According to the invention, the temporal allocation problem is be- see base channel data and multi-channel additional data released by mobilizing both the transmitter side and through measures at the receiving end.

On the transmitter side time-varying and overall suitable fingerprint information is calculated from the corresponding mono or stereo downmix audio signal. Preferably, this fingerprint information is regularly keyed rate data stream as a synchronization aid in the sent Mehrkanalzu-. This is preferably done as a data field in the midst of blocks organized for. B. spa TiAl audio coding side information, or so that the fingerprint signal is sent as the first or last piece of information of the data block, such that they can be easily added or removed.

wherein according to the invention, a number of two base channels preferably on the receiving side are time-varying and appropriate fingerprint information from the corresponding stereo audio signal, thus calculates the basic channel data. Further, the fingerprints from the multi-channel additional information is extracted. This will be the temporal offset between the multichannel additional information and the received audio signal via correlation methods, as calculated for example, a calculation of a cross correlation between the test fingerprint information and the reference fingerprint information. Alternatively, trial-and-error method can be performed in de- NEN different from the basic channel data calculated on the basis of different block raster fingerprint information is compared with the reference fingerprint information to the basis of the test block raster whose associated test fingerprint information on best match the reference fingerprint information to determine the time offset.

Finally, the. Audio signal of the base channels is synchronized with the multi-channel additional information for subsequent multi-channel reconstruction nal by a downstream delay compensation stage. Depending on the implementation alone an initial delay can be compensated. Preferably, the offset calculation is, however, carried out in parallel to the playback order in the case of a temporal training einanderdriftens the basis of channel data and the multi-channel to be able to adjust the offset as needed and the result of the correlation calculation additional information in spite of a compensated initial delay. The delay compensation stage can thus be actively controlled.

The present invention is advantageous in that no changes to the basic channel data or to the processing path for the base channel data must be made. The base channel data stream which is fed into a recipient ger, no different from a usual base channel data stream. Changes are made only on the part of the multi-channel data stream. This is modified so that the finger imprint information is keyed. However, after at present anyway there are no standardized method for the multi-channel data stream, the change in the multi-channel additional data stream does not lead to an undesirable shift away from an already standardized, implemented and established solution as it would on the other hand the case when the base channel data stream are modified would.

The scenario invention provides a special flexibility of the spread of multi-channel additional information. In particular, when the multi-channel additional information is parameter information that are very compact in the required data rate or storage capacity, a digital receiver can be powered completely separate with such data of the stereo signal. For example, a user could for already existing at him stereo recordings he already o- on its solid-state player has on his CDs, procure multi-channel additional information from a separate provider and reproducing apparatus on his re-save. This storage is not a problem because the storage requirements especially for parametric multi-channel additional information is not particularly large. the user then places a CD or he selects a stereo piece, from the multi-channel additional data storage, the corresponding multi-channel additional data stream can be accessed and synchronized with the stereo signal based on the fingerprint information in the multi-channel additional data stream to a multi-channel reconstruction to reach. The inventive solution thus allows completely independent of the way of the stereo signal, so regardless of whether it comes from a digital broadcast receiver, whether it comes from a CD, whether it comes from a DVD or whether it z. B. has arrived via the Internet, multi-channel additional data which may originate from a different source to synchronize with the stereo signal, the stereo signal then acts as a base channel data, based on which then the Multika- nal reconstruction is performed. Preferred embodiments of the present invention are explained below with reference to the accompanying drawings. Show it:

Fig. 1 is a block diagram of an inventive apparatus for generating a data stream;

Fig. 2 is a block diagram of a forward direction according to the invention for generating a multi-channel

Presentation;

3 shows a prior art joint stereo encoder for generating channel data and parametric multi-channel information.

Figure 4 is a representation of a scheme for determining ICLD, ICTD and ICC parameters for a BCC coding / decoding.

Figure 5 is a block diagram representation of a BCC encoder / decoder chain.

6 is a block diagram of an implementation of the BCC synthesis block of Fig. 5.

Figure 7a is a schematic representation of an original multi-channel signal as a sequence of blocks.

FIG. 7b is a schematic representation of one or more base channels as a result of blocks;

Figure 7c is a schematic representation of the inventive SEN data stream with multi-channel information and associated block fingerprints.

Fig. 7d an exemplary representation for a block of the data stream of Figure 7c. Fig. 8 is a more detailed representation of the inventive apparatus for generating a multi-channel representation, for example according to a preferred execution;

Figure 9 is a schematic drawing to show the offset determination through correlation between the test fingerprint and the information refer- ence fingerprint information.

FIG. 10 is a flow chart for a preferred embodiment of the offset determination parallel data output; and

Fig. 11 is a schematic representation of the calculation of the fingerprint information or fingerprint information encoded on Encodierer- and decoded more rer page.

Fig. 1 shows an apparatus for generating a data stream for a multichannel reconstruction of an original multi-channel signal, the multi-channel signal having at least two channels, according to a preferred exemplary embodiment of the present invention. The device comprises a fingerprint generator 2, the one derived from the original multi-channel signal based channel via an input line 3 is at least fed. The number of base channels is greater than or equal to 1 and less than a number of channels of the original multi-channel signal. Is the original multi-channel signal, only a stereo signal using only two channels, only a single base channel is present, which is derived from the two stereo channels. Is the original multi-channel signal JE but a signal with three or more channels, the number of base channels can also be equal second This embodiment is preferred as an audio player then successes without multi-channel additional data than regular stereo playback gen can. In a preferred embodiment of the present invention, the original multi-channel signal is a surround sound signal with five channels and one LFE channel (LFE - Low Frequency Enhancement), this channel is also referred to as a subwoofer. The five channels are a left surround channel Ls, a left channel L, a mid channel C, a right channel R and a rear right or right-surround channel Rs. The two base channels are then of the left base channel and the right base channel. among experts, the one or the plurality of base channels are also referred to as downmix channel or downmix channels.

The fingerprint generator 2 is designed to produce from the at least one base channel fingerprint information, wherein the fingerprint information reflect a temporal course of the at least one base channel. Depending on the implementation, the fingerprint information is more or less calculated consuming. Thus, in particular on the basis of statistical methods very costly calculated fingerprints, which are known under the keyword "audio ID" can be used. Alternatively, however, could also be any other size can be used in any way the timing of the one rep räsentiert or more base channels.

According to the invention, a block-based processing is preferred. Here, put the fingerprint information composed of a sequence of block fingerprints, wherein a block fingerprint is a measure of the energy of the one or more base channels in the block. Alternatively, however, could also serve as block fingerprint z. B. always a particular sample of the block or a combination of waste sample values ​​of the block are used, since at a genü- quietly high number of block fingerprints as fingerprint information, an - even if rough - Playing of the temporal characteristics of the at least one base channel arises. Generally speaking, the fingerprint information are thus derived from the sampled value of the at least one base channel and enter the time course with a greater or lesser error of the at least one base channel again, so that, as will be explained later, on the decoder / receiver side of a correlation can be done with calculated from the base channel test fingerprint information to ultimately the offset between the data stream with the multi-channel additional information and the base channel is to be determined.

The fingerprint generator 2 delivers the output side, the fingerprint information that are fed to a data stream generator. 4 The data stream generator 4 is trained .det to generate a data stream from the fingerprint information and the typical time-variable multi-channel additional information, wherein the multi-channel additional information together with the at least one base channel, the multi-channel reconstruction of the original multi-channel signal allow. The data stream generator is designed to generate the data stream at an output 5 so that from the data stream, a connection between the multichannel additional information and the fingerprint information may be derived. According to the invention the data stream from multi-channel additional information is thus labeled with the fingerprint information that is derived from the at least one base channel, in such a way that on the fingerprint information, whose correspondence to the multi-channel additional information is supplied by the data stream generator 4, the togetherness can be determined from certain multi-channel additional information on the base channel data.

Fig. 2 shows an inventive device for generating a multi-channel representation of an original multi-channel signal from at least one base channel and a data stream, the fingerprint information representing a time course of at least one base channel, and having multi-channel additional information which together allow the at least one base channel, the multi-channel reconstruction of the original multi-channel signal, wherein from the data stream, a connection between the multichannel additional information and the fingerprint information may be derived. The at least one base channel is supplied via an input 10 to a receiver and decoded more rer-page fingerprint generator. 11 The fingerprint generator 11 supplies the output side test fingerprint information via an output 12 to a synchronizer 13. Preferably, the test fingerprint information are one base channel derived from the at least by exactly the same algorithm is executed in block 2 of FIG. 1,. Depending on the implementation, however, the algorithms are not necessarily identical.

So the fingerprint generator 2 can for example generate a block fingerprint in absolute coding, while the fingerprint generator 11 carries out on the decoder side, a difference fingerprint determination, such that the assigned block test block fingerprint, the difference between two absolute is fingerprints. In this case, when coming on the data stream with the fingerprint information absolute block fingerprints, a fingerprint extractor 14 to extract the fingerprint information from the data stream and at the same time forming differences so as reference fingerprint information via an output 15 the synchronizer 13 data are supplied which are comparable with the test fingerprint information.

Generally speaking, it is preferred that the algorithms used to calculate the test fingerprint information on the decoder side and the algorithms used to calculate the fingerprint information on Encodiererseite that can be referred in Fig. 2 as reference fingerprint information, at least are so similar that the synchronizer 13 using these two information may assign a base channel, the multi-channel additional data in the data stream, which are obtained via an input 16, synchronizes the data on the least. As a multi-channel representation at the output of the synchronizer a synchronized multi-channel representation is obtained which refer to the multi-channel includes the base channel data and synchronization overhead.

For this purpose, it is preferred that the synchronizer 13 determines a time offset between the base channel data and the multi-channel additional data and then delay the multi-channel additional data by said offset. It has been found that the multi-channel additional data usually so arrive early, too early, which the much geringe- ren amount of data that typically corresponds to the multi-channel additional data, can be attributed in comparison to the amount of data for the base channel data. Thus, the multi-channel additional data is delayed, the data on the at least be applied to a base channel from the input 10 via a base channel data line 17 to the synchronizer 13 and actually issued by this only "looped" and at an output 18 again. The multi-channel additional data, which will receive the input 16 are fed via a multi-channel additional data line 19 in the synchronous chronisierer delayed therein by a certain offset and supplied at an output 20 of the synchronizer together with the basic channel data to a multichannel reconstructor 21 which then the actual Au- dio performs rendering to (not shown in Fig. 2) to generate the five audio channels and one low-frequency channel for the output side. B..

The data on the lines 18 and 20 thus form the synchronized multi-channel representation, wherein the data stream on line 20 the data stream at the input 16 except for a possibly existing multi-channel additional data coding corresponds, except for the fact that the fingerprint information from the data stream are removed, which can be done depending on the implementation in the synchronizer 13, or earlier. Alternatively, the fingerprint removal can also take place in the fingerprint extractor 14, so that then no line 19 is pre-hands, but a line 19 'that goes from the fingerprint extractor 9 directly into the synchronizer. 13 The synchronizer 13 is thus in this case supplied in parallel by the fingerprint extractor with both the multi-channel additional data and with the reference fingerprint information.

The synchronizer is so designed to synchronize the multi-channel additional information and the at least one base channel using the test fingerprint information, and the reference fingerprint information, and using the derived from the data stream context of multi-channel information with the information contained in the data stream fingerprint information. The temporal relationship between the multichannel additional information and the fingerprint information is, as will be explained below, preferably determined simply by whether the fingerprint information from a set of multi-channel additional information, according to one set of multi-channel additional information or within a set of multi-channel additional information is available. Depending on whether the fingerprints front, behind or in the middle of a set of multi-channel additional information is determined on the encoder side, that just include these multichannel information about these fingerprint information.

Preferably, a block processing is used. Also preferably, the keying of the fingerprints is performed so that a block of multichannel additional data always follows a block fingerprint, so that alternates a block of multichannel additional information with a block fingerprint and vice versa. Alternatively, however, a data stream format could also be used, in which the entire fingerprint information is written to a separate part at the beginning of the data stream, whereupon the entire data stream follows. Here, then block fingerprints and blocks of multi-channel additional information would not alternate. Alternative ways of assigning fingerprints to multichannel additional information known to those skilled. According to the invention a connection between the plurality of additional information and the fingerprint information need only be derivable on the decoder side from the data stream, data with the fingerprint information may be used to synchronize the multi-channel additional information with the basic channel data.

Subsequently 7a is shown a preferred embodiment of the block-wise processing to 7d with reference to FIGS.. Fig. 7a shows an original multi-channel signal consisting of a sequence of blocks Bl through B8, for example, a 5.1 signal, wherein, in the example shown in FIG. 7a as multi-channel information MKi are included in a block. When starting from a 5-channel signal, so contains a block such as the block Bl respectively, the first z. B. 1152 audio samples in each channel. Such a block size is preferred for example in the BCC encoder 112 of FIG. 5, wherein the Blockbil- fertil, so to some extent, windowing to obtain a sequence of blocks of a continuous signal by the element 111 in Fig. 5, the v is denoted with "block, is achieved.

At the output of the down mix block 114, which is designated in Fig. 5 with "sum signal" ', and which has the reference numeral 115, the at least one base channel is located in. The base channel data may be represented as a sequence of blocks Bl to B8 again, wherein ... However, 7a correspond 7b with the blocks Bl to B8 in Fig blocks Bl to B8 of Fig a block no longer contains now - when it is remained in a time-domain representation, the original 5.1 signal, but only a mono- therefore signal or a Ste- reo signal with two stereo base channels. the block Bl environmentally regains the 1152 time samples of both the first stereo base channel and the second stereo base channel, these 1152 samples of each of stereo left base channel and the right stereo base channel in each case by sample method of payment addition / subtraction and optionally weighting have been calculated, that is, by the operation that the downmix block 114 of FIG. 5, for setting e is performed. According to the DA data stream again comprising multi channel information blocks Bl to B8, each block in Fig. Ic the corresponding block of the original multi-channel signal in Fig. 7a, the one or more base channel of FIG. 7b corresponds. In order to come to reconstruct, for example, the block BL of the originally sprünglichen multichannel signal MKL the basic channel data in the block BL of the base channel data stream, designated COS must be combined with the multi-channel information Pl of the block Bl in Fig. 7c. This combination is performed in the example shown in Fig. 6 embodiment through the BCC synthesis block, which has to obtain a block-wise processing of the basic channel data, again a block-forming stage at its input.

Thus, P3 designates as set forth in Fig. 7c that can reconstruct a reconstruction of the block of values ​​MK3 of the original multi-channel signal together with the block of values ​​BK3 the base channels, the multi-channel information.

According to the invention now, each block Bi of the data stream of Fig. Is provided with a block fingerprint 7c. For the block B3, this means that is preferably written in the connection to the block P3 of multi-channel information of the block fingerprint F3. This block fingerprint is now accurately derived from the block B3 of the block of values ​​BK3. Alternatively, the block and differential coding may be subjected to fingerprint F3, so that the block fingerprint F3 is equal to the differ- ence of the block fingerprint of block BK3 of base channels and the block fingerprint of the block of values ​​BK2 the base channels. In a preferred embodiment of the present invention, an energy measure or a differential energy measure is used as block fingerprint.

In the scenario described above, the data stream with the one or more base channels in Fig. 7b is transmitted channel reconstructor for a multi separated from the data stream with the multichannel information and the fingerprint information of Fig. 7c. Would be done nothing, the case could occur that just five of the block BK5 pending for processing the multi-channel reconstructor, for example at the BCC synthesis block 122 of FIG.. FER ner it could be that because of some temporal blurring but by the multi-channel information just the block B7 instead of the block B5 is pending. Therefore, without further measures, a reconstruction of the block of base channel data BK5 with the multi-channel information P7 would be taken forward, which would lead to artifacts. an offset of two blocks according to the invention now, as will be explained below, is calculated, such that the data stream in Fig. 7c is delayed by two blocks, such that a MuItikanal-Darsteilung from the data stream of FIG. 7b, and the data stream present in Fig. 7c that have been but now synchronized with each other.

Depending on the embodiment and design / accuracy of the fingerprint information, the offset according to the invention is not intended to calculating an offset limited as an integer multiple of a block, but may well be at sufficiently more accurate correlation calculation, and when using a sufficiently large number of block fingerprints ( which of course at the expense of the time period for calculating the correlation is) also achieve a displacement accuracy, which is equal to a fraction of a block and can reach up to one sample. However, it has been found that such a high accuracy is not necessarily required, but that a synchronization accuracy of +/- half a block (in the case of a block length of 1152 samples) already a Multika- nal reconstruction, a listener as an artifact -free evaluated.

Fig. 7d shows a preferred embodiment of a block Bi, for example, for the block B3 of the data stream in Fig. 7c. The block is introduced with a sync word that a byte can be long game as examples. Then comes a length information so that the length of the multi-channel information that can be, for example, parameter information, but since it is preferable to scale the multi-channel information P3, as is known in the art, according to calculation, quantize and entropy coding, a waveform signal z. may be side- as the channel in the first place is not known and therefore needs to be signaled in the data stream. At the end of the multi-channel information P3 of the block fingerprint invention is then inserted. In the example shown in Fig. 7d embodiment one byte, ie 8 bits has been taken for the block fingerprint. As per block only a single energy measure is taken, a quantizer in quantizing is used with a quantizer output 8 bits wide, in one embodiment, in which only one quantization, however, no entropy coding pie is used. The quantized energy values ​​are, therefore, without further processing in the 8-bit field "block-FA" of Fig. 7d added. Then, it follows, although 7d not shown in Fig., Again a synchronization for the next block of the data stream, the re- as shown in Fig one length byte follows, and then followed by the multi-channel information P4 for BK4, whereby this block of multi-channel information P4 for the basic channel data block BK4 again the block fingerprint based on the basic channel data BK4 follows.. 7d executed, can be used as be introduced energy measure an absolute energy measure, or else a differential energy measure. Then would the block B3 of the data stream are added as block fingerprint, the difference between the energy measure for the basic channel data BK3 and the energy measure for the basic channel data BK2.

Fig. 8 shows a more detailed representation of the synchronizer, the fingerprint generator 11 and the Fingerabdru- ckextrahierers 9 of Fig. 2 in cooperation with the Multika- nalrekonstruierer 21. The base channel data are fed to a base channel data buffer 25 and buffered. Accordingly, the additional information or the data stream with the additional information and the fingerprint information to a supplementary information buffer 26, respectively. Both buffers are constructed generally speaking in the form of a FIFO buffer, but the buffer 26 has more capacity to the effect that the fingerprint information is extra- hierbar of the reference fingerprint extractor 9 and are further removed from the data stream, so that on a buffer output line 27 only multi- channel additional information, however, can be output without keyed fingerprints. However, removal of the fingerprints in the data stream can be carried out also about by a Zeitverschie- 28 or any other element, so that the Multikanalrekonstruierer 21 is not disturbed by fingerprint bytes in the multi-channel reconstruction. Are absolute fingerprints used on both reference section as well as on test page, the value calculated by the fingerprint generator 11 fingerprint information can just as the fingerprint information detected by the fingerprint extractor 9 directly in a correlator 29 within the synchronizer 13 of Fig . 2 are fed. The correlator then computes the offset value, and supplies the same via a shift cable 30 to the time shifter 28. The synchronizer 13 is further configured to, when a valid offset value is generated and the time shifter 28 conces- leads have been to drive an enabler 31, so that the free transmitter 31 includes a switch 32, such that the stream of multi-channel additional data from the buffer 26 via the time shifter 28, and the switch is fed into the Multika- nal reconstructor 21 32nd

In the preferred embodiment of the present invention, only a time shift (delay) of the multi-channel additional information is made. At the same time, so that a listener of the output of Multikanalre- konstruierers 21 is the time delay to calculate the correct offset value does not notice already performed in parallel with the calculation of the correct offset value a Multikanalre- construction. However, this multi-channel reconstruction is only a "trivial" "multi-channel reconstruction, since preferably two stereo base channels from the Multikanalrekonstruierer 21 are simply issued. If the switch 32 therefore open, it follows only one stereo output. If the switch is closed, however, 32, the Multikanalrekonstruierer 21 shall be in addition to the stereo base channels, the multi-channel additional information and can perform but now synchronized multi-channel output. A listener notices this only in that is of the higher-level quality stereo to the multi-channel quality addressed.

In applications where initial delays do not matter, but the issue of the multimetal tikanalrekonstruierers 21 can be retained until a valid offset exists. Then, the very first block (COS of Fig. 7b) with the now properly delayed multi-channel additional data Pl can already (Fig. 7c) are supplied to the Multikanalrekonstruierer 21, so that it is not started before the output when multichannel data is available. An output of the Multikanalrekonstruierers 21 when the switch will not exist in this embodiment. Reference is now shown to FIG. 9, the functionality of the correlator 29 of FIG. 8. At the output of the test fingerprint calculator 11, a sequence of test fingerprint information is supplied, as they burst in the o- part of FIG. 9 can be seen. Thus, a block fingerprint is for each block of base channels, wherein this block 1, 2, 3, 4, designated i, exist. Depending on the correlation algorithm only the sequence is required for correlation of discrete values. However, other Korrelationsalgorith- men may also receive as an input value, an interpolated between the discrete values ​​curve as is shown in Fig. 9. According to the reference fingerprint investigators 9 also generates a sequence of discrete reference fingerprints, which he extra- hiert from the data stream. Are in the data stream, for example, difference-coded fingerprint information includes, and is the correlation gate on the basis of absolute fingerprints work as a differential decoder is activated in Fig. 8 35th It is preferred, however, that in the data stream absolute fingerprints as. are energy measure included because this information on the energy per block to level correction purposes of the Multikanalrekonstruierer 21 may also be advantageously utilized. Further, it is preferable to perform the correlation on the basis of difference fingerprints. In this case, the block 9 perform a difference processing before the correlator, and the block 11 will perform a difference processing before the correlator, as it has already been executed.

The correlator 29 will now contained in the two upper partial images of FIG. 9 curves or sequences of discrete values ​​shown, and provide a correlation result, which is shown in the lower part of FIG. 9. This results in a correlation result, the offset component provides the exact offset between the two fingerprint information curves. Since the offset also is positive, the multi-channel additional information have to be moved in the positive direction of time, so be delayed. It should be noted that of course the base channel data could be moved in the negative direction of time, or that both the multi-channel additional information part can be moved in a positive direction, and the base channel additional data can be moved in the negative direction of time a part of the offset, so long the Multikanalrekonstruierer at its two inputs includes a synchronized multi-channel representation.

Is shown below in parallel with the audio output based on Fig. 10 a preferred embodiment of the calculation of the offset. The fundamental channel data is buffered, to respectively calculate a fingerprint, after which the block is of the just a test block fingerprint been calculated, is supplied to the Multikanalrekonstruierer to multi-channel reconstruction. Then the next block of the base channel data is fed back into the buffer 25 so that a test block fingerprint can be calculated from this block again. This is for z. B. carried out a number of 200 blocks. These 200 blocks are, however, so that the listener notices no delay, simple output from the Multikanalkonstruierer in terms of a "trivial" multi-channel reconstruction as a stereo output data.

Depending on the implementation less than 200 or more than blocks 200 blocks can be used. According to the invention it has been found that a number between 100 and 300 blocks and preferably 200 blocks provides results which provide a reasonable compromise between computation time correlation computing effort and offset accuracy.

If the block 36 has been executed, a transition is made to a block 37, is performed in the calculated by the correlator 29, the correlation between the 200 test block fingerprints and the calculated reference block 200 fingerprints. The offset result obtained there is now stored. Then, in a block 38 a number of the next z corresponding to the block 36th B. 200 calculates blocks of the basic channel data. Corresponding to 200 blocks of the data stream are extracted with the multi-channel additional information again. Thereafter, a correlation is in a block 39 again performed, and the offset result obtained there is stored. Then, in a block 40, a deviation between the offset result is determined on the basis of the second blocks 200, and the offset result based on the first 200 blocks. If the deviation is below a predetermined threshold, then the offset on the offset line 30 to the time shifter 28 of Fig. 8 is supplied by a block 41, and it is closed so that a transition is made to the multi-channel output from this time, the switch 32. A predetermined threshold value for the deviation is for example a value of one or two blocks. This is based on that, when an offset of a calculation for the next calculation is not more than one or two blocks change, no error in the correlation calculation is performed.

Notwithstanding this embodiment can also to a certain extent a sliding window with a window length of a number of blocks which z. B. 200, are used. So z. B. made a calculation of 200 blocks, and receive a result. Then it proceeds to a block and removed in a block, the number of blocks used for the correlation calculation and used for the new block. The result is then just as the last result obtained is stored in a histogram. This procedure is for a number of correlation calculations such. B. 100 or 200, made so that the histogram filled gradually. The peak of the histogram is then used as the calculated offset to provide the initial offset or to get an offset for the dynamic adjustment. Which takes place in parallel to the output offset calculation is run along in a block 42, and it will according to requirements, it has been found when a drifting apart of the data stream with the multi-channel information and the data stream with the basic channel data reaches a adaptive or dynamic Versatznachführung, by an updated offset value via the line 30 to the time shifter 28 of Fig. 8 is supplied. With regard to the adaptive tracking is to be noted that depending on the implemen- tation, a smoothing of the offset change may be carried out, so that when a deviation of, for example, two blocks has been established, first, the offset is incremented by 1, and then, if necessary is incremented again, so that the cracks are not too large.

Subsequently, with reference to FIG. 11 encoder-side to a preferred embodiment of the fingerprint generator 2 on ene, as has been shown in Fig. 1, and the fingerprint generator 11 of FIG. 2, as on deco he dierer- page is used, is shown.

Generally, the multi-channel audio signal for obtaining the multi-channel additional data in blocks of fixed size is divided. Per block a fingerprint will be the same for obtaining the multi-channel additional data is calculated, which is suitable to characterize the temporal structure of the signal as clearly as possible. An embodiment of this is to use the energy content of the current Downmixaudiosignals of Au dioblocks, for example in logarithmic form, ie in a decibel-related representation. In this case, the fingerprint is a measure of the temporal envelope of the audio signal. To reduce the amount of information transmitted and to increase the accuracy of the measured value, this synchronization information can also be expressed as the difference to the energy value of the previous block to then suitable entropy coding, such as Huffman coding, adaptive scaling and quantization tion. The fingerprint of the temporal envelope is then calculated as follows:

First, as shown at point 1 in Fig. 11 is shown, optionally carried out a calculation of the energy Downmixaudiosignals in the current block for a stereo signal. Here z. B. 1152 audio samples both each squared from the left and from the right Downmixkanal and summed up. si e f t (i) in this case represents a temporal sample at time i of the left base channel, while r ight s (i) represents a temporal sample of the right base channel at the time i. For a monophonic Downmixsignal the Sumraierung eliminated. Further, it is preferred, before calculating the fertil for the present inventions to remove non-meaningful DC components of Downmixaudiosignals.

In a step 2, a minimum limitation of the energy is a logarithmic representation for subsequent carried leads. For a decibel related assessment of energy, it is preferred to use a minimum energy offset, so that in case of a zero-energy results in a meaningful logarithmic calculation. This Energiemaßzahl in dB over sweeps a range of numbers from 0 to 90 (dB) for an audio signal resolution of 16 bits.

As shown at 3 in Fig. 11, it is preferred not to use for an exact determination of the time offset between the multichannel additional information and the received audio signal the absolute energy envelope, but rather the slope (steepness) of the signal envelope. Therefore, only the slope of the energy envelope is used for the correlation measurement. This derivative signal is Seen • Technically calculated by forming the difference of the energy giewertes with that of the previous block. This step is for. As occurred in the encoder. Then, the fingerprint consists of differentially encoded values. Alternatively, this step can also be purely on the decoder imple- mented. Here, the transmitted fingerprint thus consists of non-differentially encoded values. The difference is made here only in the decoder. The latter option has the advantage that the fingerprint of information including on the absolute power of the downmix signal. It is, however, typically requires a somewhat higher fingerprint word length.

Furthermore, it is preferable to scale the energy (envelope of the signal) for optimal modulation. In order for this Fingerprints maximum advantage of both the speed range in the subsequent quantization and the resolution can be improved at low energy levels, it makes sense to introduce an additional scaling (== encryption strengthening). This can be realized either as a solid and static weighting size or a shape adapted to the envelope signal dynamic gain control.

Furthermore, quantization of the fingerprint as shown at 5 in Fig. 11, made. To prepare this fingerprint for keying in the Mehrkanalzusatzinforma- functions, this is quantizes to 8 bits. This reduced fingerprint resolution has proven to be a good compromise in terms bit requirements and reliability of the delay detection in prac tice. Number of overflows of greater than 255 are in this case limited to a saturation characteristic to the maximum value of the 255th

As shown at 6 in Fig. 11, more optimal entropy coding of the fingerprints can be made. By analyzing statistical properties of the fingerprints of the bit requirements of the quantized fingerprints can be further reduced. A suitable drive Entropiever- example, the Huffman coding or arithmetic coding. Statistically different frequencies of fingerprint values ​​may be expressed by different code lengths and thus reduce the bit requirements of the fingerprint representation on average.

the calculation of multi-channel additional data is performed with the aid of multi-channel audio data per audio block. Here calculated multi-channel additional information is then extended by the newly added synchronization information by properly embedding into the bit stream.

By means of the solution according to the invention, the receiver is now capable to identify a time offset of downmix signal and additional data and reoaudiosignalen a time-correct adjustment, ie a delay compensation between Ste- and to realize multi-channel additional information in the order of +/- H audio block. Thus, the multi-channel assignment in the receiver can be ie up to a barely perceptible time difference of +/- M. audio frames reconstructed almost completely, which is not significantly audio signal affects the quality of the reconstructed multi-channel.

Depending on the circumstances, the inventive method for producing or decoding in hardware or software can be implemented. The implementation may be on a digital storage medium, particularly a floppy disk or a CD with electronically readable control signals, which can cooperate with a programmable computer system so that the method is executed. Generally, the invention thus also consists in a computer program product with a program stored on a machine-readable carrier, the program code for performing the method when the computer program product runs on a computer. In other words, the inventions can be realized with a program code for performing the method fertil as a computer program when the computer program runs on a computer.

Claims

claims
1. An apparatus for generating a data stream for a multichannel reconstruction of an original multi-channel signal, the multi-channel signal having at least two channels, comprising;
a fingerprint generator (2) for generating fingerprint information from at least one derived from the original multi-channel signal base channel, wherein a number of base channels is greater than or equal to 1 and less than a number of channels of the original multi-channel signal , wherein reflect the fingerprint information a time characteristic of the at least one base channel; and
a data stream generator (4) for generating of a data tenstroms from the fingerprint information and of time-variable multi-channel additional information, allow the multichannel reconstruction of the original multi-channel signal together with the at least one base channel, wherein the data stream is formed producer (4) to generate the data stream so that from the data stream, a temporal correlation between the multichannel additional information and the fingerprint information may be derived.
2. Device according to claim 1,
is formed wherein the fingerprint generator (2) to process the at least one base channel blockwise to hold the fingerprint information to ER,
wherein the multi-channel additional information blocks are calculated so that they can be used at least one base channel for the multi-channel reconstruction, together with the blocks and
is formed in which the data stream generator (4) to the multi-channel additional information and the fingerprint information blockwise to write in the data stream.
3. Device according to claim 2, is formed in which the fingerprint generator (2) to generate a block of the at least one base channel as fingerprint information, a block fingerprint which represents a time course of the base channel in the block .
wherein a block of multi-channel additional information is to be used together with the block of the base channel for the multi-channel reconstruction, and
is formed in which the data stream generator (4) to block by block so as to write the data stream that the block of multi-channel additional information and the block of fingerprint information have a predetermined relationship to each other.
4. The apparatus of claim 2, is formed wherein the fingerprint generator (2) to calculate for temporally following blocks of the at least one base channel as fingerprint information, a sequence of block fingerprints
wherein the multi-channel additional information blocks for temporally following blocks are placed at least one base channel, and
wherein the data stream generator is designed to the sequence of block fingerprints voted in a preparatory write relationship with the sequence of blocks of the multichannel additional information.
5. Device according to claim 4, is formed wherein the fingerprint generator (2) to calculate a difference between two fingerprint values ​​of two blocks of the at least one base channel as block fingerprint.
6. Device according to one of the preceding claims is formed in which the fingerprint-generator (2) to perform a quantization and entropy coding of fingerprint values ​​to obtain the fingerprint information,
7. The apparatus of claim 6, is formed in which the fingerprint generator (2) to scale fingerprint values ​​with scaling information and to write the scaling information further in association with the fingerprint information in the data stream.
8. Device according to one of the preceding claims is formed in which the fingerprint generator (2) to calculate the fingerprint information blockwise, and
is formed in which the data stream generator (4) to write the data stream blockwise so that a block of the data stream a block of multichannel additional information and a block of fingerprint information that the block of multi-channel additional information and a block of at least one base channel are allocated, comprising.
9. Device according to one of the preceding claims, wherein two base channels are at least present and formed during the fingerprint generator (2) is to abtastwertwei- the at least two base channels se or to add spectral value or before the ad dition to square ,
10. Device according to one of the preceding claims is formed in which the fingerprint generator (2) to be used as fingerprint information data on egg ne energy envelope of the at least one base channel.
11. The apparatus of claim 10, is formed wherein the fingerprint generator (2) to be used as fingerprint information data via an energy envelope of at least one base channel, and
is wherein the fingerprint generator (2) further configured to use a minimum limitation of the energy and to provide a logarithmic representation of a minimum-limited energy.
12. The apparatus of claim 11, wherein the one base channel is at least transferred to a multi-channel reconstructor in coded form,
wherein the coded form has been generated using a lossy encoder, and
wherein there is further a base channel decoder is present to provide a decoded form of the at least one base channel as input signal for the fingerprint generator (2).
13. Device according to one of the preceding claims, wherein the multi-channel additional data are multi-channel parameter data each block as corresponding blocks are assigned to the at least one base channel.
14. The apparatus of claim 13, further comprising:
a multi-channel analyzer (112) for the blockwise generation of both a sequence of blocks of the at least one base channel and a sequence of blocks of the multi-channel additional information,
wherein the fingerprint generator (2) is designed to calculate from each block of values ​​of the at least one base channel a block fingerprint value.
15. The apparatus of claim 14 is formed in which the data stream generator (4) to write the data stream into a separate data channel, which is present in addition to a standard data channel, via which the at least one base channel to a multi-channel reconstructor is transferable.
16. The apparatus of claim 15, wherein the standard data channel is a standardized channel for a digitalized les stereo radio signal or a standardized channel for transmission via the Internet.
17. An apparatus for generating a multi-channel representation
(18, 20) of an original multi-channel signal from at least one base channel and a data stream, the fingerprint information representing a time course of the -wenigstens one base channel and multichannel additional information which the multi-channel together with the at least one base channel enable reconstruction of the original multi-channel signal, wherein from the data stream, a connection between the multichannel additional information and the fingerprint information may be derived, comprising:
a fingerprint generator (11) for generating test fingerprint information from the at least one base channel;
a fingerprint extractor (9) for extracting the fingerprint information from the data stream to obtain reference fingerprint information; and
a synchronizer (13) for time synchronizing the multichannel additional information and the at least one base channel using the test fingerprint information, the reference fingerprint information and a signal derived from the data stream connection of the multichannel information and the fingerprint information contained in the data stream to to obtain a synchronized multi-channel representation.
18. The apparatus of claim 17, further comprising the following feature:
MuItikanal a reconstructor (21) for reconstructing the multi-channel representation using to obtain a reconstruction of the original multi-channel signal, the synchronized multi-channel representation.
19. The apparatus of claim 17 or 18,
wherein the data stream comprises a sequence of blocks of multichannel additional data in temporal relation with a sequence of reference fingerprint values ​​as reference fingerprint information, in which is formed the extractor (9) to a block of multichannel additional data based on the temporal relationship to determine an associated fingerprint value;
is formed wherein the fingerprint generator (11) to identify a sequence of blocks of the at least one base channel, a sequence of test fingerprint values ​​as test fingerprint information;
is formed in which the synchronizer (13) to calculate due to an offset (30) between the sequence of test fingerprint values ​​and the sequence of reference fingerprint values, an offset between the blocks of multi-channel additional data and the blocks of the at least one base channel, and by the offset to be compensated by delaying (28) the sequence of blocks of the multi-channel additional information using the calculated offset loading.
20. Device according to one of claims 17 to 19,
is formed wherein the fingerprint generator (11) to perform a quantization of fingerprint values ​​to obtain the test fingerprint information.
21. Device according to one of claims 17 to 20,
is formed wherein the fingerprint generator (11) to scale fingerprint values ​​with scaling information from the data stream.
22. Device according to one of claims 17 to 21,
wherein two base channels are present at least, is formed and wherein the pinger Footprint generator (11) to the at least two base channels abtastwertwei- se or to add spectral value or squaring of the ad dition before.
23. Device according to one of claims 17 to 22,
wherein the fingerprint generator (11) is designed to be used as fingerprint information data via an energy envelope of the at least one base channel,
24. Device according to one of claims 17 to 23
is formed wherein the fingerprint generator (11) to be used as fingerprint information data via an energy envelope of the at least one base channel, and
wherein the fingerprint generator (11) is further configured to use a minimum limitation of the energy and to provide a logarithmic representation of a minimum-limited energy.
25. The device according to any one of claims 17 to 24, wherein the data stream is organized blockwise, and a block of the data stream, a block of multichannel additional information and a block fingerprint are hold corresponds,
is formed wherein the fingerprint generator (11) to calculate tens to as test fingerprint information, a difference between two block fingerprints of the wenigs- one base channel, and
wherein the fingerprint extractor (9) is further adapted to calculate a difference of two block fingerprints in the data stream and to supply as the reference fingerprint information to the synchronizer (13).
26. The device according to any one of claims 17 to 25,
is formed in which the synchronizer (13) parallel to an audio output an offset between the multi-channel additional data and to calculate at least one Ba siskanal and to compensate the offset adaptively.
27. The apparatus of claim 18 which is further adapted to, when no synchronized multi-channel additional data are present NAL, to reproduce at least one base channel, and when synchronized .Mehrkanal additional data are present, from a mono or stereo switch reproduction of the. at least one base channel to a multi-channel reproduction (32).
28. The device che logical channels or physical channels are received according to one of claims 17 to 27, which is adapted to receive the data stream and the at least one base channel on each other separate bit streams, the different two from each other or about the same, however, to different times active transmission channel can be obtained.
29. A method for generating a data stream for a multimetal tikanal reconstruction of an original multi-channel signal, the multi-channel signal having at least two channels, comprising the steps of:
Generating (2) of fingerprint information from at least one derived from the original multi-channel signal based channel, wherein a number of base channels is greater than or equal to 1 and less than a number of channels of the original multi-channel signal, wherein the fingerprint information representing a time course of at least one base channel; and
Generating (4) a data stream from the fingerprint information and of time-variable multichannel additional information which allow the multichannel reconstruction of the original multi-channel signal together with the at least one base channel, wherein the data stream is begets so that from the data stream temporal association between the Mehrka-- nal additional information and the fingerprint information can be derived.
30. A method for generating a multi-channel representation
(18, 20) of an original multi-channel signal from at least one base channel and a data stream
Fingerprint information representing a time shift, run the at least one base channel, and having multi-channel additional information, allow the multichannel reconstruction of the original multi-channel signal together with the at least one base channel, from the data stream, a connection between the multichannel additional information the fingerprint information can be derived, with the following steps:
Generating (11) from test fingerprint information from the at least one base channel;
Extracting (9) of the fingerprint information from the data stream to obtain reference fingerprint information; and
Synchronizing (13) of the multi-channel additional information and the at least one base channel using the test fingerprint information, the reference fingerprint information and a signal derived from the data stream connection of the multichannel information and the fingerprint information contained in the data stream to a synchronized multichannel to obtain representation.
with a program code for performing the method according to claim 29 or claim 30 when the computer program runs 31. A computer program on a computer.
32. stream of data fingerprint information, reproduce a temporal progression, at least one of an original multi-channel signal derived basic channel, wherein a number of base channels is greater than or equal to 1 and less than a number of channels of the original multi-channel is the signal, and having multi-channel additional information, which together with the nal at least one Basiska- the multichannel reconstruction of the original multi-channel signal, wherein from the data stream, a connection between the multichannel additional information and the fingerprint information may be derived.
33. having data stream according to claim 32, the control signals to generate a synchronized multi-channel representation of the original multi-channel signal, when the data stream is fed into the device according to patent claim 17th
EP06707562A 2005-03-30 2006-03-15 Device and method for producing a data flow and for producing a multi-channel representation Active EP1864279B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
DE102005014477A DE102005014477A1 (en) 2005-03-30 2005-03-30 Apparatus and method for generating a data stream and generating a multi-channel representation
PCT/EP2006/002369 WO2006102991A1 (en) 2005-03-30 2006-03-15 Device and method for producing a data flow and for producing a multi-channel representation

Publications (2)

Publication Number Publication Date
EP1864279A1 true EP1864279A1 (en) 2007-12-12
EP1864279B1 EP1864279B1 (en) 2009-06-17

Family

ID=36598142

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06707562A Active EP1864279B1 (en) 2005-03-30 2006-03-15 Device and method for producing a data flow and for producing a multi-channel representation

Country Status (12)

Country Link
US (1) US7903751B2 (en)
EP (1) EP1864279B1 (en)
JP (1) JP5273858B2 (en)
CN (1) CN101189661B (en)
AT (1) AT434253T (en)
AU (1) AU2006228821B2 (en)
CA (1) CA2603027C (en)
DE (2) DE102005014477A1 (en)
HK (1) HK1111259A1 (en)
MY (1) MY139836A (en)
TW (1) TWI318845B (en)
WO (1) WO2006102991A1 (en)

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2339329A3 (en) 2007-02-21 2012-04-04 Agfa HealthCare N.V. System and method for optical coherence tomography
US8612237B2 (en) * 2007-04-04 2013-12-17 Apple Inc. Method and apparatus for determining audio spatial quality
CN101911634A (en) * 2007-12-03 2010-12-08 诺基亚公司 A packet generator
DE102008009024A1 (en) * 2008-02-14 2009-08-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for synchronizing multichannel extension data with an audio signal and for processing the audio signal
DE102008009025A1 (en) * 2008-02-14 2009-08-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for calculating a fingerprint of an audio signal, apparatus and method for synchronizing and apparatus and method for characterizing a test audio signal
RU2495503C2 (en) * 2008-07-29 2013-10-10 Панасоник Корпорэйшн Sound encoding device, sound decoding device, sound encoding and decoding device and teleconferencing system
JP5602138B2 (en) * 2008-08-21 2014-10-08 ドルビー ラボラトリーズ ライセンシング コーポレイション Feature optimization and reliability prediction for audio and video signature generation and detection
CN103177725B (en) * 2008-10-06 2017-01-18 爱立信电话股份有限公司 Method and device for transmitting aligned multichannel audio frequency
US8538764B2 (en) * 2008-10-06 2013-09-17 Telefonaktiebolaget L M Ericsson (Publ) Method and apparatus for delivery of aligned multi-channel audio
KR20110138367A (en) * 2009-03-13 2011-12-27 코닌클리케 필립스 일렉트로닉스 엔.브이. Embedding and extracting ancillary data
GB2470201A (en) * 2009-05-12 2010-11-17 Nokia Corp Synchronising audio and image data
US8436939B2 (en) * 2009-10-25 2013-05-07 Tektronix, Inc. AV delay measurement and correction via signature curves
US9426574B2 (en) * 2010-03-19 2016-08-23 Bose Corporation Automatic audio source switching
EP2458890B1 (en) * 2010-11-29 2019-01-23 Nagravision S.A. Method to trace video content processed by a decoder
US9075806B2 (en) * 2011-02-22 2015-07-07 Dolby Laboratories Licensing Corporation Alignment and re-association of metadata for media streams within a computing device
KR101742135B1 (en) 2011-03-18 2017-05-31 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Frame element positioning in frames of a bitstream representing audio content
US8639921B1 (en) 2011-06-30 2014-01-28 Amazon Technologies, Inc. Storage gateway security model
US8832039B1 (en) * 2011-06-30 2014-09-09 Amazon Technologies, Inc. Methods and apparatus for data restore and recovery from a remote data store
US8706834B2 (en) 2011-06-30 2014-04-22 Amazon Technologies, Inc. Methods and apparatus for remotely updating executing processes
US8806588B2 (en) 2011-06-30 2014-08-12 Amazon Technologies, Inc. Storage gateway activation process
US9294564B2 (en) 2011-06-30 2016-03-22 Amazon Technologies, Inc. Shadowing storage gateway
US8639989B1 (en) 2011-06-30 2014-01-28 Amazon Technologies, Inc. Methods and apparatus for remote gateway monitoring and diagnostics
US8793343B1 (en) 2011-08-18 2014-07-29 Amazon Technologies, Inc. Redundant storage gateways
US8789208B1 (en) 2011-10-04 2014-07-22 Amazon Technologies, Inc. Methods and apparatus for controlling snapshot exports
US9635132B1 (en) 2011-12-15 2017-04-25 Amazon Technologies, Inc. Service and APIs for remote volume-based block storage
KR20130101629A (en) * 2012-02-16 2013-09-16 삼성전자주식회사 Method and apparatus for outputting content in a portable device supporting secure execution environment
EP2670157B1 (en) * 2012-06-01 2019-10-02 Koninklijke KPN N.V. Fingerprint-based inter-destination media synchronization
CN102820964B (en) * 2012-07-12 2015-03-18 武汉滨湖电子有限责任公司 Method for aligning multichannel data based on system synchronizing and reference channel
EP2693392A1 (en) 2012-08-01 2014-02-05 Thomson Licensing A second screen system and method for rendering second screen information on a second screen
CN102937938B (en) * 2012-11-29 2015-05-13 北京天诚盛业科技有限公司 Fingerprint processing device as well as control method and device thereof
JP6349977B2 (en) 2013-10-21 2018-07-04 ソニー株式会社 Information processing apparatus and method, and program
US20150302086A1 (en) * 2014-04-22 2015-10-22 Gracenote, Inc. Audio identification during performance
US20160344902A1 (en) * 2015-05-20 2016-11-24 Gwangju Institute Of Science And Technology Streaming reproduction device, audio reproduction device, and audio reproduction method
EP3249646B1 (en) * 2016-05-24 2019-04-17 Dolby Laboratories Licensing Corp. Measurement and verification of time alignment of multiple audio channels and associated metadata
US10015612B2 (en) 2016-05-25 2018-07-03 Dolby Laboratories Licensing Corporation Measurement, verification and correction of time alignment of multiple audio channels and associated metadata

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000155598A (en) * 1998-11-19 2000-06-06 Matsushita Electric Ind Co Ltd Coding/decoding method and device for multiple-channel audio signal
CA2859333A1 (en) * 1999-04-07 2000-10-12 Dolby Laboratories Licensing Corporation Matrix improvements to lossless encoding and decoding
US6990453B2 (en) * 2000-07-31 2006-01-24 Landmark Digital Services Llc System and methods for recognizing sound and music signals in high noise and distortion
TW510144B (en) 2000-12-27 2002-11-11 C Media Electronics Inc Method and structure to output four-channel analog signal using two channel audio hardware
US7461002B2 (en) * 2001-04-13 2008-12-02 Dolby Laboratories Licensing Corporation Method for time aligning audio signals using characterizations based on auditory events
US7116787B2 (en) 2001-05-04 2006-10-03 Agere Systems Inc. Perceptual synthesis of auditory scenes
US20030035553A1 (en) 2001-08-10 2003-02-20 Frank Baumgarte Backwards-compatible perceptual coding of spatial cues
AU2003230993A1 (en) * 2002-04-25 2003-11-10 Shazam Entertainment, Ltd. Robust and invariant audio pattern matching
EP1506550A2 (en) * 2002-05-16 2005-02-16 Philips Electronics N.V. Signal processing method and arrangement
US7006636B2 (en) 2002-05-24 2006-02-28 Agere Systems Inc. Coherence-based audio coding and synthesis
US7292901B2 (en) * 2002-06-24 2007-11-06 Agere Systems Inc. Hybrid multi-channel/cue coding/decoding of audio signals
JP2006528859A (en) * 2003-07-25 2006-12-21 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィKoninklijke Philips Electronics N.V. Fingerprint generation and detection method and apparatus for synchronizing audio and video
US7013301B2 (en) * 2003-09-23 2006-03-14 Predixis Corporation Audio fingerprinting system and method
CA2992089C (en) 2004-03-01 2018-08-21 Dolby Laboratories Licensing Corporation Reconstructing audio signals with multiple decorrelation techniques and differentially coded parameters
DE102004046746B4 (en) * 2004-09-27 2007-03-01 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method for synchronizing additional data and basic data
US7567899B2 (en) * 2004-12-30 2009-07-28 All Media Guide, Llc Methods and apparatus for audio recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006102991A1 *

Also Published As

Publication number Publication date
AU2006228821A1 (en) 2006-10-05
MY139836A (en) 2009-10-30
HK1111259A1 (en) 2009-11-20
DE102005014477A1 (en) 2006-10-12
TWI318845B (en) 2009-12-21
AU2006228821B2 (en) 2009-07-23
CN101189661B (en) 2011-10-26
CA2603027C (en) 2012-09-11
CA2603027A1 (en) 2006-10-05
US20080013614A1 (en) 2008-01-17
CN101189661A (en) 2008-05-28
JP5273858B2 (en) 2013-08-28
JP2008538239A (en) 2008-10-16
US7903751B2 (en) 2011-03-08
WO2006102991A1 (en) 2006-10-05
AT434253T (en) 2009-07-15
DE502006003997D1 (en) 2009-07-30
TW200644704A (en) 2006-12-16
EP1864279B1 (en) 2009-06-17

Similar Documents

Publication Publication Date Title
JP6196249B2 (en) Apparatus and method for encoding an audio signal having multiple channels
JP2019074743A (en) Transcoding apparatus
KR20180115652A (en) Method and apparatus for encoding and decoding successive frames of an ambisonics representation of a 2- or 3-dimensional sound field
US9792918B2 (en) Methods and apparatuses for encoding and decoding object-based audio signals
US9361896B2 (en) Temporal and spatial shaping of multi-channel audio signal
JP5638037B2 (en) Parametric joint coding of audio sources
JP5311597B2 (en) Multi-channel encoder
JP5292498B2 (en) Time envelope shaping for spatial audio coding using frequency domain Wiener filters
KR101698442B1 (en) Mdct-based complex prediction stereo coding
JP5678048B2 (en) Audio signal decoder using cascaded audio object processing stages, method for decoding audio signal, and computer program
RU2491658C2 (en) Audio signal synthesiser and audio signal encoder
JP5185340B2 (en) Apparatus and method for displaying a multi-channel audio signal
US8756066B2 (en) Methods and apparatuses for encoding and decoding object-based audio signals
KR101358700B1 (en) Audio encoding and decoding
ES2362920T3 (en) Improved method for signal conformation in multichannel audio reconstruction.
RU2381570C2 (en) Stereophonic compatible multichannel sound encoding
RU2325046C2 (en) Audio coding
ES2394768T3 (en) Method and receiver for high frequency reconstruction of a stereo audio signal
KR100954179B1 (en) Near-transparent or transparent multi-channel encoder/decoder scheme
RU2491657C2 (en) Efficient use of stepwise transmitted information in audio encoding and decoding
CA2583146C (en) Diffuse sound envelope shaping for binaural cue coding schemes and the like
KR100878367B1 (en) Multi-Channel Hierarchical Audio Coding with Compact Side-Information
JP4943418B2 (en) Scalable multi-channel speech coding method
KR101100221B1 (en) A method and an apparatus for decoding an audio signal
JP4335917B2 (en) Fidelity optimized variable frame length coding

Legal Events

Date Code Title Description
17P Request for examination filed

Effective date: 20070913

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

DAX Request for extension of the european patent (to any country) (deleted)
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1111259

Country of ref document: HK

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Free format text: LANGUAGE OF EP DOCUMENT: GERMAN

REF Corresponds to:

Ref document number: 502006003997

Country of ref document: DE

Date of ref document: 20090730

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1111259

Country of ref document: HK

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090917

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091017

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090928

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091017

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090917

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

26N No opposition filed

Effective date: 20100318

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090918

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091218

PG25 Lapsed in a contracting state [announced from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 11

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 12

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 13

PGFP Annual fee paid to national office [announced from national office to epo]

Ref country code: CH

Payment date: 20190325

Year of fee payment: 14

Ref country code: DE

Payment date: 20190325

Year of fee payment: 14

Ref country code: MC

Payment date: 20190320

Year of fee payment: 14

Ref country code: IE

Payment date: 20190326

Year of fee payment: 14

Ref country code: GB

Payment date: 20190325

Year of fee payment: 14

Ref country code: FR

Payment date: 20190326

Year of fee payment: 14

Ref country code: LU

Payment date: 20190321

Year of fee payment: 14

PGFP Annual fee paid to national office [announced from national office to epo]

Ref country code: AT

Payment date: 20190319

Year of fee payment: 14

Ref country code: BE

Payment date: 20190321

Year of fee payment: 14