EP1864279A1 - Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux - Google Patents

Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux

Info

Publication number
EP1864279A1
EP1864279A1 EP06707562A EP06707562A EP1864279A1 EP 1864279 A1 EP1864279 A1 EP 1864279A1 EP 06707562 A EP06707562 A EP 06707562A EP 06707562 A EP06707562 A EP 06707562A EP 1864279 A1 EP1864279 A1 EP 1864279A1
Authority
EP
European Patent Office
Prior art keywords
channel
fingerprint
block
information
multichannel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP06707562A
Other languages
German (de)
English (en)
Other versions
EP1864279B1 (fr
Inventor
Wolfgang Fiesel
Matthias Neusinger
Harald Popp
Stephan Geyersberger
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Original Assignee
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV filed Critical Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Publication of EP1864279A1 publication Critical patent/EP1864279A1/fr
Application granted granted Critical
Publication of EP1864279B1 publication Critical patent/EP1864279B1/fr
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Definitions

  • the present invention relates to audio signal processing, and more particularly to multi-channel
  • Binaural Cue Coding (BCC) and Spatial Audio Coding, as described in J. Herre, C. Faller, S. Disch, C. Ertel, J Hubert, A. Hoeller, K. Linzmeier, C. Sprenger, P. Kroon: "Spatial Audio Coding: Next-Generation Efficient and Compatible Coding of Multi-Channel Audio", 117th. AES Convention, San Francisco 2004, Preprint 6186.
  • FIG. 3 shows a joint stereo device 60.
  • This device may be a device that, for example, the intensity stereo (IS) technology or the Binaural Cue coding technique (BCC) implemented.
  • IS intensity stereo
  • BCC Binaural Cue coding technique
  • Such a device typically receives as input at least two channels CHI, CH2, .... CHn, and outputs a single carrier channel as well as multi-channel parametric information.
  • the parametric data is defined so that an approximation of an original channel (CHI, CH2, ..., CHn) can be calculated in a decoder.
  • the carrier channel will include subband samples, spectral coefficients, time domain samples, etc. that provide a relatively fine representation of the underlying signal, while the parametric data does not include such samples or spectral coefficients, but control parameters for controlling a particular reconstruction algorithm, such as multiplying by weighting. by time shifting, by frequency shifting, etc.
  • the parametric multi-channel information therefore comprises a relatively rough representation of the signal or the associated channel.
  • the amount of data needed by a carrier channel is about 60 to 70 kbps, while the amount of data required by one channel parametric page information is in the range of 1.5 to 2.5 kbps.
  • the above figures apply to compressed data.
  • a non-compressed CD channel requires data rates on the order of about ten times.
  • An example of parametric data is the known scale factors, intensity stereo information, or BCC parameters, as set forth below.
  • the reconstructed signals differ in their amplitude, but they are identical in terms of their phase information.
  • the energy-time envelopes of both original audio channels are preserved by the selective scaling operation, which typically operates in a frequency-selective manner. This corresponds to the human perception of sound at high frequencies, where the dominant spatial information is determined by the energy envelopes.
  • the transmitted signal ie, the carrier channel
  • this processing ie, the generation of intensity stereo parameters for performing the scaling operations
  • both channels are combined to form a combined or "carrier M" channel and, in addition to the combined channel, the intensity stereo information.
  • the intensity stereo information depends on the energy of the first channel, the energy of the second channel or the energy of the combined channel.
  • the BCC technique is described in the AES convention paper 5574 "Binary Cue Coding applied to stereo and multiChannel audio compression ", T. Faller, F. Baumgarte, May 2002, Kunststoff
  • BCC coding a number of audio input channels are converted to a spectral representation using a DFT-based transformation with overlapping windows Spectrum is divided into non-overlapping sections, each of which has an index, each partition has a bandwidth proportional to the equivalent square-band (ERB) width, the Inter Channel Level Differences (ICLD) and the Inter channel time differences (ICTD) are determined for each partition and for each frame k
  • the ICLD and ICTD are quantized and encoded to finally arrive as page information in a BCC bit stream the inter-channel time differences are given for each channel relative to a reference channel, then the parameters are pre-determined calculated formulas that depend on the particular parti- tions of the signal being processed.
  • the decoder On the decoder side, the decoder typically receives a mono signal and the BCC bit stream.
  • the mono signal is transformed into the frequency domain and input to a spatial synthesis block, which also receives decoded ICLD and ICTD values.
  • the BCC parameters ICLD and ICTD are used to perform a mono signal weighting operation to synthesize the multichannel signals representing, after frequency / time conversion, a reconstruction of the original multichannel audio signal.
  • the joint stereo module 60 operates to output the channel-side information such that the parametric channel data is quantized and encoded ICLD or ICTD parameters using one of the original channels as the reference channel for encoding the channel side information becomes.
  • the carrier signal is formed from the sum of the participating source channels.
  • the above techniques provide only a monodar position for a decoder that can process only the carrier channel, but is unable to process the parametric data to produce one or more approximations of more than one input channel.
  • FIGS. 4 to 6 a typical BCC scheme for multi-channel audio decoding is shown in greater detail, referring to FIGS. 4 to 6.
  • Fig. 5 shows such a BCC scheme for coding / transmission of multichannel audio signals.
  • the multi-channel audio input signal at an input 110 of a BCC encoder 112 is down-converted in a so-called downmix block 114.
  • the original multi-channel signal at the input 110 is a 5-channel surround signal having a front left channel, a front right channel, a left surround channel, a right surround channel and a center channel.
  • the downmix block 114 generates a sum signal by simply adding these five channels into a mono signal.
  • inter-channel level differences ICLD
  • inter-channel time differences ICTD
  • the BCC analysis block 116 is also capable of calculating inter-channel correlation (ICC) values.
  • the sum signal and the page information are transmitted in a quantized and encoded format to a BCC decoder 120.
  • the BCC decoder splits the transmitted sum signal into a number of subbands and performs scaling, delays and other processing to provide the subbands of the multichannel audio channels to be output. This processing is performed so that the ICLD, ICTD and ICC parameters (cues) of a reconstructed multichannel signal at output 121 match the corresponding cues for the original multichannel signal at input 110 in BCC encoder 112.
  • the BCC decoder 120 includes a BCC synthesis block 122 and a page information reworking block 123.
  • the sum signal on line 115 is fed to a time / frequency conversion unit or filter bank FB 125.
  • FB 125 At the output of the block 125 there exists a number N of subband signals or, in an extreme case, a block of spectral coefficients, when the audio filter bank 125 performs a 1: 1 transformation, i. H. a transformation that generates N spectral coefficients from N time domain samples.
  • the BCC synthesis block 122 further comprises a delay stage 126, a level modification stage 127, a correlator At the output of stage 129, the reconstructed multichannel audio signal with, for example, five channels in the case of a 5-channel surround system can be output to a set of loudspeakers 124 as shown in FIG. 5 or FIG are shown.
  • the input signal sn is converted into the frequency domain or the filter bank region by means of the element 125.
  • the signal output by element 125 is copied so as to obtain multiple versions of the same signal, as represented by copy node 130.
  • the number of versions of the original signal is equal to the number of output channels in the output signal.
  • each version of the original signal at node 130 undergoes a particular delay di, d 2 , ..., di, ... ds.
  • the delay parameters are calculated by the page information processing block 123 in FIG. 5 and derived from the inter-channel time differences as calculated by the BCC analysis block 116 of FIG.
  • the ICC parameters calculated by the BCC analysis block 116 are used to control the functionality of the block
  • the order of steps 126, 127, 128 may differ from the sequence shown in FIG.
  • the BCC analysis is carried out in frames, that is temporally variable, and that further a frequency-wise BCC analysis is obtained, as can be seen by the filter bank division of FIG. This means that the BCC parameters are obtained for each spectral band.
  • the audio filter bank 125 decomposes the input signal into, for example, 32 bandpass signals, the BCC analysis block receives a set of BCC parameters for each of the 32 bands.
  • the BCC synthesis block 122 of Fig. 5, which is detailed in Fig. 6, performs a reconstruction based on the 32 bands exemplified.
  • ICLD, ICTD and ICC parameters can be defined between channel pairs. However, it is preferred to determine the ICLD and ICTD parameters between a reference channel and each other channel. This is shown in Fig. 4A.
  • ICC parameters can be defined in several ways. Generally speaking, one can determine ICC parameters in the encoder between all possible channel pairs, as shown in Fig. 4B. However, it has been proposed to calculate only ICC parameters between the strongest two channels at a time, as shown in Fig. 4C, where an example is shown where at one time an ICC parameter between channels 1 and 2 is calculated, and at another time an ICC parameter between channels 1 and 5 is calculated. The decoder then synthesizes the inter-channel correlation between the strongest channels in the decoder and uses certain heuristic rules to compute and synthesize the inter-channel coherence for the remaining channel pairs.
  • the multiplication parameters ai, a N represent an energy distribution of an original multichannel signal. Without loss of generality, it is preferred, as shown in FIG. 4A, to take four ICLD parameters representing the energy difference between the respective channels and the front left channel . In the page information processing block 122, the multiplication parameters ai, ..., a "are derived from the ICLD parameters such that the total energy of all reconstructed output channels is the same (or proportional to the energy of the transmitted sum signal).
  • block-based schemes are used in which, as also shown in FIG. 5, the original multichannel signal at input 110 undergoes block processing through a block stage 111 such that from one block of, for example, 1152 samples downmix Signal or sum signal or the at least one base channel is formed for this block, while at the same time the corresponding multi-channel parameters are generated for this block by the BCC analysis.
  • the sum signal is typically encoded again with a block based encoder, such as an MP3 encoder or an AAC encoder, to obtain a further data rate reduction.
  • the parameter data is coded, for example by differential coding, scaling / quantization and entropy coding.
  • a common data stream is written in which a block of the at least one base channel follows an earlier block of the at least one base channel, and in which the encoded multi-channel overhead information is also keyed in, for example by a bit stream multiplexer.
  • This keying takes place in such a way that the data stream of basic channel data and additional multi-channel information always comprises one block of basic channel data and, in association with this block, comprises a block of multichannel additional data which is then z. B. form a common transmission frame. This transmission frame is then sent over a transmission link to a decoder.
  • the decoder again comprises a data stream demultiplexer on the input side for splitting a frame of the data stream into a block of basic channel data and a block of associated multichannel additional information. Then the block of basic data z. B. decoded by an MP3 decoder or an AAC decoder. This block of decoded basic data is then supplied to the BCC decoder 120 together with the block of optionally also decoded multichannel additional information.
  • the temporal assignment of the additional information to the basic channel data is automatically determined and readily re-established by a decoder which operates on a frame-by-frame basis.
  • the decoder will to a certain extent automatically find the additional information associated with a block of basic channel data, so that high-quality multi-channel reconstruction is possible. So there will be no problem that the multi-channel additional information have a time offset to the basic channel data.
  • a situation may arise, for example, in a sequentially operating transmission system, such as broadcasting or the Internet.
  • the audio program to be transmitted is divided into basic audio data (mono or stereo demix audio signal) and extension data (multichannel additional information), which are broadcast singly or in combination.
  • coders / decoders with non-constant output data rate in order to achieve a particularly good bit efficiency.
  • this processing also depends on the actually used hardware components for decoding, as they must be present for example in a PC or digital receiver.
  • systemic or algorithmic-inherent fuzziness since, in particular, in the case of bit savings bank technology, on average, a constant output data rate is generated, however, locally, bits that are not needed for a particularly well-to-be-coded block are saved in order for another block, which is particularly difficult to code because the audio signal z. B. is particularly transient to be removed from the Bitsparkasse again.
  • the separation of the common data stream described above into two individual data streams has particular advantages. So is a classic receiver, so z. For example, a pure mono or stereo receiver at any time, regardless of the content and version of the multi-channel additional information, is able to receive and reproduce the audio base data. The separation into separate data streams thus ensures the backward compatibility of the entire concept.
  • a receiver of the newer generation can evaluate this multi-channel additional data and combine it with the audio base data in such a way that the user can be provided with the complete extension, here the multi-channel sound.
  • a particularly interesting application scenario of separate transmission of audio base data and extension data is in digital broadcasting.
  • the previously broadcast stereo audio signal can be extended by a small additional transmission effort to a multi-channel format, such as 5.1.
  • the program provider generates on the transmitter side from multi-channel sound sources, such as those found on DVD-Audio / Video, the multi-channel additional information.
  • these multichannel additional information is transmitted in parallel to the as yet radiated audio stereo signal, which is now not simply a stereo signal, but comprises two base channels derived from the multichannel signal by some downmix.
  • the stereo signal of the two base channels sounds like a normal stereo signal because multichannel analysis ultimately takes similar steps as those made by a sound engineer who mixed a stereo signal out of multiple tracks.
  • a major advantage of the separation is the compatibility with the existing digital broadcasting systems.
  • a classic receiver that can not evaluate this additional information will continue to receive and reproduce the bilingual signal without any qualitative restrictions.
  • a receiver of a newer design can, in addition to the previously received stereo sound signal, evaluate and decode this multichannel information and reconstruct the original 5.1 multichannel signal therefrom.
  • multi-channel additional information as a supplement to the previously used stereo signal
  • the receiver therefore sees only one (valid) audio data stream and, if it is a receiver of the newer type, can extract from the data stream the multichannel sound additional information via a corresponding upstream data distributor again synchronously to the associated audio data block, decode and output as a 5.1 multichannel sound ,
  • a disadvantage of this approach is the extension of the existing infrastructure or the existing data paths, so that instead of just the stereo audio signals as before, the signals combined from downmix signals and expansion can transport signals. So, if you leave the standard transfer format for stereo data, syn- chronousness can be ensured even during broadcast transmissions by the common data stream.
  • the other alternative is not to couple the multichannel overhead information to the audio encoding system used and therefore not key in the actual audio data stream.
  • the transmission takes place via a separate, but not necessarily synchronized, parallel digital additional channel.
  • This situation can occur if the downmix data are passed in unreduced form, for example as AES / EBÜ data format PCM data through a standard audio distribution infrastructure in studios. These infrastructures are designed to digitally distribute audio signals between diverse sources. For this purpose, normally known as "crossbars" functional units are used.
  • audio signals are processed in the PCM Forraat for purposes of sound control and dynamic compression.All these steps lead to incalculable delays on a way from the transmitter to the receiver.
  • the separate transmission of base channel data and multi-channel additional information is particularly interesting since existing stereo infrastructures do not need to be changed, ie the disadvantages of non-standard conformity described here with regard to the first possibility do not occur.
  • a broadcasting system only needs to broadcast one additional channel, but not change the infrastructure for the existing stereo channel.
  • the overhead is therefore effectively driven solely on the receiver side, but so that there is backwards compatibility, so that a user who has a new receiver gets better sound quality than a user who has an old receiver.
  • the magnitude of the time shift can no longer be determined from the received audio signal and the additional information.
  • a timely correct reconstruction and assignment of the multi-channel signal in the receiver is no longer guaranteed.
  • Another example of such a delay problem is when an already-running two-channel transmission system is to be extended to multi-channel transmission, for example in a receiver of a digital radio.
  • the decoding of the downmix signal by means of a receiver already existing in the two-channel audio decoder whose delay time is not known and thus can not be compensated.
  • the downmix audio signal may even reach the multichannel reconstruction audio decoder via a transmission chain containing analog parts, ie one point is digital / analogue and analogue / digital conversion takes place after further storage / transmission , Something like this always takes place in a radio transmission. Again, no clues are initially available as to how an appropriate delay equalization of the downmix signal relative to the multi-channel overhead data can be performed. Even if the sampling frequency for the A / D conversion and the sampling frequency for the D / A conversion differ slightly, there is a slow time drift of the necessary compensation delay corresponding to the ratio of the two sampling rates to one another.
  • time synchronization method To synchronize the additional data to the basic data, various techniques can be used, which are known by the term “time synchronization method.” These are based on pasting timestamps into both data streams in such a way that a correct assignment of the associated data is based on these time stamps in the receiver However, entering timestamps also alters the normal stereo infrastructure.
  • the object of the present invention is to provide a concept for generating a data stream or for generating a multi-channel representation, by means of which a synchronization of basic channel data and multi-channel additional information can be achieved.
  • a device for generating a data stream according to claim 1 a device for generating a multi-channel representation according to claim 17, a method for generating a data stream according to claim 26, a method for generating a multi-channel representation according to claim 27, a computer Program according to claim 28 or a data flow representation according to claim 29 solved.
  • the present invention is based on the finding that a separate transmission and time-synchronous merging of a basic channel data stream and a multi-channel additional information data stream is made possible by the fact that the "multichannel data stream is modified on the" sender side "such that fingerprint information that at least a time profile of the at least reproduce a basic channel in which the data stream containing the multichannel additional information is introduced such that a relationship between the multichannel additional information and the fingerprint information can be derived from the data stream. Additional information about certain basic channel data. Exactly this assignment must also be secured when transferring separate data streams.
  • the affiliation of multichannel additional information to basic channel data is signaled on the sender side by the fact that fingerprint information is determined from the basic channel data with which the multichannel additional information which belongs to precisely this basic channel data is as it were marked.
  • This labeling of the relationship between the multichannel overhead information and the fingerprint information is achieved in block-wise data processing in that a block of multichannel overhead information corresponding exactly to a block of basic channel data contains a block fingerprint of that block Basic channel data to which the block under consideration of multi-channel additional information belongs.
  • the block fingerprint of the block of base channel data in the block structure of the multichannel overhead data stream may be keyed in such that each block of multichannel overhead information contains the block fingerprint of the associated base data.
  • the block fingerprint may be written immediately following a previously used block of multichannel overhead information, or may be written before the previously existing block, or may be written at any known location within that block, such that in multichannel reconstruction the block Fingerprint is readable for synchronization purposes.
  • the data stream therefore contains normal multichannel additional data as well as the block fingerprints interspersed accordingly. Alternatively, the data stream could also be written so that z.
  • all block fingerprints provided with additional information are at the beginning of the data stream generated in accordance with the present invention so that a first portion of the data stream contains only block fingerprints and a second portion of the data stream contains the block fingerprint information belonging block-wise written multi-channel additional data contains.
  • additional information such as a block counter
  • a large number of block fingerprints could simply be read in first to obtain the reference fingerprint information.
  • the test fingerprints are added until there is a minimum number of test fingerprints used for a correlation.
  • the set of reference fingerprints could e.g. B. are already subjected to differential coding when the correlation in the multi-channel reconstruction is performed using differences, while in the data stream no difference block fingerprints but absolute block fingerprints are included.
  • the data stream is processed on the receiver side with the basic channel data, that is to say initially decoded, for example, and then supplied to a multichannel reconstructor.
  • this multi-channel reconstructor is designed such that it, if it does not receive additional information, simply makes a through connection to output the preferably two base channels as a stereo signal.
  • Parallel to this is the extraction the reference fingerprint information and the calculation of the test fingerprint information from the decoded base channel data, to then perform a correlation calculation to calculate the offset of the base channel data to the multi-channel overhead data.
  • this offset is also the correct offset. This will be the case if the offset obtained by the second correlation calculation does not deviate more than a predetermined threshold from the offset obtained by the first correlation calculation.
  • Base channel data is thus processed at the moment it is received, so of course, only stereo data can be output in the period in which the synchronization takes place, ie the offset computation, since no synchronized multichannel additional information has yet been found.
  • the rendering may be performed so that the entire synchronization calculation is performed without stereo data being output in parallel, and then from the first one Block the base channel data to synchronized multi-channel additional information, and the listener will have a synchronized 5.1 experience from the first block.
  • the time for synchronization is normally about 5 seconds, since about 200 reference fingerprints are needed as reference fingerprint information for optimal offset calculation. If this delay of about 5 seconds is irrelevant, as is the case with unidirectional transmissions, for example, you can begin with a 5.1 playback - but only after the time required for the offset calculation.
  • time-varying and suitable fingerprint information is calculated from the corresponding mono or stereo downmix audio signal.
  • these fingerprint information are regularly keyed as a synchronization aid in the sent multi-channel additional data stream. This is preferably done as a data field in the middle of the block-organized z.
  • temporally variable and suitable fingerprint information is calculated from the corresponding stereo audio signal, ie the basic channel data. wherein according to the invention a number of two base channels is preferred. Furthermore, the fingerprints are extracted from the multi-channel additional information. Thereafter, the time offset between the multichannel overhead information and the received audio signal is calculated via correlation methods, such as calculating a cross-correlation between the test fingerprint information and the reference fingerprint information. Alternatively, trial-and-error methods can also be carried out in which different fingerprint information calculated from the base channel data on the basis of different block rasters is compared with the reference fingerprint information in order to use the test block raster, whose associated test fingerprint information on best match the reference fingerprint information to determine the temporal offset.
  • the audio signal of the base channels is synchronized with the multichannel additional information for the subsequent multichannel reconstruction by a downstream delay equalization stage.
  • a downstream delay equalization stage Depending on the implementation, only an initial delay can be compensated.
  • the offset computation is performed parallel to the reproduction in order to be able to readjust the offset as required and according to the result of the correlation calculation in the event of a drifting apart of the basic channel data and the multichannel additional information despite a compensated initial delay.
  • the delay equalization stage can thus also be actively regulated.
  • the present invention is advantageous in that there is no need to make any changes to the base channel data or to the basic channel data processing path.
  • the base channel data stream fed to a receiver is no different from a common base channel data stream. Changes are made only on the part of the multi-channel data stream. This is modified so that the finger imprint information is keyed.
  • changing the multichannel additional data stream does not lead to an unwanted departure from an already standardized, implemented and established solution, as would be the case if the base channel data stream were modified would.
  • the scenario according to the invention provides a particular flexibility for the distribution of multichannel additional information.
  • the multichannel additional information is parameter information that is very compact in terms of the required data rate or storage capacity
  • a digital receiver with such data can also be supplied completely separate from the stereo signal.
  • a user could obtain multi-channel additional information from a separate provider for stereo recordings that already exist on his solid-state player or on his CDs and save them on his playback device.
  • This storage is not a problem because the memory requirements, especially for multi-channel parametric additional information is not particularly large.
  • the multi-channel overhead data memory can retrieve the corresponding multi-channel overhead data stream and synchronize with the stereo signal based on the fingerprint information in the multi-channel overhead data stream to provide a multi-channel reconstruction to reach.
  • the solution according to the invention thus allows completely independent of the way the stereo signal, that is, regardless of whether it comes from a digital radio receiver, whether it comes from a CD, whether it comes from a DVD or whether it is z.
  • multi-channel additional data that may come from a very different source to synchronize with the stereo signal, the stereo signal then acts as a base channel data, then the basis of the multichannel reconstruction is performed.
  • FIG. 1 shows a block diagram of a device according to the invention for generating a data stream
  • FIG. 2 is a block diagram of an inventive device for generating a multi-channel
  • Fig. 3 shows a known joint stereo encoder for generating channel data and parametric multi-channel information
  • FIG. 4 is an illustration of a scheme for determining ICLD, ICTD, and ICC parameters for BCC encoding / decoding;
  • Fig. 5 is a block diagram representation of a BCC encoder / decoder chain
  • Fig. 6 is a block diagram of one implementation of the BCC synthesis block of Fig. 5;
  • Fig. 7a is a schematic representation of an original multi-channel signal as a result of blocks
  • Fig. 7b is a schematic representation of one or more base channels as a result of blocks
  • FIG. 7c shows a schematic representation of the data stream according to the invention with multi-channel information and associated block fingerprints
  • Fig. 7d is an exemplary diagram for a block of the data stream of Fig. 7c; 8 shows a more detailed representation of the device according to the invention for generating a multi-channel display according to a preferred embodiment;
  • FIG. 9 shows a schematic representation for clarifying the offset determination by correlation between the test fingerprint information and the reference fingerprint information
  • FIG. 11 shows a schematic representation of the calculation of the fingerprint information or coded fingerprint information on the encoder and decoder side.
  • the device comprises a fingerprint generator 2, to which at least one base channel derived from the original multi-channel signal can be supplied via an input line 3.
  • the number of base channels is greater than or equal to 1 and less than a number of channels of the original multi-channel signal. If the original multi-channel signal is just a stereo signal with only two channels, then there is only a single base channel derived from the two stereo channels. However, if the original multichannel signal is a signal with three or more channels, the number of base channels may be equal to two.
  • the original multi-channel signal is a surround signal with five channels and one LFE channel (LFE - Low Frequency Enhancement), this channel also being called a subwoofer.
  • the five channels are a left surround channel Ls, a left channel L, a center channel C, a right channel R, and a right rear surround channel Rs.
  • the two base channels are then the left base channel and the left channel right base channel.
  • one or more basic channels are also referred to as downmix channels or downmix channels.
  • the fingerprint generator 2 is designed to generate fingerprint information from the at least one base channel, the fingerprint information representing a time profile of the at least one base channel.
  • the fingerprint information is calculated more or less costly.
  • very elaborate fingerprints which are known under the heading "audio ID”
  • audio ID can be used here, in particular on the basis of statistical methods, but alternatively any other size could be used which in some way represents the time course of the one or the multiple base channels represented.
  • the fingerprint information is composed of a series of block fingerprints, where a block fingerprint is a measure of the energy of the one or more base channels in the block.
  • a block fingerprint is a measure of the energy of the one or more base channels in the block.
  • the fingerprint Information thus derived from the sample data of at least one base channel and give the time history with more or less large error of the at least one base channel, so that, as will be explained later, at the decoder / receiver side a correlation with calculated from the base channel test Fingerprint information can be done to ultimately determine the offset between the data stream with the multi-channel additional information and the base channel.
  • the fingerprint generator 2 supplies, on the output side, the fingerprint information which is supplied to a data stream generator 4.
  • the data stream generator 4 is configured to generate a data stream from the fingerprint information and the typically time-varying multi-channel additional information, wherein the multi-channel additional information together with the at least one base channel is the multichannel reconstruction of the original multi-channel Enable signal.
  • the data stream generator is designed to generate the data stream at an output 5 such that a relationship between the multichannel additional information and the fingerprint information can be derived from the data stream.
  • the data stream of multichannel additional information is thus marked with the fingerprint information derived from the at least one base channel, such that the togetherness is provided via the fingerprint information, which is assigned to the multichannel additional information by the data stream generator 4 of certain multi-channel additional information to the basic channel data can be determined.
  • FIG. 2 shows a device according to the invention for generating a multi-channel representation of an original multichannel signal from at least one base channel and a data stream, which has fingerprint information representing a time profile of the at least one base channel and multi-channel additional information, the together men with the at least one base channel enable the multi-channel reconstruction of the original multi-channel signal, wherein from the data stream, a relationship between the multi-channel additional information and the fingerprint information is derivable.
  • the at least one base channel is fed via an input 10 to a receiver or decoder-side fingerprint generator 11.
  • the fingerprint generator 11 provides output test fingerprint information via an output 12 to a synchronizer 13.
  • the test fingerprint information is derived from the at least one base channel by exactly the same algorithm as is also executed in block 2 of FIG. However, depending on the implementation, the algorithms do not necessarily have to be identical.
  • the fingerprint generator 2 may generate a block fingerprint in absolute coding, while the fingerprint generator 11 performs a differential fingerprint determination on the decoder side, such that the test block fingerprint associated with a block represents the difference between two absolute values. Fingerprints is.
  • a fingerprint extractor 14 will extract the fingerprint information from the data stream and at the same time form differences, thereby providing the fingerprint information via an output 15 to the synchronizer 13 Data comparable to the test fingerprint information.
  • the decoder-side test fingerprint calculation algorithms and encoder-side fingerprint calculation algorithms which may also be referred to as reference fingerprint information in FIG. 2, be at least similar Synchronizer 13 using this two information the multichannel overhead data in the data stream obtained via an input 16 can be synchronized with the data via the at least one base channel.
  • a synchronized multichannel representation is obtained which comprises the basic channel data and synchronously the multichannel additional data.
  • the synchronizer 13 determines a time offset between the basic channel data and the multi-channel additional data and then delays the multi-channel additional data by this offset. It has been found that the multichannel overhead data typically arrives earlier, that is, too early, which can be attributed to the significantly smaller amount of data, which typically corresponds to the multichannel overhead data, compared to the amount of data for the base channel data. If, therefore, the multi-channel additional data is delayed, the data is fed via the at least one base channel from the input 10 via a base channel data line 17 to the synchronizer 13 and actually only "looped through” by this and output again at an output 18.
  • the data on lines 18 and 20 thus form the synchronized multi-channel representation, the data stream on line 20 corresponding to the data stream at input 16, apart from any multichannel overhead data coding, except for the fact that the fingerprint information is from the Data stream removed be, which can happen depending on the implementation in the synchronizer 13, or even before.
  • the fingerprint removal can also take place in the fingerprint extractor 14, so that there is no line 19, but a line 19 ', which goes directly from the fingerprint extractor 9 into the synchronizer 13.
  • the synchronizer 13 is therefore supplied in parallel by the fingerprint extractor with both the multi-channel additional data and with the reference fingerprint information.
  • the synchronizer is thus configured to synchronize the multichannel overhead information and the at least one base channel using the test fingerprint information and the reference fingerprint information and using the derived from the data stream context of the multichannel information with the fingerprint information contained in the data stream.
  • the timing relationship between the multichannel overhead information and the fingerprint information is preferably determined simply by whether the fingerprint information precedes a set of multichannel overhead information, a set of multichannel overhead information, or within a set of Multi-channel additional information is available. Depending on whether the fingerprints are in front of, behind, or in the midst of a set of multichannel additional information, it is determined on the encoder side that this multichannel information belongs to that fingerprint information.
  • block processing is used.
  • the keying of the fingerprints is made so that a block of multi-channel additional data always follows a block fingerprint, so that a block of multi-channel additional information alternates with a block fingerprint and vice versa.
  • a data stream format could be used in which the entire fingerprint information in one separate part at the beginning of the data stream, whereupon the whole data stream follows. So here block fingerprints and blocks of multichannel additional information would not alternate.
  • Alternative ways of assigning fingerprints to multichannel supplemental information are known to those skilled in the art. According to the invention, only a connection between the plurality of additional information and the fingerprint information on the decoder side has to be derivable from the data stream so that the fingerprint information can be used to synchronize the multichannel additional information with the basic channel data.
  • FIG. 7a shows an original multi-channel signal, for example a 5.1 signal, which consists of a sequence of blocks B1 to B8, wherein in a block in the example shown in Fig. 7a, multi-channel information MKi are included.
  • a block such as the block Bl, contains the first z. B. 1152 audio samples of each channel.
  • Such a block size is preferred, for example, in the BCC encoder 112 of FIG. 5, wherein the block image, that is, the windowing to obtain a sequence of blocks from a continuous signal, by the element 111 in FIG with "block v is reached is achieved.
  • the at least one base channel is present at the output of the downmix block 114, which is denoted "sum signal” in Fig. 5 and has the reference numeral 115.
  • the basic channel data can again be represented as a sequence of blocks B1 to B8 7b correspond to blocks B1 to B8 in Fig. 7a, but a block now no longer contains - if left in a time-domain representation - the original 5.1 signal, but only a monaural one. Signal or a signal reo signal with two stereo baseband channels.
  • the block Bl therefore again comprises the 1152 time samples of both the first stereo master channel and the second stereo master channel, these 1152 samples of both the left stereo base channel and the right stereo base channel being respectively calculated by sample addition / subtraction and optionally weighting.
  • the data stream with multichannel information again comprises blocks B1 to B8, each block in FIG. 1c corresponding to the corresponding block of the original multichannel signal in FIG. 7a and the one or more base channel of FIG. 7b, respectively.
  • the basic channel data in the block Bl of the basic channel data stream labeled BK1 must be combined with the multi-channel information Pl of the block Bl in FIG. 7c. This combination is performed in the embodiment shown in FIG. 6 by the BCC synthesis block, which again has a blocking stage at its input to obtain a block-by-block processing of the basic channel data.
  • P3 designates the multichannel information which, together with the block of values BK3 of the base channels, reconstructs a reconstruction of the block of values MK3 of the original multichannel signal.
  • each block Bi of the data stream of FIG. 7c is now provided with a block fingerprint.
  • This block fingerprint is now derived exactly from the block B3 of the block of values BK3.
  • the block fingerprint F3 could also be subjected to differential coding so that the block fingerprint F3 is equal to the differential is the block fingerprint of block BK3 of the base channels and the block fingerprint of the block of BK2 values of the base channels.
  • a block of energy or differential energy is used as the block fingerprint.
  • the data stream with the one or more base channels in FIG. 7b is transmitted to a multichannel reconstructor separately from the data stream with the multichannel information and the fingerprint information from FIG. 7c. If nothing else were to be done, the case could arise that the block BK5 is currently pending for processing at the multichannel reconstructor, for example at the BCC synthesis block 122 of FIG. It could also be that due to any temporal blurring, however, of the multichannel information, block B7 is present instead of block B5. Without further action, therefore, a reconstruction of the block of basic channel data BK5 would be made with the multi-channel information P7, which would lead to artifacts.
  • an offset of two blocks is now calculated such that the data stream in FIG. 7c is delayed by two blocks, such that a multi-channel display of the data stream of FIG. 7b and the data stream of Fig. 7c, but now synchronized with each other.
  • the offset determination according to the invention is not limited to the calculation of an offset as an integer multiple of a block, but can, if the correlation calculation is sufficiently accurate and a sufficiently large number of block fingerprints (which, of course, comes at the expense of the time period for calculating the correlation) can also achieve an offset accuracy that is equal to a fraction of a block and can reach up to one sample.
  • a high accuracy is not absolutely necessary, but that a synchronization accuracy of +/- half a block (with a block length of 1152 samples) already leads to a multichannel reconstruction, which judges a listener as artifact-free.
  • Fig. 7d shows a preferred embodiment for a block Bi, for example for the block B3 of the data stream in Fig. 7c.
  • the block is initiated with a sync word, which may be one byte long, for example.
  • a sync word which may be one byte long, for example.
  • length information since it is preferred to scale the multichannel information P3, as known in the art, according to its computation, and to entropy-encode, so that the length of the multichannel information, which may be parameter information, for example also a waveform signal z. B. of the side channel is not known from the outset and therefore must be signaled in the data stream.
  • the block fingerprint according to the invention is then inserted.
  • Fig. 7d can be introduced as Energyblot an absolute Energyhsted, or even a difference-Energieterrorism. Then the block B3 of the data stream would be added as a block fingerprint the difference between the energy measure for the base channel data BK3 and the energy measure for the base channel data BK2.
  • FIG. 8 shows a more detailed representation of the synchronizer, the fingerprint generator 11 and the fingerprint extractor 9 of FIG. 2 in cooperation with the multi-channel reconstructor 21.
  • the basic channel data is fed to a base channel data buffer 25 and buffered. Accordingly, the additional information or the data stream with the additional information and the fingerprint information is supplied to an additional information buffer 26.
  • Both buffers are generally constructed in the form of a FIFO buffer, but the buffer 26 has further capacities in that the fingerprint information from the reference fingerprint extractor 9 is extra-feasible and further removed from the data stream, so that on a buffer output line 27 only multi-channel additional information, but can be output without keyed fingerprints.
  • the removal of the fingerprints in the data stream may also be performed by a time shifter 28 or any other element such that the multi-channel reconstructor 21 is not disturbed by fingerprint bytes in the multi-channel reconstruction.
  • the fingerprint information calculated by the fingerprint generator 11, as well as the fingerprint information obtained by the fingerprint extractor 9, can be fed directly into a correlator 29 within the synchronizer 13 of FIG 2 are fed.
  • the correlator then calculates the offset value and provides it to the time shifter 28 via an offset line 30.
  • the synchronizer 13 is further configured to provide, when a valid offset value is generated, and to the time shifter 28. have been led to drive an enable 31 so that the enable 31 closes a switch 32, such that the stream of multi-channel overhead data from the buffer 26 is fed to the multicell reconstructor 21 via the time shifter 28 and the switch 32.
  • a time delay (delay) of the multichannel overhead information is made.
  • a multi-channel reconstruction is already performed in parallel to the calculation of the correct offset value.
  • this multichannel reconstruction is merely a "trivial" multichannel reconstruction since the preferably two stereo base channels are simply output from the multi-channel reconstructor 21. If the switch 32 is therefore open, only one stereo output follows. However, if the switch 32 is closed, the multichannel reconstructor 21 also receives the multichannel additional information in addition to the stereo base channels and can perform a multichannel output synchronized now. A listener only realizes this by switching from stereo quality to multichannel quality.
  • the output of the multichannel reconstructor 21 may be held back until there is a valid offset. Then the very first block (BK1 of FIG. 7b) with the now correctly delayed multi-channel additional data P1 (FIG. 7c) can already be supplied to the multichannel reconstructor 21, so that the output is started only when multichannel data is present. An output of the multichannel reconstructor 21 with the switch open will not exist in this embodiment.
  • the functionality of the correlator 29 of FIG. 8 will now be described with reference to FIG. At the output of the test fingerprint calculator 11, a sequence of test fingerprint information is provided, as seen in the top-most field of FIG.
  • a block fingerprint is present.
  • the reference fingerprint determiner 9 also generates a sequence of discrete reference fingerprints which it extracts from the data stream. If, for example, differential-coded fingerprint information is contained in the data stream, and if the correlator is to work on the basis of absolute fingerprints, a differential decoder 35 in FIG. 8 is activated. However, it is preferred that absolute fingerprints be used in the data stream.
  • the block 9 will perform difference processing before the correlator, and also the block 11 will perform difference processing before the correlator, as already stated.
  • the correlator 29 will now contain the discrete value series shown in the two upper sub-images of FIG. 9 and provide a correlation result shown in the lower part of FIG. 9.
  • the result is a correlation result whose offset component provides exactly the offset between the two fingerprint information curves. Since the offset is also positive, the multichannel additional information must be be postponed in a positive time direction, so be delayed. It should be noted that, of course, the basic channel data could be shifted in the negative time direction, or that both the multi-channel additional information can be shifted in the positive direction, and the base channel overhead data can be shifted a part of the offset in the negative time direction, so long the multichannel reconstructor contains a synchronized multi-channel representation at its two inputs.
  • the basic channel data is buffered to calculate one fingerprint at a time, after which the block from which a test block fingerprint has just been calculated is fed to the multichannel reconstructor for multichannel reconstruction. Thereafter, the next block of the base channel data is again fed to the buffer 25 so that a block test fingerprint can be calculated from this block again.
  • fewer than 200 blocks or more than 200 blocks may be used. According to the invention, it has been found that a number between 100 and 300 blocks, and preferably 200 blocks, provides results that provide a reasonable compromise between computation time, correlation computation, and offset accuracy.
  • a block 37 is entered in which the correlation between the 200 calculated test block fingerprints and the 200 calculated reference block fingerprints is performed by the correlator 29.
  • the offset result obtained there is saved now.
  • a block 38 corresponding to the block 36 a number of the next z. B. calculates 200 blocks of the base channel data. Accordingly, again 200 blocks are extracted from the data stream with the multi-channel additional information. Thereafter, in a block 39, a correlation is again performed, and the offset result obtained there is stored. Then, in a block 40, a deviation between the offset result due to the second 200 blocks and the offset result due to the first 200 blocks is detected.
  • the offset via the offset line 30 is supplied to the time shifter 28 of FIG. 8 by a block 41, and the switch 32 is closed so that the multi-channel output is transitioned from that point in time.
  • a predetermined value for the deviation threshold is, for example, a value of one or two blocks. This is because when an offset from one calculation to the next calculation does not change more than one or two blocks, no error has been made in the correlation calculation.
  • the z. B. 200 is used. So z. B. made a calculation with 200 blocks and obtained a result. Then one block is continued and one block is taken out of the number of blocks used for the correlation calculation and the new block is used for this purpose. The result obtained is then stored as well as the last result obtained in a histogram. This procedure is used for a number of correlation calculations, such as 100 or 200, so that the histogram gradually fills. The peak of the histogram is then used as a calculated offset to provide the initial offset or to obtain a dynamic offset offset.
  • the offset calculation taking place in parallel to the output will run in a block 42, and an adaptive or dynamic offset tracking will be achieved as required, when a drift of the data stream with the multichannel information and the data stream with the base channel data has been detected. by supplying an updated offset value via line 30 to time shifter 28 of FIG.
  • a smoothing of the offset change can also be carried out, so that if a deviation of, for example, two blocks has been determined, first the offset is incremented by 1 and then, if necessary is incremented again so that the jumps are not too big.
  • FIG. 11 a preferred embodiment of the encoder side fingerprint generator 2 shown in FIG. 1 and the fingerprint generator 11 of FIG. 2 as found on the decoder side of FIG. Page is inserted, shown.
  • the multichannel audio signal for obtaining the multichannel overhead data is divided into fixed size blocks.
  • a fingerprint is calculated for each block at the same time to obtain the multichannel additional data, which is suitable for characterizing the temporal structure of the signal as clearly as possible.
  • One exemplary embodiment of this is to use the energy content of the current downmix audio signal of the audio block, for example in logarithmic form, ie in a decibel-related representation.
  • the fingerprint is a measure of the temporal envelope of the audio signal.
  • this synchronization information can also be compared to the energy value of the previous block with subsequently suitable entropy coding, for example Huffman coding, adaptive scaling and quantization. be expressed.
  • suitable entropy coding for example Huffman coding, adaptive scaling and quantization.
  • an energy calculation of the downmix audio signal in the current block is optionally performed for a stereo signal.
  • This z For example, 1152 audio samples are squared and summed from both the left and right downmix channels.
  • si e f t (i) in this case represents a temporal sample at time i of the left base channel, while r ight s (i) represents a temporal sample of the right base channel at the time i.
  • si e f t (i) in this case represents a temporal sample at time i of the left base channel
  • r ight s (i) represents a temporal sample of the right base channel at the time i.
  • With a monophonic downmix signal the summation is omitted.
  • a minimum limitation of the energy is carried out for the purpose of subsequent logarithmic representation.
  • a minimum energy offset it is preferred to use a minimum energy offset to give a meaningful logarithmic calculation in the case of zero energy.
  • This energy metric in dB covers a range of 0 to 90 (dB) with an audio signal resolution of 16 bits.
  • this signal derivation is calculated by subtraction of the energy value with that of the previous block.
  • This step is z. B. completed in the encoder.
  • the fingerprint consists of difference coded values.
  • this step may also be implemented purely on the decoder side. be mented.
  • the transmitted fingerprint thus consists of non-differentially encoded values. The difference is only made here in the decoder. The latter possibility has the advantage that the fingerprint contains information about the absolute energy of the downmix signal. However, typically a slightly higher fingerprint word length is needed.
  • quantization of the fingerprint is made. To prepare this fingerprint for keying in the multichannel additional information, this is guantized to 8 bits. In practice, this reduced fingerprint resolution has proven to be a good compromise with regard to bit requirements and reliability of delay detection. Number overflows greater than 255 are limited to a maximum value of 255 with a saturation characteristic.
  • optimal entropy coding of the fingerprint can still be performed.
  • the bit requirement of the quantized fingerprint can be further reduced.
  • a suitable entropy method is, for example, Huffman coding or arithmetic coding. Statistically different frequencies of fingerprint values may be due to different Code lengths are expressed and thus on average reduce the bit requirements of the fingerprint representation.
  • the calculation of the multi-channel additional data is performed using the multi-channel audio data.
  • multichannel additional information calculated is then expanded by the newly added synchronization information by suitable embedding in the bit stream.
  • the receiver is now able to detect a time offset of downmix signal and additional data and to realize a time-correct adaptation, ie a delay compensation between stereo audio signals and multichannel additional information in the order of +/- H audio block.
  • a time-correct adaptation ie a delay compensation between stereo audio signals and multichannel additional information in the order of +/- H audio block.
  • the inventive method for generating or decoding can be implemented in hardware or in software.
  • the implementation may be on a digital storage medium, in particular a floppy disk or CD with electronically readable control signals, which may interact with a programmable computer system such that the method is performed.
  • the invention thus also consists in a computer program product with a program code stored on a machine-readable carrier for carrying out the method when the computer program product runs on a computer.
  • the invention can thus be realized as a computer program with a program code for carrying out the method when the computer program runs on a computer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Time-Division Multiplex Systems (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Television Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Studio Circuits (AREA)

Abstract

L'objectif de l'invention est de produire une synchronisation temporelle d'un flux de données comprenant des données supplémentaires à canaux multiples et un flux de données comprenant des données par l'intermédiaire d'au moins un canal de base (3). A cet effet, un calcul d'informations concernant des empreintes digitales est passée côté codage pour le canal de base (3), ce qui permet d'introduire des informations concernant l'empreinte digitale dans un contexte temporel par rapport aux données supplémentaires à canaux multiples dans un flux de données (4). Des informations concernant des empreintes digitales sont calculées côté décodage à partir du canal de base et utilisées avec les informations d'empreintes digitales extraites du flux de données, ce qui permet par exemple de calculer et de compenser le décalage entre le flux de données comprenant les informations supplémentaires à canaux multiples et le flux de données comprenant au moins un canal de base, au moyen d'une corrélation, ce qui permet d'obtenir une représentation multicanaux synchrone.
EP06707562A 2005-03-30 2006-03-15 Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux Active EP1864279B1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102005014477A DE102005014477A1 (de) 2005-03-30 2005-03-30 Vorrichtung und Verfahren zum Erzeugen eines Datenstroms und zum Erzeugen einer Multikanal-Darstellung
PCT/EP2006/002369 WO2006102991A1 (fr) 2005-03-30 2006-03-15 Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux

Publications (2)

Publication Number Publication Date
EP1864279A1 true EP1864279A1 (fr) 2007-12-12
EP1864279B1 EP1864279B1 (fr) 2009-06-17

Family

ID=36598142

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06707562A Active EP1864279B1 (fr) 2005-03-30 2006-03-15 Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux

Country Status (12)

Country Link
US (1) US7903751B2 (fr)
EP (1) EP1864279B1 (fr)
JP (1) JP5273858B2 (fr)
CN (1) CN101189661B (fr)
AT (1) ATE434253T1 (fr)
AU (1) AU2006228821B2 (fr)
CA (1) CA2603027C (fr)
DE (2) DE102005014477A1 (fr)
HK (1) HK1111259A1 (fr)
MY (1) MY139836A (fr)
TW (1) TWI318845B (fr)
WO (1) WO2006102991A1 (fr)

Families Citing this family (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1962082A1 (fr) 2007-02-21 2008-08-27 Agfa HealthCare N.V. Système et procédé destinés à la tomographie par cohérence optique
US8612237B2 (en) * 2007-04-04 2013-12-17 Apple Inc. Method and apparatus for determining audio spatial quality
EP2215797A1 (fr) 2007-12-03 2010-08-11 Nokia Corporation Générateur de paquets
DE102008009025A1 (de) * 2008-02-14 2009-08-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Vorrichtung und Verfahren zum Berechnen eines Fingerabdrucks eines Audiosignals, Vorrichtung und Verfahren zum Synchronisieren und Vorrichtung und Verfahren zum Charakterisieren eines Testaudiosignals
DE102008009024A1 (de) * 2008-02-14 2009-08-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Vorrichtung und Verfahren zum synchronisieren von Mehrkanalerweiterungsdaten mit einem Audiosignal und zum Verarbeiten des Audiosignals
WO2010013450A1 (fr) * 2008-07-29 2010-02-04 パナソニック株式会社 Dispositif de codage de son, dispositif de décodage de son, dispositif de codage/décodage de son et système de conférence
EP2327213B1 (fr) * 2008-08-21 2014-10-08 Dolby Laboratories Licensing Corporation Calcul d'erreurs de synchronisation audio video base sur des caracteristiques audio-visuelles
HUE041788T2 (hu) * 2008-10-06 2019-05-28 Ericsson Telefon Ab L M Eljárás és berendezés igazított többcsatornás hang szállítására
CN103177725B (zh) * 2008-10-06 2017-01-18 爱立信电话股份有限公司 用于输送对齐的多通道音频的方法和设备
WO2010103442A1 (fr) * 2009-03-13 2010-09-16 Koninklijke Philips Electronics N.V. Incorporation et extraction de métadonnées
GB2470201A (en) * 2009-05-12 2010-11-17 Nokia Corp Synchronising audio and image data
US8436939B2 (en) * 2009-10-25 2013-05-07 Tektronix, Inc. AV delay measurement and correction via signature curves
US9426574B2 (en) * 2010-03-19 2016-08-23 Bose Corporation Automatic audio source switching
EP2458890B1 (fr) * 2010-11-29 2019-01-23 Nagravision S.A. Procédé de suivi de contenu vidéo traité par un décodeur
US9075806B2 (en) * 2011-02-22 2015-07-07 Dolby Laboratories Licensing Corporation Alignment and re-association of metadata for media streams within a computing device
RU2589399C2 (ru) * 2011-03-18 2016-07-10 Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. Расположение элемента кадра в кадрах потока битов, представляющего аудио содержимое
US10754813B1 (en) 2011-06-30 2020-08-25 Amazon Technologies, Inc. Methods and apparatus for block storage I/O operations in a storage gateway
US8806588B2 (en) 2011-06-30 2014-08-12 Amazon Technologies, Inc. Storage gateway activation process
US8706834B2 (en) 2011-06-30 2014-04-22 Amazon Technologies, Inc. Methods and apparatus for remotely updating executing processes
US9294564B2 (en) 2011-06-30 2016-03-22 Amazon Technologies, Inc. Shadowing storage gateway
US8639921B1 (en) 2011-06-30 2014-01-28 Amazon Technologies, Inc. Storage gateway security model
US8832039B1 (en) 2011-06-30 2014-09-09 Amazon Technologies, Inc. Methods and apparatus for data restore and recovery from a remote data store
US8639989B1 (en) 2011-06-30 2014-01-28 Amazon Technologies, Inc. Methods and apparatus for remote gateway monitoring and diagnostics
US8793343B1 (en) 2011-08-18 2014-07-29 Amazon Technologies, Inc. Redundant storage gateways
US8789208B1 (en) 2011-10-04 2014-07-22 Amazon Technologies, Inc. Methods and apparatus for controlling snapshot exports
US9635132B1 (en) 2011-12-15 2017-04-25 Amazon Technologies, Inc. Service and APIs for remote volume-based block storage
KR20130101629A (ko) * 2012-02-16 2013-09-16 삼성전자주식회사 보안 실행 환경 지원 휴대단말에서 컨텐츠 출력 방법 및 장치
US9553756B2 (en) * 2012-06-01 2017-01-24 Koninklijke Kpn N.V. Fingerprint-based inter-destination media synchronization
CN102820964B (zh) * 2012-07-12 2015-03-18 武汉滨湖电子有限责任公司 一种基于系统同步与参考通道的多通道数据对齐的方法
EP2693392A1 (fr) 2012-08-01 2014-02-05 Thomson Licensing Système deuxième écran et un procédé pour rendre les informations sur un deuxième écran
CN102937938B (zh) * 2012-11-29 2015-05-13 北京天诚盛业科技有限公司 指纹处理装置及其控制方法和控制装置
TWI557727B (zh) 2013-04-05 2016-11-11 杜比國際公司 音訊處理系統、多媒體處理系統、處理音訊位元流的方法以及電腦程式產品
JP6349977B2 (ja) * 2013-10-21 2018-07-04 ソニー株式会社 情報処理装置および方法、並びにプログラム
US20150302086A1 (en) 2014-04-22 2015-10-22 Gracenote, Inc. Audio identification during performance
US20160344902A1 (en) * 2015-05-20 2016-11-24 Gwangju Institute Of Science And Technology Streaming reproduction device, audio reproduction device, and audio reproduction method
EP3115932A1 (fr) * 2015-07-07 2017-01-11 Idex Asa Reconstruction d'image
JP6807033B2 (ja) * 2015-11-09 2021-01-06 ソニー株式会社 デコード装置、デコード方法、およびプログラム
EP3249646B1 (fr) * 2016-05-24 2019-04-17 Dolby Laboratories Licensing Corp. Mesure et verification de l'alimentation temporelle de multiples canaux audio et metadottes associées
US10015612B2 (en) 2016-05-25 2018-07-03 Dolby Laboratories Licensing Corporation Measurement, verification and correction of time alignment of multiple audio channels and associated metadata
EP3324407A1 (fr) * 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Appareil et procédé de décomposition d'un signal audio en utilisant un rapport comme caractéristique de séparation
EP3324406A1 (fr) 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Appareil et procédé destinés à décomposer un signal audio au moyen d'un seuil variable
CN112986963B (zh) * 2021-02-08 2024-05-03 武汉徕得智能技术有限公司 一种激光脉冲测距回波信号多路缩放结果选择控制方法
CN112995708A (zh) * 2021-04-21 2021-06-18 湖南快乐阳光互动娱乐传媒有限公司 一种多视频同步方法及装置
CN114003546B (zh) * 2022-01-04 2022-04-12 之江实验室 一种多通道开关量复合编码设计方法和装置

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000155598A (ja) * 1998-11-19 2000-06-06 Matsushita Electric Ind Co Ltd 多チャンネル・オーディオ信号の符号化/復号化方法と装置
EP1173925B1 (fr) * 1999-04-07 2003-12-03 Dolby Laboratories Licensing Corporation Perfectionnements matriciels de codage et de decodage sans perte
US7013301B2 (en) * 2003-09-23 2006-03-14 Predixis Corporation Audio fingerprinting system and method
US6990453B2 (en) * 2000-07-31 2006-01-24 Landmark Digital Services Llc System and methods for recognizing sound and music signals in high noise and distortion
TW510144B (en) * 2000-12-27 2002-11-11 C Media Electronics Inc Method and structure to output four-channel analog signal using two channel audio hardware
US7461002B2 (en) * 2001-04-13 2008-12-02 Dolby Laboratories Licensing Corporation Method for time aligning audio signals using characterizations based on auditory events
US7116787B2 (en) * 2001-05-04 2006-10-03 Agere Systems Inc. Perceptual synthesis of auditory scenes
US7292901B2 (en) * 2002-06-24 2007-11-06 Agere Systems Inc. Hybrid multi-channel/cue coding/decoding of audio signals
US7006636B2 (en) * 2002-05-24 2006-02-28 Agere Systems Inc. Coherence-based audio coding and synthesis
US20030035553A1 (en) * 2001-08-10 2003-02-20 Frank Baumgarte Backwards-compatible perceptual coding of spatial cues
AU2003230993A1 (en) * 2002-04-25 2003-11-10 Shazam Entertainment, Ltd. Robust and invariant audio pattern matching
AU2003219438A1 (en) * 2002-05-16 2003-12-02 Koninklijke Philips Electronics N.V. Signal processing method and arrangement
WO2005011281A1 (fr) * 2003-07-25 2005-02-03 Koninklijke Philips Electronics N.V. Procede et dispositif de generation et de detection d'empreintes pour la synchronisation audio et video
ES2324926T3 (es) 2004-03-01 2009-08-19 Dolby Laboratories Licensing Corporation Descodificacion de audio multicanal.
DE102004046746B4 (de) * 2004-09-27 2007-03-01 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Verfahren zum Synchronisieren von Zusatzdaten und Basisdaten
US7567899B2 (en) * 2004-12-30 2009-07-28 All Media Guide, Llc Methods and apparatus for audio recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2006102991A1 *

Also Published As

Publication number Publication date
AU2006228821A1 (en) 2006-10-05
JP2008538239A (ja) 2008-10-16
US7903751B2 (en) 2011-03-08
TWI318845B (en) 2009-12-21
MY139836A (en) 2009-10-30
CN101189661A (zh) 2008-05-28
DE502006003997D1 (de) 2009-07-30
DE102005014477A1 (de) 2006-10-12
CA2603027C (fr) 2012-09-11
EP1864279B1 (fr) 2009-06-17
TW200644704A (en) 2006-12-16
WO2006102991A1 (fr) 2006-10-05
US20080013614A1 (en) 2008-01-17
CN101189661B (zh) 2011-10-26
ATE434253T1 (de) 2009-07-15
HK1111259A1 (en) 2008-08-01
CA2603027A1 (fr) 2006-10-05
AU2006228821B2 (en) 2009-07-23
JP5273858B2 (ja) 2013-08-28

Similar Documents

Publication Publication Date Title
EP1864279B1 (fr) Dispositif et procede pour produire un flux de donnees et pour produire une representation multicanaux
EP2240928B1 (fr) Dispositif et procédé pour calculer l'empreinte digitale d'un signal audio
EP2240929B1 (fr) Dispositif et procédé pour synchroniser des données d'extension à plusieurs canaux avec un signal audio et pour traiter le signal audio
EP1687809B1 (fr) Appareil et procede pour la reconstitution d'un signal audio multi-canaux et pour generer un enregistrement des parametres correspondants
EP1794564B1 (fr) Dispositif et procede pour synchroniser des donnees supplementaires et des donnees de base
DE602005006424T2 (de) Stereokompatible mehrkanal-audiokodierung
DE602004008613T2 (de) Treueoptimierte kodierung mit variabler rahmenlänge
EP1763870B1 (fr) Production d'un signal multicanal code, et decodage d'un signal multicanal code
DE69731677T2 (de) Verbessertes Kombinationsstereokodierverfahren mit zeitlicher Hüllkurvenformgebung
DE602004004168T2 (de) Kompatible mehrkanal-codierung/-decodierung
DE60000412T2 (de) Datenrahmen strukturierung für adaptive blocklängenkodierung
EP0954909B1 (fr) Procede de codage d'un signal audio
DE69927505T2 (de) Verfahren zum einfügen von zusatzdaten in einen audiodatenstrom
DE60002483T2 (de) Skalierbares kodierungsverfahren für hochqualitätsaudio
DE69431230T2 (de) Wahrnehmungsgebundene Mehrkanal-Audiokodierung mit adaptiver Bitverteilung
EP1854334A1 (fr) Dispositif et procede de production d'un signal stereo code d'un morceau audio ou d'un flux de donnees audio
JP2017532603A (ja) オーディオ信号のエンコードおよびデコード
DE102007029381A1 (de) Digitalsignal-Verarbeitungsvorrichtung, Digitalsignal-Verarbeitungsverfahren, Digitalsignal-Verarbeitungsprogramm, Digitalsignal-Wiedergabevorrichtung und Digitalsignal-Wiedergabeverfahren
DE602004006401T2 (de) Aktualisieren eines verborgenen datenkanals
DE202004003000U1 (de) Vorrichtung zum Beschreiben einer Audio-CD und Audio-CD

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20070913

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

DAX Request for extension of the european patent (deleted)
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1111259

Country of ref document: HK

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Free format text: LANGUAGE OF EP DOCUMENT: GERMAN

REF Corresponds to:

Ref document number: 502006003997

Country of ref document: DE

Date of ref document: 20090730

Kind code of ref document: P

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1111259

Country of ref document: HK

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090917

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091017

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090928

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091017

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090917

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

26N No opposition filed

Effective date: 20100318

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090918

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20091218

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20090617

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 11

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 12

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 13

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: LU

Payment date: 20240321

Year of fee payment: 19

Ref country code: IE

Payment date: 20240319

Year of fee payment: 19

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: AT

Payment date: 20240318

Year of fee payment: 19

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: MC

Payment date: 20240320

Year of fee payment: 19

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20240321

Year of fee payment: 19

Ref country code: GB

Payment date: 20240322

Year of fee payment: 19

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20240320

Year of fee payment: 19

Ref country code: BE

Payment date: 20240320

Year of fee payment: 19

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: CH

Payment date: 20240401

Year of fee payment: 19