CN101223579A - Method of encoding and decoding an audio signal - Google Patents
Method of encoding and decoding an audio signal Download PDFInfo
- Publication number
- CN101223579A CN101223579A CNA2006800263123A CN200680026312A CN101223579A CN 101223579 A CN101223579 A CN 101223579A CN A2006800263123 A CNA2006800263123 A CN A2006800263123A CN 200680026312 A CN200680026312 A CN 200680026312A CN 101223579 A CN101223579 A CN 101223579A
- Authority
- CN
- China
- Prior art keywords
- supplementary
- signal
- embedded
- sound
- spatial information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
An apparatus for encoding and decoding an audio signal and method thereof are disclosed, by which compatibility with a player of a general mono or stereo audio signal can be provided in coding an audio signal and by which spatial information for a multi-channel audio signal can be stored or transmitted without a presence of an auxiliary data area. The present invention includes extracting side information embedded in non-recognizable component of audio signal components and decoding the audio signal using the extracted side information.
Description
Technical field
The present invention relates to method with encode/decode audio signal.
Background technology
Recently, carried out many work and be used for the various encoding schemes and the method for digital audio and video signals, and made the product that many and various encoding schemes and method are associated with research and development.
And, developed the encoding scheme that the spatial information that uses multi-channel audio signal changes over monophony or stereo audio signal multi-channel audio signal.
Yet, in the situation that sound signal is stored in some recording mediums, do not have the ancillary data area that is used for storage space information.So, in this case, because only monophony or stereo audio signal are stored and send, so only reproduce this monophony or stereo audio signal.Therefore, sound quality is comparatively dull.
In addition, storing separately or sending in the situation of spatial information, the compatibility issue of the player of existence and common monophony or stereo audio signal.
Summary of the invention
Therefore, the present invention is directed to the devices and methods therefor that is used for encode/decode audio signal, it has avoided one or more problems of causing owing to the limitation of correlation technique and shortcoming substantially.
One object of the present invention is to provide a kind of devices and methods therefor that is used for encode/decode audio signal, takes this to provide when coding audio signal the compatibility with the player of common monophony or stereo audio signal.
Another object of the present invention is to provide a kind of devices and methods therefor that is used for encode/decode audio signal, take this under the situation that does not have ancillary data area, to store or to send the spatial information of multi-channel audio signal.
Other features and advantages of the present invention will be set forth in the following description, and part will obviously maybe can obtain teaching by practice of the present invention because of this description.Purpose of the present invention and other advantage will realize by the structure that particularly points out in written description and claim and accompanying drawing and obtain.
In order to realize these and other advantage and according to purposes of the present invention, a kind of method according to decoded audio signal of the present invention comprises: extract by being embedded in the step (a) of the supplementary in this sound signal at least one sound channel that is dispersed in sound signal, and use the decode step (b) of this sound signal of this supplementary.
For further these and other advantage of realization and according to purposes of the present invention, a kind of method according to coding audio signal of the present invention comprises: generate the step (a) of the required supplementary of decoded audio signal, and by scattering this supplementary it is embedded in the step (b) in the sound signal with at least one sound channel.
In order further to realize these and other advantage and according to purposes of the present invention, a kind of data structure according to the present invention comprises: sound signal and by being scattered the required supplementary of this sound signal of decoding that is used in the unrecognizable component that is embedded in sound signal with at least one sound channel.
For further these and other advantage of realization and according to purposes of the present invention, a kind of device that is used for coding audio signal according to the present invention comprises: the supplementary generation unit is used to generate the required supplementary of this sound signal of decoding; And the embedding unit, be used for it being embedded in the sound signal with at least one sound channel by scattering supplementary.
For further these and other advantage of realization and according to purposes of the present invention, a kind of device that is used for decoded audio signal according to the present invention comprises: embed signal decoding unit, be used for extracting the supplementary that is embedded in the sound signal with at least one sound channel by distribution; And the multichannel generation unit, be used for by using this additional information this sound signal of decoding.
Will be recognized that above summary and following detailed description are exemplary and indicative, and aim to provide further explanation the present invention for required protection.
The accompanying drawing summary
Be included to provide to further understanding of the present invention and be comprised in this instructions and constitute its a part of accompanying drawing show embodiments of the invention, and together work to explain principle of the present invention with description.
In the accompanying drawings:
Fig. 1 is the diagrammatic sketch that is used to explain according to the method for the spatial information of human body identification sound signal of the present invention;
Fig. 2 is the block diagram according to spatial encoder of the present invention;
Fig. 3 is the detailed diagram that is used for the embedding unit of the spatial encoder shown in the arrangement plan 2 according to of the present invention;
Fig. 4 is the diagrammatic sketch that is used to reset first method of spatial information bit stream according to of the present invention;
Fig. 5 is the diagrammatic sketch that is used to reset second method of spatial information bit stream according to of the present invention;
Fig. 6 A is the diagrammatic sketch according to the spatial information bit stream through shaping of the present invention;
Fig. 6 B is the detailed view of the configuration of the spatial information bit stream shown in Fig. 6 A;
Fig. 7 is the block diagram according to spatial decoder of the present invention;
Fig. 8 is the concrete block diagram that is included in the embedding decoding signals in the spatial decoder according to of the present invention;
Fig. 9 is the diagrammatic sketch that is used to explain according to the situation of common PCM decoder reproducing audio signal of the present invention;
Figure 10 is the process flow diagram that is used for spatial information is embedded in down the coding method that mixes (downmix) signal according to of the present invention;
Figure 11 is the process flow diagram that is used for the method for the spatial information decoding that is embedded in down mixed signal according to of the present invention;
Figure 12 is the diagrammatic sketch that is embedded in down the frame sign of the spatial information bit stream in the mixed signal according to of the present invention;
Figure 13 is the diagrammatic sketch that is embedded in down the spatial information bit stream in the mixed signal by fixed size according to of the present invention;
Figure 14 A is a diagrammatic sketch of explaining first method of the time alignment problem be used to solve the spatial information bit stream that embeds by fixed size;
Figure 14 B is a diagrammatic sketch of explaining second method of the time alignment problem be used to solve the spatial information bit stream that embeds by fixed size;
Figure 15 is the diagrammatic sketch that is used for the spatial information bit stream is appended to down the method for mixed signal according to of the present invention;
Figure 16 is the process flow diagram that is used for the spatial information bit stream Methods for Coding that is embedded in down mixed signal by different sizes according to of the present invention;
Figure 17 is the process flow diagram that is used for the spatial information bit stream Methods for Coding that is embedded in down mixed signal by fixed size according to of the present invention;
Figure 18 is the diagrammatic sketch that the spatial information bit stream is embedded into first method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 19 is the diagrammatic sketch that the spatial information bit stream is embedded into second method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 20 is the diagrammatic sketch that the spatial information bit stream is embedded into the third party's method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 21 is the diagrammatic sketch that the spatial information bit stream is embedded into the cubic method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 22 is the diagrammatic sketch that the spatial information bit stream is embedded into the 5th method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 23 is the diagrammatic sketch that the spatial information bit stream is embedded into the 6th method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 24 is the diagrammatic sketch that the spatial information bit stream is embedded into the 7th method in the sound signal that is mixed in down at least one sound channel according to of the present invention;
Figure 25 is used for being embedded into by the process flow diagram of descend the to be mixed in spatial information bit stream Methods for Coding of the sound signal at least one sound channel according to of the present invention; And
Figure 26 is used for being embedded into by the process flow diagram of the method for the spatial information bitstream decoding of the sound signal of descend to be mixed at least one sound channel according to of the present invention;
Embodiment
Now will be specifically with reference to its example the preferred embodiments of the present invention illustrated in the accompanying drawings.
At first, the present invention relates to a kind of supplementary that decoded audio signal is required and be embedded into devices and methods therefor in this sound signal.For convenience of explanation, this sound signal and supplementary are in the following description respectively with mixed signal and spatial information represent that they do not make any restriction to the present invention down.In this case, this sound signal comprises the PCM signal.
Fig. 1 is the diagrammatic sketch that is used to explain according to the method for the spatial information of human body identification sound signal of the present invention.
With reference to Fig. 1, can discern these facts of sound signal in 3 dimension ground based on human body, the encoding scheme that is used for multi-channel audio signal utilizes this sound signal to be expressed as 3 these facts of dimension space information by a plurality of parameter settings.
The spatial parameter that is used to represent the spatial information of multi-channel audio signal comprises CLD (levels of channels difference), ICC (inter-channel coherence), CTD (sound channel time difference) etc.CLD represents two capacity volume variances between the sound channel, and ICC represents two correlativitys between the sound channel, and CTD represents two time differences between the sound channel.
How ground, space identification sound signal and the notion that how to produce spatial parameter make an explanation to human body with reference to Fig. 1.
One direct sound wave 103 is from the left ear of long-range sound source 101 arrival human bodies, and another direct sound wave 102 is around the diffracted auris dextra 106 to arrive this human body of head.
These two sound waves 102 and 103 differ from one another on time of arrival and energy level.And CTD and CLD parameter generate by using these differences.
If arrive two ears respectively through reflected sound wave 104 and 105, if perhaps this sound source is scattered, the sound wave that does not then have correlativity each other will arrive two ears respectively to generate the ICC parameter.
Use is according to the spatial parameter that above-mentioned principle generated, can be with multi-channel audio signal as monophony or stereophonic signal sends and this signal is output as multi-channel signal.
The invention provides a kind of with spatial information-be spatial parameter-be embedded in monophony or the stereo audio signal, send this through embedding signal and be the method for multi-channel audio signal with the signal reproduction that is sent.The present invention is not limited to multi-channel audio signal.In the following description of the present invention, for convenience of explanation multi-channel audio signal is made an explanation.
Fig. 2 is the block diagram according to code device of the present invention.
With reference to Fig. 2, this code device according to the present invention receives multi-channel audio signal 201.In this case, ' n ' refers to the number of input sound channel.
The spatial information of multi-channel audio signal-be spatial parameter-generate from multi-channel audio signal 201 by supplementary generation unit 204.In the present invention, the indication of this spatial information send by mixed multichannel down (for example, left and right, mid-a, left side around, right around etc.) the following mixed signal 205 that generates of signal and the following mixed signal that is sent is gone up once more the information of sound signal sound channel used when mixing (upmix) for multi-channel audio signal.Randomly, following mixed signal 205 can be to use the direct following mixed signal such as mixed signal 202 under the art that provides from the outside to generate.
The spatial information that generates in the supplementary generation unit 204 is encoded to the spatial information bit stream for transmission and storage by supplementary coding unit 206.
The spatial information bit stream by suitable shaping with by embed unit 207 directly be inserted into the following mixed signal 205 of sound signal-promptly will send-in.When so doing, can use ' DAB embedding grammar '.
For example, different with the situation of reducing the staff sign indicating number by the AAC equipressure, mix down signal 205 be be stored in be difficult to storage space information therein storage medium (for example, in the situation of the original pcm audio signal that maybe will send by SPDIF (Sony (Sony)/Philip (Philips) digital interface) stereo compact disk), there is not the auxiliary data field that is used to store this spatial information.
In this case, if used " DAB embedding grammar ", then this spatial information can be embedded under the situation of no tonequality distortion in this original pcm audio signal.And,, wherein embed sound signal and not difference of original signal that this spatial information is arranged at common demoder.That is, at common PCM decoder, output signal the Lo '/Ro ' 208 that wherein embeds the information of having living space can be considered to the signal identical with input signal Lo/Ro205.
As ' DAB embedding grammar ', have ' position replace coding method ' ", ' echo hiding method ', ' based on the method for spread-spectrum ' etc.
It is a kind ofly to insert the method for customizing messages by what revise the audio samples that quantizes than low level that coding method is replaced in the position.In sound signal, hardly the quality of this sound signal is exerted an influence than the modification of low level.
The echo hiding method is a kind of method of inserting enough little so that echo that do not perceiveed for people's ear in sound signal.
And be a kind ofly via discrete cosine transform, discrete Fourier transform (DFT) etc. sound signal to be transformed in the frequency domain, specific binary message is carried out spread spectrum to form PN (pseudo noise) sequence and to add it to be switched in the frequency domain this sound signal based on the method for spread-spectrum.
In the present invention, will mainly explain position replacement coding method in the following description.Yet the present invention is not limited to the position and replaces coding method.
Fig. 3 is the detailed diagram that is used for the embedding unit of the spatial encoder shown in the arrangement plan 2 according to of the present invention.
With reference to Fig. 3, in the time of in the non component that spatial information is embedded in following mixed signal component by position replacement coding method, the insertion bit length (hereinafter referred to as ' K value ') that is used to embed this spatial information can use K position (K>0) replace only using lower 1 according to the method that is predetermined.This K position can use down mixed signal than low level, but be not limited in than low level.In this case, this method that is predetermined is a kind of method of for example seeking masking threshold and distributing suitable position according to this masking threshold according to psychoacoustic model.
Following mixed signal Lo/Ro 301 as shown in FIG. is transferred to audio-frequency signal coding unit 306 via the impact damper 303 in this embedding unit.
Masking threshold computing unit 304 is segmented into the sound signal of input predetermined segment (for example, piece) and seeks masking threshold for respective section subsequently.
Masking threshold computing unit 304 is sought and can be made and make amendment and the insertion bit length (that is K value) of the following mixed signal of audible distortion can not take place according to this masking threshold.That is, the figure place of using when spatial information being embedded into descend mixed signal is distributed according to piece.
In description of the invention, the data cell that bit length (that is K value) inserts is inserted in one of the use that piece is represented to be present in the frame.
In a frame, can there be at least one or a plurality of.If frame length is fixed, then block length can reduce along with the increase of piece number.
In case the K value is determined, just can in the spatial information bit stream, comprise this K value.That is, bit stream shaping unit 305 can be so that this spatial information bit stream can comprise this K value mode wherein comes the shaping of spatial information bit stream.In this case, synchronization character, error-detecging code, error correcting code etc. can be included in this spatial information bit stream.
Spatial information bit stream through shaping can be rearranged to embedding form.Spatial information bit stream through resetting is embedded into by audio-frequency signal coding unit 306 in the following mixed signal, and is output as sound signal the Lo '/Ro ' 307 that wherein embeds the information bit stream of having living space subsequently.In this case, the spatial information bit stream can be embedded in down in K the position of mixed signal.The K value can have a fixed value in a piece.In a word, the K value is inserted into the spatial information bit stream and is transferred to decoding device subsequently in the shaping of spatial information bit stream or rearrangement process.And this decoding device can use this K value to extract the spatial information bit stream.
As mentioning in describing before, the spatial information bit stream is gone through by piece and is embedded in down process in the mixed signal.This process is carried out by one of the whole bag of tricks.
First method is to use zero to substitute K lower position of mixed signal down and interpolation is implemented through the mode of the spatial information bit stream data of shaping simply.For example, if the K value is 3, if down the sample data of mixed signal is 11101101 and if the spatial information bit stream data that embeds is 111, then low 3 of ' 11101101 ' are used zero to substitute to provide 11101000.And spatial information bit stream data ' 111 ' is added to " 11101000 " so that " 11101111 " to be provided.
Second method is to use dithering to implement.At first, deduct spatial information bit stream data from the insertion district of mixing signal down through resetting.Mixed signal comes re-quantization based on this K value down then.And this spatial information bit stream data through resetting is added to this following mixed signal through re-quantization.For example, if the K value is 3,, then deduct ' 111 ' to provide 11100110 from ' 11101101 ' if down the sample data of mixed signal is 11101101 and if the spatial information bit stream data that embeds is 111.Low 3 subsequently by re-quantization so that ' 11101000 ' (by rounding off) to be provided.And ' 111 ' is added to ' 11101000 ' to provide ' 11101111 '
Because the spatial information bit stream that is embedded in down in the mixed signal is a stream of random bits, so it may not possess white noise character.Because mixing signal interpolation white noise type signal downwards is favourable on the tonequality characteristic, so the spatial information bit stream is gone through albefaction (whitening) process to be added to down mixed signal.And this albefaction process is applicable to the spatial information bit stream except that synchronization character.
In the present invention, " albefaction " expression makes random signal have the volume of equal or almost similar sound signal at the All Ranges of frequency domain.
In addition, when the spatial information bit stream being embedded in descend mixed signal, can be by spatial information bit stream using noise manufacturing process be minimized audible distortion.
In the present invention, ' noise shaped method ' expression is revised noisiness so that move on to the process that can listen the higher frequency band on the frequency range or generate corresponding to the time varying filter of the masking threshold that obtains from the respective audio signal and this wave filter of passing through to be generated from the energy of the quantizing noise that quantizes to generate and revise from the process of the characteristic of the noise that quantizes to generate.
Fig. 4 is the diagrammatic sketch that is used for first method of spatial information bit stream rearrangement according to of the present invention.
With reference to Fig. 4,, can use the K value that the spatial information bit stream is reset to can embed form as in describing before, mentioning.In this case, the spatial information bit stream can be embedded in the following mixed signal by resetting in every way.And Fig. 4 shows the method with sample plane order embedded space information.
This first method is a kind of being that unit scatters the spatial information bit stream of relevant block and sequentially embeds the method that the mode of the spatial information bit stream that is scattered is reset the spatial information bit stream by the K bit.
If be made of N sample 403 if the K value is 4 and pieces 405, then spatial information bit stream 401 can be rearranged into and sequentially be embedded in low 4 of each sample.
As mentioning in describing before, the present invention is not limited to the spatial information bit stream is embedded in low 4 this situations of each sample.
In addition, in the lower K position of each sample, as shown in the figure, the spatial information bit stream can at first be embedded into MSB (highest significant position) or at first be embedded into LSB (least significant bit (LSB)).
In Fig. 4, arrow 404 indications embed direction and the interior digital designation data rearrangement order of bracket.
The certain bits layer that the bit plane indication is made of a plurality of positions.
In the figure place of the spatial information bit stream that will embed situation, with zero clogging remaining bit 406, in remaining bit, inserting random signal or available original following mixed signal is replaced remaining bit less than the embedded figure place in the insertion district of embedded space information bit stream therein.
For example, if be configured to the number of samples (N) of a piece if be 100 and the K value be 4, then the embedded figure place (W) in this piece is W=N*K=100*4=400.
If the figure place (V) of the spatial information bit stream that embeds be 390 positions (promptly, V<M) is then with zero clogging remaining 10 positions, inserting random signal or replace these remaining 10 positions, fill these remaining 10 positions or this remaining 10 positions are filled in available their combination with the tailer sequence of designation data end with original mixed signal down in these remaining 10 positions.Tailer sequence is represented the bit sequence of the end of space information bit stream in the indication relevant block.Though Fig. 4 shows remaining bit and clog by piece, the present invention also comprises in the above described manner and clogs remaining bit by inserting frame.
Fig. 5 is the diagrammatic sketch that is used to reset second method of spatial information bit stream according to of the present invention.
With reference to Fig. 5, this second method is to implement by the mode of bit plane 502 order rearrangement spatial information bit streams 501.In this case, the spatial information bit stream can be by piece from mixing being embedded in proper order than low level of signal down, and this does not make any restriction to the present invention certainly.
For example, if be configured to the number of samples (N) of a piece if be 100 and the K value be 4,100 least significant bit (LSB)s that then are configured to bit plane 0502 are preferentially clogged and are configured to 100 positions of bit plane 1502 and can be clogged.
In Fig. 5, arrow 505 indications embed direction and the interior digital designation data rearrangement order of bracket.
The advantageous particularly during synchronization character of this second method on extracting random site.From through resetting and during the synchronization character of the spatial information bit stream that encoded signals search is inserted, only can extracting LSB and search for synchronization character.
And can expect that according to the figure place (V) of the spatial information bit stream that will embed, this second method is only used minimum LSB.In this case, if the figure place (V) of the spatial information bit stream that embeds is less than inciting somebody to action the embedded figure place (W) in the insertion district of embedded space information bit stream therein, then clog remaining bit 506, be inserted into random signal in the remaining bit, replace remaining bit, clog remaining bit or remaining bit is clogged in available their combination with the end tailer sequence of designation data end with original mixed signal down with zero.Especially, using down, the method for mixed signal is favourable.Though Fig. 5 shows the example of clogging remaining bit by piece, the present invention also comprises in the above described manner by inserting the situation that frame is clogged remaining bit.
Fig. 6 A shows according to of the present invention the spatial information bit stream is embedded in down bitstream structure in the mixed signal.
With reference to Fig. 6 A, spatial information bit stream 607 can be reset synchronization character 603 and the K value 604 that is used for this spatial information bit stream to comprise by bit stream shaping unit 305.
And, in the shaping process, at least one error-detecging code or error correcting code 606 or 608 (following will being described error-detecging code) can be included in the spatial information bit stream of shaping.Error-detecging code can be judged the whether distortion in transmission or storing process of this spatial information bit stream 607.
Error-detecging code comprises CRC (cyclic redundancy check (CRC)).Can comprise error-detecging code by being divided into two steps.Be used for having the error-detecging code 1 of header 601 of K value and the error-detecging code 2 that is used for the frame data 602 of spatial information bit stream and can be included in the spatial information bit stream with being separated.In addition, all the other information 605 can be included in the spatial information bit stream with being separated.And, can be included in these all the other information 605 about the information of rearrangement method of this spatial information bit stream etc.
Fig. 6 B is the detailed view of the configuration of the spatial information bit stream shown in Fig. 6 A.The frame that Fig. 6 B shows spatial information bit stream 601 comprises the embodiment of two pieces (the present invention is not limited to this).
With reference to Fig. 6 B, the spatial information bit stream shown in Fig. 6 B comprises synchronization character 612, K value (K1, K2, K3, K4) 613 to 616, all the other information 617 and error- detecging code 618 and 623.
Spatial information bit stream 610 comprises pair of block.In the situation of stereophonic signal, piece 1 can be made up of the piece 619 and 620 that is respectively applied for L channel and R channel.And piece 2 can be made up of the piece 621 and 622 that is respectively applied for L channel and R channel.
Though in Fig. 6 B stereophonic signal has been shown, the present invention is not limited to stereophonic signal.
The insertion bit length of these pieces (K value) is included in the header part.
The insertion bit length of the L channel of K1 613 indicator dogs 1.The insertion bit length of the R channel of K2 614 indicator dogs 1.The insertion bit length of the L channel of K3 615 indicator dogs 2.And the insertion position size of the R channel of K4 616 indicator dogs 2.
And, can comprise error-detecging code by being divided into two steps.For example, the error-detecging code 2 comprising the frame data 611 of the error-detecging code 1618 of the header 609 of K value and this spatial information bit stream can be separated to comprise.
Fig. 7 is the block diagram according to decoding device of the present invention.
With reference to Fig. 7, receive sound signal the Lo '/Ro ' 701 that has wherein embedded the spatial information bit stream according to decoding device of the present invention.
The sound signal that wherein embeds the information bit stream of having living space can be a kind of in monophony, the stereo and multi-channel signal.For the ease of explanation, stereophonic signal is used as example of the present invention, but this does not make any restriction to the present invention.
Embed signal decoding unit 702 and can extract the spatial information bit stream from sound signal 701.
By the spatial information bit stream that embeds signal decoding unit 702 extractions is encoded spatial information bit stream.And encoded spatial information bit stream can be an input signal of going to spatial information decoding unit 703.
Spatial information decoding unit 703 is with encoded spatial information bitstream decoding and will output to multichannel generation unit 704 through the spatial information bit stream of decoding subsequently.
Fig. 8 is the detailed diagram that is used to be configured to the embedding signal decoding unit 702 of this decoding device according to of the present invention.
With reference to Fig. 8, sound signal the Lo '/Ro ' that has wherein embedded spatial information is imported into and embeds signal decoding unit 702.And synchronization character search unit 802 detects synchronization character from sound signal 801.In this case, can detect this synchronization character from a sound channel of this sound signal.
After detecting synchronization character, header decoding unit 803 is decoded the header district.In this case, the information extraction of predetermined length from this header district and data inverse can contrary albefaction scheme be applied to the header district information except that synchronization character in institute's information extraction to revising unit 804.
Then, can be from the length information in this header district of information acquisition, header district of having used contrary albefaction scheme thereon etc.
And data inverse is revised unit 804 can will be applied to remaining spatial information bit stream against the albefaction scheme.Information such as K value etc. can obtain by the header decoding.The raw spatial information bit stream can obtain arranging once more through the spatial information bit stream of resetting by using such as information such as K values.In addition, can obtain to arrange down the sync bit information of the frame of mixed signal and spatial information bit stream, promptly the frame arrangement information 806.
Fig. 9 is the figure that is used to explain according to the situation of general PCM decoding device reproducing audio signal of the present invention.
With reference to Fig. 9, sound signal the Lo '/Ro ' that wherein embeds the information bit stream of having living space is used as the input of general PCM decoding device.
Sound signal Lo '/Ro ' that general PCM decoding device will wherein embed the information bit stream of having living space is identified as the normal stereo sound signal to reproduce sound.And the sound signal 902 before the sound of reproduction and the embedded space information is not difference with regard to tonequality.
Therefore, be compatible with the normal reproduction of the stereophonic signal in the general PCM decoding device according to the sound signal of wherein embedded space information of the present invention and have the advantage that multi-channel audio signal is provided in can the decoding device of multi-channel decoding.
Figure 10 is at the process flow diagram that mixes the coding method of embedded space information in the signal down according to of the present invention.
With reference to Figure 10, sound signal is mixed (1001,1002) down from multi-channel signal.In this case, following mixed signal can be a kind of in monophony, the stereo and multi-channel signal.
Then, extract spatial information (1003) from multi-channel signal.And usage space information span information bit stream (1004).
The spatial information bit stream is embedded in down (1005) in the mixed signal.
And, comprise that the whole bit stream that wherein embeds the following mixed signal of the information bit stream of having living space is transferred into decoding device (1006).
Especially, the present invention's insertion bit length (being the K value) of using down mixed signal to find to insert the insertion district of spatial information the bit stream therein and spatial information bit stream can being embedded in this insertion district.
Figure 11 is according to the process flow diagram to the method that is embedded in down the spatial information decoding in the mixed signal of the present invention.
With reference to Figure 11, the decoding device reception comprises the whole bit stream (1101) of the following mixed signal that wherein embeds the information bit stream of having living space and extracts mixed signal (1102) down from this bit stream.
Decoding device extracts also information bit stream (1103) between decode empty from whole bit stream.
Decoding device extracts spatial information (1104) by decoding and uses the spatial information decoding mixed signal of this time (1105) that is extracted subsequently.In this case, following mixed signal can be decoded as two sound channels or a plurality of sound channel.
Especially, the present invention can extract the information of the information of spatial information bit stream embedding grammar and K value and can use the embedding grammar that is extracted and the K value of being extracted to this spatial information bitstream decoding.
Figure 12 is the diagrammatic sketch that is embedded in down the frame length of the spatial information bit stream in the mixed signal according to of the present invention.
With reference to Figure 12, ' frame ' expression has a header and allows the unit of independent decoding one predetermined length.In description of the invention, ' frame ' expression is about to ' the insertion frame ' of appearance.In the present invention, ' insertion frame ' is illustrated in down the unit of embedded space information bit stream in the mixed signal.
And the length of insertion frame can define frame by frame or can use predetermined length.
For example, make insert in frame length and the spatial information bit stream corresponding to decode and the frame length (s) (hereinafter being referred to as " decoded frame length ") of the unit of application space information have equal length (referring to, Figure 12 (a)), be ' S ' multiple (referring to, Figure 12 (b)) or make ' S ' be the multiple of ' N ' (referring to, Figure 12 (c)).
Under the situation of N=S, shown in Figure 12 (a), decoded frame length (S, 1201) and insertion frame length (N, 1202) unanimity are so that decoding processing.
Under the situation of N>S, shown in Figure 12 (b), can reduce the figure place of being added to transmit a mode of inserting frame (N, 1204) by a plurality of decoded frames (1203) are connected together owing to header, error-detecging code (for example CRC) etc.
Under the situation of N<S, shown in Figure 12 (c), can dispose a decoded frame (S, 1205) by some insertion frames (N, 1206) are connected together.
In inserting the frame header, the information of a plurality of subframes that can insert the information, the information of inserting frame length (N) that are used for the insertion bit length of embedded space information therein, are included in this insertion frame etc.
Figure 13 is by inserting the figure that frame unit is embedded in down the spatial information bit stream in the mixed signal according to of the present invention.
At first, in every kind of situation shown in Figure 12 (a), 12 (b), 12 (c), insert frame and decoded frame and be configured to multiple each other.
With reference to Figure 13, in order to transmit, the bit stream of configurable regular length, for example, the grouping 1303 of transport stream (TS) form.
Especially, the grouped element that spatial information bit stream 1301 can predetermined length be the boundary and no matter the decoded frame length of spatial information bit stream how.Wherein be inserted with such as packets of information such as TS headers 1302 and be transmitted to decoding device.The length of inserting frame can define or use predetermined length rather than define in frame by every frame.
Consider differently separately and mixing maximum number of digits (K_max) difference that signal does not have following of the situation of quality distortion to distribute down according to the masking threshold of following each piece of characteristic of mixed signal, this method is necessary for the data rate of change spatial information bit stream.
For example, be not enough in the situation of the required spatial information bit stream of perfect representation relevant block at K_max, high data to K_max be transmitted and remainder data after pass through another block transfer.
Under the enough situation of K_max, the spatial information bit stream of next piece is by pre-loaded.
In this case, each TS grouping has an independently header.And, can comprise synchronization character, TS packet-length information in the header, be included in the information of a plurality of subframes in this TS grouping, the information of the interior insertion bit length that distributes of grouping etc.
Figure 14 A is a diagrammatic sketch of explaining first method be used to solve the time alignment problem by inserting the spatial information bit stream that frame unit embeds.
With reference to Figure 14 A, insert the length of frame by every frame definition and maybe can use a predetermined length.
May cause the time alignment problem between insertion frame start position and time mixed signal frame of spatial information bit stream of embedding by the embedding grammar that inserts frame unit.Therefore, the solution that needs a kind of time alignment problem.
In first method shown in Figure 14 A, the header 1402 of the decoded frame 1403 of spatial information (hereinafter being referred to as ' decoded frame header ') is separated to place.
Indicate whether to exist and to be included in the decoded frame header 1402 distinctive information of the positional information of the sound signal of its application space information.
For example, under the situation of TS grouping 1404 and 1405, indicate whether to exist the distinctive information 1408 (for example, sign) of decoded frame header 1402 to be comprised in the TS packet headers 1404.
If distinctive information 1408 is 1,, then can from this decoded frame header, extracts and indicate whether and to use the distinctive information of positional information of the following mixed signal of this spatial information bit stream to it if promptly decoded frame header 1402 exists.
Then, will can from decoded frame header 1402, extract according to the distinctive information of being extracted the positional information 1409 (for example deferred message) of the following mixed signal of its application space information bit stream.
If distinctive information 1411 is 0, then may not comprise positional information in the header of TS grouping.
Generally speaking, spatial information bit stream 1403 preferably appears at the corresponding front of mixed signal 1401 down.Therefore, positional information 1409 can be the sample value that postpones at.
Simultaneously, for the problem that the amount of the required information of the expression sample value that prevents to cause owing to excessive delay excessively increases, defined the sample group unit (for example granularity unit) of representing one group of sample etc.Therefore, positional information can be represented by this sample group unit.
Describe as the front and to mention, TS synchronization character 1406, insert bit length 1407, indicate whether to exist the distinctive information of decoded frame header and all the other information 140 can be included in the TS header.
Figure 14 B is a diagrammatic sketch of explaining second method of the time alignment problem be used to solve the spatial information bit stream that embeds by the insertion frame with the length that defines frame by frame.
With reference to Figure 14 B, under the situation of for example TS grouping, second method adopts the starting point 1413 of matching and decoding frame, the starting point and the corresponding mode of the starting point of mixed signal 1412 down of TS grouping to realize.
For the part through coupling, the distinctive information 1420 or 1422 (for example sign) of indicating this starting point of three types to be aligned can be included in the header 1415 of TS grouping.
Figure 14 B illustrates these three kinds of starting points and is mixing the n frame 1412 places coupling of signal down.In this case, distinctive information 1422 can have value 1.
If three kinds of starting points do not match, then distinctive information 1420 can have value 0.
For these three kinds of starting points are matched together, specific part 1417 usefulness after previous T S packet zero are clogged, are wherein inserted random signal, replace or clog with their combination with original sound signal of mixing down.
Mention as the front description, TS synchronization character 1418, insertion bit length 1419 and all the other information 1421 can be comprised in the TS packet headers 1415.
Figure 15 is the diagrammatic sketch that the spatial information bit stream is attached to down the method for mixed signal according to of the present invention.
With reference to Figure 15, the length of the frame of additional spatial information bit stream on it (hereinafter being referred to as ' additional frame ') can be the length cell that defines frame by frame or not according to the predetermined length unit of frame definition.
For example, as shown in the figure, can multiply by by decoded frame length 1504 or obtain inserting frame length (wherein N is a positive integer) or insert frame length and can have fixed-length cell divided by N with spatial information.
If decoded frame length 1504 is with to insert frame length different, can for example need not the segmentation of spatial information bit stream but randomly the cutting room information bit stream generate the insertion frame that has equal length with decoded frame length 1504 under with the situation that is fit to insert frame.
In this case, the spatial information bit stream is configured to be embedded in down and maybe can be configured in the mixed signal be additional to down on the mixed signal rather than be embedded in down in the mixed signal.
Be to become from analog signal conversion the signal of digital signal (hereinafter being referred to as ' first sound signal ') as the PCM signal, the spatial information bit stream can be configured to be embedded in this first sound signal.
In the further compressed digital signal as the MP3 signal (hereinafter being referred to as ' second sound signal '), the spatial information bit stream can be configured to append to this second sound signal.
For example under the situation of using second sound signal, following mixed signal is represented as the bit stream of compressed format.Therefore as shown in the figure, following mixed signal bitstream 1502 exists with compressed format and the spatial information of decoded frame length 1504 is affixed to down mixed signal bitstream 1502.
Therefore, the spatial information bit stream can transmit with burst.
Header 1503 can be present in the decoded frame.And, its positional information of having used the following mixed signal of spatial information is comprised in this header 1503.
Simultaneously, the present invention includes a kind of situation, promptly the spatial information bit stream is configured to the additional frame (for example the TS bit stream 1506) of compressed format this additional frame is appended to the following mixed signal bitstream 1502 of compressed format.
The TS header 1505 that can have in this case, TS bit stream 1506.And, in additional frame header (for example the TS header 1505), additional frame synchronizing information 1507 be can comprise, the distinctive information 1508 of the header that whether has decoded frame in this additional frame, the information that is included in the number of subframes in this additional frame and at least one in all the other information 1509 indicated.And the distinctive information whether starting point of indication additional frame and the starting point of decoded frame mate also can be comprised in the additional frame.
If the decoded frame header is present in the additional frame, then from the decoded frame header, extracts and indicate whether to exist the distinctive information of positional information of it having been used the following mixed signal of spatial information.
Then, can be according to the positional information of distinctive information extraction to the following mixed signal of its application space information.
Figure 16 is the process flow diagram that the insertion frame by all size is embedded in down the spatial information bit stream Methods for Coding in the mixed signal according to of the present invention.
With reference to Figure 16, sound signal is mixed (1601,1602) down from multi-channel audio signal.In this case, following mixed signal can be monophony, stereo or multi-channel audio signal.
And spatial information is extracted (1601,1603) from multi-channel audio signal.
Adopt the spatial information span information bit stream of being extracted (1604) subsequently.The spatial information that is generated can be embedded in down in the mixed signal by the insertion frame unit with length corresponding with the integral multiple of the decoded frame length of each frame.
If decoded frame length (S) is then inserted frame length (N) and is configured to equal a S (1607) by a plurality of N are linked together greater than inserting frame length (N) (1605).
If decoded frame length (S) is then inserted frame length (N) and is configured to equal a N (1608) by a plurality of S are linked together less than inserting frame length (N) (1606).
If decoded frame length (S) equals to insert frame length (N), then insert frame length (N) and be configured to equal decoded frame length (S) (1609).
Pei Zhi spatial information bit stream is embedded in down (1610) in the mixed signal in the above described manner.
At last, comprise that the whole bit stream that wherein embeds the following mixed signal of the information bit stream of having living space is transmitted (1611).
In addition, in the present invention, the information of the insertion frame length of spatial information bit stream can be embedded in the whole bit stream.
Figure 17 is to be embedded in down the process flow diagram of the spatial information bit stream Methods for Coding in the mixed signal by regular length according to of the present invention.
With reference to Figure 17, sound signal is descended to mix from multi-channel audio signal (1701,1702).In this case, following mixed signal can be monophony, stereo or multi-channel audio signal.
And, extract spatial information (1701,1703) from multi-channel audio signal.
Use the spatial information span information bit stream of being extracted (1704) subsequently.
After the spatial information bit stream is demarcated for the bit stream with regular length (grouped element) of for example transport stream (TS) (1705), the spatial information bit stream of this regular length is embedded in down (1706) in the mixed signal.
Then, comprise that the whole bit stream that wherein embeds the following mixed signal of the information bit stream of having living space is transmitted (1707).
In addition, in the present invention, mixed signal obtains the insertion bit length in the insertion district of embedded space information bit stream (being the K value) therein under using, and the spatial information bit stream can be embedded in the insertion district.
Figure 18 is the diagrammatic sketch that the spatial information bit stream is embedded in first method in the sound signal of being mixed down at least one sound channel according to of the present invention.
Have under the situation of at least one sound channel mixing signal configures down, spatial information is considered to and this at least one sound channel data shared.Therefore, need be a kind of by spatial information being dispersed in the method for embedded space information on this at least one sound channel.
Figure 18 is illustrated in the method for embedded space information on the sound channel of the following mixed signal with at least one sound channel.
With reference to Figure 18, spatial information is embedded in down in the K position of mixed signal.Especially, spatial information only is embedded in the sound channel and is not embedded in other sound channel.And the K value of each piece or sound channel can be different.
As the front describe mention like that, these can be corresponding to the low level of mixed signal down with corresponding of K value, but the present invention is not limited only to this.In this case, the spatial information bit stream can be inserted into the sound channel by the bit plane order that begins from LSB or by the sample plane order.
Figure 19 is the diagrammatic sketch that the spatial information bit stream is embedded in second method in the sound signal of being mixed down at least one sound channel according to of the present invention.For ease of explanation, Figure 19 illustrates the following mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 19, second method is to adopt successively the mode that spatial information is embedded the piece-n of piece-n, another sound channel (for example R channel) of a sound channel (for example L channel), the piece of last sound channel (L channel)-(n+1) etc. to realize.In this case, synchronizing information can only be embedded in the sound channel.
Although the spatial information bit stream can be embedded in the following mixed signal of each piece, however also can be in decode procedure by piece or extract the spatial information bit stream frame by frame.
Because the signaling feature of two sound channels of mixed frequency signal differs from one another, therefore can come to two channel allocation K values by finding two sound channels masking threshold separately separately.Especially, as shown in the figure, K1 and K2 are assigned to two sound channels respectively.
In this case, spatial information can be embedded in each sound channel in proper order by the bit plane order or the sample plane that begin from LSB.
Figure 20 is the diagrammatic sketch that the spatial information bit stream is embedded in the third party's method in the sound signal of being mixed down at least one sound channel according to of the present invention.Figure 20 illustrates the following mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 20, third party's method adopts and realizes by the mode that spatial information is spread to embedded space information on two sound channels.Especially, spatial information is to be embedded into by the mode that sample unit replaces corresponding embedding order at two sound channels.
Because down the signaling feature of two sound channels of mixed signal differs from one another, therefore can by find individually two sound channels separately masking threshold and the K value differently is assigned in two sound channels.Particularly, K as shown in the figure
1And K
2Distributed to two sound channels respectively.
The K value of each piece can differ from one another.For example, spatial information is successively placed on the K of the sample-1 of a sound channel (for example L channel)
1In the individual low level, the K of the sample-1 of another sound channel (for example R channel)
2In the individual low level, the K of the sample-2 of last sound channel (for example L channel)
1In the individual low level and the K of sample 2 of back one sound channel (for example R channel)
2In the individual low level.
In the accompanying drawings, the order of the indication of the numeral in bracket packing space information bit stream.Begin to fill from MSB although Figure 20 illustrates the spatial information bit stream, yet the spatial information bit stream also can begin to fill from LSB.
Figure 21 is the diagrammatic sketch that the spatial information bit stream is embedded in the cubic method in the sound signal of being mixed down at least one sound channel according to of the present invention.Figure 21 illustrates the following mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 21, cubic method is that the mode with embedded space information realizes at least one sound channel by spatial information is spread in employing.Particularly, spatial information is to be embedded into to begin to replace corresponding embedding mode in proper order by bit-plane cell from LSB at two sound channels.
Because down the signaling feature of two sound channels of mixed signal differs from one another, therefore can by find separately two sound channels separately the mask threshold value and with K value (K
1And K
2) differently distribute to two sound channels.Particularly, K1 as shown in the figure and K2 can be assigned to two sound channels respectively.
The K value of each piece can differ from one another.For example, spatial information is placed in 1 least significant bit (LSB) of sample-2 of 1 least significant bit (LSB) of sample-2 of 1 least significant bit (LSB), last sound channel (for example L channel) of sample-1 of 1 least significant bit (LSB), another sound channel (for example R channel) of the sample-1 of a sound channel (for example L channel) and back one sound channel (for example R channel) successively.In the accompanying drawings, the indication of the numeral in piece packing space sequence of information.
Be stored under storage medium (for example, stereo CD) that does not have ancillary data area or the situation that is sent out by SPDIF etc. in sound signal, the L/R sound channel interweaves by sample unit.Thereby if by the 3rd or cubic method stored audio signal, then to come audio signal according to the order that is received be favourable to demoder.
And cubic method is applicable to that the spatial information bit stream is by resetting the situation of storing by bit-plane cell.
As mentioning during the front is described, by being dispersed under the situation that is embedded on two sound channels, can differently distribute K value to all sound channels respectively at the spatial information bit stream.In this case, can transmit the K value separately by each sound channel in the bit stream.Under the situation that transmits a plurality of K values, differential coding is applicable to the situation of encoded K value.
Figure 22 is the diagrammatic sketch that the spatial information bit stream is embedded in the 5th method in the sound signal of being mixed down at least one sound channel according to of the present invention.Figure 22 illustrates the following mixed signal with two sound channels, but the invention is not restricted to this.
With reference to Figure 22, the mode with embedded space information realizes the employing of the 5th method on two sound channels by spatial information is dispersed in.Particularly, the 5th method is to realize in the mode of inserting identical value in each of two sound channels repeatedly.
In this case, the value with same sign can be inserted in each of two sound channels at least, and perhaps the value that sign is different can be respectively inserted in two sound channels at least.
For example, value 1 is inserted into each sound channel in two sound channels or is worth 1 and-1 and alternately is inserted into respectively in two sound channels.
The advantage of the 5th method is to be beneficial to by minimum effective insertion position (a for example K position) of comparing at least one sound channel to check error of transmission.
Particularly, under the situation that monophonic audio signal is sent to such as stereo medias such as CD, because the sound channel-R (R channel) of the sound channel-L (L channel) of mixed signal and following mixed signal is equal to each other down, therefore can improve robustness etc. by the spatial information that equilibrium is inserted.In this case, spatial information is embedded in each sound channel by the bit plane order that starts from LSB or by the sample plane order.
Figure 23 is the diagrammatic sketch that the spatial information bit stream is embedded in the 6th method in the sound signal of being mixed down at least one sound channel according to of the present invention.
The 6th method relates under the situation that frame in each sound channel comprises a plurality of (length B) method of spatial information being inserted in the following mixed signal with at least one sound channel.
With reference to Figure 23, the insertion bit length of each sound channel and piece (being the K value) can have different value separately or each sound channel can have identical value with piece.
Insert bit length (K for example
1, K
2, K
3, and K
4) can be stored in the frame header that a complete frame is once transmitted.And the frame header can be positioned on the LSB.In this case, header can be inserted into by bit-plane cell.And the spatial information data can alternately be inserted by sample unit or module unit.In Figure 23, the piece number in the frame is 2.Therefore, the length of piece (B) is N/2.In this case, the figure place that is inserted in this frame is (K1+K2+K3+K4) * B.
Figure 24 is the diagrammatic sketch that the spatial information bit stream is embedded in the 7th method in the sound signal of being mixed down at least one sound channel according to of the present invention.Figure 24 illustrates the following mixed signal with two sound channels, but the present invention is not limited only to this.
With reference to Figure 22, the 7th method is that the mode with embedded space information realizes on two sound channels by spatial information is spread in employing.Particularly, the 7th method is characterised in that and will will alternately insert the method for spatial information and mix mutually by the sample plane order is alternately inserted spatial information in two sound channels method in two sound channels by the bit plane order that starts from LSB or MSB.
This method is carried out maybe and can be carried out by module unit by frame unit.
But dash area 1 to C corresponding also step-by-step planar sequence with header as shown in figure 24 inserts among LSB or the MSB so that frame alignment word is inserted in search.
Other parts (non-shaded portion) C+1 and more high-order portion corresponding to the part except that header and can alternately being inserted in two sound channels so that extract the spatial information data by sample unit.For each sound channel and piece, insert position size (for example K value) and can have similar and different value each other.And all insert bit length and all can be comprised in the header.
Figure 25 is to being embedded into the process flow diagram of the spatial information Methods for Coding in the following mixed signal with at least one sound channel according to of the present invention.
With reference to Figure 25, sound signal is mixed to a sound channel (2501,2502) down from multi-channel audio signal.And, extract spatial information (2501,2503) from multi-channel audio signal.
Use the spatial information span information bit stream of extracting (2504) subsequently.
The spatial information bit stream is embedded in the following mixed signal with at least one sound channel (2505).In this case, can use a kind of in those the seven kinds of methods of embedded space information bit stream at least one sound channel.
Then, comprise that the whole stream that wherein embeds the following mixed signal of the information bit stream of having living space is transmitted (2506).In this case, the present invention uses down mixed signal to find the K value and the spatial information bit stream is embedded in the K position.
Figure 26 is according to the process flow diagram to the method that is embedded in the spatial information bitstream decoding in the following mixed signal that has at least one sound channel of the present invention.
With reference to Figure 26, spatial decoder receives and comprises the bit stream (2601) that wherein embeds the following mixed signal of the information bit stream of having living space.
Detect mixed signal (2602) down from the bit stream that receives.
Be embedded in that spatial information bit stream in the following mixed signal with at least one sound channel is extracted and according to the bit stream that receives decode (2603).
Then, use the spatial information that obtains by decoding will descend to mix conversion of signals and become multi-channel signal (2604).
The present invention extracts the distinctive information of the order of embedded space information bit stream and also can use this distinctive information to extract reconciliation code space information bit stream.
In addition, the present invention extracts the information of K value and can use information bit stream between this K value decode empty from the spatial information bit stream.
Commercial Application
Therefore, the invention provides following effect or advantage.
At first, when encoding multi-channel audio signal according to the present invention, spatial information is embedded in down in the mixed signal.Therefore, multi-channel audio signal can be stored into/be rendered to/from not having the storage medium (for example stereo CD) of ancillary data area or an audio format that does not have ancillary data area.
Secondly, spatial information can be embedded in down in the mixed signal by various frame lengths or fixed frame length.And spatial information can be embedded in the following mixed signal with at least one sound channel.Therefore, the present invention has improved Code And Decode efficient.
Although in conjunction with its preferred embodiment the present invention is set forth and illustrate, yet to those skilled in the art, be conspicuous can make various changes and variation therein and do not deviate from the spirit and scope of the present invention at this.Therefore, the present invention is intended to contain it and drops on all changes and variation in appended claims and the equivalent scope thereof.
Claims (23)
1. the method for a decoded audio signal comprises:
Extraction is embedded in the supplementary in the described sound signal, and wherein said supplementary is to scatter corresponding at least one sound channel of described sound signal; And
Use the described supplementary described sound signal of decoding.
2. the method for claim 1 is characterized in that, described supplementary is embedded in the insertion district of described sound signal by module unit.
3. method as claimed in claim 2 is characterized in that, the described supplementary in the described insertion district embeds by sample plane order or bit plane order.
4. method as claimed in claim 3 is characterized in that, the described supplementary in the described insertion district is embedded into from MSB (highest significant position) or LSB (least significant bit (LSB)).
5. method as claimed in claim 2 is characterized in that, described supplementary is embedded in the described insertion district by ALT-CH alternate channel.
6. the method for claim 1 is characterized in that, also comprises the synchronizing information of extracting described supplementary from a sound channel of described sound signal.
7. the method for claim 1 is characterized in that, described extraction supplementary comprises by sample unit extracts the insertion frame end of described supplementary until described supplementary.
8. method as claimed in claim 8 is characterized in that, described supplementary is inserted in the described sound signal with at least two sound channels repeatedly with identical value or value with opposite sign.
9. the method for claim 1 is characterized in that, the header step-by-step plane order of described supplementary is embedded in the described sound signal with at least one sound channel, and wherein, the zone except that described header is embedded into by the sample plane order.
10. the method for claim 1 is characterized in that, also comprises the insertion bit length that extracts described supplementary from the header of described supplementary.
11. the method for claim 1 is characterized in that, described sound signal comprises the following audio signal of multi-channel signal.
12. the method for claim 1 is characterized in that, described supplementary comprises the spatial information of multi-channel signal.
13. the method for a coding audio signal comprises:
The generation required supplementary of described sound signal that is used to decode; And
By scattering described supplementary described supplementary is embedded in the described sound signal with at least one sound channel.
14. method as claimed in claim 13 is characterized in that, described embedding supplementary comprises by module unit described supplementary is embedded in the described sound signal with a plurality of.
15. method as claimed in claim 14 is characterized in that, the insertion bit length of embedded supplementary obtains by described.
16. method as claimed in claim 14 is characterized in that, the described insertion bit length of embedded supplementary is that the sound channel by described sound signal obtains.
17. method as claimed in claim 13, it is characterized in that, described embedding supplementary also comprises, if described supplementary is embedded in figure place required in the frame of sound signal less than described supplementary is embedded the figure place that described sound signal allows, then replace remaining bit by inserting both combination at least of frame with zero, in random signal, original audio signal, tailer sequence or described zero, described random signal, described original audio signal and the described tailer sequence.
18. a data structure comprises:
Sound signal; And
By scattered be embedded in described sound signal with at least one sound channel can not discern the required supplementary of described sound signal of decoding of being used in the component.
19. data structure as claimed in claim 18 is characterized in that, described supplementary is embedded in the described sound signal by the sample plane order.
20. data structure as claimed in claim 18 is characterized in that, described supplementary is that the step-by-step plane order is embedded in the described sound signal.
21. data structure as claimed in claim 18 is characterized in that, the header step-by-step plane order of described supplementary is embedded in the described sound signal, and wherein, the zone except that described header is embedded into by the sample plane order.
22. a device that is used for coding audio signal comprises:
The supplementary generation unit is used to generate the required supplementary of decoded audio signal; And
Embed the unit, be used for this supplementary being embedded in described sound signal with at least one sound channel by scattering described supplementary.
23. a device that is used for decoded audio signal comprises:
Embed signal decoding unit, be used for extracting the supplementary that is embedded in described sound signal by scattering with at least one sound channel; And
The multichannel generation unit is used to use the described additional information described sound signal of decoding.
Applications Claiming Priority (19)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US68457805P | 2005-05-26 | 2005-05-26 | |
US60/684,578 | 2005-05-26 | ||
US75860806P | 2006-01-13 | 2006-01-13 | |
US60/758,608 | 2006-01-13 | ||
US78717206P | 2006-03-30 | 2006-03-30 | |
US60/787,172 | 2006-03-30 | ||
KR1020060030658 | 2006-04-04 | ||
KR1020060030661A KR20060122694A (en) | 2005-05-26 | 2006-04-04 | Method of inserting spatial bitstream in at least two channel down-mix audio signal |
KR10-2006-0030660 | 2006-04-04 | ||
KR1020060030661 | 2006-04-04 | ||
KR10-2006-0030658 | 2006-04-04 | ||
KR1020060030660 | 2006-04-04 | ||
KR1020060030658A KR20060122692A (en) | 2005-05-26 | 2006-04-04 | Method of encoding and decoding down-mix audio signal embeded with spatial bitstream |
KR10-2006-0030661 | 2006-04-04 | ||
KR1020060030660A KR20060122693A (en) | 2005-05-26 | 2006-04-04 | Modulation for insertion length of saptial bitstream into down-mix audio signal |
KR1020060046972 | 2006-05-25 | ||
KR10-2006-0046972 | 2006-05-25 | ||
KR1020060046972A KR20060122734A (en) | 2005-05-26 | 2006-05-25 | Encoding and decoding method of audio signal with selectable transmission method of spatial bitstream |
PCT/KR2006/002020 WO2006126858A2 (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101223579A true CN101223579A (en) | 2008-07-16 |
CN101223579B CN101223579B (en) | 2013-02-06 |
Family
ID=39406062
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2006800263104A Active CN101253550B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
CN2006800263119A Active CN101258538B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
CN2006800263123A Active CN101223579B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
CN200680018078XA Active CN101180674B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2006800263104A Active CN101253550B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
CN2006800263119A Active CN101258538B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200680018078XA Active CN101180674B (en) | 2005-05-26 | 2006-05-26 | Method of encoding and decoding an audio signal |
Country Status (1)
Country | Link |
---|---|
CN (4) | CN101253550B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107077861A (en) * | 2014-10-01 | 2017-08-18 | 杜比国际公司 | Audio coder and decoder |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ATE538469T1 (en) | 2008-07-01 | 2012-01-15 | Nokia Corp | APPARATUS AND METHOD FOR ADJUSTING SPATIAL INFORMATION IN A MULTI-CHANNEL AUDIO SIGNAL |
WO2010008200A2 (en) | 2008-07-15 | 2010-01-21 | Lg Electronics Inc. | A method and an apparatus for processing an audio signal |
JP5258967B2 (en) | 2008-07-15 | 2013-08-07 | エルジー エレクトロニクス インコーポレイティド | Audio signal processing method and apparatus |
CN101662688B (en) * | 2008-08-13 | 2012-10-03 | 韩国电子通信研究院 | Method and device for encoding and decoding audio signal |
CN101340191B (en) * | 2008-08-19 | 2013-07-31 | 无锡中星微电子有限公司 | Decoder and decoding method |
US9514768B2 (en) | 2010-08-06 | 2016-12-06 | Samsung Electronics Co., Ltd. | Audio reproducing method, audio reproducing apparatus therefor, and information storage medium |
KR20150032649A (en) * | 2012-07-02 | 2015-03-27 | 소니 주식회사 | Decoding device and method, encoding device and method, and program |
TWI517142B (en) | 2012-07-02 | 2016-01-11 | Sony Corp | Audio decoding apparatus and method, audio coding apparatus and method, and program |
AU2013284705B2 (en) | 2012-07-02 | 2018-11-29 | Sony Corporation | Decoding device and method, encoding device and method, and program |
CN104488026A (en) * | 2012-07-12 | 2015-04-01 | 杜比实验室特许公司 | Embedding data in stereo audio using saturation parameter modulation |
US9445197B2 (en) * | 2013-05-07 | 2016-09-13 | Bose Corporation | Signal processing for a headrest-based audio system |
EP3005353B1 (en) | 2013-05-24 | 2017-08-16 | Dolby International AB | Efficient coding of audio scenes comprising audio objects |
GB2515539A (en) | 2013-06-27 | 2014-12-31 | Samsung Electronics Co Ltd | Data structure for physical layer encapsulation |
US10149086B2 (en) * | 2014-03-28 | 2018-12-04 | Samsung Electronics Co., Ltd. | Method and apparatus for rendering acoustic signal, and computer-readable recording medium |
CN106716525B (en) * | 2014-09-25 | 2020-10-23 | 杜比实验室特许公司 | Sound object insertion in a downmix audio signal |
WO2016204581A1 (en) * | 2015-06-17 | 2016-12-22 | 삼성전자 주식회사 | Method and device for processing internal channels for low complexity format conversion |
CN107782977A (en) * | 2017-08-31 | 2018-03-09 | 苏州知声声学科技有限公司 | Multiple usb data capture card input signal Time delay measurement devices and measuring method |
JP7463278B2 (en) * | 2018-08-30 | 2024-04-08 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
CN109785849B (en) * | 2019-01-17 | 2020-11-27 | 福建歌航电子信息科技有限公司 | Method for inserting unidirectional control information into pcm audio stream based on iis transmission |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8901032A (en) * | 1988-11-10 | 1990-06-01 | Philips Nv | CODER FOR INCLUDING ADDITIONAL INFORMATION IN A DIGITAL AUDIO SIGNAL WITH A PREFERRED FORMAT, A DECODER FOR DERIVING THIS ADDITIONAL INFORMATION FROM THIS DIGITAL SIGNAL, AN APPARATUS FOR RECORDING A DIGITAL SIGNAL ON A CODE OF RECORD. OBTAINED A RECORD CARRIER WITH THIS DEVICE. |
DD289172A5 (en) * | 1988-11-29 | 1991-04-18 | N. V. Philips' Gloeilampenfabrieken,Nl | ARRANGEMENT FOR THE PROCESSING OF INFORMATION AND RECORDING RECEIVED BY THIS ARRANGEMENT |
NL9000338A (en) * | 1989-06-02 | 1991-01-02 | Koninkl Philips Electronics Nv | DIGITAL TRANSMISSION SYSTEM, TRANSMITTER AND RECEIVER FOR USE IN THE TRANSMISSION SYSTEM AND RECORD CARRIED OUT WITH THE TRANSMITTER IN THE FORM OF A RECORDING DEVICE. |
CA2323561C (en) * | 1999-01-13 | 2013-03-26 | Koninklijke Philips Electronics N.V. | Embedding supplemental data in an encoded signal |
CN1129114C (en) * | 1999-03-19 | 2003-11-26 | 索尼公司 | Additional information embedding method and its device, and additional information decoding method and its decoding device |
WO2003034627A1 (en) * | 2001-10-17 | 2003-04-24 | Koninklijke Philips Electronics N.V. | System for encoding auxiliary information within a signal |
-
2006
- 2006-05-26 CN CN2006800263104A patent/CN101253550B/en active Active
- 2006-05-26 CN CN2006800263119A patent/CN101258538B/en active Active
- 2006-05-26 CN CN2006800263123A patent/CN101223579B/en active Active
- 2006-05-26 CN CN200680018078XA patent/CN101180674B/en active Active
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107077861A (en) * | 2014-10-01 | 2017-08-18 | 杜比国际公司 | Audio coder and decoder |
CN107077861B (en) * | 2014-10-01 | 2020-12-18 | 杜比国际公司 | Audio encoder and decoder |
Also Published As
Publication number | Publication date |
---|---|
CN101180674B (en) | 2012-01-04 |
CN101180674A (en) | 2008-05-14 |
CN101253550B (en) | 2013-03-27 |
CN101223579B (en) | 2013-02-06 |
CN101258538B (en) | 2013-06-12 |
CN101253550A (en) | 2008-08-27 |
CN101258538A (en) | 2008-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101223579B (en) | Method of encoding and decoding an audio signal | |
US8214220B2 (en) | Method and apparatus for embedding spatial information and reproducing embedded signal for an audio signal | |
CN101542596B (en) | For the method and apparatus of the object-based audio signal of Code And Decode | |
NO20180980A1 (en) | Compatible multichannel encoding / decoding | |
CN101151659A (en) | Scalable multi-channel audio coding | |
CN1930914B (en) | Frequency-based coding of audio channels in parametric multi-channel coding systems | |
CN101911181A (en) | The method and apparatus that is used for audio signal | |
CN105637582A (en) | Audio encoding device and audio decoding device | |
RU2008132156A (en) | PERSONALIZED DECODING OF MULTI-CHANNEL VOLUME SOUND | |
CN101292428A (en) | Method and apparatus for encoding/decoding | |
US20200388291A1 (en) | Audio encoding method, to which brir/rir parameterization is applied, and method and device for reproducing audio by using parameterized brir/rir information | |
KR20060122694A (en) | Method of inserting spatial bitstream in at least two channel down-mix audio signal | |
TWI501220B (en) | Embedding and extracting ancillary data | |
WO2023173941A1 (en) | Multi-channel signal encoding and decoding methods, encoding and decoding devices, and terminal device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |