CN103733256A - Audio signal processing method, audio encoding apparatus, audio decoding apparatus, and terminal adopting the same - Google Patents

Audio signal processing method, audio encoding apparatus, audio decoding apparatus, and terminal adopting the same Download PDF

Info

Publication number
CN103733256A
CN103733256A CN201280038627.5A CN201280038627A CN103733256A CN 103733256 A CN103733256 A CN 103733256A CN 201280038627 A CN201280038627 A CN 201280038627A CN 103733256 A CN103733256 A CN 103733256A
Authority
CN
China
Prior art keywords
sound channel
signal
mixed
sound
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201280038627.5A
Other languages
Chinese (zh)
Inventor
李男淑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN103733256A publication Critical patent/CN103733256A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Abstract

An audio signal processing method includes the steps of: when a first plurality of input channels are down-mixed to a second plurality of output channels, comparing locations of the first plurality of input channels with locations of the second plurality of output channels; down-mixing channels having the same locations as those of the second plurality of output channels from among the first plurality of input channels to channels at the same locations from among the second plurality of output channels; searching for at least one adjacent channel with respect to each of the remaining channels from among the first plurality of input channels; determining a weight for the searched adjacent channel by considering at least one of a distance between channels, a signal correlation, and an error in restoration; and down-mixing each of the remaining channels of the first plurality of input channels to the adjacent channel on the basis of the determined weight.

Description

The terminal of acoustic signal processing method, audio coding equipment, audio decoding apparatus and the described method of employing
Technical field
Equipment and the method consistent with exemplary embodiment relate to audio coding/decoding, more particularly, relate to a kind of terminal of the minimized acoustic signal processing method of deterioration, audio coding equipment, audio decoding apparatus and the described method of employing that can make sound quality when multi-channel audio signal is resumed.
Background technology
Recently, along with popularizing of content of multimedia, user expects that the demand of experiencing relatively true to nature and abundant sound source environment increases.In order to meet these demands of user, carry out energetically the research of multichannel audio.
Multi-channel audio signal needs efficient data compression rate according to transmission environment.Specifically, spatial parameter is used for recovering multi-channel audio signal.In the processing of extraction spatial parameter, because distortion can occur in the impact of reverb signal.Subsequently, when multi-channel audio signal is resumed, can there is the deterioration of sound quality.
Therefore, need the multichannel audio codec technology of the deterioration that can reduce or remove the sound quality that can occur when usage space parameter is recovered multi-channel audio signal.
Summary of the invention
Technical matters
The many-side of one or more exemplary embodiment provides the terminal of the minimized acoustic signal processing method of deterioration, audio coding equipment, audio decoding apparatus and the described method of employing that can make sound quality when multi-channel audio signal is resumed.
Solution
According to the one side of one or more exemplary embodiment, a kind of acoustic signal processing method is provided, comprise: when more than first input sound channel contracted mixed (down-mix) while being more than second output channels, the position of the position to more than first input sound channel and more than second output channels compares; The sound channel contracting with the position identical with the position of more than second output channels in more than first input sound channel is mixed to the sound channel at same position place in more than second output channels; Search at least one contiguous sound channel of the each sound channel in the remaining sound channel in more than first input sound channel; Consider at least one in distance, the correlativity between signal and the error between convalescence between sound channel, determine the weighting factor of the contiguous sound channel searching; Based on definite weighting factor, the each sound channel contracting in the remaining sound channel in more than first input sound channel is mixed to described contiguous sound channel.
Accompanying drawing explanation
Fig. 1 is according to the block diagram of the audio signal processing of exemplary embodiment;
Fig. 2 is according to the block diagram of the audio coding equipment of exemplary embodiment;
Fig. 3 is according to the block diagram of the audio decoding apparatus of exemplary embodiment;
Fig. 4 illustrates according to the sound channel coupling between 10.2 channel audio signal of exemplary embodiment and 5.1 channel audio signal;
Fig. 5 is according to the process flow diagram of the contracting mixing method of exemplary embodiment;
Fig. 6 is according to the process flow diagram of mixed (up-mix) method of the increasing of exemplary embodiment;
Fig. 7 is according to the block diagram of the spatial parameter encoding device of exemplary embodiment;
Fig. 8 a and Fig. 8 b illustrate according to the variable quantization step-length of the energy value in the frequency band of the each frame for contracting mixing sound road;
Fig. 9 be illustrate for the frequency spectrum data of whole sound channel according to the curve map of the energy distribution of frequency band;
Figure 10 a to Figure 10 c is the curve map that the gross bit rate by changing threshold value adjustment is shown;
Figure 11 is according to the process flow diagram of the method for the generation spatial parameter of exemplary embodiment;
Figure 12 is according to the process flow diagram of the method for the generation spatial parameter of another exemplary embodiment;
Figure 13 is according to the process flow diagram of the acoustic signal processing method of exemplary embodiment;
Figure 14 a to Figure 14 c illustrates the example of the operation 1330 of operation 1110 for describing Figure 11 or Figure 13;
Figure 15 illustrates another example of the operation 1330 of operation 1110 for describing Figure 11 or Figure 13;
Figure 16 a to Figure 16 d illustrates another example of the operation 1330 of operation 1110 for describing Figure 11 or Figure 13;
Figure 17 is the curve map that the summation of angle parameter is shown;
Figure 18 is for describing according to the calculating of the angle parameter of exemplary embodiment;
Figure 19 is according to the block diagram of the audio signal processing of the integrated multichannel codec of exemplary embodiment and core codec;
Figure 20 is according to the block diagram of the audio coding equipment of exemplary embodiment;
Figure 21 is according to the block diagram of the audio decoding apparatus of exemplary embodiment.
Embodiment
The present invention can allow various changes or modification and pro forma various change, and specific embodiment will be illustrated in the accompanying drawings and in instructions and be described in detail.But, should understand specific embodiment and not limit the invention to specific open form but comprise embodiment all modifications, that be equal to or alternative in spirit of the present invention and technical scope.In the following description, will not be described in detail known function or structure, in order to avoid fuzzy the present invention by unnecessary details.
Although the term such as " first " and " second " can be used for describing various elements, described element can not be limited by described term.Described term can be used for particular element and another element to separate.
The term using in the present invention, only for describing specific embodiment, and does not have any restriction the intent of the present invention.Although as far as possible current widely used generic term is elected to be to the term using in the present invention in the function of considering in the present invention, they can change according to the appearance of those of ordinary skill in the art's intention, judicial precedent or new technology.In addition, under specific circumstances, can use the term of being selected wittingly by applicant, in this case, will in corresponding description of the present invention, disclose the implication of these terms.Therefore, the term of use should not defined by the simple name of term in the present invention, and should be defined by the content in the implication of term and whole the present invention design.
Unless context clearly refers else, otherwise singulative is intended to comprise plural form.In the present invention, should understand, such as the term of " comprising " and " having ", be used to indicate and have the feature, numeral, step, operation, element, parts or their combination that realize, and do not get rid of in advance the possibility that exists or add one or more other features, numeral, step, operation, element, parts or their combination.
Now with reference to the accompanying drawing that exemplary embodiment of the present invention is shown, the present invention is more fully described.Identical label in accompanying drawing represents identical element, therefore by the description of their repetition of omission.
Fig. 1 is according to the block diagram of the audio signal processing 100 of exemplary embodiment.Audio signal processing 100 is corresponding with multimedia device, and can comprise the voice communication special-purpose terminal of phone, mobile phone etc., the broadcast that comprises TV, MP3 player etc. or the mixed type terminal of music special-purpose terminal or voice communication special-purpose terminal and broadcast or music special-purpose terminal, but be not limited to this.Audio signal processing 100 can be used as client computer, server or is arranged in the transducer between client-server.
With reference to Fig. 1, audio signal processing 100 comprises encoding device 110 and decoding device 120.According to exemplary embodiment, audio signal processing 100 can comprise encoding device 110 and decoding device 120 both, according to another exemplary embodiment, audio signal processing 100 can comprise any one in encoding device 110 and decoding device 120.
Encoding device 110 receives the original signal (that is, multi-channel audio signal) of using multiple sound channels to form, and by original signal is contracted, mixes to produce the mixed sound signal of contracting.Encoding device 110 produces Prediction Parameters and Prediction Parameters is encoded.The sound signal that Prediction Parameters is applied to from contracting mixed is recovered original signal.At length, Prediction Parameters is and for mixed contracting that original signal is contracted, mixes the associated value of matrix, is included in each coefficient value of the mixed matrix of contracting etc.For example, Prediction Parameters can comprise spatial parameter.Prediction Parameters can change according to the product specification of encoding device 110 or decoding device 120, design specification etc., and the value that can be set to optimize by experiment.Here, sound channel can be indicated loudspeaker.
Decoding device 120 is by increasing and mix to produce and the corresponding restoring signal of original signal (that is, multi-channel audio signal) the mixed sound signal that contracts by Prediction Parameters.
Fig. 2 is according to the block diagram of the audio coding equipment 200 of exemplary embodiment.
With reference to Fig. 2, audio coding equipment 200 can comprise contracting mixed unit 210, side information generation unit 220 and coding unit 230.Described assembly can be integrated at least one module, and is implemented as at least one processor (not shown).
The mixed unit 210 of contracting receives N channel audio signal and the N channel audio signal receiving is contracted mixed.The mixed unit 210 of contracting can mix to produce monophonic audio signal or M channel audio signal (M<N) by N channel audio signal is contracted.For example, the mixed unit 210 of contracting can be by 10.2-channel audio signal is contracted and mixes to produce triple-track sound signal or six-channel audio, so that corresponding with 2.1-channel audio signal or 5.1-channel audio signal.
According to exemplary embodiment, contracting mixes unit 210 by selecting two sound channels in N sound channel and two sound channels selecting being contracted and mix to produce the first monophony, and is contracted and mixed to produce the second monophony by the first monophony to generation and another sound channel.Can produce final monophonic audio signal or M channel audio signal by repeating the mixed processing of contracting of monophony to producing as the mixed result of contracting and another sound channel.
Mixed for N channel audio signal being contracted when making entropy minimization, preferably similar sound channel is contracted mixed.Therefore, the mixed unit 210 of contracting can mix and multi-channel audio signal be contracted mixed with relatively high compressibility by the sound channel between sound channel with high correlation is contracted.
The sound channel that side information generation unit 220 produces from contracting mixed is recovered the required side information of multichannel.Whenever the mixed unit 210 of contracting contracts when mixed to multichannel successively, the sound channel that side information generation unit 220 produces from contracting mixed is recovered the required side information of multichannel.Now, side information generation unit 220 can produce for determining by the information of the intensity of two mixed sound channels of contracting with for determining the information of phase place of these two sound channels.
In addition, when contracting mixed being performed, side information generation unit 220 produces which sound channel of indication by the mixed information of contracting.When sound channel is contracted when mixed with the order based on correlation calculations rather than fixing order, side information generation unit 220 can produce the mixed order of contracting of sound channel as side information.
When contracting mixed being performed, side information generation unit 220 repeats to produce the mixed sound channel of contracting is returned to the required information of monophony.For example, if mixed produce monophony 11 times by 12 sound channels are contracted successively, produce 11 times about the information of the mixed order of contracting, for determine sound channel intensity information and for determining the information of phase place of sound channel.According to exemplary embodiment, when the each frequency band in multiple frequency bands produces the information of the intensity for determining sound channel and when determining the information of phase place of sound channel, if the quantity of frequency band is k, can produce the information of 11 × k bar for the intensity of definite sound channel, and can produce the information of 11 × k bar for the phase place of definite sound channel.
Coding unit 230 can be encoded to monophonic audio signal or the M channel audio signal of mixing and producing that contracted by the mixed unit 210 of contracting.If the sound signal of exporting from the mixed unit 210 that contracts is simulating signal, simulating signal is converted to digital signal, and according to pre-defined algorithm to encoding symbols.Encryption algorithm is not limit, and for can be used for coding unit 230 by all algorithms that coding audio signal produced to bit stream.In addition, coding unit 230 can be encoded to the side information for recover multi-channel audio signal from monophonic audio signal being produced by side information generation unit 220.
Fig. 3 is according to the block diagram of the audio decoding apparatus 300 of exemplary embodiment.
With reference to Fig. 3, audio decoding apparatus 300 can comprise extraction unit 310, decoding unit 320, increase mixed unit 330.Described assembly can be integrated at least one module, and is implemented as at least one processor (not shown).
Extraction unit 310 extracts the audio frequency of coding and the side information of coding from the voice data (that is, bit stream) receiving.Can be by the audio frequency that mixes monophony or M sound channel (M<N) and according to pre-defined algorithm, coding audio signal is produced coding that N sound channel contracted.
The audio frequency of decoding unit 320 to the coding being extracted by extraction unit 310 and the side information of coding are decoded.In this case, decoding unit 320 is by using the audio frequency of the algorithm identical with algorithm for encoding to coding and the side information of coding to decode.As the result of audio decoder, monophonic audio signal or M channel audio signal are resumed.
Increase mixed unit 330 and mix to recover the N channel audio signal before contracting is mixed by the sound signal of being decoded by decoding unit 320 is increased.Now, increase mixed unit 330 and recover N channel audio signal based on the side information of being decoded by decoding unit 320.
That is to say, increase mixed unit 330, by reference to mixed processing of the reverse execution contracting of side information (that is, spatial parameter), the sound signal increasing of contracting mixed is mixed to multi-channel audio signal.Now, by reference to comprising that the side information that mixes the information of order about the contracting of sound channel is from monophony successively separated sound channel.Can be used for determining by basis and be determined by intensity and the phase place of the mixed sound channel of contracting, from monophony successively separated sound channel by the mixed intensity of sound channel of contracting and the information of phase place.
Fig. 4 illustrates according to the sound channel coupling between the 10.2-channel audio signal 410 of exemplary embodiment and 5.1-channel audio signal 420.
When the multi-channel audio signal of input is 10.2-channel audio signal, by contracting mix than the multi-channel audio signal of the sound channel of 10.2 sound channel smaller amounts (such as, 7.1-channel audio signal, 5.1-channel audio signal or 2.0 channel audio signal) can be used as output multi-channel audio signal.
As shown in Figure 4, when 10.2-channel audio signal 410 is mixed 5.1-channel audio signal 420 by contracting, if the FL sound channel in 5.1 sound channels and RL sound channel are confirmed to be the contiguous sound channel of the LW sound channel in 10.2 sound channels, can consider that position, correlativity or the error between convalescence determine the weighting factor of FL sound channel and RL sound channel.According to exemplary embodiment, if determine the weighting factor of FL sound channel be 0 and the weighting factor of RL sound channel be 1, the sound channel signal of the LW sound channel in 10.2 sound channels can be mixed the RL sound channel in 5.1 sound channels by contracting.
In addition, the L sound channel in 10.2 sound channels and Ls sound channel can be assigned to respectively FL sound channel and the RL sound channel in 5.1 sound channels at same position place.
Fig. 5 is according to the process flow diagram of the contracting mixing method of exemplary embodiment.
With reference to Fig. 5, in operation 510, from the first layout information, check quantity and the position of input sound channel.For example, the first layout information is IC(1), IC(2) ..., IC(N), and can check the position of N input sound channel from the first layout information.
In operation 520, from the second layout information inspection contract quantity and the position of mixed sound channel (that is, output channels).For example, DC(1 during the second layout information), DC(2) ..., DC(N), can check the position (M<N) of M output channels from the second layout information.
In operation 530, from the first sound channel IC(1 of input sound channel) start to determine input sound channel and output channels, whether have the sound channel with identical outgoing position.
In operation 540, if there is the sound channel with identical outgoing position in input sound channel and output channels, the sound channel signal of corresponding input sound channel is assigned to the output channels at same position place.For example,, if input sound channel IC(n) with output channels DC(m) outgoing position identical, DC(m) can be DC(m)+IC(n).
In operation 550, if there is not the sound channel with identical outgoing position in input sound channel and output channels, from the first sound channel IC(1 of input sound channel) start to determine output channels whether exist and input sound channel IC(n) contiguous sound channel.
In operation 560, if determine and have multiple contiguous sound channels in operation 550, by use with multiple contiguous sound channels in the corresponding predetermined weight factor of each contiguous sound channel by input sound channel IC(n) sound channel signal be distributed to the each contiguous sound channel in multiple contiguous sound channels.For example,, if determine the DC(i of output channels), DC(j) and DC(k) be input sound channel IC(n) contiguous sound channel, weighting factor w i, w jand w kcan be arranged for respectively input sound channel IC(n) and output channels DC(i), input sound channel IC(n) and output channels DC(j) and input sound channel IC(n) and output channels DC(k).The weighting factor w that can arrange by use i, w jand w kby input sound channel IC(n) sound channel signal be distributed as DC (i)=DC (i)+w i× IC (n), DC (j)=DC (j)+w j× IC (n) and DC (k)=DC (k)+w k× IC (n).
Can weighting factor be set by following method.
According to exemplary embodiment, can be according to multiple contiguous sound channels and input sound channel IC(n) between relation determine weighting factor.About multiple contiguous sound channels and input sound channel IC(n) between relation, multiple contiguous sound channels and input sound channel IC(n) between sound channel length, sound channel signal and the input sound channel IC(n of the each contiguous sound channel in multiple contiguous sound channel) sound channel signal between correlativity and at least one in the error between convalescence of multiple contiguous sound channels can be employed.
According to another exemplary embodiment, weighting factor can be according to multiple contiguous sound channels and input sound channel IC(n) between relation be confirmed as 0 or 1.For example, can by multiple contiguous sound channels with input sound channel IC(n) immediate contiguous sound channel is defined as 1, and remaining contiguous sound channel can be defined as to 0.Selectively, can by have in the sound channel signal of multiple contiguous sound channels with input sound channel IC(n) the sound channel signal contiguous sound channel with the sound channel signal of high correlation be defined as 1, and remaining contiguous sound channel can be defined as to 0.Selectively, the contiguous sound channel during restoration with minimum error in multiple contiguous sound channels can be defined as to 1, and remaining contiguous sound channel is defined as to 0.
Operation 570, determine whether the input sound channel that inspected is all, and if do not check all input sound channels, method proceed to operation 530 with repetitive operation 530 to operation 560.
In operation 580, if all input sound channels of inspected, final configuration information and the corresponding spatial parameter with the mixed sound channel of the contracting of the signals that distribute in operation 540 and the signals that distribute in operation 560 of producing.
Can take sound channel, frame, frequency band or frequency spectrum as unit, carry out the contracting mixing method according to exemplary embodiment, therefore, the degree of accuracy that can improve according to Environmental adjustments performance.At this, frequency band is the unit that the sampled point of audible spectrum is divided into groups, and can have even length or non-homogeneous length by reflection threshold value frequency band.The in the situation that of non-homogeneous length, a frame can be provided so that the quantity that is included in the sampled point in each frequency band increases to last sampled point gradually from starting sampled point.If support multiple bit rate, can be included in and be set to identical from the quantity of the sampled point in the corresponding each frequency band of different bit rates.Can pre-determine the quantity that is included in the sampled point in a frame or a frequency band.
According in the contracting mixing method of exemplary embodiment, the layout of sound channel that can be mixed with contracting and the layout of input sound channel are correspondingly identified for the sound channel mixed weighting factor that contracts.Therefore, various layouts are processed on contracting mixing method adaptability ground, can consider that correlativity between position, the sound channel signal of sound channel or the error between convalescence determine weighting factor, thus raising sound quality.In addition, consider that correlativity between position, the sound channel signal of sound channel or the error between the convalescence mixed sound channel that contracts is configured, therefore, if audio decoding apparatus has the sound channel that the quantity of the sound channel mixed with contracting is identical, even if user only hears the sound channel that contracting is mixed in the case of not having the mixed processing of independent increasing, user is the deterioration of the sound quality of None-identified subjectivity also.
Fig. 6 is according to the process flow diagram of the increasing mixing method of exemplary embodiment.
With reference to Fig. 6, in operation 610, receive configuration information and the corresponding spatial parameter in the contracting mixing sound road producing by processing as shown in Figure 5.
In operation 620, by use, at configuration information and the corresponding spatial parameter of the mixed sound channel of the contractings of operation 610 receptions, the mixed sound channel that contracts is increased and mixed, recover input sound channel sound signal.
Fig. 7 is according to the block diagram of the spatial parameter encoding device 700 in the coding unit 230 of be included in Fig. 2 of exemplary embodiment.
With reference to Fig. 7, spatial parameter encoding device 700 can comprise energy calculation unit 710, quantization step determining unit 720, quantifying unit 730 and Multiplexing Unit 740.Shown in assembly can be integrated at least one module and be implemented as at least one processor (not shown).
Energy calculation unit 710 receive from the mixed unit that contracts (with reference to Fig. 2 210) the mixed sound channel signal of contracting that provides, and take sound channel, frame, frequency band or frequency spectrum as unit calculating energy value.Here, the example of energy value can be norm value.
Quantization step determining unit 720 is by using the energy value calculating take sound channel, frame, frequency band or frequency spectrum as unit providing from energy calculation unit 710 to determine quantization step.For example, for sound channel, frame, frequency band or the frequency spectrum with macro-energy value, quantization step can be little, and for sound channel, frame, frequency band or the frequency spectrum with little energy value, quantization step can be large.In this case, two quantization steps can be set, and can select one of two quantization steps according to the result that energy value and predetermined threshold are compared.When correspondingly distributing quantization step adaptively with the distribution of energy value, can select the quantization step mating with the distribution of energy value.Therefore, can the importance based on the sense of hearing adjust the bit that is used in quantification by dividing, thereby improve sound quality.According to exemplary embodiment, can, when keeping the weighting factor distributing according to the energy distribution of the mixed sound channel of each contracting, by changing changeably threshold frequency, adjust gross bit rate.
Quantification and lossless coding unit 730 are quantized spatial parameter take sound channel, frame, frequency band or frequency spectrum as unit by the definite quantization step of quantization step determining unit 720 by using, and the spatial parameter quantizing is carried out to lossless coding.
Multiplexing Unit 740 carries out the multiplexing bit stream that produces by the mixed sound signal of the contracting of the spatial parameter to lossless coding and lossless coding.
Fig. 8 a and Fig. 8 b illustrate according to the variable quantization step-length of the energy value in the frequency band of each frame of the mixed sound channel of contracting, and wherein, sound channel 1 and sound channel 2 are mixed by contracting, and sound channel 3 and sound channel 4 are mixed by contracting.In Fig. 8 a and Fig. 8 b, d0 represents the energy value of the mixed sound channel of the contracting of sound channel 1 and sound channel 2, and d1 represents the energy value of the mixed sound channel of the contracting of sound channel 3 and sound channel 4.
Fig. 8 a and Fig. 8 b indication are provided with two quantization steps, and dash area is corresponding with the frequency band with the energy value that is equal to or greater than predetermined threshold, therefore, small quantization step is provided for to dash area.
Fig. 9 be illustrate for the frequency spectrum data of whole sound channel according to the curve map of the energy distribution of frequency band, Figure 10 a to Figure 10 c be illustrated in according in the situation of the energy value weights assigned factor of each sound channel by considering that energy distribution changes the curve map of the gross bit rate that threshold frequency adjusts.
Figure 10 a illustrates based on initial threshold frequency 100a, (left-hand component arranged to small quantization step, be less than low frequency region 110a, 120a and the 130a of threshold frequency 100a), right-hand component is arranged to the large quantization step example of (that is, being greater than high frequency band 110b, 120b and the 130b of initial threshold frequency 100a).Thereby illustrating higher than the threshold frequency 100b of initial threshold frequency 100a, Figure 10 b increases the example of gross bit rate for increasing being provided with region 140a, the 150a of small quantization step and 160a.Thereby Figure 10 c illustrates lower than the threshold frequency 100c of initial threshold frequency 100a for reducing the example that is provided with region 170a, the 180a of small quantization step and 190a and reduces gross bit rate.
Figure 11 is according to the process flow diagram of the method for the generation spatial parameter that can be carried out by the encoding device of Fig. 2 200 of exemplary embodiment.
With reference to Figure 11, in operation 1110, produce N angle parameter.
In operation 1120, independently (N-1) the individual angle parameter in N angle parameter is encoded.
In operation 1130, from (N-1) angle parameter, predict a remaining angle parameter.
In operation 1140, the angle parameter of prediction is carried out residual coding and is produced the residual error of a remaining angle parameter.
Figure 12 is according to the process flow diagram of the method for the generation spatial parameter that can be carried out by the decoding device of Fig. 3 200 of another exemplary embodiment.
With reference to Figure 12, in operation 1210, receive (N-1) the individual angle parameter in N angle parameter.
In operation 1220, from (N-1) angle parameter, predict a remaining angle parameter.
In operation 1230, the angle parameter of predicting by interpolation and residual error produce a remaining angle parameter.
Figure 13 is according to the process flow diagram of the acoustic signal processing method of exemplary embodiment.
With reference to Figure 13, in operation 1310, mixed to contracting as the first sound channel signal ch1 to the n sound channel signal chn of multi-channel signal.At length, the first sound channel signal ch1 to the n sound channel signal chn contracting can be mixed is a monophonic signal DM.Can be by the mixed unit of contracting 210 executable operations 1310.
In operation 1320, (n-1) the individual sound channel signal in the first sound channel signal ch1 to the n sound channel signal chn is added, or the first sound channel signal ch1 to the n sound channel signal chn is added.At length, can in the first sound channel signal ch1 to the n sound channel signal chn except being added with reference to the sound channel signal sound channel signal, and be added signal become first-phase plus signal.Selectively, can be added the first sound channel signal ch1 to the n sound channel signal chn, and the signal being added becomes second-phase plus signal.
In operation 1330, can be used as the first-phase plus signal of the signal producing in operation 1320 and produce the first spatial parameter with reference to the correlativity between sound channel signal.Selectively, in operation 1330, replace and produce the first spatial parameter, can be used as the second-phase plus signal of the signal producing in operation 1320 and produce second space parameter with reference to the correlativity between sound channel signal.
With reference to sound channel signal, it can be the each sound channel signal in the first sound channel signal ch1 to the n sound channel signal chn.Therefore, with reference to the quantity of sound channel signal, can be n and can produce with n with reference to the corresponding n of a sound channel signal spatial parameter.
Therefore, operation 1330 also can comprise by the each sound channel signal in the first sound channel signal ch1 to the n sound channel signal chn and is set to produce the first spatial parameter to the n spatial parameter with reference to sound channel signal.
Can be by the mixed unit of contracting 210 executable operations 1320 and 1330.
Operation 1340, operation 1330 in produce spatial parameter SP be encoded and send to decoding device (with reference to Fig. 3 300).In addition, operation 1310 produce monophonic signal DM be encoded and send to decoding device (with reference to Fig. 3 300).At length, the spatial parameter SP of coding and the monophonic signal DM of coding can be included in transport stream TS and be sent to decoding device (with reference to Fig. 3 300).The spatial parameter SP being included in transport stream TS indicates the set of spatial parameters that comprises the first spatial parameter to the n spatial parameter.
Operation 1340 can by encoding device (with reference to Fig. 2 200) carry out.
Figure 14 a to Figure 14 c illustrates the example of the operation 1330 of operation 1110 for describing Figure 11 or Figure 13.Hereinafter, with reference to Figure 14 a to Figure 14 c, describe the operation that produces first-phase plus signal and the first spatial parameter in detail.Figure 14 a to Figure 14 c illustrates that multi-channel signal comprises the situation of the first sound channel signal to the triple-track signal ch1, ch2 and ch3.In addition, Figure 14 a to Figure 14 c illustrate signal and the vector of signal, wherein, signal contract mixedly with indication, can use various contracting mixing methods to replace vector methods.
Figure 14 a to Figure 14 c illustrates with reference to sound channel signal it is respectively the situation of the first sound channel signal ch1, second sound channel signal ch2 and triple-track signal ch3.
With reference to Figure 14 a, when being the first sound channel signal ch1 with reference to sound channel signal, side information generation unit (with reference to Fig. 2 220) by producing sum signal 1410 to being added (ch2+ch3) except the second sound channel signal ch2 with reference to sound channel signal and triple-track signal ch3.Thereafter, side information generation unit (with reference to Fig. 2 220) by being used as with reference to correlativity (ch1, ch2+ch3) between the first sound channel signal ch1 and the sum signal 1410 of sound channel signal, produce spatial parameter.Spatial parameter comprises indication with reference to the information of the correlativity between sound channel signal and sum signal 1410 and indicates the information with reference to the relative signal amplitude of sound channel signal and sum signal 1410.
With reference to Figure 14 b, when reference signal is second sound channel signal ch2, side information generation unit (with reference to Fig. 2 220) by producing sum signal 1420 to being added (ch1+ch3) except the first sound channel signal ch1 with reference to sound channel signal and triple-track signal ch3.Thereafter, side information generation unit (with reference to Fig. 2 220) by being used as with reference to correlativity (ch2, ch1+ch3) between second sound channel signal ch2 and the sum signal 1420 of sound channel signal, produce spatial parameter.
With reference to Figure 14 c, when being triple-track signal ch3 with reference to sound channel signal, side information generation unit (with reference to Fig. 2 220) by producing sum signal 1430 to being added (ch1+ch2) except the first sound channel signal ch1 with reference to sound channel signal and second sound channel signal ch2.Thereafter, side information generation unit (with reference to Fig. 2 220) by being used as with reference to correlativity (ch3, ch1+ch2) between triple-track signal ch3 and the sum signal 1430 of sound channel signal, produce spatial parameter.
When multi-channel signal comprises three sound channel signals, with reference to the quantity of sound channel signal, be 3, and can produce three spatial parameters.Produce spatial parameter by encoding device (with reference to Fig. 2 200) coding, and via network (not shown) be sent to decoding device (with reference to Fig. 3 300).
By the first sound channel signal to the triple-track signal ch1, ch2 and ch3 are contracted, the mixed monophonic signal DM obtaining is identical with the sum signal of the first sound channel signal to the triple-track signal ch1, ch2 and ch3, and can be represented by Dm=ch1+ch2+ch3.Therefore, be related to ch1=DM-(ch2+ch3) effectively.
Decoding device 300 receives the decode the first spatial parameter as the spatial parameter of describing with reference to Figure 14 a to Figure 14 c.Decoding device (with reference to Fig. 3 300) monophonic signal of decoding by use and the spatial parameter of decoding recover original channel signal.As mentioned above, be related to ch1=DM-(ch2+ch3) effectively, the spatial parameter producing with reference to Figure 14 a can comprise indication the first sound channel signal ch1 and sum signal 1410(ch2+ch3) parameter and indication the first sound channel signal ch1 and the sum signal 1410(ch2+ch3 of relative amplitude) between the parameter of similarity, therefore, can recover the first sound channel signal ch1 and sum signal 1410(ch2+ch3 by using with reference to spatial parameter and the monophonic signal DM of Figure 14 a generation).In an identical manner, can, by using respectively the spatial parameter with reference to Figure 14 b and Figure 14 c generation, recover second sound channel signal ch2 and sum signal 1420(ch1+ch3) and triple-track signal ch3 and sum signal 1430(ch1+ch2).That is to say, increase mixed unit (with reference to Fig. 3 330) can recover all the first sound channel signal to the triple-track signal ch1, ch2 and ch3.
Figure 15 illustrates another example of the operation 1330 of operation 1110 for describing Figure 11 or Figure 13.Hereinafter, with reference to Figure 15, describe the operation that produces second-phase plus signal and second space parameter in detail.Figure 15 illustrates that multi-channel signal comprises the situation of the first sound channel signal to the triple-track signal ch1, ch2 and ch3.In addition, Figure 15 illustrate signal and the vector of signal.
With reference to Figure 15, second-phase plus signal is by the first sound channel signal to the triple-track signal ch1, ch2 and ch3 are added to the signal obtaining, therefore, by by triple-track signal ch3 and signal 1510(signal 1510 by the first sound channel signal ch1 and second sound channel signal ch2 are added and are obtained) be added the signal 1520(ch1+ch2+ch3 obtaining) be second-phase plus signal.
First, be created in the first sound channel signal ch1 as the spatial parameter between the first sound channel signal ch1 and second-phase plus signal 1520 in the situation with reference to sound channel signal.At length, can be by least one the spatial parameter that uses correlativity (ch1, ch1+ch2+ch3) between the first sound channel signal ch1 and second-phase plus signal 1520 to produce to comprise in the first parameter and the second parameter.
Next, by using the correlativity (ch2, ch1+ch2+ch3) between second sound channel signal ch2 and second-phase plus signal 1520 to produce spatial parameter at second sound channel signal ch2 as with reference to sound channel signal in the situation that.Finally, by using the correlativity (ch3, ch1+ch2+ch3) between triple-track signal ch3 and second-phase plus signal 1520 to produce spatial parameter at triple-track signal ch3 as with reference to sound channel signal in the situation that.
Decoding device (with reference to Fig. 3 300) to receiving the decode as the second space parameter of spatial parameter of describing with reference to Figure 15.Thereafter, decoding device (with reference to Fig. 3 300) monophonic signal of decoding by use and the spatial parameter of decoding recover original channel signal.The monophonic signal of decoding is corresponding with the signal (ch1+ch2+ch3) that multiple sound channel signals are added.
Therefore, can recover the first sound channel signal ch1 by the monophonic signal of usage space parameter and decoding, wherein, use the correlativity (ch1, ch1+ch2+ch3) between the first sound channel signal ch1 and second-phase plus signal 1520 to produce described spatial parameter.Similarly, the spatial parameter that can utilize the correlativity (ch2, ch1+ch2+ch3) between second sound channel signal ch2 and second-phase plus signal 1520 to produce by use, recovers second sound channel signal ch2.In addition, the spatial parameter that can utilize the correlativity (ch3, ch1+ch2+ch3) between triple-track signal ch3 and second-phase plus signal 1520 to produce by use, recovers triple-track signal ch3.
Figure 16 a to Figure 16 d illustrates another example of the operation 1330 of operation 1110 for describing Figure 11 or Figure 13.
First, in the encoding device 200 of Fig. 2, the spatial parameter being produced by side information generation unit 220 can comprise the angle parameter as the first parameter.Angle parameter is the parameter that signal amplitude correlativity is designated as to predetermined angular value, wherein, signal amplitude correlativity is as the signal amplitude correlativity between the remaining sound channel signal with reference to sound channel signal in reference sound channel signal and first sound channel signal ch1 to the n sound channel signal chn of any one sound channel signal in the first sound channel signal ch1 to the n sound channel signal chn.Angle parameter can be described as global vector angle (GVA).In addition, angle parameter can be the parameter that is expressed as angle value with reference to the relative amplitude of sound channel signal and first-phase plus signal.
Side information generation unit 220 can be in the first sound channel signal ch1 to the n sound channel signal chn each sound channel signal as with reference to producing the first angle parameter to the n angle parameter sound channel signal.Hereinafter, the angle parameter producing as with reference to sound channel signal in the situation that at k sound channel signal chk is called to k angle parameter.
Figure 16 a illustrates that the multi-channel signal being received by encoding device comprises the situation of the first sound channel signal to the triple-track signal ch1, ch2 and ch3.Figure 16 b, Figure 16 c and Figure 16 d illustrate with reference to sound channel signal it is respectively the situation of the first sound channel signal ch1, second sound channel signal ch2 and triple-track signal ch3.
With reference to Figure 16 b, when being the first sound channel signal ch1 with reference to sound channel signal, side information generation unit (with reference to Fig. 2 220) to being added (ch2+ch3) as second sound channel signal ch2 and triple-track signal ch3 except the remaining sound channel signal with reference to sound channel signal, and obtain as the first angle parameter angle 1622 of the angle parameter between sum signal 1620 and the first sound channel signal ch1.
At length, can obtain the first angle parameter angle 11622 from the arc tangent of the value by the absolute value of sum signal (ch2+ch3) 1620 is obtained divided by the absolute value of the first sound channel signal ch1.
With reference to Figure 16 c, can obtain the second angle parameter angle 21632 as with reference to sound channel signal at second sound channel signal ch2 from the arc tangent of the value by the absolute value of sum signal (ch1+ch3) 1630 is obtained divided by the absolute value of second sound channel signal ch2.
With reference to Figure 16 d, can obtain the 3rd angle parameter angle 31642 as with reference to sound channel signal at triple-track signal ch3 from the arc tangent of the value by the absolute value of sum signal (ch1+ch2) 1640 is obtained divided by the absolute value of triple-track signal ch3.
Figure 17 is the curve map that the summation of angle parameter is shown, wherein, and x axle indication angle value, y axle indication distribution probability.In addition, in angle value, a unit is corresponding to 6 degree.For example, the value 30 of x axle is indicated 180 degree.
At length, the each sound channel signal in the first sound channel signal to the n sound channel signal converges to predetermined value as the summation of the angle parameter of the n with reference to calculating sound channel signal.The predetermined value of convergence can change and can optimize by emulation or experiment according to the value of n.For example, when n is 3, the predetermined value of convergence can be 180 degree.
With reference to Figure 17, as shown in figure 17, when n is 3, the summation of three angle parameters converges to approximately 30 units (that is, approximately 180 degree 1710).The curve map of Figure 14 obtains by emulation or experiment.
Distinguishingly, the summation of three angle parameters can converge to approximately 45 units (that is, approximately 270 degree 1720).When due to all three sound channel signals being quiet and each angle parameter while having the value of 90 degree, predetermined value can converge to approximately 270 degree 1720.At this in particular cases, if any one value in three angle parameters is changed to 0, the summation of three angle parameters is converged to approximately 180 degree 1710.When all three sound channel signals are while being quiet, the mixed monophonic signal that contracts also has value 0, even and monophonic signal increased mixed decoding, its result is also 0.Therefore, even if the value of an angle parameter is changed to 0, the result that increases mixed decoding is not also changed, therefore, 0 also harmless even if any one in three angle parameters is changed to.
Figure 18 is for describing according to the calculating of the angle parameter of exemplary embodiment, and wherein, multi-channel signal comprises the first sound channel signal to the triple-track signal ch1, ch2 and ch3.According to exemplary embodiment, can produce spatial parameter, described spatial parameter comprises the angle parameter except k angle parameter in the first angle parameter to the n angle parameter and for calculating the residual error of k angle parameter of k angle parameter.
With reference to Figure 18, when the first sound channel signal ch1 is during with reference to sound channel signal, the first angle parameter is calculated and is encoded, the first angle parameter of coding is included in predetermined bit regions 1810 and send it to decoding device (with reference to Fig. 3 300).When second sound channel signal ch2 is during with reference to sound channel signal, the second angle parameter is calculated and is encoded, the second angle parameter of coding is included in predetermined bit regions 1830 and send it to decoding device (with reference to Fig. 3 300).
When the 3rd angle parameter is k angle parameter as above, can obtain as follows the residual error of k angle parameter.
Because the summation of n angle parameter converges to predetermined value, therefore can obtain by deduct the value of n the angle parameter except k angle parameter angle parameter from predetermined value the value of k angle parameter.At length, when n is 3, if not three all sound channel signals, be all quiet, the summation of three angle parameters converges to approximately 180 degree.Therefore, value=180 of the 3rd angle parameter degree-(value of value+the second angle parameter of the first angle parameter).Can predict the 3rd angle parameter by the correlativity between the first angle parameter to the three angle parameters.
At length, side information generation unit (with reference to Fig. 2 220) value of the k angle parameter of prediction in the first angle parameter to the n angle parameter.Predetermined bit regions 1870 indications comprise the data area of the predicted value of k angle parameter.
Thereafter, side information generation unit (with reference to Fig. 2 220) predicted value to k angle parameter and the original value of k angle parameter carry out layout.Predetermined bit regions 1850 indications comprise the data area of the value of the 3rd angle parameter angle 31642 of calculating with reference to Figure 16 d.
Thereafter, side information generation unit (with reference to Fig. 2 220) produce difference between predicted value 1870 and the original value 1850 of k angle parameter of the k angle parameter residual error as k angle parameter.Predetermined bit regions 1890 indications comprise the data area of the residual error of k angle parameter.
Encoding device (with reference to Fig. 2 200) to spatial parameter encode and by the spatial parameter of coding send to decoding device (with reference to Fig. 3 300), wherein, described spatial parameter comprises the angle parameter except k angle parameter (being included in the parameter in data area 1810 and 1830) in the first angle parameter to the n angle parameter and the residual error (being included in the parameter in data area 1890) of k angle parameter.
Decoding device (with reference to Fig. 3 300) receive spatial parameter, described spatial parameter comprises the angle parameter except k angle parameter in the first angle parameter to the n angle parameter and the residual error of k angle parameter.
Decoding device (with reference to Fig. 3 300) in decoding unit (with reference to Fig. 3 320) spatial parameter and the predetermined value that by use, receive recover k angle parameter.
At length, decoding unit (with reference to Fig. 3 320) can be by deducting the value of the angle parameter except k angle parameter the first angle parameter to the n angle parameter from predetermined value and producing k angle parameter from the residual error that the result deducting compensates k angle parameter.
The residual error of k angle parameter has the size of data less than the value of k angle parameter.Therefore, when spatial parameter (comprising the angle parameter except k angle parameter in the first angle parameter to the n angle parameter and the residual error of k angle parameter) is sent to decoding device (with reference to Fig. 3 300) time, encoding device (with reference to Fig. 2 200) and decoding device (with reference to Fig. 3 300) between the data volume of sending and receiving can reduce.
When for example three sound channels produce angle parameter, can be by use value 0,1 and 2 perception by the angle parameter of the sound channel of residual coding.That is to say, when independently all three sound channels being encoded, need 2 bits × 3=6 bit, but can only need 5 bits according to following method.
When the scope of D=A+B × 3+C × 9(%D: in the time of 0~26), if the value of D is known when decoding, can pass through C=floor(D/9), D '=mod (D, 9), B=floor (D '/3), A=mod (D '/3) obtain A, B and C.
Figure 19 is according to the block diagram of the audio signal processing 1900 of the integrated multichannel codec of exemplary embodiment and core codec.
Audio signal processing 1900 shown in Figure 19 comprises encoding device 1910 and decoding device 1940.According to exemplary embodiment, audio signal processing 1900 can comprise encoding device 1910 and decoding device 1940 both, according to another exemplary embodiment, audio signal processing 1900 can comprise any one in encoding device 1910 and decoding device 1940.
Encoding device 1910 can comprise multi-channel encoder 1920 and core encoder 1930, and decoding device 1940 can comprise core decoder 1850 and multi-channel decoder 1860.
The example of the codec algorithm using in core encoder 1930 and core decoder 1850 can be AC-3, strengthens AC-3, use the AAC of improved discrete cosine transform (MDCT), but is not limited to this.
Figure 20 is according to the block diagram of the audio coding equipment 2000 of exemplary embodiment, wherein, and the integrated multi-channel encoder 2010 of audio coding equipment 2000 and core encoder 2040.
Audio coding equipment 2000 shown in Figure 20 comprises multi-channel encoder 2010 and core encoder 2040, wherein, multi-channel encoder 2010 can comprise converter unit 2020 and the mixed unit 2030 of contracting, and core encoder 2040 can comprise envelope coding unit 2050, bit allocation units 2060, quantifying unit 2070 and bitstream format unit 2080.Described assembly can be integrated at least one module and be implemented as at least one processor (not shown).
With reference to Figure 20, converter unit 2020 is transformed to the PCT input of time domain the frequency spectrum data of frequency domain.Now, strange discrete Fourier transformation (MODFT) that can application enhancements.Owing to producing MDCT component according to MODFT=MDCT+jMDST, therefore existing inverse transformation part and existing analysis filterbank part are not essential.In addition, because MODFT has complex values, therefore can obtain more accurately rank/phase place/correlativity compared with MDCT.
The frequency spectrum data that the mixed unit 2030 of contracting provides from transformation into itself unit 2020 extracts spatial parameter, and by frequency spectrum data is contracted, mixes to produce the mixed frequency spectrum of contracting.The spatial parameter extracting is provided for bitstream format unit 2080.
The MDCT conversion coefficient of the mixed frequency spectrum of contracting that envelope coding unit 2050 provides from the mixed unit 2030 that certainly contracts obtains envelope value take the frequency band be scheduled to as unit, and envelope value is carried out to lossless coding.Any one power, average amplitude, norm value and the average energy that can obtain from the frequency band to be scheduled to as unit here, forms envelope.
Bit allocation units 2060 are by using the envelope value obtaining take each frequency band as unit to produce the conversion coefficient required bit distribution information of encoding, and MDCT conversion coefficient is normalized.In this case, take each frequency band as unit, quantize and the envelope value of lossless coding can be included in bit stream and be sent to decoding device (with reference to Figure 21 2100).Distribute relevantly with the bit of envelope value that uses each frequency band, the envelope value of inverse quantization can be used, thus encoding device 2000 and decoding device (with reference to Figure 21 2100) use identical processing.When norm value is used as envelope value, can use take each frequency band as unit norm value to calculate masking threshold, can use the amount of bits that in masking threshold perception, prediction needs.
Quantifying unit 2070 quantizes the MDCT conversion coefficient of the mixed frequency spectrum that contracts by the bit distribution information based on providing from bit allocation units 2060, produces quantization index.
Bitstream format unit 2080 formats to produce bit stream by the spectrum envelope to coding, quantization index and the spatial parameter of the mixed frequency spectrum that contracts.
Figure 21 is according to the block diagram of the audio decoding apparatus 2100 of exemplary embodiment, wherein, and the integrated core decoder 2110 of audio decoding apparatus 2100 and multi-channel decoder 2160.
Audio decoding apparatus 2100 shown in Figure 21 comprises core decoder 2110 and multi-channel decoder 2160, wherein, core decoder 2110 can comprise bit stream resolution unit 2120, envelope decoding unit 2130, bit allocation units 2140 and inverse quantization unit 2150, and multi-channel decoder 2160 can comprise the mixed unit 2150 of increasing and inverse transformation unit 2180.Described assembly can be integrated at least one module and be implemented as at least one processor (not shown).
With reference to Figure 21, bit stream resolution unit 2120, by the bit stream sending via network (not shown) is resolved, is extracted the spectrum envelope of coding, quantization index and the spatial parameter of the mixed frequency spectrum that contracts.
Envelope decoding unit 2130 carries out lossless coding to the spectrum envelope of the coding providing from bit stream resolution unit 2120.
Bit allocation units 2140 are by the spectrum envelope of the coding that provides take each frequency band as unit from bit stream resolution unit 2120 is provided, and divides to match the conversion coefficient required bit of decoding.Bit allocation units 2140 can operate in the same manner with the bit allocation units 2060 of the audio coding equipment 2000 of Figure 20.
Inverse quantization unit 2150 is carried out inverse quantization and produces the frequency spectrum data of MDCT component to the quantization index of the mixed frequency spectrum of the contracting providing from bit stream resolution unit 2120 by the bit distribution information based on providing from bit allocation units 2140.
Increase mixed unit 2170 mixed by using the spatial parameter providing from bit stream resolution unit 2120 to increase the frequency spectrum data of the MDCT assembly providing from inverse quantization unit 2150, and to increasing mixed frequency spectrum, carry out renormalization by the spectrum envelope that the decoding providing from envelope decoding unit 2130 is provided.
Inverse transformation unit 2180 is by bringing to carrying out contravariant from the mixed frequency spectrum of increasing that increases mixed unit 2170 and provide the pulse code modulation (pcm) output that produces time domain.Now, can apply anti-MODFT with corresponding with converter unit (with reference to Figure 20 2020).For this reason, can produce or predict from the frequency spectrum data of MDCT component the frequency spectrum data of improved discrete sine transform (MDST) component.Can, by using the frequency spectrum data of the frequency spectrum data of MDCT component and the MDST component of generation or prediction to produce the frequency spectrum data of MODFT component, apply anti-MODFT.Inverse transformation unit 2180 can be applied to anti-MDCT the frequency spectrum data of MDCT component.For this reason, can from audio coding equipment (with reference to Figure 20 2000) send for compensate MDCT territory increases mixed during the parameter of error of generation.
According to exemplary embodiment, within the stationary signal time period, can in MDCT territory, carry out multi-channel decoding.Within the non-stationary time period, can be by producing or prediction MDST component produces MODFT component from MDCT component within the transient signal time period, and in MODFT territory, it is carried out to multi-channel decoding.
Can check that current demand signal is corresponding with the stationary signal time period or corresponding with the non-stationary signal time period as label information or window information that unit adds bit stream to the frequency band to be scheduled to or frame.For example, when short window is employed, current demand signal can be corresponding to the non-stationary signal time period, and when long window is employed, current demand signal can be corresponding to the stationary signal time period.
In more detail, when strengthening AC-3 algorithm application in core codec, can be by check the characteristic of current demand signal with blksw and AHT label information, when AC-3 algorithm application is during in core codec, can be by check the characteristic of current demand signal with blksw label information.
According to Figure 20 and Figure 21, by using the MODFT of time/frequency domain conversion, even if use multichannel codec and the core codec of different conversion scheme to be integrated, the complexity of decoding end also can reduce.In addition, even if use multichannel codec and the core codec of different conversion scheme to be integrated, existing synthesis filter banks part and existing conversion fraction neither be essential, therefore, can omit overlap-add, thereby prevent extra delay.
According to the method for exemplary embodiment, can be written as computer executable program, and be implemented in universal digital computer, wherein, universal digital computer is by usability computer readable recording medium storing program for performing executive routine.In addition, spendable data structure, programmed instruction or data file can be recorded in computer readable recording medium storing program for performing in every way in an embodiment of the present invention.Computer readable recording medium storing program for performing can comprise all types of memory storages of the storage computer system-readable data of getting.The example of computer readable recording medium storing program for performing comprises: magnetic medium (such as, hard disk, floppy disk and tape), optical record medium (such as, CD-ROM, DVD), magnet-optical medium (such as, CD) and be configured to specially storage and the hardware unit of execution of program instructions (such as, ROM (read-only memory) (ROM), random-access memory (ram) and flash memory).In addition, computer readable recording medium storing program for performing can be the transmission medium of the signal for transmitting designated program instruction, data structure etc.The example of programmed instruction not only can comprise the machine language code by compiler-creating, also can comprise and by computer system, be used the executable higher-level language code such as interpreter.
Although described exemplary embodiment of the present invention in detail with reference to accompanying drawing, the invention is not restricted to these embodiment.Be clear that in the scope of disclosed technical spirit in the claims and can carry out various changes or modification by those of ordinary skill in the art, should understand these changes or modification and belong to technical scope of the present invention.

Claims (1)

1. an acoustic signal processing method, comprising:
When more than first input sound channel contracted, mix while being more than second output channels, the position of the position to more than first input sound channel and more than second output channels compares;
The sound channel contracting with the position identical with the position of more than second output channels in more than first input sound channel is mixed to the sound channel at same position place in more than second output channels;
Search at least one contiguous sound channel of the each sound channel in the remaining sound channel in more than first input sound channel;
Consider at least one in distance, the correlativity between signal and the error between convalescence between sound channel, determine the weighting factor of the contiguous sound channel searching;
Weighting factor based on definite mixes described contiguous sound channel by the each sound channel contracting in the remaining sound channel in more than first input sound channel.
CN201280038627.5A 2011-06-07 2012-06-07 Audio signal processing method, audio encoding apparatus, audio decoding apparatus, and terminal adopting the same Pending CN103733256A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201161494050P 2011-06-07 2011-06-07
US61/494,050 2011-06-07
PCT/KR2012/004508 WO2012169808A2 (en) 2011-06-07 2012-06-07 Audio signal processing method, audio encoding apparatus, audio decoding apparatus, and terminal adopting the same

Publications (1)

Publication Number Publication Date
CN103733256A true CN103733256A (en) 2014-04-16

Family

ID=47296608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280038627.5A Pending CN103733256A (en) 2011-06-07 2012-06-07 Audio signal processing method, audio encoding apparatus, audio decoding apparatus, and terminal adopting the same

Country Status (4)

Country Link
EP (1) EP2720223A2 (en)
KR (1) KR20140037118A (en)
CN (1) CN103733256A (en)
WO (1) WO2012169808A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107749299A (en) * 2017-09-28 2018-03-02 福州瑞芯微电子股份有限公司 A kind of multi-audio-frequencoutput output method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1783728A (en) * 2004-12-01 2006-06-07 三星电子株式会社 Apparatus and method for processing multi-channel audio signal using space information
CN101053017A (en) * 2004-11-04 2007-10-10 皇家飞利浦电子股份有限公司 Encoding and decoding a set of signals
CN101401151A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis
CN101460997A (en) * 2006-06-02 2009-06-17 杜比瑞典公司 Binaural multi-channel decoder in the context of non-energy-conserving upmix rules
CN101594186A (en) * 2008-05-28 2009-12-02 华为技术有限公司 Generate the method and apparatus of single channel signal in the double-channel signal coding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101053017A (en) * 2004-11-04 2007-10-10 皇家飞利浦电子股份有限公司 Encoding and decoding a set of signals
CN1783728A (en) * 2004-12-01 2006-06-07 三星电子株式会社 Apparatus and method for processing multi-channel audio signal using space information
CN101401151A (en) * 2006-03-15 2009-04-01 法国电信公司 Device and method for graduated encoding of a multichannel audio signal based on a principal component analysis
CN101460997A (en) * 2006-06-02 2009-06-17 杜比瑞典公司 Binaural multi-channel decoder in the context of non-energy-conserving upmix rules
CN101594186A (en) * 2008-05-28 2009-12-02 华为技术有限公司 Generate the method and apparatus of single channel signal in the double-channel signal coding

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107749299A (en) * 2017-09-28 2018-03-02 福州瑞芯微电子股份有限公司 A kind of multi-audio-frequencoutput output method and device

Also Published As

Publication number Publication date
WO2012169808A2 (en) 2012-12-13
EP2720223A2 (en) 2014-04-16
KR20140037118A (en) 2014-03-26
WO2012169808A3 (en) 2013-03-07

Similar Documents

Publication Publication Date Title
RU2422987C2 (en) Complex-transform channel coding with extended-band frequency coding
JP6789365B2 (en) Voice coding device and method
CN101223582B (en) Audio frequency coding method, audio frequency decoding method and audio frequency encoder
US8069052B2 (en) Quantization and inverse quantization for audio
KR100949232B1 (en) Encoding device, decoding device and methods thereof
KR101425155B1 (en) Audio encoder, audio decoder and related methods for processing multi-channel audio signals using complex prediction
US7801735B2 (en) Compressing and decompressing weight factors using temporal prediction for audio data
CN101518083B (en) Method, medium, and system encoding and/or decoding audio signals by using bandwidth extension and stereo coding
US9659568B2 (en) Method and an apparatus for processing an audio signal
US20080077412A1 (en) Method, medium, and system encoding and/or decoding audio signals by using bandwidth extension and stereo coding
CN102656628B (en) Optimized low-throughput parametric coding/decoding
JP4272897B2 (en) Encoding apparatus, decoding apparatus and method thereof
KR102296067B1 (en) Method and apparatus for decoding a compressed hoa representation, and method and apparatus for encoding a compressed hoa representation
CN101223570A (en) Frequency segmentation to obtain bands for efficient coding of digital media
CA2840785A1 (en) Encoding device and method, decoding device and method, and program
KR20110021803A (en) Factorization of overlapping transforms into two block transforms
CN101162584A (en) Method and apparatus to encode and decode audio signal by using bandwidth extension technique
US9230551B2 (en) Audio encoder or decoder apparatus
WO2016001355A1 (en) Method and apparatus for encoding/decoding of directions of dominant directional signals within subbands of a hoa signal representation
US20080071550A1 (en) Method and apparatus to encode and decode audio signal by using bandwidth extension technique
CN105745703A (en) Signal encoding method and apparatus and signal decoding method and apparatus
KR102433192B1 (en) Method and apparatus for decoding a compressed hoa representation, and method and apparatus for encoding a compressed hoa representation
JP5949270B2 (en) Audio decoding apparatus, audio decoding method, and audio decoding computer program
CN103733256A (en) Audio signal processing method, audio encoding apparatus, audio decoding apparatus, and terminal adopting the same
EP3164866A1 (en) Method and apparatus for encoding/decoding of directions of dominant directional signals within subbands of a hoa signal representation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140416