CN101185117B

CN101185117B - Method and apparatus for decoding an audio signal

Info

Publication number: CN101185117B
Application number: CN2006800182380A
Authority: CN
Inventors: 吴贤午; 郑亮源; 房熙锡; 金东秀; 林宰显
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2005-05-26
Filing date: 2006-05-25
Publication date: 2012-09-26
Anticipated expiration: 2026-05-25
Also published as: CN101185119B; CN101185118A; CN101185119A; CN101185117A; CN101185118B

Abstract

Method and apparatus for processing audio signals are provided. The method for decoding an audio signal includes extracting a downmix signal and spatial information from a received audio signal, generating surround converting information using the spatial information and rendering the downmix signal to generate a pseudo-surround signal in a previously set rendering domain, using the surround converting information. The apparatus for decoding an audio signal includes a demultiplexing part extracting a downmix signal and spatial information from a received audio signal, an information converting part generating surround converting information using the spatial information and a pseudo-surround generating part rendering the downmix signal to generate a pseudo-surround signal in a previously set rendering domain, using the surround converting information.

Description

The method and apparatus of decoded audio signal

Technical field

The present invention relates to Audio Signal Processing, and more specifically, relate to and be used to handle the method and apparatus that can produce pseudo-sound signal around signal.

Background technology

Recently, develop the various technology and the method that are used for the coded digital sound signal, and also made its relevant product.In addition, developed many methods, wherein had multi channel sound signal and be used the psychoacoustic model coding.

This psychoacoustic model is the method that a kind of principle of using human voice recognition mode is reduced in the data volume when removing signal unnecessary in the process of encoding process effectively.For example, human ear can't be discerned quiet sound immediately after the sound of noise and excitement, and also only hears frequency at 20-20, the sound between the 000Hz.

Though developed the above existing technology and method, do not had the known audio signal that is used for to produce pseudo-method around signal from the audio bitstream that comprises spatial information.

Summary of the invention

The invention provides the method and apparatus and the data structure thereof that are used for decoded audio signal, it can provide pseudo-surrounding effect in audio system.

According to one aspect of the present invention; A kind of method that is used for decoded audio signal is provided; This method comprises: from the sound signal that receives, extract mixed frequency signal and spatial information down; Use this spatial information to produce, and use is in the territory in appearing of setting in advance around transitional information, and now mixed frequency signal is pseudo-in signal to produce around transitional information.

According to another aspect of the present invention; A kind of device that is used for decoded audio signal is provided; This device comprises: the multichannel of mixed frequency signal and spatial information is decomposed part under from the sound signal that receives, extracting; The usage space information generating is around the information translation part of transitional information and use around transitional information and be mixed frequency signal now in the territory to produce pseudo-puppet around signal around producing part in appearing of being provided with in advance.

According to another aspect of the present invention; A kind of data structure of sound signal is provided, and this data structure comprises: following mixed frequency signal, and it is to produce through the sound signal of descending mixing to have a plurality of channels; And spatial information; It produces when producing down mixed frequency signal, and wherein spatial information is converted into around transitional information, and mixed frequency signal was appeared with pseudo-in signal by means of converting into around transitional information of using in appearing in the territory of being provided with in advance down.

According to another aspect of the present invention; A kind of stored audio signal and the medium with data structure are provided; Wherein this data structure comprises: following mixed frequency signal, and it is to produce through the sound signal of descending mixing to have a plurality of channels, and spatial information; It produces when producing mixed frequency signal; Wherein spatial information is converted into around transitional information, and mixed frequency signal was appeared in the territory to appear in the territory what be provided with in advance in appearing of setting in advance down, and is pseudo-in signal by means of converting into around transitional information of use.

Description of drawings

This accompanying drawing of following is included to provide further to be understood the present invention, and it illustrates embodiments of the invention, and explains the effect of the principle of the invention with this instructions.

In the accompanying drawings:

Fig. 1 illustrates the signal processing system according to one embodiment of the invention;

Fig. 2 illustrates according to the puppet of one embodiment of the invention around the schematic block diagram that produces part;

Fig. 3 illustrates the schematic block diagram according to the information translation part of one embodiment of the invention;

Fig. 4 illustrates according to one embodiment of the invention and is used to describe puppet around the schematic block diagram that presents process and spatial information transfer process;

Fig. 5 illustrates according to another embodiment of the present invention and is used to describe puppet around the schematic block diagram that presents process and spatial information transfer process;

Fig. 6 and Fig. 7 illustrate the schematic block diagram that is used to describe the channel mapping process according to one embodiment of the invention;

Fig. 8 illustrates according to one embodiment of the invention and is used for via the synoptic diagram of describing filter factor with channel; With

Fig. 9 to Figure 11 illustrates to be used to describe according to the embodiment of the invention and is used to produce the schematic block diagram around the process of transitional information.

Embodiment

To at length introduce embodiments of the invention now, the illustrated that its example is being followed.

At first, the present invention is through term description, and this term uses in its relevant technology usually.But, defined some terms in the present invention clearly to describe the present invention.Therefore, the present invention must be based on the term that defines in the following description and understands.

" spatial information " expression in the present invention produces the information that multichannel needs through the signal of mixing under the uppermixing.Though with hypothesis space information is that spatial parameter is described the present invention, to understand easily, this spatial information does not receive the restriction of spatial parameter.Here, this spatial parameter comprises channel level poor (CLD), interchannel coherence (ICC) and channel estimating coefficient (CPC) etc.This channel level poor (CLD) is illustrated in the energy difference of two interchannels.This interchannel coherence (ICC) is illustrated in the cross correlation of two interchannels.This channel estimating coefficient (CPC) expression is from the predictive coefficient of three channels of two channel estimatings.

" core codec " expression in the present invention is used for the codec of coding audio signal.This core codec is space encoder information not.The present invention will suppose that the mixing sound signal is to be described by the sound signal of core codec coding down.In addition, this core codec can comprise the layer-II of Motion Picture Experts Group (MPEG), mpeg audio layer-III (MP3), AC-3, OggVorbis, DTS, Window media audio (WMA), Advanced Audio Coding (AAC) or AAC (HE-AAC) efficiently.But, this core codec can be provided.In this case, use unpressed PCM signal.This codec can be an existing codec and at the codec in the future of exploitation in the future.

" channel distribution part " expression can be divided into the input channel of given number the division part of the delivery channel of another given number, and wherein the delivery channel number is different from the number of input channel.This channel distribution partly comprises two to three (TTT) box, and it converts two input channels into three delivery channels.In addition, this channel distribution partly comprises one to two (OTT) box, and it converts an input channel into two delivery channels.Channel distribution of the present invention partly is not limited to TTT and OTT box, but understands easily, can be to use this channel distribution part in the system arbitrarily at its input channel number and delivery channel number.

Fig. 1 illustrates the signal processing system according to one embodiment of the invention.As shown in Figure 1, this signal processing system comprises encoding device 100 and decoding device 150.Though the present invention will describe based on sound signal, to understand easily, signal processing system of the present invention can be handled all signals except that sound signal.

This encoding device 100 comprises mixing part 110, core encoder part 120 and multiplexing section 130 down.This time mixing part 110 comprises mixing part 111 and spatial information estimation part 112 under the channel.

N multi channel audio signal X of mixing part 110 under input ₁, X ₂..., X _NThe time, depend on that frequency mixing method perhaps descends arbitrarily frequency mixing method to produce sound signal under certain.The number of sound signal that outputs to core encoder part 120 from following mixing part 110 here, is less than the number " N " of input multi channel audio signal.This spatial information estimation part 112 is extracted spatial information from the input multi channel audio signal, sends the spatial information that extracts to multiplexing section 130 then.Here, the number of following mixing channel can be one or two, perhaps can be the specific number according to following mixing order.The number of mixing channel can be set down.In addition, descend mixed frequency signal to be used as mixing sound signal down alternatively arbitrarily.

These core encoder part 120 codings are the mixing sound signal down, and it is via mixing channel transmission down.The following mixing sound signal of this coding is inputed to multiplexing section 130.

The following mixing sound signal of these multiplexing section 130 multiplexed codings and spatial information send the bit stream that produces to decoding device 150 then to produce bit stream.Here, this bit stream can comprise core codec bit stream and spatial information bit stream.

This decoding device 150 comprises multichannel decomposition part 160, core codec part 170 and pseudo-surround decoder part 180.This puppet surround decoder part 180 can comprise pseudo-in producing part 200 and information translation part 300.In addition, this decoding device 150 may further include spatial information decoded portion 190.This multichannel is decomposed part 160 and is received this bit stream, and the bit stream multichannel that receives is decomposed into core codec bit stream and spatial information bit stream.This multichannel is decomposed part 160 and from the bit stream that receives, is extracted mixed frequency signal and spatial information down.

This core codec part 170 is decomposed part 160 from multichannel and is received the bit stream that the core codec bit stream receives with decoding, then decoded result is exported to pseudo-surround decoder part 180 as the following mixed frequency signal of decoding.For example, when 100 times mixing multi-channel signals of encoding device were single channel signal or stereo channels signal, the following mixed frequency signal of this decoding can be single channel signal or stereo channels signal.Though the single channel of mixing channel or stereo channels were described under embodiments of the invention were based on and are used as, and understood easily, the present invention is not subject to down the number of mixing channel.

This spatial information decoded portion 190 is decomposed part 160 from multichannel and is received the spatial information bit stream, this spatial information bit stream of decoding, and as this decoded results of spatial information output.

This puppet surround decoder part 180 plays usage space information from the mixed frequency signal generation is pseudo-in signal down.Below be that it is included in the pseudo-surround decoder part 180 for the description of puppet around generation part 200 and information translation part 300.

This information translation part 300 receives spatial information and filtering information.In addition, these information translation part 300 usage space information and filtering information produce around transitional information.Here, having around transitional information of this generation is suitable for producing pseudo-pattern around signal.Should represent filter factor in puppet around producing under the situation that part 200 is specific wave filters around transitional information.Though the present invention is based on as what describe around the filter factor of transitional information, understands easily, should not limited by this filter factor around transitional information.In addition, be the relevant transport function (HRTF) of head though this filtering information is assumed to be, to understand easily, this filtering information is not limited to HRTF.

In the present invention, the filter factor of more than describing is represented the coefficient of specific wave filter.For example, this filter factor can be as giving a definition.Prototype HRTF filter factor is represented the original filter factor of specific hrtf filter, and can be represented as GL_L or the like.The HRTF filter factor of conversion is represented from the filter factor of prototype HRTF filter factor conversion, and can be represented as GL_L ' or the like.The HRTF filter factor of spatialization be through spatialization prototype HRTF filter factor producing the pseudo-filter factor that obtains around signal, and can be represented as FL_L1 or the like.The main coefficient table that appears is shown to carry out and presents necessary filter factor.And can be represented as HL_L or the like.The interior master who inserts presents coefficient and representes to insert and/or the main filter factor that presents the coefficient acquisition of obfuscation through interior, and can be represented as HL_L ' or the like.According to the present invention, to understand easily, filter factor is not limited to above-described filter factor.

This puppet receives around transitional information from the following mixed frequency signal of core codec part 170 reception decodings with from information translation part 300 around producing part 200, and the following mixed frequency signal of use decoding and pseudo-in signal around the transitional information generation.For example, pseudo-ly be used in stereo audio system, providing virtual multichannel (perhaps around) sound around signal.According to the present invention, to understand easily, puppet will play above-described effect around signal in any equipment except that stereo audio system.This puppet can be carried out appearing of various kinds according to pattern is set around producing part 200.

Suppose that encoding device 100 transmits monophonys or stereo mixed frequency signal rather than multi channel audio signal down, and this time mixed frequency signal quilt sends with the spatial information of multi channel audio signal.In this case, though the delivery channel of this equipment 150 is stereo channels rather than multichannel, comprise that this decoding device 150 of pseudo-surround decoder part 180 can provide the user to have the effect that virtual three-dimensional sound is listened to impression.

Below be according to the description of one embodiment of the invention for sound signal structure 140, as shown in Figure 1.When sending sound signal based on useful load, it can receive via each channel or individual channel.The audio frequency useful load of 1 frame is made up of with auxiliary data field the voice data field of coding.Here, this auxiliary data field can comprise the spatial information of coding.For example, if the data rate of audio frequency useful load is 48～128kbps, the data rate of spatial information can be 5～32kbps.Such example will not limit the scope of the invention.

Fig. 2 illustrates according to the puppet of one embodiment of the invention around the schematic block diagram that produces part 200.

The territory of describing in the present invention comprises the wherein decoding following mixing territory of mixed frequency signal down; Wherein handle spatial information to produce spatial information territory around transitional information; Wherein usage space information is presenting the territory and wherein exporting the domain output of the puppet of time domain around signal of mixed frequency signal now.Here, this domain output sound signal can be heard by the mankind.This domain output refers to time domain.This puppet comprises and presents part 220 and domain output conversion portion 230 around producing part 200.In addition, this puppet may further include around generation part 200 and presents territory conversion portion 210, and it is different from when presenting the territory in mixing territory instantly, will descend the mixing territory to convert into and present the territory.

Below be respectively by the description that is included in three territory conversion methods that three territory conversion portions appearing in the territory conversion portion 210 carry out.At first, be set to the sub-band territory and describe though following embodiment hypothesis presents the territory, understand easily, this presents the territory can be set to any territory.According to the first territory conversion method, be under the situation of time domain in following mixing territory, time domain is converted into and presents the territory.According to the second territory conversion method, be under the situation of discrete frequency domain in following mixing territory, discrete frequency domain is converted into and presents the territory.According to the 3rd territory conversion method, be under the situation of discrete frequency domain in following mixing territory, discrete frequency domain is converted into time domain, and the time domain of conversion is converted into and presents the territory afterwards.

This presents part 220 and uses the puppet that is used for following mixed frequency signal around the transitional information execution around appearing to produce puppet around signal.Here, this puppet of exporting from pseudo-surround decoder part 180 with stereo delivery channel becomes the pseudo-surround sound output with virtual surround sound sound around signal.In addition, because be the signal during presenting the territory around signal, when being not time domain, the territory needs the territory conversion when appearing from the puppet that presents part 220 output.Though the present invention describes under the situation of stereo channels, understands easily, can use the present invention, and irrelevant with the number of delivery channel.

For example, can realize puppet around rendering method through the HRTF filtering method, wherein input signal experiences one group of hrtf filter.Here, spatial information can be the value that can in mixing filter group territory, use, and mixing filter group territory is around undefined at MPEG.This puppet can realize as following embodiment according to the type in following mixing territory and spatial information territory around rendering method.For this reason, make down mixing territory and spatial information territory with present the territory and overlap.

According to the embodiment of puppet around rendering method, exist a kind of wherein in sub-band territory (QMF), carry out under the puppet of mixed frequency signal around the method that appears.This sub-band territory comprises simple sub-band territory and hybrid domain.For example, mixed frequency signal is a PCM signal and when down the mixing territory is not the sub-band territory instantly, presents territory conversion portion 210 and will descend the mixing territory to convert the sub-band territory into.On the other hand, when the mixing territory was the sub-band territory instantly, following mixing territory did not need to be changed.Sometimes, in order to make down mixed frequency signal and spatial information synchronous, need to descend mixed frequency signal or spatial information to postpone.Here, when the spatial information territory was the sub-band territory, the spatial information territory did not need to be changed.In addition, pseudo-in signal in order in time domain, to produce, this domain output conversion portion 230 will present the territory and convert time domain into.

According to puppet another embodiment around rendering method, exist a kind of wherein in discrete frequency domain, carry out under the puppet of mixed frequency signal around the method that appears.Here, this discrete frequency domain is represented the frequency domain except that the sub-band territory.That is, this frequency domain can comprise at least one of discrete frequency domain and sub-band territory.For example, when the mixing territory was not discrete frequency domain instantly, this presented territory conversion portion 210 and will descend the mixing territory to convert discrete frequency domain into.Here, when the spatial information territory was the sub-band territory, the spatial information territory need be converted into discrete frequency domain.This method is used for replacing filtering in time domain with the operation in discrete frequency domain, makes operating speed relatively promptly to carry out.In addition, pseudo-in signal in order in time domain, to produce, this domain output conversion portion 230 can convert time domain into presenting the territory.

According to puppet another embodiment around rendering method, exist a kind of wherein in time domain, carry out under the puppet of mixed frequency signal around the method that appears.For example, when the mixing territory was not time domain instantly, this presented time domain conversion portion 210 and will descend the mixing territory to convert time domain into.Here, when the spatial information territory was the sub-band territory, the spatial information territory also was converted into time domain.In this case, be time domain because this presents the territory, this domain output conversion portion 230 does not need to appear the territory and converts time domain into.

Fig. 3 illustrates the schematic block diagram according to the information translation part 300 of one embodiment of the invention.As shown in Figure 3, this information translation part 300 comprises that channel mapping part 310, coefficient produce part 320 and integral part 330.In addition, this information translation part 300 may further include and is used for handling the additional processing section (not shown) of filter factor in addition and/or presenting territory conversion portion 340.

This channel mapping part 310 is carried out the channel mapping, makes the spatial information of this input to be produced the channel mapping output valve as the channel map information then by mapping at least one channel signal to multi-channel signal.

This coefficient produces part 320 and produces channel coefficients information.This channel coefficients information can comprise the coefficient information of channel or the coefficient information of interchannel.Here, the coefficient information of channel is represented at least one of size information and energy information or the like, and the coefficient information of this interchannel representes the relevant information of interchannel, and it is to use filter factor and channel mapping output valve to calculate.This coefficient produces a plurality of coefficients generation parts that part 320 can comprise channel.This coefficient produces part 320 and uses filtering information and channel mapping output valve to produce channel coefficients information.Here, this channel can comprise at least one of multichannel, following mixing channel and delivery channel.From now on, this channel will be described as multichannel, and the coefficient information of channel will also be described as size information.Though channel and coefficient information will be described based on the above embodiments, understand easily, there is the modification of many admissible embodiment.In addition, this coefficient produces part 320 and can produce channel coefficients information according to channel number or other characteristic.

Integral part 330 integrations of the coefficient information of this receive channel or add with the coefficient information of channel to produce the coefficient information of integration.In addition, this integral part 330 uses the integral coefficient of integral coefficient information to produce filter factor.This integral part 330 can produce this integral coefficient through the coefficient of further integration additional information and this channel.This integral part 330 can be according to the coefficient of at least one channel of characteristic-integration of channel coefficients information.For example, this integral part 330 can be according to the characteristic of channel coefficients information, following mixing channel, and delivery channel, integration is carried out in the combination of a channel that combines with delivery channel and the channel of listing.In addition, this integral part 330 can produce additional processing coefficient information through the coefficient of handling this integration in addition.That is, this integral part 330 can produce filter factor through additional processing.For example, this integral part 330 can pass through to handle in addition this integral coefficient, such as, through using specific function, perhaps produce filter factor through merging a plurality of integral coefficients in integral coefficient.Here, this integral coefficient information is at least one in delivery channel amplitude information, delivery channel energy information and the delivery channel relevant information.

When the spatial information territory is different from when presenting the territory, this presents territory conversion portion 340 can make the spatial information territory and present the territory and overlap.This presents territory conversion portion 340 and can convert into and present the territory being used for pseudo-territory around the filter factor that appears.

Because this integral part 330 plays a part to reduce pseudo-in the workload that appears, it can be omitted.In addition, under the situation of stereo mixed frequency signal down, in the process of the coefficient information that produces channel, produce the coefficient sets that is applied to a left side and bottom right mixed frequency signal.Here, sets of filter coefficients can comprise from each channel and sends to the filter factor of their channels and send to the filter factor of their relative channel from each channel.

Fig. 4 illustrates according to one embodiment of the invention and is used to describe puppet around the schematic block diagram that presents process and spatial information transfer process.Then, this embodiment stereo down mixed frequency signal of illustrating decoding is received pseudo-in the situation that produces part 410.

Information translation part 400 can be created in pseudo-in producing the coefficient that sends its own channel in the part 410 to and sending the coefficient of relative channel in around generation part 410 in puppet.This information translation part 400 produces coefficient HL_L and coefficient HL_R, and the coefficient HL_L that will produce and HL_R export to first and present part 413.Here, this coefficient HL_L is transmitted to pseudo-in the left output terminal that produces part 410, and this coefficient HL_R is transmitted to pseudo-in the right output terminal that produces part 410.In addition, this information translation part 400 produces coefficient HR_R and HR_L, and the coefficient HR_R that produces and HR_L are exported to second presents part 414.Here, this coefficient HR_R is transmitted to pseudo-in the right output terminal that produces part 410, and this coefficient HR_L is transmitted to pseudo-in the left output terminal that produces part 410.

This puppet comprises that around producing part 410 first presents part 413, second and present part 414 and totalizer 415 and 416.In addition, this puppet may further include

territory conversion portion

411 and 412 around producing part 410, and it will descend the mixing territory to overlap with presenting the territory, when two territories mutually not simultaneously, for example, when the mixing territory is not the sub-band territory instantly, and to present the territory be the sub-band territory.Here, this puppet may further include

anti-territory conversion portion

417 and 418 around producing part 410, and it will present the territory, and for example the sub-band territory converts time domain into.Therefore, the user hears the audio frequency with virtual multichannel sound via the earphone with stereo channels etc.

First and second

present part

413 and 414 receives stereo mixed frequency signal and one group of filter factor down.This sets of filter coefficients is applied to a left side and bottom right mixed frequency signal respectively, and from integral part 403 outputs.

For example, first and second

present part

413 and 414 and use four filter factor HL_L, HL_R, HR_L and HR_R to carry out to appear to produce pseudo-in signal from mixed frequency signal down.

More particularly, first presents part 413 can use filter factor HL_L and HL_R to carry out to appear, and wherein filter factor HL_L is transmitted to its oneself channel, and this filter factor HL_R is transmitted to and its oneself the relative channel of channel.First presents part 413 can comprise that son presents part (not shown) 1-1 and 1-2.Here; This son presents part 1-1 and uses filter factor HL_L execution to appear; This filter factor HL_L is transmitted to pseudo-in the left output terminal that produces part 410; And this son presents part 1-2 and uses filter factor HL_R execution to appear, and this filter factor HL_R is transmitted to pseudo-in the right output terminal that produces part 410.In addition, second presents part 41 4 uses sets of filter coefficients HR_R execution appears with HR_L, and wherein filter factor HR_R is transmitted to its oneself channel, and this filter factor HR_L is transmitted to and its oneself the relative channel of channel.Second presents part 414 can comprise that son presents part (not shown) 2-1 and 2-2.Here; This son presents part 2-1 and uses filter factor HR_R execution to appear; This filter factor HR_R is transmitted to pseudo-in the right output terminal that produces part 410; And this son presents part 2-2 and uses filter factor HR_L execution to appear, and this filter factor HR_L is transmitted to pseudo-in the left output terminal that produces part 410.HL_R and HR_R addition in totalizer 416, and HL_L and HR_L addition in totalizer 415.Here, in case of necessity, HL_R and HR_L vanishing, this coefficient that refers to cross term is zero.Here, when HL_R and HR_L were zero, two transmission did not in addition interact.

On the other hand, under monophone, under the situation of mixed frequency signal, can appear through embodiment execution with the structure that is similar to Fig. 4.More particularly, original monophone input is called as first channel signal, and the signal that obtains through decorrelation first channel signal is called as the second channel signal.In this case, first and second

present part

413 and 414 can receive first and second channel signals, and their execution are appeared.

With reference to figure 4; The stereo mixed frequency signal down of its definition input is represented by " x "; Represent by " D " through spatial information being shone upon the channel mapping coefficient that obtains to channel; The prototype HRTF filter factor of outside input is by " G " expression, and interim multi-channel signal is represented by " p ", and experienced the output signal that appears and represented by " y ".This mark " x ", " D ", " G ", " p " and " y " can represent through the matrix form of following formula 1.Formula 1 is based on that prototype HRTF filter factor representes.But when the HRTF filter factor of revising was used for following formula, G must be with G ' replacement in following formula.

[formula 1]

x = [\begin{matrix} Li \\ Ri \end{matrix}],

p = [\begin{matrix} L \\ Ls \\ R \\ Rs \\ C \\ LFE \end{matrix}],

D = [\begin{matrix} D_L 1 & D_L 2 \\ D_Ls 1 & D_Ls 2 \\ D_R 1 & D_R 2 \\ D_Rs 1 & D_Rs 2 \\ D_C 1 & D_C 2 \\ D_LFE 1 & D_LFE 2 \end{matrix}]

，

G = [\begin{matrix} GL_L & GLs_L & GR_L & GRs_L & GC_L & GLFE_L \\ GL_R & GLs_R & GR_R & GRs_L & GC_R & GLFE_R \end{matrix}]

y = [\begin{matrix} Lo \\ Ro \end{matrix}]

Here, when each coefficient was the value of frequency domain, interim multi-channel signal " p " can be following formula 2 through channel mapping coefficient " D " and the stereo product representation of mixed frequency signal " x " down.

[formula 2]

p＝D·x，

[\begin{matrix} L \\ Ls \\ R \\ Rs \\ C \\ LFE \end{matrix}] = [\begin{matrix} D_L 1 & D_L 2 \\ D_Ls 1 & D_Ls 2 \\ D_R 1 & D_R 2 \\ D_Rs 1 & D_Rs 2 \\ D_C 1 & D_C 2 \\ D_LFE 1 & D_LFE 2 \end{matrix}] [\begin{matrix} Li \\ Ri \end{matrix}]

Then, when using prototype HRTF filter factor " G " to present interim multichannel " p ", can should export signal " y " through formula 3 expressions.

[formula 3]

y＝G·p

Then, if p=Dx is inserted into, can be through formula 4 expressions " y ".

[formula 4]

y＝GDx

Here, if definition H=GD, this output signal " y " and stereo mixed frequency signal " x " down have the relation of following formula 5.

[formula 5]

H = [\begin{matrix} HL_L & HR_L \\ HL_R & HR_R \end{matrix}]

，y＝Hx

Therefore, the product of this filter factor allows to obtain " H ".Then, can be through this output signal " y " of acquisition that stereo down mixed frequency signal " x " and " H " are multiplied each other.

Can through following formula 6 obtain after a while with the coefficient F that describes (FL_L1, FL_L2 ...).

[formula 6]

H = GD =

[\begin{matrix} GL_L & GLs_L & GR_L & GRs_L & GC_L & GLFE_L \\ GL_R & GLs_R & GR_R & GRs_R & GC_R & GLFE_R \end{matrix}]

[\begin{matrix} D_L 1 & D_L 2 \\ D_Ls 1 & D_Ls 2 \\ D_R 1 & D_R 2 \\ D_Rs 1 & D_Rs 2 \\ D_C 1 & D_C 2 \\ D_LFE 1 & D_LFE 2 \end{matrix}]

Fig. 5 illustrates according to another embodiment of the present invention and is used to describe puppet around the schematic block diagram that presents process and spatial information transfer process.Then, this embodiment illustrates that mixed frequency signal is received pseudo-in the situation that produces part 510 under the monophone of decoding.As shown in the drawing, information translation part 500 comprises that channel mapping part 501, coefficient produce part 502 and integral part 503.Because the above-mentioned element of this information translation part 500 is carried out information translation part 400 identical functions with Fig. 4, will omit its detailed description below.Here, this information translation part 500 can produce last filter factor, and its territory overlaps around the territory that appears that appears with wherein execution is pseudo-.When the following mixed frequency signal of decoding is under the monophone mixed frequency signal time, this sets of filter coefficients can comprise sets of filter coefficients HM_L and HM_R.This filter factor HM_L is used to carry out appearing of mixed frequency signal under the monophone, exports to pseudo-in the left channel that produces part 510 will present the result.This filter factor HM_R is used to carry out appearing of mixed frequency signal under the monophone, exports to pseudo-in the right channel that produces part 510 will present the result.

This puppet comprises that around producing part 510 the 3rd presents part 512.In addition, this puppet may further include territory conversion portion 511 and reverse

territory conversion portion

513 and 514 around producing part 510.This puppet is different from the puppet of Fig. 4 around the element that produces part 410 around the element that produces part 510; This is because the following mixed frequency signal of decoding is mixed frequency signal under the monophone in Fig. 5, and this puppet comprises that around producing part 510 carrying out puppet presents part 512 and a territory conversion portion 511 around one the 3rd that appears.The 3rd presents part 512 from integral part 503 accept filter coefficient sets HM_L and HM_R, and can use puppet that the filter factor of reception carries out mixed frequency signal under the monophone around appearing, and produces pseudo-in signal.

Simultaneously, be under the situation of monophonic signal at following mixed frequency signal, can obtain the stereo output of mixing down through the puppet of carrying out mixed frequency signal under the monophone around appearing according to two kinds of following methods.

According to first method, the 3rd presents part 512 (for example, hrtf filter) does not use and is used for pseudo-filter factor around sound, and is to use the value of use when handling stereo mixing down.Here, the value of when handling stereo mixing down, using can be coefficient (left front=1, right front=0 ... or the like), this coefficient " left front " is to be used for left side output here, and this coefficient " right front " is to be used for right output.

Secondly, from the centre that following mixed frequency signal produces the decode procedure of multi-channel signal, obtain to have the stereo output of mixing down of required channel number in usage space information.

With reference to figure 5; Mixed frequency signal is represented by " x " under its definition input monophone; The channel mapping coefficient is by " D " expression, and the prototype HRTF filter factor of outside input is by " G " expression, and interim multi-channel signal is represented by " p "; And experienced the output signal that appears and represented that by " y " this mark " x ", " D ", " G ", " p " and " y " can be represented by the matrix form of following formula 7.

[formula 7]

x＝[Mi]，

p = [\begin{matrix} L \\ Ls \\ R \\ Rs \\ C \\ LFE \end{matrix}],

D = [\begin{matrix} D_L \\ D_Ls \\ D_R \\ D_Rs \\ D_C \\ D_LFE \end{matrix}]

G = [\begin{matrix} GL_L & GLs_L & GR_L & GRs_L & GC_L & GLFE_L \\ GL_R & GLs_R & GR_R & GRs_L & GC_R & GLFE_R \end{matrix}],

y = [\begin{matrix} Lo \\ Ro \end{matrix}]

Relation between the matrix in formula 7 is described in the explanation of Fig. 4.Therefore, following description is with the descriptions thereof are omitted.Here, Fig. 4 illustrates and receives the stereo situation of mixed frequency signal down, and Fig. 5 illustrates the situation that receives mixed frequency signal under the monophone.

Fig. 6 and Fig. 7 illustrate the schematic block diagram that is used to describe the channel mapping process according to one embodiment of the invention.This channel mapping process refers to a process, and wherein at least one of channel mapping output valve is to produce through the spatial information that receives is mapped as multi channel at least one channel, with compatible around producing part with puppet.This channel mapping process is carried out in channel mapping part 401 and 501.Here, spatial information, for example, energy can be shone upon at least two that give a plurality of channels.Can divide Lfe channel and central channel C here.In this case, because above-mentioned process does not need

channel distribution part

604 or 705, it can simplify calculating.

For example, under receiving monophone mixed frequency signal the time, can coefficient of performance CLD1 to CLD5, ICC1 to ICC5 etc. produces channel mapping output valve.This channel mapping output valve can be D _L, D _R, D _c, D _LEF, D _Ls, D _RsDeng.Because this channel mapping output valve through the usage space information acquisition, can obtain the channel mapping output valve of various kinds according to different formula.Can change the generation of channel mapping output valve here, according to the scope of the tree structure of the spatial information that receives by decoding device 150 and the spatial information that in decoding device 150, uses.

Fig. 6 and Fig. 7 illustrate the schematic block diagram that is used to describe the channel mapping structure according to one embodiment of the invention.Here, the channel mapping structure can comprise at least one channel distribution part of expression OTT box.The channel architecture of Fig. 6 has 5151 structures.

With reference to figure 6, can use OTT box 601,602,603,604,605 and spatial information, for example CLD ₀, CLD ₁, CLD ₂, CLD ₃, CLD ₄, ICC ₀, ICC ₁, ICC ₂, ICC ₃Deng producing multi-channel signal L, R, C, LFE, Ls, Rs from following mixed frequency signal " m ".For example, when this tree structure has 5151 structures as shown in Figure 6, can only use CLD to obtain this channel mapping output valve, shown in formula 8.

[formula 8]

[\begin{matrix} L \\ R \\ C \\ LFE \\ Ls \\ Rs \end{matrix}] = \begin{matrix} [\begin{matrix} D_{L} \\ D_{R} \\ D_{C} \\ D_{LFE} \\ D_{Ls} \\ D_{Rs} \end{matrix}] m = [\begin{matrix} c_{1, OTT 3} c_{1, OTT 1} c_{1, OTT 0} \\ c_{2, OTT 3} c_{1, OTT 1} c_{1, OTT 0} \\ c_{1, OTT 4} c_{2, OTT 1} c_{1, OTT 0} \\ c_{2, OTT 4} c_{2, OTT 1} c_{1, OTT 0} \\ c_{1, OTT 1} c_{2, OTT 0} \\ c_{2, OTT 2} c_{2, OTT 0} \end{matrix}] m \end{matrix}

Wherein,

C_{{1, OTT}_{x}}^{l . m} = \sqrt{\frac{10^{\frac{{CLD}_{x}^{l, m}}{10}}}{1 + 10^{\frac{{CLD}_{x}^{l, m}}{10}}}},

C_{{2, OTT}_{S}}^{l . m} = \sqrt{\frac{1}{1 + 10^{\frac{{CLD}_{x}^{l, m}}{10}}}}

With reference to figure 7, can use OTT box 701,702,703,704,705 and spatial information, for example CLD ₀, CLD ₁, CLD ₂, CLD ₃, CLD ₄, ICC ₀, ICC ₁, ICC ₃, ICC ₄Or the like produce multi-channel signal L, Ls, R, Rs, C, LFE from following mixed frequency signal " m ".

For example, when this tree structure has 5152 structures as shown in Figure 7, can only use CLD to obtain this channel mapping output valve, shown in formula 9.

[formula 9]

[\begin{matrix} L \\ L \\ R \\ Rs \\ C \\ LFE \end{matrix}] = \begin{matrix} [\begin{matrix} D_{L} \\ D_{Ls} \\ D_{R} \\ D_{Rs} \\ D_{C} \\ D_{LFE} \end{matrix}] m = [\begin{matrix} c_{1, OTT 3} c_{1, OTT 1} c_{1, OTT 0} \\ c_{2, OTT 3} c_{1, OTT 1} c_{1, OTT 0} \\ c_{1, OTT 4} c_{2, OTT 1} c_{1, OTT 0} \\ c_{2, OTT 4} c_{2, OTT 1} c_{1, OTT 0} \\ c_{1, OTT 1} c_{2, OTT 0} \\ c_{2, OTT 2} c_{2, OTT 0} \end{matrix}] m \end{matrix}

This channel mapping output valve can change according to the time slot of frequency band range, parameter band and/or transmission.Here, pseudo-when appearing if enlarging between the adjacent frequency band or in the difference that forms the channel mapping output valve between the time slot on border when carrying out, distortion possibly appear.In order to prevent above-mentioned distortion, possibly in frequency and time domain, need the obfuscation of channel mapping output valve.More particularly, prevent that the method for distortion is following.At first, this method can adopt frequency ambiguityization and time ambiguityization, perhaps adopts any puppet that is suitable for around the other technologies that appear in addition.In addition, can prevent this distortion through each channel mapping output valve multiply by specific gain.

Fig. 8 illustrates the synoptic diagram that is used to describe the filter factor of channel according to one embodiment of the invention.For example, this filter factor can be the HRTF coefficient.

Pseudo-in appearing in order to carry out, the filter filtering through having filter factor GL_L is from the signal of left channel information source " L " 810, and then, this filtered L*GL_L is used as left side output and transmits.In addition, by the filter filtering with filter factor GL_R, then, this filtered L*GL_R is used as right output and transmits from the signal of left channel information source " L " 810.For example, left and right output can arrive user's left ear and auris dextra respectively.So, all left sides obtain through channel with right output.Then, the left side of this acquisition output is added and (for example, Lo), and the right side of this acquisitions is exported and added and export (for example, Ro) to produce the last right side to produce last left side output.Therefore, having experienced puppet can be by 10 expression of following formula around a last left side that appears and right output.

[formula 10]

Lo＝L*GL_L+C*GC_L+R*GR_L+Ls*GLs_L+Rs*GRs_L

Ro＝L*GL_R+C*GC_R+R*GR_R+Ls*GLs_R+_Rs*GRs_R

According to embodiments of the invention, the method that is used to obtain L (810), C (800), R (820), Ls (830) and Rs (840) is following.At first, can obtain L (810), C (800), R (820), Ls (830) and Rs (840) through the coding/decoding method that is used to use down mixed frequency signal and spatial information to produce multi-channel signal.For example, can produce this multi-channel signal through MPEG surround decoder method.Secondly, can obtain L (810), C (800), R (820), Ls (830) and Rs (840) through only relevant formula with spatial information.

Fig. 9 to Figure 11 illustrates according to the embodiment of the invention and is used to describe the schematic block diagram of generation around the process of transitional information.

Fig. 9 illustrates according to one embodiment of the invention and is used to describe the schematic block diagram of generation around the process of transitional information.As shown in Figure 9, except that channel mapping part, the information translation part can comprise that coefficient produces part 900 and integral part 910.Here, this coefficient produce part 900 comprise the subsystem number produce part (coef_1 produce part 900_1, coef_2 produce part 900_2 ..., coef_N produces part 900_N) at least one.Insert part 920 and territory conversion portion 930 here, in this information translation partly may further include so that handle filter factor in addition.

This coefficient produces part 900 usage space information and filtering information produces coefficient.Below be to produce part at specific subsystem number, for example, coef_1 produces the description that coefficient produces among the part 900_1 (it is called as the first subsystem number and produces part).

For example, mixed frequency signal the time, the first subsystem number produces part 900_1 and uses the value D_L that produces from spatial information to produce coefficient FL_L and the FL_R that is used for multi channel left channel under the input monophone.The coefficient FL_L of this generation and FL_R can be by 11 expressions of following formula.

[formula 11]

FL_L=D_L*GL_L (be used for mixed frequency signal produces the coefficient of left side output under the monophone of input)

FL_R=D_L*GL_R (being used for producing the coefficient of right output) from the monophone channel signal of input

Here, this D_L is the channel mapping output valve that in the channel mapping process, produces from spatial information.The process that is used to obtain D_L can transmit the tree structure information change that receives with decoding device according to encoding device.Similarly; Produce part 900_2 at coef_2 and be called as second subsystem number generation part; And coef_3 produces part 900_3 and is called as under the 3rd subsystem number generation situation partly; The second subsystem number produces part 900_2 can produce coefficient FR_L and FR_R, and the 3rd subsystem number generation part 900_3 can produce FC_L and FC_R or the like.

For example, when under the input stereo audio mixed frequency signal time, the first subsystem number produces part 900_1 and uses the value D_L1 that produces from spatial information generation is used for coefficient FL_L1, FL_L2, FL_R1 and the FL_R2 of multi channel left channel with D_L2.Coefficient FL_L1, FL_L2, FL_R1 and the FL_R2 of this generation can be by 12 expressions of following formula.

[formula 12]

FL_L1=D_L1*GL_L (be used for mixed frequency signal produces the coefficient of left side output under the left side of mixed frequency signal under the input stereo audio)

FL_L2=D_L2*GL_L (the bottom right mixed frequency signal that is used for mixed frequency signal under the input stereo audio produces the coefficient of right output)

FL_R1=D_L1*GL_R (be used for mixed frequency signal produces the coefficient of right output under the left side of mixed frequency signal under the input stereo audio)

FL_R2=D_L2*GL_R (the bottom right mixed frequency signal that is used for mixed frequency signal under the input stereo audio produces the coefficient of right output)

, be similar to the situation of mixed frequency signal under this monophone of input here, when under the input stereo audio mixed frequency signal time, at least one that can produce part 900_1 to 900_N through coefficient produces a plurality of coefficients.

This integral part 910 produces filter factor through integral coefficient, and this integral coefficient produces according to channel.The integration of monophone that this integral part 910 is used to import and stereo mixed frequency signal situation down can be by 13 expressions of following formula.

[formula 13]

Under the situation of mixed frequency signal under the input monophone:

HM_L＝FL_L+FR_L+FC_L+FLS_L+FRS_L+FLFE_L

HM_R＝FL_R+FR_R+FC_R+FLS_R+FRS_R+FLFE_R

Under input stereo audio under the situation of mixed frequency signal:

HL_L＝FL_L1+FR_L1+FC_L1+FLS_L1+FRS_L1+FLFE_L1

HR_L＝FL_L2+FR_L2+FC_L2+FLS_L2+FRS_L2+FLFE_L2

HL_R＝FL_R1+FR_R1+FC_R1+FLS_R1+FRS_R1+FLFE_R1

HR_R＝FL_R2+FR_R2+FC_R2+FLS_R2+FRS_R2+FLFE_R2

Here, HM_L and HM_R are illustrated in and are used for puppet under the situation of importing mixed frequency signal under the monophone around the filter factor that appears.On the other hand, HL_L, HR_L, HL_R and HR_R be illustrated in be used under the situation of mixed frequency signal under the input stereo audio pseudo-in the filter factor that appears.

Part 920 should be interiorly inserted and this filter factor can be interiorly inserted.In addition, can be used as the obfuscation that filter factor is carried out in aftertreatment.This time ambiguityization can be carried out in time ambiguity part (not shown).When the spatial information that transmits and produce when time shaft has wide interval, insert in this and insert these filter factors in part 920, to obtain non-existent spatial information between the spatial information that transmits and produce.For example, when spatial information was present in n parameter time slot and n+K parameter time slot (K＞1), the embodiment of linear interpolation can be by 14 expressions of following formula.In the embodiment of formula 14, can use the filter factor of generation, for example HL_L, HR_L, HL_R and HR_R obtain the spatial information in not having the parameter time slot that transmits.Should be appreciated that, insert part 920 in this and can pass through several different methods interpolation filter coefficient.

[formula 14]

Under the situation of mixed frequency signal under the input monophone:

HM_L(n+j)＝HM_L(n)*a+HM_L(n+k)*(1-a)

HM_R(n+j)＝HM_R(n)*a+HM_R(n+k)*(1-a)

Under input stereo audio under the situation of mixed frequency signal:

HL_L(n+j)＝HL_L(n)*a+HL_L(n+k)*(1-a)

HR_L(n+j)＝HR_L(n)*a+HR_L(n+k)*(1-a)

HL_R(n+j)＝HL_R(n)*a+HL_R(n+k)*(1-a)

HR_R(n+j)＝HR_R(n)*a+HR_R(n+k)*(1-a)

Here, HM_L (n+j) and HM_R (n+j) expression is used for pseudo-coefficient around the filter factor acquisition that appears through interior inserting mixed frequency signal the time under the input monophone.In addition, HL_L (n+j), HR_L (n+j), HL_R (n+j) and HR_R (n+j) expression is when being used for pseudo-coefficient around the filter factor acquisition that appears through interior inserting mixed frequency signal the time under the input stereo audio.Here, " j " and " k " is integer, 0＜j＜k.In addition, " a " is real number (0＜a＜1), and by 15 expressions of following formula.

[formula 15]

a＝j/k

Through the linear interpolation of formula 14, can use the spatial information in n and n+K parameter time slot to obtain at the spatial information that does not have in the parameter time slot that between n and n+K parameter time slot, transmits.That is the unknown-value that, can be in two parameter time slots on the straight line that the connection value by spatial information forms, obtains spatial information according to formula 15.

When the coefficient value between proximity modules in time domain can produce discrete point when promptly changing.Then, can be through the execution time obfuscation of time ambiguity part to prevent by the caused distortion of discrete point.Can with interior this time ambiguity operation of slotting operation executed in parallel.In addition, can handle this time ambiguityization and interior slotting operation according to their sequence of operation differently.

Under the situation of mixing channel, the time ambiguityization of this filter factor can be by 16 expressions of following formula under monophone.

[formula 16]

HM_L(n)′＝HM_L(n)*b+HM_L(n-1)′*(1-b)

HM_R(n)′＝HM_R(n)*b+HM_R(n-1)′*(1-b)

The obfuscation that formula 16 is described via 1 utmost point iir filter, wherein this obfuscation result can obtain as follows.That is, this filter factor HM_L (n) and HM_R (n) multiply by " b " respectively in current module (n).Then, this filter factor HM_L (n-1) ' and HM_R (n-1) ' multiply by (1-b) respectively in formerly the module (n-1).This multiplied result is by addition, shown in formula 16.Here, " b " is constant (0＜b＜1).The value of " b " is more little, and this obfuscation effect increase is many more.On the contrary, the value of " b " is big more, and this obfuscation effect increase is few more.Be similar to above-described method, can carry out the obfuscation of remaining filter factor.

Use formula 16 to be used for time ambiguityization, interior inserting with obfuscation can be represented by formula 17.

[formula 17]

HM_L(n+j)′＝(HM_L(n)*a+HM_L(n+k)*(1-a))*b+HM_L(n+j-1)′*(1-b)

HM_R(n+j)′＝(HM_R(n)*a+HM_R(n+k)*(1-a))*b+HM_R(n+j-1)′*(1-b)

On the other hand, in interior slotting part 920 and/or time ambiguityization part is carried out respectively, insert and time ambiguity the time, can obtain the filter factor that its energy value is different from original filter factor.Under the sort of situation, can further need the energy scale processing to prevent above-mentioned problem.When presenting the territory when not overlapping with the spatial information territory, this territory conversion portion 930 converts the spatial information territory into and presents the territory.But, if presenting the territory, this overlaps with the spatial information territory, do not need above-mentioned territory conversion.Here, when the spatial information territory is the sub-band territory and presents the territory when being frequency domain, above-mentioned territory conversion can relate to wherein that coefficient is expanded or is reduced to the processing that meets frequency range and be used for the time range of each sub-band.

Figure 10 illustrates according to another embodiment of the present invention and is used to describe the schematic block diagram of generation around the transitional information process.Shown in figure 10, except that channel mapping part, the information translation part can comprise that coefficient produces part 1000 and integral part 1020.Here, this coefficient produce part 1000 comprise the subsystem number produce part (coef_1 produce part 1000_1, coef_2 produce part 1000_2 ... produce part 1000_N with coef_N) at least one.In addition, insert part 1010 and territory conversion portion 1030 in this information translation partly may further include so that handle filter factor in addition.Here, insert in this part 1010 comprise insert in the son part 1010_1,1010_2 ... with at least one of 1010_N.Different with the embodiment of Fig. 9, in the embodiment of Figure 10, insert in this and insert the corresponding coefficient that this coefficient generation part 1000 produces according to channel in part 1010.For example, under the situation of mixing channel, this coefficient produces part 1000 and produces coefficient FL_L and FL_R under monophone, and under stereo mixing channel situation down, produces coefficient FL_L1, FL_L2, FL_R1 and FL_R2.

Figure 11 illustrates and is used to describe the schematic block diagram of generation around the process of transitional information according to another embodiment of the present invention.Different with the embodiment of Fig. 9 and 10, in the embodiment of Figure 11, insert each channel mapping output valve in the interior slotting part 1100, coefficient produces in part 1110 uses and inserts the coefficient that the result produces channel then.

In the embodiment of Fig. 9 to Figure 11, described because channel mapping output valve is in frequency domain (for example, the parameter band unit has single value), handle and in frequency domain, carry out, produce such as filter factor.In addition, pseudo-when appearing when in the sub-band territory, carrying out, this

territory conversion portion

930 or 1030 is not carried out the territory conversion, but the filter coefficient in sub-band territory along separate routes perhaps can be carried out conversion with the decomposition of adjusting frequency, and exports this transformation result then.

As stated; The present invention can even can't produce under the environment of multi-channel signal at decoding device; In decoding device, provide to have pseudo-sound signal the sound bit stream of the spatial information of mixed frequency signal and multi-channel signal under this decoding device receives and comprises around sound.

Obvious for those skilled in the art, do not break away from spirit of the present invention or scope, can carry out various improvement and variation in the present invention.Therefore, this invention is intended to cover it and be included into improvement of the present invention and the variation that is provided within appended claim and its equivalent scope.

Claims

1. method that is used for decoded audio signal, this method comprises:

From the sound signal that receives, extract mixed frequency signal and spatial information down;

Use said spatial information to produce around transitional information; With

Use that said to present said mixed frequency signal down around transitional information in the territory in appearing of being provided with in advance pseudo-in signal to produce,

Wherein said puppet is used in stereo audio system, providing around sound around signal.

2. according to the method for claim 1, comprise that further be that the puppet of domain output is around signal with the said puppet that presents the territory around conversion of signals.

3. according to the method for claim 1, further comprise:

The said down mixed frequency signal of decoding, wherein when the following mixing territory of the following mixed frequency signal of decoding be different from be provided with in advance present the territory time, the territory that appears of using the territory conversion method that is provided with in advance will descend the mixing territory to convert into to be provided with in advance.

4. according to the method for claim 3, wherein, change at least one that said mixing territory down comprises following operation:

When said following mixing territory is time domain, time domain is converted into the territory that appears that is provided with in advance;

When said following mixing territory is discrete frequency domain, discrete frequency domain is converted into the territory that appears that is provided with in advance; With

When said following mixing territory is discrete frequency domain, convert discrete frequency domain into time domain, and the time domain after will changing then converts the territory that appears that is provided with in advance into.

5. according to the process of claim 1 wherein, the said territory that appears that is provided with in advance is the sub-band territory, and said rendering step comprises:

Be applied to said mixed frequency signal down with said around transitional information; With

With the application result addition.

6. use said spatial information and said filtering information to produce said according to the process of claim 1 wherein around transitional information.

7. according to the method for claim 1, further comprise:

Reception comprises the said sound signal of said mixed frequency signal down and said spatial information,

Wherein said mixed frequency signal down and said spatial information extract from said sound signal.

8. according to the process of claim 1 wherein that said spatial information comprises at least one among channel level difference and the interchannel coherence.

9. device that is used for decoded audio signal, this device comprises:

The multichannel of mixed frequency signal and spatial information is decomposed part under from the bit stream that receives, extracting;

Use said spatial information to produce information translation part around transitional information; With

Use saidly to present said mixed frequency signal down in the territory producing pseudo-puppet around signal around producing part in appearing of being provided with in advance around transitional information,

10. according to the device of claim 9, wherein, said puppet comprises that around producing part with the puppet that presents the territory that is provided with in advance be the domain output conversion portion of the puppet of domain output around signal around conversion of signals.

11. according to the device of claim 9, wherein, said puppet comprises around producing part:

When said down mixing territory be different from said be provided with in advance present the territory time, with said mixing territory down convert into said be provided with in advance present the territory present the territory conversion portion.

12. according to the device of claim 11, wherein, the said territory conversion portion that appears comprises following at least one:

When said following mixing territory is time domain, convert said time domain into the said first territory conversion portion that presents the territory that is provided with in advance;

When said following mixing territory is discrete frequency domain, convert said discrete frequency domain into the said second territory conversion portion that presents the territory that is provided with in advance; With

When said following mixing territory is discrete frequency domain, convert said discrete frequency domain into said time domain, and the time domain after will changing then converts said the 3rd territory conversion portion that presents the territory that is provided with in advance into.

13. according to the device of claim 9, wherein, the said territory that appears that is provided with in advance is the sub-band territory, and mixed frequency signal comprises first signal and secondary signal under said, and

Said puppet is applied to this first signal with said around transitional information around producing part, is applied to said secondary signal with said around transitional information, and adds said first signal to said secondary signal.

14. according to the device of claim 9, wherein, it is said around transitional information to use said spatial information and said filtering information to produce.

15. according to the device of claim 9, wherein, said multichannel is decomposed part and received the said sound signal that comprises said mixed frequency signal down and said spatial information, wherein said mixed frequency signal down and said spatial information are extracted out from said sound signal.

16. according to the device of claim 9, wherein, said spatial information one of comprises among channel level difference and the interchannel coherence at least.