CN1402593A

CN1402593A - 5.1 path surround sound earphone repeat signal processing method

Info

Publication number: CN1402593A
Application number: CN 02134416
Authority: CN
Inventors: 谢菠荪; 王杰; 管善群; 李长滨
Original assignee: South China University of Technology SCUT; TCL King Electronics Shenzhen Co Ltd
Current assignee: South China University of Technology SCUT; TCL King Electronics Shenzhen Co Ltd
Priority date: 2002-07-23
Filing date: 2002-07-23
Publication date: 2003-03-12
Anticipated expiration: 2022-07-23
Also published as: CN1219415C

Abstract

First, the original time domain signal of the 5.1 channel surround sound 1, r, c, ls, rs, lfe is inputs. Then, (1) the process of decorrelation and simulating the multiple surround speaker for the left and right surround sound ls, rs are carried out so as to obtain signal ls' and rs'. (2) the convolution is carried out between the signals of ls', rs'; 1, r; c, lfe and the head correlation impulse response function and mixing process further so as to obtain eL, eR signals. (3) The eL, eR signals are fed to a pair of earphone. The invention eliminates the orientation effect of the earphone, repeating the 5.1 channel surround sound. The invented earphone can simulate the effect of the surround sound without need of simulating the reflected sound of the listening room.

Description

A kind of signal processing method of earphone repeat of 5.1 path surround sounds

(1) technical field

The present invention relates to the electroacoustic techniques field, specifically be meant a kind of signal processing method of earphone repeat of 5.1 path surround sounds.

(2) background technology

5.1 the path surround sound has been widely used in aspects such as film, DVD, home theater, its left side, total the place ahead l, middle c, right r and a left side are around ls, right around five of rs full range band path independently, add that (annotate: the present invention represents time-domain signal with lowercase to a low-frequency effect path lfe, corresponding frequency-region signal represents with capitalization, as 1 with L, r and R or the like).But, when directly with earphone repeat 5.1 path surround sound signals, orientation effect in the head can appear, and just acoustic image often concentrates on number of people inside, causes a kind of factitious auditory effect.For addressing this problem, the Dolby laboratory has proposed a kind of technology of the Dobly of being called earphone, and it passes through method for processing signals, orientation effect in the head when eliminating earphone repeat.But, in the signal processing of Dobly earphone, introduce the room acoustics model of listening room inadequately, simulate the reflected sound in room, thereby in eliminating head, in the orientation effect, but brought new defective.

Because 5.1 path surround sounds have five tunnel independently full range band signal and one tunnel selectable low-frequency effect paths, in the application of common products such as home theater, (from DVD's etc.) five road signals corresponding five loud speakers of feeding are respectively retransmitted.In existing international standard, the azimuth of five loudspeaker arrangement is (coordinate is chosen for :-180 °＜θ≤180 °, θ=0 ° is the dead ahead, θ=90 ° be front-left): θ _L=30 °, θ _R=-30 °, θ _C=0 °, θ _LS=110 ° (± 10 °), θ _RS=-110 ° (± 10 °).Wherein, the position of left and right circulating loudspeaker does not have strict the qualification.And lfe is retransmitted by inferior woofer.Because the processing to the processing of low-frequency effect path lfe and mid-c path is similar, so all omitted the lfe path in the analysis afterwards.If 5.1 path surround sound signals are directly used earphone repeat (omitting the lfe path), then recurrent signal can be write as: Wherein, the direct superposition of l and the ls left earphone of feeding, the direct superposition of r and the rs right earphone of feeding.The c signal after-3dB decay, the left and right earphone of feeding simultaneously (decay 3dB is in order to guarantee the signal power conservation).

But, when directly using earphone repeat, can produce orientation effect in the head, just acoustic image appears in the attentive listener head and causes a kind of factitious auditory effect.Psychoacoustic studies show that, location former in the head because: when (1) retransmitted 5.1 path signal with loud speaker, the sound wave that each loud speaker produced was after scatterings such as the number of people, auricle, and at ears place superposition, thereby the acoustic image that simulates in the certain angle scope distributes.And during earphone repeat, left and right path signal is the duct inlet of direct feed-in ears, and not considering does not have scattering effects such as the number of people, auricle the intersection superposition of left and right path yet, thereby destroyed the spatial information of original sound field.(2) during sound reproduction, the correct outside fix of the reflection in room and late reverberation is important, and earphone repeat is not made good use of the information of this part.

Therefore, if simulated top 2 points, just can in earphone repeat, eliminate orientation effect in the head with the method for processing signals artificially.According to this thinking, a kind of technology of the Dolby of being called earphone has been invented in the Dolby laboratory of the U.S..In the home theater of 5.1 path surround sounds is used, establish left and right loud speaker and be respectively h to the time-domain pulse response (promptly a coherent pulse responds) of attentive listener ears _LL, h _RL, h _LR, h _RR, central loudspeakers is to the time-domain pulse response difference h of attentive listener ears _LC, h _RC, left and right circulating loudspeaker is respectively h to the time-domain pulse response of the ears of attentive listener _LLS, h _RLS, h _LRS, h _RRSWith the l signal is example, distinguishes the convolution left speaker after the impulse response of ears when the l of time domain signal, obtains signal e _L=h _LL* l, e _R=h _RL* l uses earphone repeat again, and then the sound at ears place is identical with the situation of the L loud speaker of reality, has just fictionalized the L loud speaker in earphone repeat.Therefore, l, c, r, ls, the rs signal of input are made after the similar processing superposition more respectively: Use earphone repeat again, just can be simultaneously five loudspeaker virtuals of 5.1 transit systems be come out.When at this moment, the acoustic pressure at ears place will equal (or being proportional to) loud speaker and retransmit in the acoustic pressure at attentive listener ears place:

Promptly reached purpose with the virtual 5.1 path surround sounds of earphone.

Above signal processing has only been considered the direct sound wave of loud speaker to ears, and free field situation about retransmitting is not just considered the reflected sound of listening room.Because reflected sound has certain effect to eliminating to locate in the head, therefore, in existing signal processing method, introduce the room acoustics model of listening room, simulated the reflected sound in room, but but brought following defective: when (1) was retransmitted at the surround sound of reality, the reflected sound of listening room can destroy the acoustic space information of the original sound field (as music hall) of surround sound signal.Therefore, listening room generally all adopts the design (reverberation time of past listening room generally got 0.3～0.4 second, got the lower reverberation time in recent years) of sound absorption.And in the signal processing of earphone repeat, simulated the reflected sound of listening room, this simulation is to be difficult to control.The intensity of simulated reflections sound excessively a little less than, to eliminating the DeGrain of location in the head, otherwise, cross the original sound field spatial information that can destroy surround sound signal by force again, cause a kind of new factitious auditory effect.In the particularly existing signal processing method, the signal of l, the c of 5.1 path surround sounds, r, ls, five full range bands of rs path has all been made the reflected sound of simulation listening room and handled.And in the 5.1 paths recording of reality, the c path is normally recorded (as the dialogue in the film) of speech signal, and speech signal is added the definition that excessive reflected sound can destroy language.(2) be the T room of second for a reverberation time, its impulse response length is about the T order of magnitude of second.Thereby, in order to simulate the reflected sound of listening room fully, need be with voice signal and room impulse response convolution.If the signals sampling frequency is 48kHz, T=0.3 second, then impulse response length will be 48000 * 0.3=14400 point.In the signal processing of reality, it is very difficult carrying out long like this impulse response convolution in real time.Therefore, in existing signal processing, the former secondary reflection sound in the listening room have only been simulated.This simulation also can influence the repeating transmission effect.

(3) summary of the invention

The present invention is exactly the defective that exists in the above-mentioned prior art in order to solve, and a kind of signal processing method of earphone repeat of 5.1 path surround sounds is provided.This method can need not to simulate the reflected sound in the room of listening room in eliminating the head of earphone repeat in the orientation effect, effectively improves and retransmits sound effect.

The signal processing method of the earphone repeat of a kind of 5.1 path surround sounds of the present invention is characterized in that, it comprises the steps and treatment conditions:

The 5.1 path surround sound time-domain signals that first step input is original comprise that left path signal l, right path signal r, center channel signal c, a left side are around path signal ls, right around path signal rs, low-frequency effect path signal lfe;

Second step will be original left and rightly carry out time domain around signal ls, rs and delay time, obtain surround sound signal ls, ls1, ls2, rs, rs1 and the rs2 of three pairs of decorrelations, method with stereo pan-pot is mixed ls, ls1, ls2, rs1 and five signals of rs2 respectively in proportion, obtains signal ls '; In addition rs, ls1, ls2, rs1 and five signals of rs2 are mixed respectively in proportion, obtain signal rs ';

That the 3rd step was mixed into them with ls ' and rs ' signal and signal (ls '+rs ') and difference signal (ls '-rs '), and they are carried out virtual processing obtain signal (ls '+rs ') * σ ₂(ls '-rs ') * δ ₂

That the 4th step was mixed into them with left and right path signal l, r and signal (l+r) and difference signal (l-r), in addition with after center channel signal c and the low-frequency effect path signal lfe mixing addition, amplification+3dB again, obtain signal 1.414 (c+lfe), again it is mixed into left and right path and signal (h+r) in, obtain signal [l+r+1.414 (c+lfe)];

The 5th step was carried out virtual processing with signal [l+r+1.414 (c+lfe)] with (l-r), obtained [l+r+1.414 (c+lfe)] * σ ₁(l-r) * 5 _ITwo signals;

The 6th step is with [l+r+1.414 (c+lfe)] * σ ₁, (ls '+rs ') * σ ₂Two signal mixing additions are in addition with (l-r) * δ ₁, (ls '-rs ') * δ ₂Two signal mixing additions; Addition is mixed that two signals obtain are mixed into them again with signal and difference signal, and decay-6dB respectively promptly obtains e _L, e _RSignal.

The 7th step is with e _L, e _RThe a pair of earphone of signal mixing is retransmitted.

Wherein, the mixed proportion of ls, ls1, ls2, rs1 and five signals of rs2 is 1,0.999,0.966,0.101 and 0.259 in second step; The mixed proportion of rs, ls1, ls2, rs1 and five signals of rs2 is 1,0.101,0.259,0.999,0.966; The virtual processing of the 3rd step neutralisation signals (ls '+rs ') and difference signal (ls '-rs '), above-mentioned exactly two signals respond resulting function σ with two by ± 120 ° of coherent pulses respectively ₂, δ ₂Carry out process of convolution; Signal [l+r+1.414 (c+lfe)] and virtual processing (l-r) in the 5th step, above-mentioned exactly two signals respond resulting function σ with two by ± 30 ° of coherent pulses respectively ₁, δ ₁Carry out process of convolution.

Principle of the present invention is, in fact, in the recording of 5.1 path surround sound signals, left and rightly generally included the reflected sound information of original sound field around signal ls, rs, and this point is stereo different with common binary channel.Therefore, in earphone repeat,, also can eliminate in the head preferably and locate, and need not painstakingly introduce the reflected sound of listening attentively to the chamber as long as ls, rs signal are made good use of.The place ahead c path signal particularly, in the sound reproduction of accompanying image, it mainly is the dialogue in the film (looking).Because at this moment the attentiveness of attentive listener concentrates on the image, therefore the artificial reflected sound of unnecessary introducing is eliminated in the head and is located.

On the other hand, in common home theater, due to limited conditions, little being fit to, arranged too much loud speaker, thus left and right be directly to retransmit around signal ls, rs by a pair of left and right circulating loudspeaker.In the signal processing of Dolby earphone, ls, rs retransmit by a pair of (obtaining with a coherent pulse response function convolution) virtual circulating loudspeaker.But in public's movie theatre of 5.1 path surround sounds was used, ls, rs signal were to retransmit by a series of circulating loudspeakers that are arranged in cinema both sides and rear; And, subjectivity sound field Sensurround acoustically when retransmitting in order to improve, circulating loudspeaker group again feeds after ls, rs signal can be handled through decorrelation.If in the earphone repeat of 5.1 path surround sounds, use for reference the method for cinema, after ls, rs are carried out decorrelation and handle, with a series of virtual loudspeaker cluster repeating transmission, will obtain better effect again.

In sum, the present invention proposes: in the signal processing of earphone repeat 5.1 path surround sounds, signal to a left side, the place ahead l, middle c, right r path, adopt free field (a coherent pulse response function time domain convolution) signal processing to get final product, there is no need to introduce the reflected sound of listening attentively to the chamber as basic processing.And for left and right around signal ls, rs, after decorrelation is handled, adopt LS, LS1, LS2, RS, RS1, RS2 totally six virtual speakers retransmit (certainly, also can use more virtual speaker).In actual applications, decorrelation can be used as a kind of selectable functions.Attentive listener can select to adopt two or more virtual circulating loudspeakers to retransmit according to the character of program.

The present invention compared with prior art has following advantage and beneficial effect:

1. the present invention is to after handling from 5+1 independent primary signal of multipath (comprising 5.1 paths and the Dolby Surround) surround sound of DVD etc., retransmits with earphone.Orientation effect in eliminating the head of earphone repeat, when retransmitting out 5.1 path surrounding sound effects, the present invention need not to simulate the reflected sound in the room of listening room, thereby can not bring new factitious auditory effect.

2. the present invention can simulate many circulating loudspeakers effect of public's movie theatre in earphone repeat, and is not the effect of the left and right circulating loudspeaker of home theater, thereby the sound effect when retransmitting is further improved.

3. the present invention can adopt general or special purpose DSP hardware circuit to realize, also can adopt the software of algorithmic language (as VC++) establishment to realize on multimedia computer.

4. the present invention can be used as special hardware circuit and is used in the sound reproduction of aspects such as DVD, TV (comprising DTV and HDTV), home theater, also can be used as the sound reproduction that hardware or software are used in multimedia computer.

(4) description of drawings

Fig. 1 is a block diagram of the present invention;

Fig. 2 is the loud speaker orientation layout plan of 5.1 paths and the transfer function schematic diagram that arrives ears;

Fig. 3 is the circulating loudspeaker group orientation diagram in the cinema;

Fig. 4 is the orientation diagram of a plurality of virtual circulating loudspeaker;

Fig. 5 is decorrelation and virtual a plurality of circulating loudspeaker schematic diagram;

Fig. 6 is the flow chart that the earphone surround sound signal is handled;

Fig. 7 is the flow chart of signal processing software;

Fig. 8 is a virtual sound image orientation schematic diagram.

(5) embodiment

Below in conjunction with drawings and Examples, the present invention is described in further detail.

System block diagram of the present invention as shown in Figure 1, it can to the input 5.1 path surround sound signals handle after, use earphone repeat.Fig. 2 is the transmission that ears were arranged and arrived in the loud speaker orientation of general home theater 5.1 paths.

Circulating loudspeaker group orientation in the cinema is arranged as shown in Figure 3.Use for reference the method for cinema, the present invention need carry out decorrelation to left and right sides surround sound signal and handle, and a plurality of virtual circulating loudspeakers of feeding.By signal processing theory, the decorrelation of signal is handled and can be carried out in time domain or frequency domain, but the processing of time domain is comparatively simple.The simplest method is with signal lag.For example, to the ls signal, behind time-delay t1 and the t2, obtain ls1, ls2 signal respectively, suitably select t1, t2, make them surpass the auto-correlation time of signal, can make ls, ls1, ls2 is the signal of three decorrelations.In like manner,, behind delay time respectively t3, the t4, obtain rs1 and rs2 respectively, and rs, rs1, rs2 are the signal of decorrelation to the rs signal.

As for the selection of time of delay t1, t2 and t3, t4, should consider psychoacoustic factor:

(1) for the correlation of erasure signal more effectively, t1/t2 and t3/t4 should be non-integer;

(2) for fear of the signal lag echogenicity, t1, t2, t3, t4 should be in the scopes of Hass (precedence effect);

(3) the surround sound signal majority is the reflected sound of original sound field, and at this moment, ls, ls1, ls2, rs, six correlation between signals of rs1, rs2 are low more, and subjective Sensurround acoustically is good more.But surround sound signal also can include the audio-visual effects that some are specific (requiring the location) (for example, helicopter march around the arena effect) sometimes.At this moment, the original surround sound signal ls of two-way, rs are correlated with.Therefore, the signal processing of decorrelation should be considered this problem.The present invention proposes to adopt the decorrelation of left and right symmetry to handle, and just gets t1=t3, t2=t4, and when original ls, rs were mutual incoherent signal, it also was six mutual incoherent signals that ls, the ls1, ls2, rs, rs1, the rs2 that obtain are handled in decorrelation.And when original ls, rs were coherent signal, in ls and rs, ls1 and rs1, ls2 and three pairs of signals of rs2, for what be correlated with, difference was uncorrelated mutually to signal with a pair of signal.So just can not influence the location of surround sound part.

Therefore, in processing of the present invention, can get t1=t3=7ms, t2=t4=11ms.In the practical application, t1, t2, t3, t4 can make adjustable within the specific limits, and reasonable scope is 5～30ms.

Surround sound signal just can be delivered to ls, ls1, ls2, rs, six signals of rs1, rs2 corresponding virtual speaker and retransmit after decorrelation is handled.The most direct method is to respond six circulating loudspeakers that fictionalize respectively as shown in Figure 4 with a coherent pulse.At this moment, recurrent signal is treated to Wherein, h _LL, h _RL, h _LC, h _RC, h _LR, h _RRBe respectively the coherent pulse response function of left, center, right, the place ahead loud speaker, h to the ears of attentive listener _LLS, h _RLS, h _LRS, h _RRS, h _LLS1, h _RLS1, h _LRS1, h _RRS1, h _LLS2, h _RLS2, h _LRS2, h _RRS2Be respectively the coherent pulse response function of six left and right sides circulating loudspeakers to the ears of attentive listener.But sort signal is handled need carry out the time domain process of convolution, quite complicated, therefore, is necessary further to simplify.

In fact, in the loud speaker of level difference type stereo (surround sound) is retransmitted, in order to produce the acoustic image of a certain position, space, might not be at loud speaker of this location arrangements, but the signal mixing (as level difference) of loud speaker that can be by changing other position, space produces.For example, for Fig. 2, utilize two circulating loudspeakers in rear (azimuth of establishing them is ± 120 °), when they present with identical signal simultaneously, press surround sound acoustic image ranging formula, the repeating transmission acoustic image positions is: Therefore, by continuously changing ls/rs, just can produce the acoustic image of optional position in ° scope of rear ± 120.When earphone repeat decorrelation 5.1 path surround sound signals, can use for reference this method, only need will ± 120 ° with a coherent pulse response function virtual come out of two circulating loudspeakers, and ls1, ls2, rs1, rs2 are mixed into respectively according to a certain percentage ± input of the signal of 120 ° of virtual speakers in, the signal of mixed circulating loudspeaker is input as ls ', rs ', then For example, when ls=rs=ls2=rs1=rs2=0, to the ls1 signal, in proportion

Ls '=α ₁Ls1 rs '=β ₁Ls1 (7) is mixed into respectively in ls ' and the rs ' signal.The ls ' of following formula, rs ' are replaced ls, rs in (5) formula respectively, and make θ ₁=135 °, and utilize formula (ls ' | ²+ | rs ' | ²=| ls1| ²(gross power conservation) can solve:

α ₁=0.999 β ₁=0.101 pair of ls2, rs1, rs2 three road signals are done similar processing, get

α ₁＝β ₃＝0.999，β ₁＝α ₃＝0.101

α ₂=β ₄=0.966 β ₂=α ₄The processing method of=0.259 earphone repeat decorrelation 5.1 path surround sound signals that can obtain simplifying.

Therefore original surround sound signal ls, rs is not that direct utilization (2) formula is handled, but elder generation carries out virtual processing again after the processing of Fig. 5 (or (6) formula) becomes ls ', rs '.Briefly, with input signal ls, the rs in ls ', rs ' replacement (2) formula after decorrelation and multi-loudspeaker processing, that is:

Above-mentioned signal processing can be simplified.In (8) formula, need the time-domain signal convolution altogether 10 times.But to the c signal, can use for reference the method for the mirage phantom center channel of multipath sound reproduction, just save the C loud speaker and feed simultaneously behind the c path signal decay-3dB (taking advantage of 0.707) L, R loud speaker.When c ≠ 0, other signals are zero, l=r=0.707c is arranged, acoustic image is just in the dead ahead like this.Thereby during earphone repeat, can be superimposed on l, the r signal after the c signal times 0.707, same again l, r signal are handled together.Like this, (8) formula is reduced to: Following formula need carry out the convolution of 8 signals altogether.But because left-right symmetric can be established h _LL=h _RR=a _l, h _LR=h _RL=b ₁, h _LLS=h _RRS=a ₂, h _LRS=h _RLS=b ₂Utilize symmetry, can further simplify signal processing.Can verify that (9) formula and following formula are of equal value fully:

[\begin{matrix} e_{L} \\ e_{R} \end{matrix}] = [\begin{matrix} 0.5 & 0.5 \\ 0.5 & - 0.5 \end{matrix}] {[\begin{matrix} σ_{1} & 0 \\ 0 & δ_{1} \end{matrix}] * [\begin{matrix} 1 & 1 \\ 1 & - 1 \end{matrix}] [\begin{matrix} l + 0.707 c \\ r + 0.707 c \end{matrix}] + [\begin{matrix} σ_{2} & 0 \\ 0 & δ_{2} \end{matrix}] * [\begin{matrix} 1 & 1 \\ 1 & - 1 \end{matrix}] [\begin{matrix} {ls}^{'} \\ {rs}^{'} \end{matrix}]} - - - (10)

Wherein, σ ₁=a ₁+ b ₁, δ ₁=a ₁-b ₁, σ ₂=a ₂+ b ₂, δ ₂=a ₂-b ₂, they are responded addition or are subtracted each other by a coherent pulse and obtain.And Be called and differ from matrix, i.e. MS matrix.Symbol * represents convolution algorithm, and does not have * number place to represent general phase multiplication.See from right to left in (10) formula, need the signal multiplication (or convolution of time domain) of four frequency domains altogether, signal processing obtains simplifying, and helps this hardware system of handling in real time exploitation.In actual applications, can realize top convolution by design various finite impulse response (FIR)s (FIR) or infinite impulse response (IIR) digital filter.

Flow chart of the present invention as shown in Figure 6, the present invention can utilize the hardware circuit that general signal processing chip makes to realize, also can be designed to special-purpose integrated circuit (IC) chip and realize, can also be designed to software and realize on multimedia computer.The present invention can be used for DVD, TV, home theater, aspects such as multimedia computer.

The application of embodiment one DVD and TV

After carrying out virtual processing with DVD decoding output or from 5.1 path surround sound (numeral) signals that digital television broadcasting obtains by flow process as shown in Figure 6, obtain two paths of signals e _L, e _R, the earphone repeat of feeding is then retransmitted out the effect of surround sound.Wherein, virtual signal is handled a part of hardware circuit that can be used as in the DVD player, also can be used as a part of hardware circuit of television set.

The application of embodiment two home theaters

With the amplifier of 5.1 path surround sound (numeral) signal mixing home theaters of DVD decoding output, it is as a part of functional circuit in the amplifier that virtual signal is as shown in Figure 6 handled, and obtains two paths of signals e _L, e _R, the earphone repeat of feeding.

The application of embodiment three multimedia computers

DVD-ROM by computer reads, and obtains 5.1 path surround sound (numeral) signals through decoding; The virtual signal of carrying out as shown in Figure 6 with computer software is handled (also can realize with special-purpose hardware circuit), the e that obtains on the sound card of computer then _L', e _R' signal, output to earphone repeat by sound card.

The present invention specifically introduces with the realization of software on multimedia computer.The used coherent pulse respective function of signal processing can measure by experiment, the pulse respective function sample frequency of used here HRIR function is 48kHz (also can be 44.1kHz), length 128 points (also can be 256 or 512 points), resolution 16bit, diffusion field equilibrium.

As shown in Figure 7, in fact, signal processing is exactly (from DVD or hard disk) and reads in 5.1 original path surround sound signals, again it is carried out virtual processing after, the earphone repeat of feeding.The primary signal of supposing 5.1 path surround sounds is a digital signal, then

The first step is respectively read in 128 time-domain signals to l, r, c, ls, rs, lfe respectively, deposits buffer in;

Second step utilized buffer to ls, rs time-delay 7ms and 11ms, obtained ls1, ls2, rs1, rs2, with they linear combination, obtained ls '=(ls+ α ₁Ls1+ α ₂Ls2+ α ₃Rs1+ α ₄Rs2), rs '=(rs+ β ₁Lrs1+ β ₂Ls2+ β ₃Rs1+ β ₄Rs2); (annotate: straight-through when selecting, when just not doing the processing of decorrelation and virtual multi-loudspeaker, can save this step)

The 3rd step was added and subtracted (MS) computing to 128 time domain l, r and ls ' rs ' respectively, obtained (l+r), (l-r) and (ls '+rs '), (ls '-rs ');

The 4th step is 128 time domain c, lfe addition, and after taking advantage of 1.414, obtains 1.414 (c+lfe), with it and (l+r) addition, obtains (l+r+1.414c+1.414lfe);

The 5th step was utilized the partitioned fast convolution algorithm, respectively with 128 time domain (l+r+1.414c+1.414lfe), (l-r), (ls '+rs '), (ls '-rs ') and 128 σ ₁, δ ₁, σ ₂, δ ₂Convolution obtains (l+r+1.414c+1.414lfe) * σ respectively ₁, (l-r) * δ ₁, (ls '+rs ') * σ ₂, (ls '-rs ') * δ ₂

The 6th step obtained the * σ of m '=(l+r+1.414c+1.414lfe) with the item addition of correspondence ₁+ (ls '+rs ') * σ ₂And the * δ of s '=(l-r) ₁+ (ls '-rs ') * δ ₂

The 7th step was added and subtracted conversion to m ', s ', and multiply by 0.5, obtained the e of time domain _L, e _R, they output to loud speaker and retransmit after the D/A of sound card conversion;

Process above the 8th step repeated is till handling.

As mentioned above, can realize the present invention preferably.

Experimental verification the ability of the virtual 5.1 path surround sounds of above-mentioned earphone system.Experiment is divided into two parts: acoustic image positioning experiment and subjective contrast experiment.

The acoustic image positioning experiment is an ability of retransmitting the acoustic image of different directions in the horizontal plane for verification system.At first, in the audio frequency work station, produce 5.1 path surround sound signals of different azimuth in the corresponding horizontal plane, common 5.1 path surround sounds are by discrete---signal mixing is produced the spatial sound picture, just pass through a certain loud speaker of signal mixing, and the signal of other loud speaker is zero, produce the acoustic image on the loudspeaker direction, by with a pair of adjacent loud speaker of signal mixing, and regulate this level difference and produce acoustic image between the loud speaker loudspeaker signal.Therefore, also use here discrete---right method produces signal:

(1) to the 0 °≤θ in the place ahead≤30 °, acoustic image produces r=ls=rs=0 by L and C loud speaker.Equally, to the place ahead-30 °≤θ＜0 °, acoustic image produces l=ls=rs=0 by R and C loud speaker.Adopt 5.1 path signal to feed, produce corresponding θ respectively _I=0 °, ± 15 °, ± 30 ° signal, θ _IThe attach most importance to position of sounding picture.

(2) to the rear, 120 °≤θ≤180 ° and-180 °＜θ≤-120 °, acoustic image produces l=r=c=0 by LS and RS loud speaker.To LS and the RS loud speaker appropriate signals of feeding, produce corresponding θ respectively _I=± 120 °, ± acoustic image of 150 ° and 180 °.

(3) 30 °＜θ of side direction＜120 °, acoustic image is produced by L, LS loud speaker, but research in the past shows, in 5.1 path surround sounds are retransmitted, only utilizing L and LS loud speaker is to be difficult to produce stable side direction acoustic image, and this is a defective of 5.1 path surround sounds.Therefore, be difficult to follow according to acoustic image positions decision signal mixing for side direction.Therefore, use here path level difference 20lg (ls/l) be respectively-12dB ,-signal of 6dB, 0dB, 6dB and 12dB is for it.

Testing used signal is pink noise, it is obtained 5.1 path signal in the different acoustic images of corresponding horizontal plane orientation (to each orientation by top method processing, signal length is 2 seconds), and add artificial reverberation to ls, rs path, reverberation time is 0.5s, the 10ms of reflection time-delay first, through echo reverberation ratio+3dB.With these 5.1 path signal by after the flow processing as shown in Figure 6, use earphone repeat (to annotate: add artificial reverberation here and should regard reverberation in the original 5.1 path surround sound signals as again, rather than the reflected sound of chamber is listened attentively in adding in virtual signal is handled, and in signal processing, only simulated a pair of virtual speaker, do not handle) adding decorrelation around signal, and in Fig. 6, ls, rs signal are selected to lead directly to, just do not do decorrelation and virtual many circulating loudspeaker processing.

During experiment, attentive listener judges the direction and the distance of virtual sound image.In order to allow attentive listener compare, play one section before the experiment earlier with Pan-pot normal stereo signal that produce, from left to right.Because in the earphone repeat, attentive listener retransmits not as good as loud speaker the accuracy of acoustic image location, so mainly as follows to the requirement of attentive listener in the experiment:

(1), only requires that attentive listener judges that to listen attentively in result and as shown in Figure 8 the acoustic image direction which the most approaching to acoustic image positions.When listening attentively to the result and overlap, just think that experimental result meets Theoretical Calculation, otherwise do not meet with notional result;

(2), only require that attentive listener judges that acoustic image is positioned at that head is outer, head surface or head to the distance of virtual sound image.

Totally 8 attentive listener participate in the experiment, and the result to them carries out statistical analysis at last.

Table 1 is the statistics of the place ahead and rear positioning experiment.As can be seen, to the place ahead-30 °≤θ≤30 ° acoustic image direction, there is the result of 7 attentive listener identical with notional result, and have an attentive listener the place ahead acoustic image to be judged into the mirror position at rear, in fact passing by many experiments shows, the sub-fraction people obscures the acoustic image of forward and backward mirror position easily really.So can think that still system can fictionalize the acoustic image in the place ahead in the horizontal plane.To the 120 °≤θ in rear≤180 ° and ° acoustic image orientation ,-180 °＜θ≤-120,8 equal correct judgments of attentive listener.

Table 1 experimental result statistics

Theoretical value θ (°)		??-30	-15	??0	??15	??30	-120	-150	???180	??150	???120
Theoretical value θ (°)		??-30	-15	??0	??15	??30	-120	-150	???180	??150	???120	The judgement number	In the right direction	???7	?7	??7	??7	??7	?8	?8	???8	??8	???8
In the head	???0	?0	??0	??0	??0	?0	?0	???3	??0	???0			In the right direction	???7	?7	??7	??7	??7	?8	?8	???8	??8	???8
In the head	???0	?0	??0	??0	??0	?0	?0	???3	??0	???0	Head surface		???2	?3	??7	??5	??1	?1	?5	???5	??2	???0
Outside the head	???6	?5	??1	??3	??7	?7	?3	???0	??6	???8	Head surface		???2	?3	??7	??5	??1	?1	?5	???5	??2	???0

For the acoustic image distance, there are 3 attentive listener to judge acoustic image in head during except θ=180 °, other all attentive listener judge acoustic image outside head or head surface, and when retransmitting common binary channel stereophonic signal, all 8 attentive listener judge that acoustic image is in head.Thereby after the virtual processing, can eliminate during repeating transmission or partly eliminate orientation effect in the head, more directly make moderate progress with the common binary channel stereophonic signal of earphone repeat.

For the location of side direction acoustic image, all attentive listener can feel obviously that all acoustic image is positioned at the preceding position behind left ear of left ear, but the particular location location is uncertain, and generally there is chattering in reaction when 20lg (ls/l) changes.For the acoustic image distance, 50% attentive listener judges acoustic image all outside head, and remaining 50% attentive listener judges that the acoustic image distance head outside or head surface, the situation of judgement acoustic image in head occur.Because just there was side direction acoustic image wild effect in normal 5.1 path loud speakers in retransmitting originally, so it is suitable substantially to think that lateral register experiment effect and normal 5.1 paths are retransmitted effect.

These experimental results show, after employing HRTF carries out virtual processing to 5.1 path surround sound signals, can produce audio-visual effects preferably in earphone repeat, and need not to introduce the room acoustics model of listening room in signal processing.

Subjective contrast experiment is in order to verify the effect of different signal processing methods.What test usefulness is one section common double path stereophonic signal of recording in anechoic chamber (orchestral music, prosperous Dare, snatch of music waterborne, length 20s), does following six kinds of processing respectively:

(1) is mixed into unipath (Mono) signal;

(2) do not deal with;

(3) in common double path stereophonic signal, add artificial reverberation signal (reverberation time 1.8s), the 15ms of reflection time-delay first, through echo reverberation ratio+3dB just only adds reflected sound, and need not respond process of convolution by a coherent pulse;

(4) with L, the R signal of binary channel stereophonic signal as 5.1 paths, and LS, C, RS signal all are zero, handle by flow process as shown in Figure 7 then, just to signal only with a coherent pulse response process of convolution, and do not add reflected sound;

(5) with L, the R signal of binary channel stereophonic signal as 5.1 paths, L, R are produced ls, the rs signal of reflection in the analog music Room (the reflected sound parameter is the same with artificial reverberation, note, here the reflected sound of Jia Ruing should be regarded the primary reflection sound in the 5.1 path surround sound signals as, rather than the secondary reflection sound of the listening room that in virtual processing, adds), signal is handled by flow process as shown in Figure 6, and selected straight-through function, the effect of virtual a pair of circulating loudspeaker.Just at first simulate with 5.1 path surround sounds and retransmit the signal of recording, carry out virtual processing again at music hall;

(6), but be to handle the effect of virtual 6 circulating loudspeakers at last by carrying out decorrelation to method shown in Figure 7 as Fig. 5 with (5).

Above-mentioned signal is used earphone repeat successively, make the effect (spatial impression, Sensurround) of which segment signal of attentive listener comparison best.Totally 8 attentive listener participate in the experiment.The result is:

For 6 segment signals, have 6 attentive listener to think that the 6th section Sensurround (spatial impression) is best, have 2 attentive listener to think that the 3rd section and the 6th section effect are optimum.This shows that 5.1 path surround sound signals are after decorrelation and virtual processing, and Sensurround and spatial impression are obviously strengthened.Simultaneously also explanation makes full use of the necessity of the reflected sound information that 5.1 path surround sound signals are comprised.

Research of the present invention obtains the subsidy of " the outstanding young teacher of the Ministry of Education subsidizes planning item " and TCL Kingbrand Electronic (Shenzhen) Co., Ltd..

Claims

1. the signal processing method of the earphone repeat of a path surround sound is characterized in that, it comprises the steps and treatment conditions:

That the 4th step was mixed into them with left and right path signal l, r and signal (l+r) and difference signal (l-r), in addition with after center channel signal c and the low-frequency effect path signal lfe mixing addition, amplification+3dB again, obtain signal 1.414 (c+lfe), again it is mixed into left and right path and signal (l+r) in, obtain signal [l+r+1.414 (c+lfe)];

The 5th step was carried out virtual processing with signal [l+r+1.414 (c+lfe)] with (l-r), obtained [l+r+1.414 (c+lfe)] * σ ₁(l-r) * δ ₁Two signals;

2. the signal processing method of the earphone repeat of a kind of 5.1 path surround sounds according to claim 1 is characterized in that, the mixed proportion of ls, ls1, ls2, rs1 and five signals of rs2 is 1,0.999,0.966,0.101 and 0.259 in second step; The mixed proportion of rs, ls1, ls2, rs1 and five signals of rs2 is 1,0.101,0.259,0.999,0.966.

3. the signal processing method of the earphone repeat of a kind of 5.1 path surround sounds according to claim 1, it is characterized in that, the virtual processing of the 3rd step neutralisation signals (ls '+rs ') and difference signal (ls '-rs '), above-mentioned exactly two signals respond resulting function σ with two by ± 120 ° of coherent pulses respectively ₂, δ ₂Carry out process of convolution.

4. the signal processing method of the earphone repeat of a kind of 5.1 path surround sounds according to claim 1, it is characterized in that, signal [l+r+1.414 (c+lfe)] and virtual processing (l-r) in the 5th step, above-mentioned exactly two signals respond resulting function σ with two by ± 30 ° of coherent pulses respectively ₁, δ ₁Carry out process of convolution.