CN112019994B

CN112019994B - Method and device for constructing in-vehicle diffusion sound field environment based on virtual loudspeaker

Info

Publication number: CN112019994B
Application number: CN202010805149.2A
Authority: CN
Inventors: 刘志恩; 魏浩钦; 刘惟伊; 谢丽萍; 胡杰; 罗玉兰; 刘浩; 孙唯; 张光
Original assignee: Wuhan University of Technology WUT; Dongfeng Motor Corp
Current assignee: Wuhan University of Technology WUT; Dongfeng Motor Corp
Priority date: 2020-08-12
Filing date: 2020-08-12
Publication date: 2022-02-08
Anticipated expiration: 2040-08-12
Also published as: CN112019994A

Abstract

The invention belongs to the technical field of digital signal processing, and discloses a method and a device for constructing a diffusion sound field environment in a vehicle based on virtual loudspeakers. The device comprises an audio decorrelation processing unit, an equalization processing unit, an HRTF filter unit and an audio gain superposition unit. The method solves the problem that the reality of the ASD sound is reduced due to the fact that passengers in the automobile perceive the ASD sound direction in practice, can achieve the surround effect of an acoustic field in the automobile on an acoustic space, and meets the set requirement of the quality of the sound in the automobile.

Description

Method and device for constructing in-vehicle diffusion sound field environment based on virtual loudspeaker

Technical Field

The invention relates to the technical field of digital signal processing, in particular to a method and a device for constructing an in-vehicle diffuse sound field environment based on a virtual loudspeaker.

Background

With the development of the automobile NVH (Noise, Vibration, Harshness) industry, people have new requirements on the automobile sound environment, and the pursuit of sound quality by various large host factories and consumers brought by the popularization of electric vehicles. The ASD (Active Sound Design) meets the driving experience of consumers in a form of simulating engine Sound, solves the problem of excessive silence of the Sound environment in the electric vehicle and improves the comfort level in the vehicle.

In order to design a more realistic engine sound effect, in addition to the requirement of closing the fidelity of the sound, the design of the speakers in the vehicle is also needed, so that passengers cannot feel that the engine sound is emitted from the speakers in the vehicle. The prior art is usually realized by additionally increasing the number of loudspeakers, so that the production cost of the automobile is increased.

Disclosure of Invention

The embodiment of the application provides a method and a device for constructing an in-vehicle diffusion sound field environment based on a virtual loudspeaker, and solves the problem that in practice, the reality of ASD sound is reduced due to the fact that in-vehicle passengers perceive the ASD sound direction.

The embodiment of the application provides a method for constructing an in-vehicle diffusion sound field environment based on a virtual loudspeaker, which comprises the following steps:

performing decorrelation processing on input original audio information to obtain a plurality of different first audio information;

constructing a target diffusion sound field environment, obtaining first HRTF parameter information by using artificial false head measurement, and performing frequency equalization processing on the first HRTF parameter information to obtain second HRTF parameter information;

building an HRTF filter according to the second HRTF parameter information, and filtering all the first audio information by using the HRTF filter to obtain a plurality of left audio sub-information and a plurality of right audio sub-information;

superposing all the left audio sub-information to obtain left channel information, and outputting the left channel information to a first real loudspeaker; and superposing all the right audio sub-information to obtain right channel information, and outputting the right channel information to a second real loudspeaker.

Preferably, before the decorrelation processing, the method further includes: performing analog-to-digital conversion processing on the original audio information through an audio processing chip;

before outputting the left channel information to the first real speaker, the method further comprises: performing digital-to-analog conversion processing on the left channel information;

before outputting the right channel information to the second real speaker, the method further comprises: and D/A conversion processing is carried out on the right channel information.

Preferably, the decorrelation processing is performed by using a time delay method, and the original audio information is respectively subjected to time delay processing according to different time delay times to obtain a plurality of different first audio information, where the audio of the left and right channels in each first audio information is the same, and is represented as:

e_0L(t)＝e_0R(t)＝e_a(t)

e_1L(t)＝e_1R(t)＝e₀(t-τ₁)

e_yL(t)＝e_yR(t)＝e₀(t-τ_y)

in the formula, e₀Representing the original audio information, τ_yRepresenting the delay interval corresponding to the y-th first audio information, e_yL(t) denotes the audio of the left channel in the y-th first audio information, e_yR(t) represents the audio of the right channel in the y-th first audio information.

Preferably, the specific implementation manner of obtaining the first HRTF parameter information by using artificial false head measurement in constructing the target diffuse sound field environment is as follows:

constructing a target diffusion sound field environment by arranging the first real loudspeaker and the second real loudspeaker, and recording sound information of each acquisition loudspeaker by using an artificial false head to obtain first HRTF parameter information, wherein the first HRTF parameter information is expressed as:

in the formula, s_L(t)、s_R(t) represents a time domain signal received by the artificial false head; e.g. of the type_L(t) an audio time domain signal representing the original left channel, corresponding to e_0L(t)、……、e_yL(t)；e_R(t) an audio time domain signal representing the original right channel, corresponding to e_0R(t)、……、e_yR(t)；h_L(t)、h_R(t) denotes a binaural impulse response; s_L(f)、S_R(f)、E_L(f)、E_R(f)、

All obtained by Fourier transform;

is a virtual loudspeaker head transfer function, i.e. first HRTF parameter information.

Preferably, the frequency equalization processing on the first HRTF parameter information adopts the following formula:

in the formula, H_L、H_RRespectively represent first HRTF parameter information, H'_L(θ，f)、H′_R(θ, f) respectively represent the second HRTF parameter information, and W is an amplitude normalization constant.

Preferably, the specific implementation manner of building the HRTF filter is as follows: and calculating the IIR filter coefficient of the corresponding HRTF by using the CAPZ model, and constructing an IIR filter model in Sigma Studio by using the obtained IIR filter coefficient.

Preferably, the IIR filter coefficients are calculated by using the following formula:

for HRTF parameters of a space direction theta, a filter system function designed by a CAPZ model is adopted

Comprises the following steps:

A(z)＝1+a₁z^-1+…+a_Pz^-P

B(z)＝b₀(θ)+b₁(θ)z^-1+…+b_Q(θ)z^-Q

wherein the P coefficients of the A (z) part are independent of the sound source direction and determine

P common poles; q coefficients of B (z) are related to the direction of the sound source, and they determine

Q zeros related to the sound source direction;

representing a common transfer function independent of the direction of the sound source; b (z) represents a direction transfer function related to a direction;

in the formula (I), the compound is shown in the specification,

is and

the impulse response of the corresponding filter; δ (n-q) is the unit impulse response, i.e., HRIR;

let it be assumed that the original HRIR for M spatial directions is known, denoted as h (θ)_iN), i is 0, 1, …, M-1, HRIR length in each direction is N points, i.e. N is 0, 1, …, N-1; for the ith direction, the squared error of the impulse response of the filter from the original HRIR is:

in the formula, a known h (. theta.) is used_iN-p) instead of

Obtaining:

the sum of the squared errors in all directions is:

in the formula, epsilon_allContaining P coefficients a independent of the direction of the sound source_pAnd M (Q +1) coefficients b related to the direction of the sound source_q(θ_i) The total number of P + M (Q +1) waiting coefficients is;

define a column matrix (vector) of [ M (N + P) ]. times.1:

writing all pending coefficients as a column matrix x of [ P + M (Q +1) ]. times.1:

x＝[a₁，a₂，…，a_P，b₀(θ₀)，b₀(θ₁)，…，b₀(θ_M-1)，…，b_Q(θ₀)，b_Q(θ₁)，…，b_Q(θ_M-1)]^T

e＝h₁-[A]x

in the formula, h₁Is [ M (N + P)]X 1 column matrix, [ A ]]Is [ M (N + P)]×[P+M(Q+1)]Matrix, from known h (θ)_iN) obtaining;

the sum of squared errors can be written as:

ε_all＝e⁺e＝h₁-[A]x⁺h₁-[A]x

in the formula, the symbol "+" represents the transposed conjugate of the matrix;

selecting P + M (Q +1) coefficients such that ε of the above formula_allMinimum:

the solution that can be obtained for the CAPZ filter coefficients is:

x＝{[A]⁺[A]}^-1[A]⁺h₁

in the formula, x represents the calculated IIR filter coefficient.

On the other hand, the embodiment of the present application provides an apparatus for constructing an in-vehicle diffuse sound field environment based on virtual speakers, including:

the audio decorrelation processing unit is used for performing decorrelation processing on the original audio information;

the equalization processing unit is used for carrying out equalization processing on the first HRTF parameter information obtained by artificial false head measurement to obtain second HRTF parameter information;

the HRTF filter unit is used for building an HRTF filter according to the second HRTF parameter information and filtering all the first audio information;

the audio gain superposition unit is used for superposing all the left audio sub-information to obtain left channel information and superposing all the right audio sub-information to obtain right channel information;

the device for constructing the in-vehicle diffuse sound field environment based on the virtual loudspeaker is used for realizing the steps in the method for constructing the in-vehicle diffuse sound field environment based on the virtual loudspeaker.

Preferably, the apparatus for constructing an in-vehicle diffuse sound field environment based on virtual speakers further comprises:

the audio input unit is used for carrying out analog-to-digital conversion processing on the original audio information before decorrelation processing;

the audio output unit is used for performing digital-to-analog conversion processing on the left channel information before outputting the left channel information to a first real loudspeaker; and the processing module is used for performing digital-to-analog conversion processing on the right channel information before outputting the right channel information to a second real loudspeaker.

One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:

in the embodiment of the application, decorrelation processing is performed on input original audio information to obtain a plurality of different first audio information; constructing a target diffuse sound field environment, obtaining first HRTF parameter information by using artificial false head measurement, and carrying out frequency equalization processing on the first HRTF parameter information to obtain second HRTF parameter information; building an HRTF filter according to the second HRTF parameter information, and filtering all the first audio information by using the HRTF filter to obtain a plurality of left audio sub-information and a plurality of right audio sub-information; superposing all the left audio sub-information to obtain left channel information, and outputting the left channel information to a first real loudspeaker; and superposing all the right audio sub-information to obtain right channel information, and outputting the right channel information to a second real loudspeaker. The invention leads in the audio and carries out decorrelation processing on the original audio to obtain a plurality of audio, frequency equalization is carried out according to the measured Head Transfer Functions (HRTF), HRTF filter design is carried out, the audio virtualizes sounds from different directions after passing through the HRTF filter, the obtained audio in a plurality of directions is added, and finally the audio is transmitted to a real loudspeaker. The device corresponding to the method comprises an audio decorrelation processing unit, an equalization processing unit, an HRTF filter unit and an audio gain superposition unit. The virtual loudspeaker is constructed to solve the problem that the false impression of sound is generated in an active sound generating system in a vehicle because a passenger accurately positions the loudspeaker generating sound from the ASD system, so that the authenticity of the designed ASD sound is reduced, and the driving experience of the passenger is influenced.

Drawings

In order to more clearly illustrate the technical solution in the present embodiment, the drawings needed to be used in the description of the embodiment will be briefly introduced below, and it is obvious that the drawings in the following description are one embodiment of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a flowchart of a method for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to an embodiment of the present invention;

fig. 2 is a schematic model of a method for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to an embodiment of the present invention;

fig. 3 is a structural diagram of a method for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to an embodiment of the present invention;

fig. 4 is a schematic diagram of decorrelation processing in a method for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to an embodiment of the present invention;

fig. 5 is a structural diagram model of an IIR filter in the method for constructing an in-vehicle diffuse sound field environment based on a virtual speaker according to the embodiment of the present invention.

Detailed Description

In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.

The embodiment provides a method for constructing an in-vehicle diffuse sound field environment based on virtual speakers, as shown in fig. 1, fig. 2, and fig. 3, the method mainly includes the following steps:

step 1, performing analog-to-digital conversion processing on original audio information through an audio processing chip.

The original audio e₀And inputting the data into a DSP chip, and performing analog-to-digital conversion in the DSP chip.

In particular, the original audio e₀And the audio signal is transmitted to an audio processing chip for the chip to read. The input audio may be single-channel audio or multi-channel audio.

One specific audio input method is: an audio input model is built in DSP processing software Sigma Studio, an audio connection line is used for connecting the DSP development board with an audio player, audio is input to a DSP chip, and data of each channel is separated so as to facilitate subsequent data processing.

And 2, performing decorrelation processing on the original audio information to obtain a plurality of different first audio information.

Specifically, the decorrelation processing is performed by using a time delay method, and the original audio information is respectively subjected to time delay processing according to different time delay times to obtain a plurality of different first audio information, where the audio of the left and right channels in each first audio information is the same, and is represented as:

e_OL(t)＝e_OR(t)＝e₀(t)

e_1L(t)＝e_1R(t)＝e₀(t-τ₁)

e_yL(t)＝e_yR(t)＝e₀(t-τ_y)

in the formula, e₀Representing the original audio information, τ_yRepresenting the delay interval corresponding to the y-th first audio information, e_yL(t) denotes the audio of the left channel in the y-th first audio information, e_yR(t) denotes the right in the y-th first audio informationAudio of the soundtrack.

I.e. to the original audio e₀Performing decorrelation processing to generate a plurality of audio e₀、e₁、……、e_y。

Take the final generation of 3 audios as an example, for the original audio e₀Performing decorrelation processing to generate a plurality of audio e₀、e₁、e₂。

The audio decorrelation process includes: the delay time is set, and each path of audio is delayed to obtain a new audio, as shown in fig. 4.

The method has the main technical key points that the method comprises the steps of calculating delay time, building an audio delay processing model in DSP processing software Sigma Studio, carrying out delay processing on the audio to obtain a plurality of new audios, eliminating the relevance of signals to a certain extent and enhancing the surrounding sense of subjective hearing. Namely, the audio is delayed in different degrees, so that the correlation between the generated new audio is reduced, and the sound surrounding sense in subjective auditory sense is generated.

The time delay calculation combines the priority effect of sound, and the relative time delay tau between the multiple sound source signals exceeds a certain lower limit tau_LBut not exceeding a certain upper limit τ_HWhen this occurs, a different spatial auditory effect of the synthesized sound image localization occurs, in which the audio of the left and right channels is the same.

e_0L(t)＝e_0R(t)＝e₀(t)

e_1L(t)＝e_1R(t)＝e₀(t-τ₁)

e_2L(t)＝e_2R(t)＝e₀(t-τ₂)

And 3, constructing a target diffuse sound field environment, obtaining first HRTF parameter information by using artificial false head measurement, and performing frequency equalization processing on the first HRTF parameter information to obtain second HRTF parameter information.

The artificial false head is used for measuring the HRTF parameter information of a plurality of loudspeakers capable of recording a target diffusion sound field, and the HRTF parameters are subjected to equalization processing, so that the problems of frequency distortion, change of reproduced subjective timbre and the like are avoided.

Specifically, a target diffuse sound field environment is constructed by arranging a first real loudspeaker and a second real loudspeaker (for example, 2 real loudspeakers in front in fig. 2), HRTF parameter information of each loudspeaker (namely, the loudspeaker placed when a head-related transfer function is collected in a half-anechoic chamber) is recorded by using an artificial dummy head, and frequency equalization processing is performed on the HRTF parameters, so that the problems of reproduced sound image distortion and frequency distortion are avoided.

The HRTF parameter equalization processing includes: measurement of virtual speaker head transfer function (HRTF) parameters, HRTF parameter equalization.

Specifically, step 3 mainly comprises the following steps:

(1) an ideal diffusion sound field environment is constructed by arranging real loudspeakers, and sound information of each loudspeaker is recorded by utilizing an artificial false head to obtain HRTF parameters (namely, the HRTF parameters are obtained

)。

In the formula, s_L(t)、s_R(t) time-domain signals received by the artificial dummy head, e_L(t)、e_R(t) is the original audioTime domain signal, h_L(t)、h_R(t) is the binaural impulse response, S_L(f)、S_R(f)、E_L(f)、E_R(f)、

All the data are obtained by Fourier transform,

is a head transfer function, i.e. first HRTF parameter information. In particular, e_L(t) an audio time domain signal representing the original left channel, corresponding to e_0L(t)、e_1L(t)、e_2L(t)；e_R(t) an audio time domain signal representing the original right channel, corresponding to e_0R(t)、e_1R(t)、e_2R(t)。

(2) And (3) equalizing the first HRTF parameter information by using the sum of the HRTF functions of the left ear and the right ear:

in the formula, H_L、H_RRespectively represent first HRTF parameter information, H'_L(θ，f)、H′_R(θ, f) respectively represent the second HRTF parameter information, and W is an amplitude normalization constant. I.e. using H'_L(θ，f)、H′_R(θ, f) instead of the original HRTF parameters.

And 4, building an HRTF filter according to the second HRTF parameter information, and filtering all the first audio information by using the HRTF filter to obtain a plurality of left audio sub-information and a plurality of right audio sub-information.

Specifically, a Common-acoustic-polar and Zero (CAPZ) model is used, i.e., IIR filter coefficients of corresponding HRTFs are calculated by using a Common Pole model and a Pole model related to a direction, an IIR filter model is built inside a DSP by using Sigma Studio, and after audio passes through a filter, sound with spatial information is generated.

Referring to fig. 5, the specific process of calculating IIR filter coefficients mainly includes:

for HRTF parameters in the spatial direction theta, a filter system function designed by using a CAPZ model is as follows:

A(z)＝1+a₁z^-1+…+a_Pz^-P

B(z)＝b₀(θ)+b₁(θ)z^-1+…+b_Q(θ)z^-Q

Q zeros related to the sound source direction;

representing a common transfer function independent of the direction of the sound source; b (z) represents a directional transfer function with respect to direction.

Wherein the content of the first and second substances,

is and

the impulse response of the corresponding filter, δ (n-q), is the unit impulse response (HRIR).

Let us assume that the original HRIR for M spatial directions (M corresponding to the number of virtual loudspeakers) is known, denoted h (θ)_iN), i is 0, 1, …, M-1, and the HRIR length in each direction is N points, i.e., N is 0, 1, …, N-1. For the ith direction, the squared error of the impulse response of the filter from the original HRIR is:

wherein the known h (theta) is used_iN-p) instead of

Obtaining:

the sum of the squared errors in all directions is:

wherein epsilon_allContaining P coefficients a independent of the direction of the sound source_pAnd M (Q +1) coefficients b related to the direction of the sound source_q(θ_i) There are P + M (Q +1) determinants.

Define a column matrix (vector) of [ M (N + P) ]. times.1:

writing all undetermined coefficients into a column matrix x of [ P + M (Q +1) ] × 1, i.e.:

e＝h₁-[A]x

wherein h is₁Is [ M (N + P)]X 1 column matrix (vector), [ A ]]Is [ M (N + P)]×[P+M(Q+1)]Matrix, from known h (θ)_iAnd n) obtaining.

And the sum of squared errors can be written as:

ε_all＝e⁺e＝h₁[A]x⁺h₁-[A]x

the symbol "+" represents the transposed conjugate of the matrix.

Selecting P + M (Q +1) coefficients such that ε of the above formula_allThe minimum, namely:

the solution that can be obtained for the CAPZ filter coefficients is:

x＝{[A]⁺[A]}^-1[A]⁺h₁

in the formula, x represents the calculated IIR filter coefficient.

And (4) constructing an IIR filter model in Sigma Studio software of the DSP by using the obtained filter coefficient.

Step 5, superposing all the left audio sub-information to obtain left channel information, and outputting the left channel information to a first real loudspeaker after performing digital-to-analog conversion processing on the left channel information; and superposing all the right audio sub-information to obtain right channel information, and outputting the right channel information to a second real loudspeaker after performing digital-to-analog conversion processing on the right channel information.

Namely, the filtered audio is superposed and then is transmitted to a corresponding loudspeaker through digital-to-analog conversion.

Specifically, the filtered left and right channel audio frequencies are respectively added and combined into a left channel audio frequency and a right channel audio frequency, and are sent to the corresponding real speakers after digital-to-analog conversion.

As shown in fig. 2, the invention constructs 5 virtual speakers, and produces sound through 2 actual speakers, thereby achieving the purpose of realizing the sound field diffusion effect of 5 speakers. The invention realizes the effect of playing a plurality of virtual loudspeakers through 2 actual loudspeakers, generates the sense of encirclement on subjective feeling, confuses the perception of passengers to the loudspeakers in the vehicle, and thus achieves the purpose of improving the ASD sound reality sense.

Furthermore, it is also possible to simulate 5 virtual speakers by the front 2 actual speakers and the rear 5 virtual speakers by the rear 2 actual speakers.

It should be noted that the basic solution of the present invention is to use only the front 2 actual speakers to construct the diffuse sound field, because most of the vehicles of ASD sound in the market today sound through the front 2 speakers in the vehicle. In specific application, whether 5 rear virtual speakers need to be constructed or not can be considered according to the actually constructed in-vehicle effect. If a rear virtual loudspeaker needs to be constructed, the loudspeaker at the rear in the vehicle is designed to generate a corresponding diffused sound field by the same method (namely, the steps 3 to 5 are repeated).

Corresponding to the method, the embodiment further provides an apparatus for constructing an in-vehicle diffuse sound field environment based on the virtual speaker, including:

the audio input unit is used for carrying out analog-to-digital conversion processing on the original audio information before decorrelation processing; i.e. audio input to the DSP chip.

The audio decorrelation processing unit is used for performing decorrelation processing on the original audio information; namely, a plurality of audios with low correlation are generated in the DSP chip, and a sound effect with good subjective auditory sense surrounding sense is generated.

The equalization processing unit is used for carrying out equalization processing on the first HRTF parameter information obtained by artificial false head measurement to obtain second HRTF parameter information; namely, the measured HRTF parameters are equalized, so that the problems of frequency distortion and sound color change are avoided.

The HRTF filter unit is used for building an HRTF filter according to the second HRTF parameter information and filtering all the first audio information; the filter coefficient is calculated by referring to the equalized HRTF parameters by using a CAPZ model, and an IIR filter is designed in a DSP chip, so that the problem of overlarge direct convolution calculation is solved.

The audio gain superposition unit is used for superposing all the left audio sub-information to obtain left channel information and superposing all the right audio sub-information to obtain right channel information; i.e. the filtered sounds are summed in the DSP chip.

The audio output unit is used for performing digital-to-analog conversion processing on the left channel information before outputting the left channel information to a first real loudspeaker; the digital-to-analog conversion processing is carried out on the right channel information before the right channel information is output to a second real loudspeaker; i.e. finally to the real speaker (actual speaker).

In addition, the actual speaker is a speaker directly used in the car, and the actual speaker is directly connected to the corresponding speaker through an audio line by building a model in an ADAU1467 development board.

In order to verify the sound field effect, the invention also provides a verification method for the effectiveness and the correctness of the in-vehicle diffused sound field environment constructed based on the virtual loudspeaker. Firstly, a sound pressure curve of the constructed virtual loudspeaker at the position of the human ear is compared with a sound pressure curve of the human ear measured by actually using the corresponding real loudspeaker, the construction effect of the virtual loudspeaker is checked according to the comparison result, then the sound pressure of the human ear measured by actually using the loudspeaker is calculated, and a binaural auditory cross-correlation coefficient (IACC) is used as an index for measuring the sound spatial sensation. IACC is defined as binaural time-domain sound pressure p_L(t)、p_R(t) maximum of normalized cross-correlation function:

wherein p is_L(t) denotes the measured sound pressure at the left ear of the human, p_R(t) represents the measured sound pressure at the right ear of the person,

representing the cross-correlation function of the left and right human ears,

the left-ear autocorrelation function is represented as,

the left ear autocorrelation function is represented, IACC is more than or equal to 0 and less than or equal to 1, when the IACC is closer to 1, the auditory sense can generate clear sound images, the localization of the sound images is more clear, and the lower the IACC is, the more fuzzy the sound images are, so that the sound images cannot be localized.

The method and the device for constructing the in-vehicle diffused sound field environment based on the virtual loudspeaker provided by the embodiment of the invention at least have the following technical effects:

(1) the virtual loudspeaker is constructed to solve the problem that a passenger accurately positions the loudspeaker which emits sound to the active sound production (ASD) system in the vehicle to generate a false sound impression, so that the authenticity of the designed ASD sound is reduced, and the driving experience of the passenger is influenced.

(2) The invention carries out decorrelation processing on the audios according to the priority effect (Hass effect) in the spatial auditory effect, weakens the correlation among the audios, ensures that the existence of one sound source has a masking effect on the spatial position information of another sound source, is limited to the spatial position information instead of the spatial auditory information, and not only meets the aims of fuzzy sound source positioning but also realizes the sense of enclosure.

(3) The invention balances the frequency (tone) of the measured HRTF parameters, further balances the balance among energies of different frequencies, and avoids the problems of sound image distortion of reproduced sound and tone color change caused by frequency distortion.

(4) The virtual loudspeaker is constructed by utilizing the head transfer function from the sound source to the human ear, the actually measured binaural impulse response can be convoluted with the audio to achieve the addition of the space effect, but one audio needs to be convoluted twice, five virtual loudspeakers need to be convoluted ten times, and the calculation amount is overlarge, so that the IIR filter coefficient of the HRTF is calculated by adopting a CAPZ model.

(5) The method for constructing the in-vehicle diffusion sound field environment by the virtual loudspeaker solves the problem that the in-vehicle loudspeaker occupies a large space, reduces the cost for arranging the loudspeaker, and meanwhile confuses the judgment of passengers on the ASD sound source position information by the diffusion sound field, so that the reality of the ASD is improved.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention has been described in detail with reference to examples, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, which should be covered by the claims of the present invention.

Claims

1. A method for constructing an in-vehicle diffuse sound field environment based on virtual speakers is characterized by comprising the following steps:

superposing all the left audio sub-information to obtain left channel information, and outputting the left channel information to a first real loudspeaker; superposing all the right audio sub-information to obtain right channel information, and outputting the right channel information to a second real loudspeaker;

the sound emitted by the first real speaker and the second real speaker is used for an active sound design ASD in the vehicle.

2. The method for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to claim 1, wherein before the decorrelation process, further comprising: performing analog-to-digital conversion processing on the original audio information through an audio processing chip;

3. The method for constructing the in-vehicle diffuse sound field environment based on the virtual speaker according to claim 1, wherein the decorrelation processing is performed by a time-delay method, and the original audio information is respectively subjected to time-delay processing according to different time-delay times to obtain a plurality of different first audio information, wherein the audio of the left channel and the audio of the right channel in each first audio information are the same and are represented as follows:

e_0L(t)＝e_0R(t)＝e₀(t)

e_1L(t)＝e_1R(t)＝e₀(t-τ₁)

e_yL(t)＝e_yR(t)＝e₀(t-τ_y)

4. The method for constructing the in-vehicle diffuse sound field environment based on the virtual speaker as claimed in claim 3, wherein the specific implementation manner of obtaining the first HRTF parameter information by using the artificial dummy head measurement to construct the target diffuse sound field environment is as follows:

All obtained by Fourier transform;

5. The method for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to claim 4, wherein the frequency equalization processing on the first HRTF parameter information adopts the following formula:

6. The method for constructing the in-vehicle diffuse sound field environment based on the virtual speaker as claimed in claim 1, wherein the specific implementation manner of constructing the HRTF filter is as follows: and calculating the IIR filter coefficient of the corresponding HRTF by using the CAPZ model, and constructing an IIR filter model in Sigma Studio by using the obtained IIR filter coefficient.

7. The method for constructing the in-vehicle diffuse sound field environment based on the virtual speaker as claimed in claim 6, wherein the IIR filter coefficient is calculated by using the following formula:

Comprises the following steps:

A(z)＝1+a₁z^-1+…+a_Pz^-P

B(z)＝b₀(θ)+b₁(θ)z^-1+…+b_Q(θ)z^-Q

Q zeros related to the sound source direction;

in the formula (I), the compound is shown in the specification,

is and

in the formula, a known h (. theta.) is used_iN-p) instead of

Obtaining:

the sum of the squared errors in all directions is:

define a column matrix (vector) of [ M (N + P) ]. times.1:

e＝h₁-[A]x

the sum of squared errors can be written as:

ε_all＝e⁺e＝h₁-[A]x⁺h₁-[A]x

the solution that can be obtained for the CAPZ filter coefficients is:

x＝([A]⁺[A]}^-1[A]⁺h₁

in the formula, x represents the calculated IIR filter coefficient.

8. An apparatus for constructing an in-vehicle diffuse sound field environment based on virtual speakers, comprising:

the device for constructing the in-vehicle diffusion sound field environment based on the virtual loudspeaker is used for realizing the steps in the method for constructing the in-vehicle diffusion sound field environment based on the virtual loudspeaker in any one of claims 1 to 7.

9. The apparatus for constructing an in-vehicle diffuse sound field environment based on virtual speakers according to claim 8, further comprising: