CN108632709B

CN108632709B - Immersive broadband 3D sound field playback method

Info

Publication number: CN108632709B
Application number: CN201810352481.0A
Authority: CN
Inventors: 贾懋珅; 张家铭; 鲍长春; 吴宇轩
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2018-04-19
Filing date: 2018-04-19
Publication date: 2021-04-27
Anticipated expiration: 2038-04-19
Also published as: CN108632709A

Abstract

The invention discloses an immersive broadband 3D sound field replaying method, which comprises the steps of firstly, calculating the virtual sound source arrival of a scene A placed at a specified space positionTaking the function value as the sound pressure value of the virtual sound source radiation sound field; secondly, setting a loudspeaker array of a certain wall surface in a scene B as a regular rectangular equal-interval layout, and modeling acoustic propagation characteristics from all loudspeakers to listening points by utilizing a Green function based on the fluctuation characteristics of sound waves; thirdly, based on the linear convex optimization theory, dividing l₁And the norm is used as a sparse rule operator, regularization operation is carried out by utilizing an alternating direction multiplier method, the center frequencies of eight frequency bands in 1 octave are selected to calculate the weight of the loudspeaker, and the loudspeaker is activated to be selected. Finally, use l₂And (3) norm regularization, namely calculating weight signals of activated loudspeakers in the playback system, so that the radiated sound field of a sound source to be played back is closest to the radiated sound field of the activated loudspeakers under the least mean square criterion.

Description

Immersive broadband 3D sound field playback method

Technical Field

The invention belongs to the technical field of acoustics, and particularly relates to an immersive broadband 3D sound field playback method.

Background

Spatial sound reproduction is the most critical item of three-dimensional audio technology in order to reproduce a desired sound scene as accurately as possible, giving the listener a sense of presence and immersion. The main implementation is to reconstruct an acoustic environment in a given space that is consistent with the desired sound field using a set of loudspeaker arrays. In an existing sound field playback system, a loudspeaker array is usually arranged in a ring or spherical array mode, and a ball rack and other large-scale equipment are needed to be used in actual placement, so that operation difficulty and experience inconvenience are brought to a user. In recent years, with the development of virtual reality and augmented reality, the demands of users on immersive experiences are more urgent, and great development opportunities for playback of immersive spatial sound fields are met. The primary objective of immersive playback is to create a sense of sharing the same ambient atmosphere for the participants, and to play back a sound source that is geographically separated from the user by the loudspeaker array as if the user were in the environment of the sound source, thereby enabling the user to experience personally on the scene. For individual users, the complex ring/sphere loudspeaker array layout is far from realistic. Aiming at the problem, the invention designs a 3D sound field playback system facing an immersive broadband voice signal scene by utilizing a loudspeaker layout mode based on a planar array.

In immersive interaction, 3D video provides the user with position information of sound sources in visual terms, and sound field playback can deepen the user's presence from an auditory perspective as a supplement to visual information in this process. The immersion depends on the synchronous playing of audio and video, which requires high real-time performance and low delay of a playback algorithm. At present, most of sound field reproduction algorithms with higher precision use a sound pressure value in an audible region in a scene as target sound field information, and use a loudspeaker array to recover the sound pressure information. In the reconstruction process, the calculation is usually performed by using a spherical harmonic expansion form of a column/spherical coordinate, and a high-order Hankel function, a Bessel function and an associated Legendre function are used as auxiliary functions. Therefore, the calculation complexity is high, and the application in an actual scene is difficult. Therefore, the design adopts a sound pressure matching technology to recover a target sound field, and l is introduced₁And l₂The degree of fitting to the original sound pressure information is controlled through regularization, so that the robustness of the algorithm is improved. In the process of solving the regularization problem, an alternating direction multiplier method is introduced to reduce the calculation complexity, improve the operation speed of the algorithm and achieve higher real-time requirement.

Disclosure of Invention

The invention provides an immersive broadband 3D sound field playback method aiming at the problems of complicated loudspeaker layout and higher algorithm complexity in the existing multi-channel audio playback system, which is suitable for the playback of a broadband voice signal 3D sound field in an immersive application scene, and a target sound field is reconstructed by utilizing a group of loudspeakers arranged on a wall surface in a sound pressure matching mode.

The invention is used for solving the problem that the accurate perception and playback of a target sound source in an actual scene are difficult, and comprises the following steps:

step 1, taking the central position of the wall surface where the loudspeaker is located as an original point, and establishing a space rectangular coordinate system, wherein an x axis is parallel to the ground, a y axis is perpendicular to the ground, and a z axis is perpendicular to the wall surface. The coordinate position of a virtual sound source to be replayed is designated as s, the number of used loudspeakers is set to be N at most, and M positions are selected from a listening area to listen (namely, the number of listening points is M).

And 2, constructing a virtual sound source radiation sound field. And constructing acoustic propagation characteristics from the omnidirectional virtual sound source to the listening point based on the acoustic transmission function, and taking the function value as a sound pressure value of the sound field radiated by the sound source in the specified area. The invention considers that the application object is a broadband voice signal, so the upper limit value of the frequency to be calculated is 16kHz, the lower limit value is 100Hz, 100 Hz-16 kHz is divided into 8 frequency bands according to an octave, the central frequency of each frequency band is taken, and 8 single-frequency virtual sound sources are selected for calculation.

And 3, constructing a loudspeaker radiation sound field. And setting the positions of the N loudspeakers, and respectively calculating the sound pressure values generated by all listening point positions of the loudspeakers as the sound pressure value set of the sound field radiated by the loudspeakers.

And 4, solving a loudspeaker weight vector aiming at the virtual sound source. By a 1₁Norm regularization, namely, constraining a virtual sound source radiation sound field and a loudspeaker radiation sound field, converting the solving problem of loudspeaker weights (also called loudspeaker excitation signals or loudspeaker driving signals, namely, signals played by a loudspeaker) into a minimum absolute contraction and selection operator problem, and solving by using an alternative direction multiplier method to obtain a group of loudspeaker weight vectors for replaying each virtual single-frequency sound source (total 8).

And step 5, determining the number of the activated loudspeakers. Summarizing the loudspeaker weight vectors corresponding to the 8 virtual single-frequency sound sources, and keeping nonzero items of contents in the loudspeaker weight vectors as selected loudspeakers (called activated loudspeakers) in a replay system; while removing the zero entries, i.e., the non-selected speakers (referred to as non-activated speakers).

And 6, solving a weight signal for activating the loudspeaker aiming at the sound source to be replayed. Calculating the radiation sound field of the sound source to be replayed and the radiation sound field of the activated loudspeaker again according to the step 2 and the step 3 for the sound source to be replayed and all the activated loudspeakers, and utilizing l₂And (4) norm regularization, wherein the weight of each loudspeaker is calculated under the minimum mean square criterion and is used as a driving signal of the loudspeaker during reproduction.

Preferably, the implementation manner of step 2 is as follows: characterization of frequency as f by Green's function_bThe acoustic transfer function from the virtual single-frequency sound source to the mth listening point:

in the formula (I), the compound is shown in the specification,

k_b＝2πf_bc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position (three-dimensional coordinates) of the virtual single-frequency sound source; y is_mThe location of the mth listening point is M ═ 1,2, …, M. In consideration of a broadband voice sound field, the center frequencies of 8 frequency bands in 1 octave from 100Hz to 16kHz are selected for joint optimization (the ratio of the upper limit frequency to the lower limit frequency of each frequency band is 2, and the starting frequency is 88Hz), namely, the 8 frequency bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set₁＝125Hz,…,f₈16kHz, i.e., b 1,2, …, 8.

In addition, the virtual sound source in the invention is used in the system design process (namely, the process of selecting the activated loudspeaker from all the loudspeakers), the activated loudspeaker is used for sound field reproduction, and the reproduced sound source is the same as the virtual sound source position. The virtual sound source in the invention is 8 frequencies respectively of f₁,f₂,…,f₈A single frequency sound source of (2). Considering that a virtual sound source has omni-directivity, the sound pressure value of its radiated sound field is equal to the green function value, and can be recorded as:

in the formula, T represents transposition operation; y is_mIs the position of the mth listening point, m is 1,2,…,M；

Is at a frequency f_bThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.

Preferably, step 3 is to find a set of sound pressure values of the sound field radiated by the loudspeaker. The acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated as follows:

in the formula, x_nThe coordinate of the nth speaker in the speaker array is N ═ 1,2, …, N. The set of acoustic transfer functions from the loudspeaker to all listening points is then:

let the loudspeaker weight vector to be found (also called loudspeaker excitation signal vector or loudspeaker drive signal vector) be w_b＝[w_b(1),w_b(2),…,w_b(N)]^TAnd N is the total number of speakers. At this time, the sound pressure value set of the sound field radiated by the loudspeaker

Can be expressed as:

preferably, step 4 is to obtain a weight vector (w) of the speaker required for reconstructing each virtual sound source (single-frequency sound source)_b). If the virtual sound source is directly made to radiate the sound field

With loudspeaker radiating sound field

Equality, solving for w by matrix inversion_bThis can cause ill-conditioned solutions and model overfitting problems. Consider l₁The regularization has the function of variable selection, so that the model is sparser, namely, the accuracy of fitting is ensured under the condition that a small number of loudspeakers are selected. So this design adopts₁The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:

in the formula, | | | non-conducting phosphor₁Represents l₁A norm; | | non-woven hair₂Represents l₂A norm; lambda [ alpha ]₁2 is a penalty factor;

is to make

Minimum, calculated w_bThe value is obtained. I.e. at frequency f_bAnd (5) obtaining the weight vector of the loudspeaker.

The design adopts an alternating direction multiplier method to solve the above formula, and firstly, the above formula is written into the form of the alternating direction multiplier method:

minimize f(w_b)+g(z_b)

s.t.w_b-z_b＝0

in the formula (I), the compound is shown in the specification,

g(z_b)＝λ₁||z_b||₁；z_bis used for constraining the weight vector w of the loudspeaker_bThe variable of (2). Solving the weight w of the loudspeaker_bThe process of (1) is an iterative process, and the weight value of the loudspeaker obtained by the (j + 1) th iteration is changedMeasurement of

The following formula is used to obtain:

in the formula, j +1 is iteration times, and j takes 0 as a starting point;

is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,

for the constrained variable of the jth iteration, the initial value of the iteration is initialized to

I is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]And (4) selecting. Constrained variables

The iterative update is shown as follows:

in the formula, S_κ(α) is a soft threshold decision function defined as follows:

dual variable

The iterative update procedure of (1) is as follows:

after j iterations, the main residual is

Dual residual is

The stopping criterion for the algorithm iteration is:

wherein epsilon^priIs the upper bound, ε, of the main residual^dualIs the upper limit of the dual residual, calculated by:

the design sets the iteration residual as: absolute error epsilon^abs＝10^-4Relative error e^rel＝10^-2. When the stop criterion is satisfied, output f_bA group of loudspeaker weight vectors w corresponding to frequency points_b. The weight vector w of 8 groups of loudspeakers under different frequencies is calculated_b，b＝1,2,…,8。

Preferably, step 5 is to summarize all the speaker weight vectors obtained in step 4 and statistically retain non-zero terms therein. All the speaker weight vectors w under 8 frequency points can be expressed as:

adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w_∑：

w_∑＝[w_∑(1),w_∑(2),…,w_∑(N)]^T

Wherein the content of the first and second substances,

n is 1,2, …, N is the total number of speakers. w is a_∑The loudspeaker corresponding to the medium non-zero element is judged as an activated loudspeaker, corresponding loudspeaker position information can be selected from the loudspeaker array, and the total number of the activated loudspeakers is recorded as N_a. Thus, the selection of N from N speakers is completed_aAnd a process of activating the speakers with which the later sound field reproduction is performed.

Preferably, step 6 is to calculate the N selected in step 5_aThe drive signals of the active loudspeakers are used to synthesize the sound pressure values of the sound source to be reproduced at the listening point. The frequency domain components contained in the sound source to be replayed are set as follows: f. of₁,f₂,…,f_h,…,f_HThe corresponding wave numbers are: k is a radical of₁,k₂,…,k_h,…,k_HWherein k is_h＝2πf_hAnd H represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index. The frequency content of the sound source to be reproduced is richer than that of the virtual sound source (used for the loudspeaker selection process), i.e. H>>8。

Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2

Wherein the content of the first and second substances,

k_h＝2πf_hc, s is the position of the actual single frequency to be replayed (same as the position of the virtual sound source); y is_mThe location of the mth listening point is M ═ 1,2, …, M.

Calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3

In the formula, w_h＝[w_h(1),…,w_h(N_a)]^TA weight vector for activating the loudspeaker; g_hSet of acoustic transfer functions for all active loudspeakers to all listening points:

wherein the content of the first and second substances,

representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, N_a，m＝1,2,…,M。

To minimize playback errors, the design utilizes l₂Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:

in the formula (I), the compound is shown in the specification,

is to make

W calculated at minimum_hThe value is obtained. I.e. at frequency f_hAnd (5) obtaining the weight vector of the activated loudspeaker. Lambda [ alpha ]₂>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and it is necessary to implement playback with as little total power as possible while satisfying the minimum error, according to the practical choice. The solving process of the above formula is:

wherein H represents a conjugate transpose operation, w_hIs N of the requested activation_aAt frequency f of a loudspeaker_hA lower drive signal.

Finally, w is₁,w₂,…,w_HAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.

The invention relates to an immersive broadband 3D sound field playback method based on a loudspeaker array; the research aim is to design a set of virtual sound system which accurately replays sound source signals in a scene A in a scene B through a loudspeaker array. According to the method, a convex optimization model theory is utilized, firstly, acoustic transmission functions from a virtual sound source (the position of the virtual sound source is the same as that of a sound to be reproduced and the frequency of the virtual sound source is different) of a scene A placed at a specified space position to each listening point in a scene B are calculated, and the function values are used as sound pressure values of a virtual sound source radiation sound field; secondly, setting a loudspeaker array of a certain wall surface in a scene B as a regular rectangular equal-interval layout, and modeling acoustic propagation characteristics from all loudspeakers to listening points by utilizing a Green function based on the fluctuation characteristics of sound waves; thirdly, based on the linear convex optimization theory, dividing l₁And the norm is used as a sparse rule operator, regularization operation is carried out by utilizing an alternating direction multiplier method, the center frequencies of eight frequency bands in 1 octave are selected to calculate the weight of the loudspeaker, and the loudspeaker is activated to be selected. Finally, use l₂Norm regularization, calculating weight signals of activated loudspeakers in a playback system so that the radiated sound field of a sound source to be played back and the radiated sound of the activated loudspeakers under the least mean square criterionThe fields are closest. The method has clear thought, low algorithm complexity and high real-time performance, solves the problem that the number of loudspeakers required by the actual three-dimensional sound field reproduction is large, and most importantly, the method can reproduce the sound through the loudspeakers in the planar array layout, is easy to actually place, can be applied to home theaters and network playing platforms, and can also be applied to office environments and audio and video conferences.

Drawings

Fig. 1 is a diagram of an example of speaker array playback.

FIG. 2 is a diagram illustrating an overall design method.

Detailed description of the invention

In the designed general flow, the sound pressure value from a virtual sound source (a single-frequency sound source with the same position as the sound source to be reproduced) to a listening point is firstly calculated, then, the acoustic transmission function from all the loudspeakers to the listening point is calculated, and the acoustic transmission function is multiplied by the corresponding loudspeaker weight value, so that the sound pressure expression of the sound field radiated by the loudspeakers is obtained. By a 1₁And (4) calculating and summarizing the loudspeaker weights under all frequency points by combining norm regularization and an alternative direction multiplier method, and keeping nonzero items in the loudspeaker weights as activated loudspeakers. And finally, respectively calculating sound pressure values of the sound source to be replayed and the loudspeaker to be activated to a listening point, and calculating driving signals of the loudspeakers by using the minimum mean square as a criterion to obtain a playing signal of each loudspeaker so as to complete the overall design. In implementation, the algorithm of the present invention is embedded into software to realize automatic operation of each process, and the following specific implementation steps are combined with fig. 1 and fig. 2 to further explain the present invention, and the specific work flow is as follows:

step 1: and establishing a space rectangular coordinate system by taking the central position of the wall surface where the loudspeaker is positioned as an original point. Wherein the x-axis is parallel to the ground, the y-axis is perpendicular to the ground, and the z-axis is perpendicular to the wall surface. The coordinate position of the virtual sound source to be played back is designated as s. Setting the number of listening points in a listening area and the number of loudspeakers used at most as M and N respectively, and calculating the spatial position coordinates of the listening points and the loudspeakers;

step 2: and constructing a virtual sound source radiation sound field.

Characterization of frequency as f by Green's function_bAcoustic transmission of virtual single frequency sound source to mth listening pointThe input function is as follows:

in the formula (I), the compound is shown in the specification,

in the formula, T represents transposition operation; y is_mThe location of the mth listening point, M is 1,2, …, M;

And step 3: and constructing a loudspeaker radiation sound field.

The acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated:

Can be expressed as:

and 4, step 4: finding speaker weight vector (w) for virtual sound source_b)。

This design adopts₁The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:

is to make

minimize f(w_b)+g(z_b)

s.t.w_b-z_b＝0

in the formula (I), the compound is shown in the specification,

g(z_b)＝λ₁||z_b||₁；z_bis used for constraining the weight vector w of the loudspeaker_bThe variable of (2). Solving the weight w of the loudspeaker_bThe process of (1) is an iterative process, and the weight variable of the loudspeaker is obtained by the (j + 1) th iteration

The following formula is used to obtain:

in the formula, j +1 is iteration times, and j takes 0 as a starting point;

is the jth timeConstraint variable of iteration, initial value of iteration is initialized to

The iterative update is shown as follows:

dual variable

The iterative update procedure of (1) is as follows:

after j iterations, the main residual is

Dual residual is

The stopping criterion for the algorithm iteration is:

And 5: and (4) summarizing all the weight vectors of the loudspeakers obtained in the step (4), counting and keeping nonzero items in the weight vectors to be used as activated loudspeakers, and simultaneously determining the number and the position information of the activated loudspeakers.

All the speaker weight vectors w under 8 frequency points can be expressed as:

w_∑＝[w_∑(1),w_∑(2),…,w_∑(N)]^T

Wherein the content of the first and second substances,

n is 1,2, …, N is the total number of speakers. w is a_∑The loudspeaker corresponding to the medium non-zero element is judged as the activated loudspeaker, and the corresponding loudspeaker can be selected from the loudspeaker arrayPosition information of the device, and the total number of activated loudspeakers is recorded as N_a. Thus, the selection of N from N speakers is completed_aAnd a process of activating the speakers with which the later sound field reproduction is performed.

Step 6: and solving a weight signal for activating the loudspeaker for the sound source to be replayed.

The frequency domain components contained in the sound source to be replayed are set as follows: f. of₁,f₂,…,f_h,…,f_HThe corresponding wave numbers are: k is a radical of₁,k₂,…,k_h,…,k_HWherein k is_h＝2πf_hAnd H represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index. The frequency content of the sound source to be reproduced is richer than that of the virtual sound source (used for the loudspeaker selection process), i.e. H>>8。

Wherein the content of the first and second substances,

wherein the content of the first and second substances,

in the formula (I), the compound is shown in the specification,

is to make

wherein the content of the first and second substances,^Hrepresenting a conjugate transpose operation, w_hIs N of the requested activation_aAt frequency f of a loudspeaker_hA lower drive signal.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. An immersive broadband 3D sound field playback method comprising the steps of:

step 1, establishing a space rectangular coordinate system by taking the central position of the wall surface where the loudspeaker is positioned as an original point

Wherein the x axis is parallel to the ground, the y axis is vertical to the ground, and the z axis is vertical to the wall surface; appointing the coordinate position of the virtual sound source to be replayed as s, setting the number of the used loudspeakers as N at most, and selecting M positions in the listening area for listening;

step 2, constructing a virtual sound source radiation sound field

Establishing acoustic propagation characteristics from a virtual sound source with full directivity to a listening point based on a Green function, and taking the Green function value of the point as a sound pressure value of a sound field radiated by the sound source in a specified area;

step 3, constructing a radiation sound field of the loudspeaker

Setting the positions of N loudspeakers, and respectively calculating the sound pressure values generated by the loudspeakers at all listening point positions as the sound pressure value set of the sound field radiated by the loudspeakers;

step 4, solving weight vector of loudspeaker aiming at virtual sound source

By a 1₁Normalizing the norm, namely constraining a virtual sound source radiation sound field and a loudspeaker radiation sound field, converting the problem of solving the loudspeaker weight into a minimum absolute contraction and selection operator problem, and solving by using an alternative direction multiplier method to obtain a group of loudspeaker weight vectors for replaying each virtual single-frequency sound source;

step 5, determining the number of activated loudspeakers

Summarizing loudspeaker weight vectors corresponding to all virtual single-frequency sound sources, and reserving nonzero contents in the loudspeaker weight vectors as selected loudspeakers in a replay system; meanwhile, removing zero items in the audio signals, namely, the unselected loudspeakers;

step 6, calculating the weight signal of the activated loudspeaker aiming at the sound source to be replayed

Calculating the radiation sound field of the sound source to be replayed and the radiation sound field of the activated loudspeaker again according to the step 2 and the step 3 for the sound source to be replayed and all the activated loudspeakers, and utilizing l₂And (4) norm regularization, wherein the weight of each loudspeaker is calculated under the minimum mean square criterion and is used as a driving signal of the loudspeaker during reproduction.

2. The immersive broadband 3D sound field playback method of claim 1, wherein: step 2, representing the frequency as f by using a Green function_bThe acoustic transfer function from the virtual single-frequency sound source to the mth listening point:

wherein the content of the first and second substances,

k_b＝2πf_bc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position of the virtual single-frequency sound source; y is_mFor the location of the mth listening point, M is 1,2, …, M, and considering the wide-band speech sound field, the center frequencies of 8 bands in the 1 octave from 100Hz to 16kHz are selected for joint optimization, that is, the 8 bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set₁＝125Hz,…,f₈16kHz, i.e., b 1,2, …, 8;

the virtual sound source is used in the system design process, namely: a process of selecting an active speaker from all speakers, the active speaker being used for sound field reproduction, the reproduced sound source being the same as a virtual sound source position, the virtual sound source being 8 sound sources each having a frequency f₁,f₂,…,f₈Because the virtual sound source has omni-directivity, the sound pressure value of the radiated sound field is equal to the green function value, and can be recorded as:

wherein the content of the first and second substances,^Trepresenting a transpose operation; y is_mThe location of the mth listening point, M is 1,2, …, M;

3. The immersive broadband 3D sound field playback method of claim 1, wherein: the step 3 specifically comprises the following steps:

the acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated as follows:

wherein x is_nThe coordinate of the nth loudspeaker in the loudspeaker array is N, which is 1,2, …, N; the set of acoustic transfer functions from the loudspeaker to all listening points is then:

suppose the weight vector of the loudspeaker to be solved is w_b＝[w_b(1),w_b(2),…,w_b(N)]^TAnd N is the total number of loudspeakers, at which time the loudspeakers radiate the set of sound pressure values of the sound field

Can be expressed as:

4. the immersive broadband 3D sound field playback method of claim 1, wherein: step 4 specifically comprises; by means of₁The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:

is to make

Minimum, calculated w_bA value; i.e. at frequency f_bObtaining a weight vector of the loudspeaker;

solving the above formula by adopting an alternating direction multiplier method, firstly, writing the above formula into the form of the alternating direction multiplier method:

minimize f(w_b)+g(z_b)

s.t.w_b-z_b＝0

wherein the content of the first and second substances,

g(z_b)＝λ₁||z_b||₁；z_bis used for constraining the weight vector w of the loudspeaker_bA variable of (d); solving the weight w of the loudspeaker_bThe process of (1) is an iterative process, and the weight variable of the loudspeaker is obtained by the (j + 1) th iteration

The following formula is used to obtain:

in the formula, j +1 is iteration times, and j takes 0 as a starting point;

I is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]Internally selected, constrained variables

The iterative update is shown as follows:

dual variable

The iterative update procedure of (1) is as follows:

after j iterations, the main residual is

Dual residual is

The stopping criterion for the algorithm iteration is:

the iteration residual is set to: absolute error epsilon^abs＝10^-4Relative to each otherError epsilon^rel＝10^-2When the stop criterion is satisfied, f is output_bA group of loudspeaker weight vectors w corresponding to frequency points_bIn total, 8 sets of weight vectors w of speakers under different frequencies are calculated_b，b＝1,2,…,8。

5. The immersive broadband 3D sound field playback method of claim 4, wherein: step 5 is to summarize all the speaker weight vectors obtained in step 4, and count and retain the non-zero items therein, and all the speaker weight vectors w under 8 frequency points can be expressed as:

w_∑＝[w_∑(1),w_∑(2),…,w_∑(N)]^T

Wherein the content of the first and second substances,

n is the total number of loudspeakers, w_∑The loudspeaker corresponding to the medium non-zero element is judged as an activated loudspeaker, corresponding loudspeaker position information can be selected from the loudspeaker array, and the total number of the activated loudspeakers is recorded as N_a(ii) a Thus, the selection of N from N speakers is completed_aAnd a process of activating the speakers with which the later sound field reproduction is performed.

6. The immersive broadband 3D sound field playback method of claim 5, wherein step 6 is to calculate N selected in step 5_aA drive signal for activating the loudspeakers to synthesize sound pressure values at the listening point for the sound source to be reproduced; the frequency domain components contained in the sound source to be replayed are set as follows: f. of₁,f₂,…,f_h,…,f_HThe corresponding wave numbers are: k is a radical of₁,k₂,…,k_h,…,k_HWherein k is_h＝2πf_hH represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index; the frequency content of the sound source to be reproduced is richer than that of the virtual sound source, i.e. H>>8；

Wherein the content of the first and second substances,

k_h＝2πf_hc, s is the position of the actual single frequency to be replayed; y is_mThe location of the mth listening point, M is 1,2, …, M,

Wherein, w_h＝[w_h(1),…,w_h(N_a)]^TA weight vector for activating the loudspeaker; g_hSet of acoustic transfer functions for all active loudspeakers to all listening points:

wherein the content of the first and second substances,

representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, N_a，m＝1,2,…,M，

To minimize playback errors, use is made of₂Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:

wherein the content of the first and second substances,

is to make

W calculated at minimum_hValue, i.e. at frequency f_hThe weight vector, lambda, of the activated loudspeaker calculated below₂>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and the playback is realized by using the total power as small as possible while meeting the minimized error according to the practical selection, and the solving process of the above equation is as follows:

wherein the content of the first and second substances,^Hrepresenting a conjugate transpose operation, w_hIs N of the requested activation_aAt frequency f of a loudspeaker_hA lower drive signal;