CN108632709B - Immersive broadband 3D sound field playback method - Google Patents
Immersive broadband 3D sound field playback method Download PDFInfo
- Publication number
- CN108632709B CN108632709B CN201810352481.0A CN201810352481A CN108632709B CN 108632709 B CN108632709 B CN 108632709B CN 201810352481 A CN201810352481 A CN 201810352481A CN 108632709 B CN108632709 B CN 108632709B
- Authority
- CN
- China
- Prior art keywords
- loudspeaker
- sound source
- sound
- sound field
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
Abstract
The invention discloses an immersive broadband 3D sound field replaying method, which comprises the steps of firstly, calculating the virtual sound source arrival of a scene A placed at a specified space positionTaking the function value as the sound pressure value of the virtual sound source radiation sound field; secondly, setting a loudspeaker array of a certain wall surface in a scene B as a regular rectangular equal-interval layout, and modeling acoustic propagation characteristics from all loudspeakers to listening points by utilizing a Green function based on the fluctuation characteristics of sound waves; thirdly, based on the linear convex optimization theory, dividing l1And the norm is used as a sparse rule operator, regularization operation is carried out by utilizing an alternating direction multiplier method, the center frequencies of eight frequency bands in 1 octave are selected to calculate the weight of the loudspeaker, and the loudspeaker is activated to be selected. Finally, use l2And (3) norm regularization, namely calculating weight signals of activated loudspeakers in the playback system, so that the radiated sound field of a sound source to be played back is closest to the radiated sound field of the activated loudspeakers under the least mean square criterion.
Description
Technical Field
The invention belongs to the technical field of acoustics, and particularly relates to an immersive broadband 3D sound field playback method.
Background
Spatial sound reproduction is the most critical item of three-dimensional audio technology in order to reproduce a desired sound scene as accurately as possible, giving the listener a sense of presence and immersion. The main implementation is to reconstruct an acoustic environment in a given space that is consistent with the desired sound field using a set of loudspeaker arrays. In an existing sound field playback system, a loudspeaker array is usually arranged in a ring or spherical array mode, and a ball rack and other large-scale equipment are needed to be used in actual placement, so that operation difficulty and experience inconvenience are brought to a user. In recent years, with the development of virtual reality and augmented reality, the demands of users on immersive experiences are more urgent, and great development opportunities for playback of immersive spatial sound fields are met. The primary objective of immersive playback is to create a sense of sharing the same ambient atmosphere for the participants, and to play back a sound source that is geographically separated from the user by the loudspeaker array as if the user were in the environment of the sound source, thereby enabling the user to experience personally on the scene. For individual users, the complex ring/sphere loudspeaker array layout is far from realistic. Aiming at the problem, the invention designs a 3D sound field playback system facing an immersive broadband voice signal scene by utilizing a loudspeaker layout mode based on a planar array.
In immersive interaction, 3D video provides the user with position information of sound sources in visual terms, and sound field playback can deepen the user's presence from an auditory perspective as a supplement to visual information in this process. The immersion depends on the synchronous playing of audio and video, which requires high real-time performance and low delay of a playback algorithm. At present, most of sound field reproduction algorithms with higher precision use a sound pressure value in an audible region in a scene as target sound field information, and use a loudspeaker array to recover the sound pressure information. In the reconstruction process, the calculation is usually performed by using a spherical harmonic expansion form of a column/spherical coordinate, and a high-order Hankel function, a Bessel function and an associated Legendre function are used as auxiliary functions. Therefore, the calculation complexity is high, and the application in an actual scene is difficult. Therefore, the design adopts a sound pressure matching technology to recover a target sound field, and l is introduced1And l2The degree of fitting to the original sound pressure information is controlled through regularization, so that the robustness of the algorithm is improved. In the process of solving the regularization problem, an alternating direction multiplier method is introduced to reduce the calculation complexity, improve the operation speed of the algorithm and achieve higher real-time requirement.
Disclosure of Invention
The invention provides an immersive broadband 3D sound field playback method aiming at the problems of complicated loudspeaker layout and higher algorithm complexity in the existing multi-channel audio playback system, which is suitable for the playback of a broadband voice signal 3D sound field in an immersive application scene, and a target sound field is reconstructed by utilizing a group of loudspeakers arranged on a wall surface in a sound pressure matching mode.
The invention is used for solving the problem that the accurate perception and playback of a target sound source in an actual scene are difficult, and comprises the following steps:
step 1, taking the central position of the wall surface where the loudspeaker is located as an original point, and establishing a space rectangular coordinate system, wherein an x axis is parallel to the ground, a y axis is perpendicular to the ground, and a z axis is perpendicular to the wall surface. The coordinate position of a virtual sound source to be replayed is designated as s, the number of used loudspeakers is set to be N at most, and M positions are selected from a listening area to listen (namely, the number of listening points is M).
And 2, constructing a virtual sound source radiation sound field. And constructing acoustic propagation characteristics from the omnidirectional virtual sound source to the listening point based on the acoustic transmission function, and taking the function value as a sound pressure value of the sound field radiated by the sound source in the specified area. The invention considers that the application object is a broadband voice signal, so the upper limit value of the frequency to be calculated is 16kHz, the lower limit value is 100Hz, 100 Hz-16 kHz is divided into 8 frequency bands according to an octave, the central frequency of each frequency band is taken, and 8 single-frequency virtual sound sources are selected for calculation.
And 3, constructing a loudspeaker radiation sound field. And setting the positions of the N loudspeakers, and respectively calculating the sound pressure values generated by all listening point positions of the loudspeakers as the sound pressure value set of the sound field radiated by the loudspeakers.
And 4, solving a loudspeaker weight vector aiming at the virtual sound source. By a 11Norm regularization, namely, constraining a virtual sound source radiation sound field and a loudspeaker radiation sound field, converting the solving problem of loudspeaker weights (also called loudspeaker excitation signals or loudspeaker driving signals, namely, signals played by a loudspeaker) into a minimum absolute contraction and selection operator problem, and solving by using an alternative direction multiplier method to obtain a group of loudspeaker weight vectors for replaying each virtual single-frequency sound source (total 8).
And step 5, determining the number of the activated loudspeakers. Summarizing the loudspeaker weight vectors corresponding to the 8 virtual single-frequency sound sources, and keeping nonzero items of contents in the loudspeaker weight vectors as selected loudspeakers (called activated loudspeakers) in a replay system; while removing the zero entries, i.e., the non-selected speakers (referred to as non-activated speakers).
And 6, solving a weight signal for activating the loudspeaker aiming at the sound source to be replayed. Calculating the radiation sound field of the sound source to be replayed and the radiation sound field of the activated loudspeaker again according to the step 2 and the step 3 for the sound source to be replayed and all the activated loudspeakers, and utilizing l2And (4) norm regularization, wherein the weight of each loudspeaker is calculated under the minimum mean square criterion and is used as a driving signal of the loudspeaker during reproduction.
Preferably, the implementation manner of step 2 is as follows: characterization of frequency as f by Green's functionbThe acoustic transfer function from the virtual single-frequency sound source to the mth listening point:
in the formula (I), the compound is shown in the specification,kb=2πfbc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position (three-dimensional coordinates) of the virtual single-frequency sound source; y ismThe location of the mth listening point is M ═ 1,2, …, M. In consideration of a broadband voice sound field, the center frequencies of 8 frequency bands in 1 octave from 100Hz to 16kHz are selected for joint optimization (the ratio of the upper limit frequency to the lower limit frequency of each frequency band is 2, and the starting frequency is 88Hz), namely, the 8 frequency bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set1=125Hz,…,f816kHz, i.e., b 1,2, …, 8.
In addition, the virtual sound source in the invention is used in the system design process (namely, the process of selecting the activated loudspeaker from all the loudspeakers), the activated loudspeaker is used for sound field reproduction, and the reproduced sound source is the same as the virtual sound source position. The virtual sound source in the invention is 8 frequencies respectively of f1,f2,…,f8A single frequency sound source of (2). Considering that a virtual sound source has omni-directivity, the sound pressure value of its radiated sound field is equal to the green function value, and can be recorded as:
in the formula, T represents transposition operation; y ismIs the position of the mth listening point, m is 1,2,…,M;Is at a frequency fbThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.
Preferably, step 3 is to find a set of sound pressure values of the sound field radiated by the loudspeaker. The acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated as follows:
in the formula, xnThe coordinate of the nth speaker in the speaker array is N ═ 1,2, …, N. The set of acoustic transfer functions from the loudspeaker to all listening points is then:
let the loudspeaker weight vector to be found (also called loudspeaker excitation signal vector or loudspeaker drive signal vector) be wb=[wb(1),wb(2),…,wb(N)]TAnd N is the total number of speakers. At this time, the sound pressure value set of the sound field radiated by the loudspeakerCan be expressed as:
preferably, step 4 is to obtain a weight vector (w) of the speaker required for reconstructing each virtual sound source (single-frequency sound source)b). If the virtual sound source is directly made to radiate the sound fieldWith loudspeaker radiating sound fieldEquality, solving for w by matrix inversionbThis can cause ill-conditioned solutions and model overfitting problems. Consider l1The regularization has the function of variable selection, so that the model is sparser, namely, the accuracy of fitting is ensured under the condition that a small number of loudspeakers are selected. So this design adopts1The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:
in the formula, | | | non-conducting phosphor1Represents l1A norm; | | non-woven hair2Represents l2A norm; lambda [ alpha ]12 is a penalty factor;is to makeMinimum, calculated wbThe value is obtained. I.e. at frequency fbAnd (5) obtaining the weight vector of the loudspeaker.
The design adopts an alternating direction multiplier method to solve the above formula, and firstly, the above formula is written into the form of the alternating direction multiplier method:
minimize f(wb)+g(zb)
s.t.wb-zb=0
in the formula (I), the compound is shown in the specification,g(zb)=λ1||zb||1;zbis used for constraining the weight vector w of the loudspeakerbThe variable of (2). Solving the weight w of the loudspeakerbThe process of (1) is an iterative process, and the weight value of the loudspeaker obtained by the (j + 1) th iteration is changedMeasurement ofThe following formula is used to obtain:
in the formula, j +1 is iteration times, and j takes 0 as a starting point;is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,for the constrained variable of the jth iteration, the initial value of the iteration is initialized toI is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]And (4) selecting. Constrained variablesThe iterative update is shown as follows:
in the formula, Sκ(α) is a soft threshold decision function defined as follows:
after j iterations, the main residual isDual residual isThe stopping criterion for the algorithm iteration is:
wherein epsilonpriIs the upper bound, ε, of the main residualdualIs the upper limit of the dual residual, calculated by:
the design sets the iteration residual as: absolute error epsilonabs=10-4Relative error erel=10-2. When the stop criterion is satisfied, output fbA group of loudspeaker weight vectors w corresponding to frequency pointsb. The weight vector w of 8 groups of loudspeakers under different frequencies is calculatedb,b=1,2,…,8。
Preferably, step 5 is to summarize all the speaker weight vectors obtained in step 4 and statistically retain non-zero terms therein. All the speaker weight vectors w under 8 frequency points can be expressed as:
adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w∑:
w∑=[w∑(1),w∑(2),…,w∑(N)]T
Wherein the content of the first and second substances,n is 1,2, …, N is the total number of speakers. w is a∑The loudspeaker corresponding to the medium non-zero element is judged as an activated loudspeaker, corresponding loudspeaker position information can be selected from the loudspeaker array, and the total number of the activated loudspeakers is recorded as Na. Thus, the selection of N from N speakers is completedaAnd a process of activating the speakers with which the later sound field reproduction is performed.
Preferably, step 6 is to calculate the N selected in step 5aThe drive signals of the active loudspeakers are used to synthesize the sound pressure values of the sound source to be reproduced at the listening point. The frequency domain components contained in the sound source to be replayed are set as follows: f. of1,f2,…,fh,…,fHThe corresponding wave numbers are: k is a radical of1,k2,…,kh,…,kHWherein k ish=2πfhAnd H represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index. The frequency content of the sound source to be reproduced is richer than that of the virtual sound source (used for the loudspeaker selection process), i.e. H>>8。
Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2
Wherein the content of the first and second substances,kh=2πfhc, s is the position of the actual single frequency to be replayed (same as the position of the virtual sound source); y ismThe location of the mth listening point is M ═ 1,2, …, M.
Calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3
In the formula, wh=[wh(1),…,wh(Na)]TA weight vector for activating the loudspeaker; ghSet of acoustic transfer functions for all active loudspeakers to all listening points:
wherein the content of the first and second substances,representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, Na,m=1,2,…,M。
To minimize playback errors, the design utilizes l2Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:
in the formula (I), the compound is shown in the specification,is to makeW calculated at minimumhThe value is obtained. I.e. at frequency fhAnd (5) obtaining the weight vector of the activated loudspeaker. Lambda [ alpha ]2>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and it is necessary to implement playback with as little total power as possible while satisfying the minimum error, according to the practical choice. The solving process of the above formula is:
wherein H represents a conjugate transpose operation, whIs N of the requested activationaAt frequency f of a loudspeakerhA lower drive signal.
Finally, w is1,w2,…,wHAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.
The invention relates to an immersive broadband 3D sound field playback method based on a loudspeaker array; the research aim is to design a set of virtual sound system which accurately replays sound source signals in a scene A in a scene B through a loudspeaker array. According to the method, a convex optimization model theory is utilized, firstly, acoustic transmission functions from a virtual sound source (the position of the virtual sound source is the same as that of a sound to be reproduced and the frequency of the virtual sound source is different) of a scene A placed at a specified space position to each listening point in a scene B are calculated, and the function values are used as sound pressure values of a virtual sound source radiation sound field; secondly, setting a loudspeaker array of a certain wall surface in a scene B as a regular rectangular equal-interval layout, and modeling acoustic propagation characteristics from all loudspeakers to listening points by utilizing a Green function based on the fluctuation characteristics of sound waves; thirdly, based on the linear convex optimization theory, dividing l1And the norm is used as a sparse rule operator, regularization operation is carried out by utilizing an alternating direction multiplier method, the center frequencies of eight frequency bands in 1 octave are selected to calculate the weight of the loudspeaker, and the loudspeaker is activated to be selected. Finally, use l2Norm regularization, calculating weight signals of activated loudspeakers in a playback system so that the radiated sound field of a sound source to be played back and the radiated sound of the activated loudspeakers under the least mean square criterionThe fields are closest. The method has clear thought, low algorithm complexity and high real-time performance, solves the problem that the number of loudspeakers required by the actual three-dimensional sound field reproduction is large, and most importantly, the method can reproduce the sound through the loudspeakers in the planar array layout, is easy to actually place, can be applied to home theaters and network playing platforms, and can also be applied to office environments and audio and video conferences.
Drawings
Fig. 1 is a diagram of an example of speaker array playback.
FIG. 2 is a diagram illustrating an overall design method.
Detailed description of the invention
In the designed general flow, the sound pressure value from a virtual sound source (a single-frequency sound source with the same position as the sound source to be reproduced) to a listening point is firstly calculated, then, the acoustic transmission function from all the loudspeakers to the listening point is calculated, and the acoustic transmission function is multiplied by the corresponding loudspeaker weight value, so that the sound pressure expression of the sound field radiated by the loudspeakers is obtained. By a 11And (4) calculating and summarizing the loudspeaker weights under all frequency points by combining norm regularization and an alternative direction multiplier method, and keeping nonzero items in the loudspeaker weights as activated loudspeakers. And finally, respectively calculating sound pressure values of the sound source to be replayed and the loudspeaker to be activated to a listening point, and calculating driving signals of the loudspeakers by using the minimum mean square as a criterion to obtain a playing signal of each loudspeaker so as to complete the overall design. In implementation, the algorithm of the present invention is embedded into software to realize automatic operation of each process, and the following specific implementation steps are combined with fig. 1 and fig. 2 to further explain the present invention, and the specific work flow is as follows:
step 1: and establishing a space rectangular coordinate system by taking the central position of the wall surface where the loudspeaker is positioned as an original point. Wherein the x-axis is parallel to the ground, the y-axis is perpendicular to the ground, and the z-axis is perpendicular to the wall surface. The coordinate position of the virtual sound source to be played back is designated as s. Setting the number of listening points in a listening area and the number of loudspeakers used at most as M and N respectively, and calculating the spatial position coordinates of the listening points and the loudspeakers;
step 2: and constructing a virtual sound source radiation sound field.
Characterization of frequency as f by Green's functionbAcoustic transmission of virtual single frequency sound source to mth listening pointThe input function is as follows:
in the formula (I), the compound is shown in the specification,kb=2πfbc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position (three-dimensional coordinates) of the virtual single-frequency sound source; y ismThe location of the mth listening point is M ═ 1,2, …, M. In consideration of a broadband voice sound field, the center frequencies of 8 frequency bands in 1 octave from 100Hz to 16kHz are selected for joint optimization (the ratio of the upper limit frequency to the lower limit frequency of each frequency band is 2, and the starting frequency is 88Hz), namely, the 8 frequency bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set1=125Hz,…,f816kHz, i.e., b 1,2, …, 8.
In addition, the virtual sound source in the invention is used in the system design process (namely, the process of selecting the activated loudspeaker from all the loudspeakers), the activated loudspeaker is used for sound field reproduction, and the reproduced sound source is the same as the virtual sound source position. The virtual sound source in the invention is 8 frequencies respectively of f1,f2,…,f8A single frequency sound source of (2). Considering that a virtual sound source has omni-directivity, the sound pressure value of its radiated sound field is equal to the green function value, and can be recorded as:
in the formula, T represents transposition operation; y ismThe location of the mth listening point, M is 1,2, …, M;is at a frequency fbThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.
And step 3: and constructing a loudspeaker radiation sound field.
The acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated:
in the formula, xnThe coordinate of the nth speaker in the speaker array is N ═ 1,2, …, N. The set of acoustic transfer functions from the loudspeaker to all listening points is then:
let the loudspeaker weight vector to be found (also called loudspeaker excitation signal vector or loudspeaker drive signal vector) be wb=[wb(1),wb(2),…,wb(N)]TAnd N is the total number of speakers. At this time, the sound pressure value set of the sound field radiated by the loudspeakerCan be expressed as:
and 4, step 4: finding speaker weight vector (w) for virtual sound sourceb)。
This design adopts1The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:
in the formula, | | | non-conducting phosphor1Represents l1A norm; | | non-woven hair2Represents l2A norm; lambda [ alpha ]12 is a penalty factor;is to makeMinimum, calculated wbThe value is obtained. I.e. at frequency fbAnd (5) obtaining the weight vector of the loudspeaker.
The design adopts an alternating direction multiplier method to solve the above formula, and firstly, the above formula is written into the form of the alternating direction multiplier method:
minimize f(wb)+g(zb)
s.t.wb-zb=0
in the formula (I), the compound is shown in the specification,g(zb)=λ1||zb||1;zbis used for constraining the weight vector w of the loudspeakerbThe variable of (2). Solving the weight w of the loudspeakerbThe process of (1) is an iterative process, and the weight variable of the loudspeaker is obtained by the (j + 1) th iterationThe following formula is used to obtain:
in the formula, j +1 is iteration times, and j takes 0 as a starting point;is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,is the jth timeConstraint variable of iteration, initial value of iteration is initialized toI is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]And (4) selecting. Constrained variablesThe iterative update is shown as follows:
in the formula, Sκ(α) is a soft threshold decision function defined as follows:
after j iterations, the main residual isDual residual isThe stopping criterion for the algorithm iteration is:
wherein epsilonpriIs the upper bound, ε, of the main residualdualIs the upper limit of the dual residual, calculated by:
the design sets the iteration residual as: absolute error epsilonabs=10-4Relative error erel=10-2. When the stop criterion is satisfied, output fbA group of loudspeaker weight vectors w corresponding to frequency pointsb. The weight vector w of 8 groups of loudspeakers under different frequencies is calculatedb,b=1,2,…,8。
And 5: and (4) summarizing all the weight vectors of the loudspeakers obtained in the step (4), counting and keeping nonzero items in the weight vectors to be used as activated loudspeakers, and simultaneously determining the number and the position information of the activated loudspeakers.
All the speaker weight vectors w under 8 frequency points can be expressed as:
adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w∑:
w∑=[w∑(1),w∑(2),…,w∑(N)]T
Wherein the content of the first and second substances,n is 1,2, …, N is the total number of speakers. w is a∑The loudspeaker corresponding to the medium non-zero element is judged as the activated loudspeaker, and the corresponding loudspeaker can be selected from the loudspeaker arrayPosition information of the device, and the total number of activated loudspeakers is recorded as Na. Thus, the selection of N from N speakers is completedaAnd a process of activating the speakers with which the later sound field reproduction is performed.
Step 6: and solving a weight signal for activating the loudspeaker for the sound source to be replayed.
The frequency domain components contained in the sound source to be replayed are set as follows: f. of1,f2,…,fh,…,fHThe corresponding wave numbers are: k is a radical of1,k2,…,kh,…,kHWherein k ish=2πfhAnd H represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index. The frequency content of the sound source to be reproduced is richer than that of the virtual sound source (used for the loudspeaker selection process), i.e. H>>8。
Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2
Wherein the content of the first and second substances,kh=2πfhc, s is the position of the actual single frequency to be replayed (same as the position of the virtual sound source); y ismThe location of the mth listening point is M ═ 1,2, …, M.
Calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3
In the formula, wh=[wh(1),…,wh(Na)]TA weight vector for activating the loudspeaker; ghSet of acoustic transfer functions for all active loudspeakers to all listening points:
wherein the content of the first and second substances,representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, Na,m=1,2,…,M。
To minimize playback errors, the design utilizes l2Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:
in the formula (I), the compound is shown in the specification,is to makeW calculated at minimumhThe value is obtained. I.e. at frequency fhAnd (5) obtaining the weight vector of the activated loudspeaker. Lambda [ alpha ]2>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and it is necessary to implement playback with as little total power as possible while satisfying the minimum error, according to the practical choice. The solving process of the above formula is:
wherein the content of the first and second substances,Hrepresenting a conjugate transpose operation, whIs N of the requested activationaAt frequency f of a loudspeakerhA lower drive signal.
Finally, w is1,w2,…,wHAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.
Claims (6)
1. An immersive broadband 3D sound field playback method comprising the steps of:
step 1, establishing a space rectangular coordinate system by taking the central position of the wall surface where the loudspeaker is positioned as an original point
Wherein the x axis is parallel to the ground, the y axis is vertical to the ground, and the z axis is vertical to the wall surface; appointing the coordinate position of the virtual sound source to be replayed as s, setting the number of the used loudspeakers as N at most, and selecting M positions in the listening area for listening;
step 2, constructing a virtual sound source radiation sound field
Establishing acoustic propagation characteristics from a virtual sound source with full directivity to a listening point based on a Green function, and taking the Green function value of the point as a sound pressure value of a sound field radiated by the sound source in a specified area;
step 3, constructing a radiation sound field of the loudspeaker
Setting the positions of N loudspeakers, and respectively calculating the sound pressure values generated by the loudspeakers at all listening point positions as the sound pressure value set of the sound field radiated by the loudspeakers;
step 4, solving weight vector of loudspeaker aiming at virtual sound source
By a 11Normalizing the norm, namely constraining a virtual sound source radiation sound field and a loudspeaker radiation sound field, converting the problem of solving the loudspeaker weight into a minimum absolute contraction and selection operator problem, and solving by using an alternative direction multiplier method to obtain a group of loudspeaker weight vectors for replaying each virtual single-frequency sound source;
step 5, determining the number of activated loudspeakers
Summarizing loudspeaker weight vectors corresponding to all virtual single-frequency sound sources, and reserving nonzero contents in the loudspeaker weight vectors as selected loudspeakers in a replay system; meanwhile, removing zero items in the audio signals, namely, the unselected loudspeakers;
step 6, calculating the weight signal of the activated loudspeaker aiming at the sound source to be replayed
Calculating the radiation sound field of the sound source to be replayed and the radiation sound field of the activated loudspeaker again according to the step 2 and the step 3 for the sound source to be replayed and all the activated loudspeakers, and utilizing l2And (4) norm regularization, wherein the weight of each loudspeaker is calculated under the minimum mean square criterion and is used as a driving signal of the loudspeaker during reproduction.
2. The immersive broadband 3D sound field playback method of claim 1, wherein: step 2, representing the frequency as f by using a Green functionbThe acoustic transfer function from the virtual single-frequency sound source to the mth listening point:
wherein the content of the first and second substances,kb=2πfbc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position of the virtual single-frequency sound source; y ismFor the location of the mth listening point, M is 1,2, …, M, and considering the wide-band speech sound field, the center frequencies of 8 bands in the 1 octave from 100Hz to 16kHz are selected for joint optimization, that is, the 8 bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set1=125Hz,…,f816kHz, i.e., b 1,2, …, 8;
the virtual sound source is used in the system design process, namely: a process of selecting an active speaker from all speakers, the active speaker being used for sound field reproduction, the reproduced sound source being the same as a virtual sound source position, the virtual sound source being 8 sound sources each having a frequency f1,f2,…,f8Because the virtual sound source has omni-directivity, the sound pressure value of the radiated sound field is equal to the green function value, and can be recorded as:
wherein the content of the first and second substances,Trepresenting a transpose operation; y ismThe location of the mth listening point, M is 1,2, …, M;is at a frequency fbThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.
3. The immersive broadband 3D sound field playback method of claim 1, wherein: the step 3 specifically comprises the following steps:
the acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated as follows:
wherein x isnThe coordinate of the nth loudspeaker in the loudspeaker array is N, which is 1,2, …, N; the set of acoustic transfer functions from the loudspeaker to all listening points is then:
suppose the weight vector of the loudspeaker to be solved is wb=[wb(1),wb(2),…,wb(N)]TAnd N is the total number of loudspeakers, at which time the loudspeakers radiate the set of sound pressure values of the sound fieldCan be expressed as:
4. the immersive broadband 3D sound field playback method of claim 1, wherein: step 4 specifically comprises; by means of1The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:
wherein | | | purple hair1Represents l1A norm; | | non-woven hair2Represents l2A norm; lambda [ alpha ]12 is a penalty factor;is to makeMinimum, calculated wbA value; i.e. at frequency fbObtaining a weight vector of the loudspeaker;
solving the above formula by adopting an alternating direction multiplier method, firstly, writing the above formula into the form of the alternating direction multiplier method:
minimize f(wb)+g(zb)
s.t.wb-zb=0
wherein the content of the first and second substances,g(zb)=λ1||zb||1;zbis used for constraining the weight vector w of the loudspeakerbA variable of (d); solving the weight w of the loudspeakerbThe process of (1) is an iterative process, and the weight variable of the loudspeaker is obtained by the (j + 1) th iterationThe following formula is used to obtain:
in the formula, j +1 is iteration times, and j takes 0 as a starting point;is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,for the constrained variable of the jth iteration, the initial value of the iteration is initialized toI is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]Internally selected, constrained variablesThe iterative update is shown as follows:
in the formula, Sκ(α) is a soft threshold decision function defined as follows:
after j iterations, the main residual isDual residual isThe stopping criterion for the algorithm iteration is:
wherein epsilonpriIs the upper bound, ε, of the main residualdualIs the upper limit of the dual residual, calculated by:
the iteration residual is set to: absolute error epsilonabs=10-4Relative to each otherError epsilonrel=10-2When the stop criterion is satisfied, f is outputbA group of loudspeaker weight vectors w corresponding to frequency pointsbIn total, 8 sets of weight vectors w of speakers under different frequencies are calculatedb,b=1,2,…,8。
5. The immersive broadband 3D sound field playback method of claim 4, wherein: step 5 is to summarize all the speaker weight vectors obtained in step 4, and count and retain the non-zero items therein, and all the speaker weight vectors w under 8 frequency points can be expressed as:
adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w∑:
w∑=[w∑(1),w∑(2),…,w∑(N)]T
Wherein the content of the first and second substances,n is the total number of loudspeakers, w∑The loudspeaker corresponding to the medium non-zero element is judged as an activated loudspeaker, corresponding loudspeaker position information can be selected from the loudspeaker array, and the total number of the activated loudspeakers is recorded as Na(ii) a Thus, the selection of N from N speakers is completedaAnd a process of activating the speakers with which the later sound field reproduction is performed.
6. The immersive broadband 3D sound field playback method of claim 5, wherein step 6 is to calculate N selected in step 5aA drive signal for activating the loudspeakers to synthesize sound pressure values at the listening point for the sound source to be reproduced; the frequency domain components contained in the sound source to be replayed are set as follows: f. of1,f2,…,fh,…,fHThe corresponding wave numbers are: k is a radical of1,k2,…,kh,…,kHWherein k ish=2πfhH represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index; the frequency content of the sound source to be reproduced is richer than that of the virtual sound source, i.e. H>>8;
Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2
Wherein the content of the first and second substances,kh=2πfhc, s is the position of the actual single frequency to be replayed; y ismThe location of the mth listening point, M is 1,2, …, M,
calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3
Wherein, wh=[wh(1),…,wh(Na)]TA weight vector for activating the loudspeaker; ghSet of acoustic transfer functions for all active loudspeakers to all listening points:
wherein the content of the first and second substances,representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, Na,m=1,2,…,M,
To minimize playback errors, use is made of2Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:
wherein the content of the first and second substances,is to makeW calculated at minimumhValue, i.e. at frequency fhThe weight vector, lambda, of the activated loudspeaker calculated below2>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and the playback is realized by using the total power as small as possible while meeting the minimized error according to the practical selection, and the solving process of the above equation is as follows:
wherein the content of the first and second substances,Hrepresenting a conjugate transpose operation, whIs N of the requested activationaAt frequency f of a loudspeakerhA lower drive signal;
finally, w is1,w2,…,wHAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810352481.0A CN108632709B (en) | 2018-04-19 | 2018-04-19 | Immersive broadband 3D sound field playback method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810352481.0A CN108632709B (en) | 2018-04-19 | 2018-04-19 | Immersive broadband 3D sound field playback method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108632709A CN108632709A (en) | 2018-10-09 |
CN108632709B true CN108632709B (en) | 2021-04-27 |
Family
ID=63705604
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810352481.0A Active CN108632709B (en) | 2018-04-19 | 2018-04-19 | Immersive broadband 3D sound field playback method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108632709B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11503422B2 (en) * | 2019-01-22 | 2022-11-15 | Harman International Industries, Incorporated | Mapping virtual sound sources to physical speakers in extended reality applications |
CN113314129B (en) * | 2021-04-30 | 2022-08-05 | 北京大学 | Sound field replay space decoding method adaptive to environment |
CN113395638B (en) * | 2021-05-25 | 2022-07-26 | 西北工业大学 | Indoor sound field loudspeaker replaying method based on equivalent source method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102790931A (en) * | 2011-05-20 | 2012-11-21 | 中国科学院声学研究所 | Distance sense synthetic method in three-dimensional sound field synthesis |
CN102901559A (en) * | 2012-09-27 | 2013-01-30 | 哈尔滨工程大学 | Sound field separating method based on single-surface measurement and local acoustical holography method |
CN103712684A (en) * | 2013-12-25 | 2014-04-09 | 广西科技大学 | Sound field rebuilding method |
CN105181121A (en) * | 2015-05-29 | 2015-12-23 | 合肥工业大学 | High-precision near-field acoustic holography algorithm adopting weighted iteration equivalent source method |
JP2017028494A (en) * | 2015-07-22 | 2017-02-02 | 日本電信電話株式会社 | Acoustic field sound collection and reproduction device, method for the same and program |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100530350C (en) * | 2005-09-30 | 2009-08-19 | 中国科学院声学研究所 | Sound radiant generation method to object |
US20090103753A1 (en) * | 2007-10-19 | 2009-04-23 | Weistech Technology Co., Ltd | Three-dimension array structure of surround-sound speaker |
US20150294041A1 (en) * | 2013-07-11 | 2015-10-15 | The University Of North Carolina At Chapel Hill | Methods, systems, and computer readable media for simulating sound propagation using wave-ray coupling |
CN107566970A (en) * | 2017-07-20 | 2018-01-09 | 西北工业大学 | A kind of medium-high frequency Reconstruction of Sound Field method inside enclosed environment |
CN107566969A (en) * | 2017-07-20 | 2018-01-09 | 西北工业大学 | A kind of enclosed environment internal low-frequency Reconstruction of Sound Field method |
-
2018
- 2018-04-19 CN CN201810352481.0A patent/CN108632709B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102790931A (en) * | 2011-05-20 | 2012-11-21 | 中国科学院声学研究所 | Distance sense synthetic method in three-dimensional sound field synthesis |
CN102901559A (en) * | 2012-09-27 | 2013-01-30 | 哈尔滨工程大学 | Sound field separating method based on single-surface measurement and local acoustical holography method |
CN103712684A (en) * | 2013-12-25 | 2014-04-09 | 广西科技大学 | Sound field rebuilding method |
CN105181121A (en) * | 2015-05-29 | 2015-12-23 | 合肥工业大学 | High-precision near-field acoustic holography algorithm adopting weighted iteration equivalent source method |
JP2017028494A (en) * | 2015-07-22 | 2017-02-02 | 日本電信電話株式会社 | Acoustic field sound collection and reproduction device, method for the same and program |
Also Published As
Publication number | Publication date |
---|---|
CN108632709A (en) | 2018-10-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7367785B2 (en) | Audio processing device and method, and program | |
US11770671B2 (en) | Spatial audio for interactive audio environments | |
JP4938015B2 (en) | Method and apparatus for generating three-dimensional speech | |
CN101341793B (en) | Method to generate multi-channel audio signals from stereo signals | |
CN101843114B (en) | Method, apparatus and integrated circuit for focusing on audio signal | |
CN109196884B (en) | Sound reproduction system | |
CA3069403C (en) | Concept for generating an enhanced sound-field description or a modified sound field description using a multi-layer description | |
JP5826996B2 (en) | Acoustic signal conversion device and program thereof, and three-dimensional acoustic panning device and program thereof | |
EP0872154A1 (en) | An acoustical audio system for producing three dimensional sound image | |
CN108632709B (en) | Immersive broadband 3D sound field playback method | |
Novo | Auditory virtual environments | |
Zuo et al. | 3D multizone soundfield reproduction in a reverberant environment using intensity matching method | |
Hollerweger | Periphonic sound spatialization in multi-user virtual environments | |
KR100955328B1 (en) | Apparatus and method for surround soundfield reproductioin for reproducing reflection | |
CN113766396A (en) | Loudspeaker control | |
US11388540B2 (en) | Method for acoustically rendering the size of a sound source | |
CN109923877A (en) | The device and method that stereo audio signal is weighted | |
CN116684784B (en) | Acoustic playback method and system based on parametric array loudspeaker array | |
Mickiewicz et al. | Spatialization of sound recordings using intensity impulse responses | |
Jin | A tutorial on immersive three-dimensional sound technologies | |
CN113314129B (en) | Sound field replay space decoding method adaptive to environment | |
O’Dwyer | Sound Source Localization and Virtual Testing of Binaural Audio | |
TW202337236A (en) | Apparatus, method and computer program for synthesizing a spatially extended sound source using elementary spatial sectors | |
JPH05115098A (en) | Stereophonic sound field synthesis method | |
Jimenez et al. | Auralisation of Stage Acoustics for Large Ensembles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |