CN108632709B - Immersive broadband 3D sound field playback method - Google Patents

Immersive broadband 3D sound field playback method Download PDF

Info

Publication number
CN108632709B
CN108632709B CN201810352481.0A CN201810352481A CN108632709B CN 108632709 B CN108632709 B CN 108632709B CN 201810352481 A CN201810352481 A CN 201810352481A CN 108632709 B CN108632709 B CN 108632709B
Authority
CN
China
Prior art keywords
loudspeaker
sound source
sound
sound field
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810352481.0A
Other languages
Chinese (zh)
Other versions
CN108632709A (en
Inventor
贾懋珅
张家铭
鲍长春
吴宇轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201810352481.0A priority Critical patent/CN108632709B/en
Publication of CN108632709A publication Critical patent/CN108632709A/en
Application granted granted Critical
Publication of CN108632709B publication Critical patent/CN108632709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups

Abstract

The invention discloses an immersive broadband 3D sound field replaying method, which comprises the steps of firstly, calculating the virtual sound source arrival of a scene A placed at a specified space positionTaking the function value as the sound pressure value of the virtual sound source radiation sound field; secondly, setting a loudspeaker array of a certain wall surface in a scene B as a regular rectangular equal-interval layout, and modeling acoustic propagation characteristics from all loudspeakers to listening points by utilizing a Green function based on the fluctuation characteristics of sound waves; thirdly, based on the linear convex optimization theory, dividing l1And the norm is used as a sparse rule operator, regularization operation is carried out by utilizing an alternating direction multiplier method, the center frequencies of eight frequency bands in 1 octave are selected to calculate the weight of the loudspeaker, and the loudspeaker is activated to be selected. Finally, use l2And (3) norm regularization, namely calculating weight signals of activated loudspeakers in the playback system, so that the radiated sound field of a sound source to be played back is closest to the radiated sound field of the activated loudspeakers under the least mean square criterion.

Description

Immersive broadband 3D sound field playback method
Technical Field
The invention belongs to the technical field of acoustics, and particularly relates to an immersive broadband 3D sound field playback method.
Background
Spatial sound reproduction is the most critical item of three-dimensional audio technology in order to reproduce a desired sound scene as accurately as possible, giving the listener a sense of presence and immersion. The main implementation is to reconstruct an acoustic environment in a given space that is consistent with the desired sound field using a set of loudspeaker arrays. In an existing sound field playback system, a loudspeaker array is usually arranged in a ring or spherical array mode, and a ball rack and other large-scale equipment are needed to be used in actual placement, so that operation difficulty and experience inconvenience are brought to a user. In recent years, with the development of virtual reality and augmented reality, the demands of users on immersive experiences are more urgent, and great development opportunities for playback of immersive spatial sound fields are met. The primary objective of immersive playback is to create a sense of sharing the same ambient atmosphere for the participants, and to play back a sound source that is geographically separated from the user by the loudspeaker array as if the user were in the environment of the sound source, thereby enabling the user to experience personally on the scene. For individual users, the complex ring/sphere loudspeaker array layout is far from realistic. Aiming at the problem, the invention designs a 3D sound field playback system facing an immersive broadband voice signal scene by utilizing a loudspeaker layout mode based on a planar array.
In immersive interaction, 3D video provides the user with position information of sound sources in visual terms, and sound field playback can deepen the user's presence from an auditory perspective as a supplement to visual information in this process. The immersion depends on the synchronous playing of audio and video, which requires high real-time performance and low delay of a playback algorithm. At present, most of sound field reproduction algorithms with higher precision use a sound pressure value in an audible region in a scene as target sound field information, and use a loudspeaker array to recover the sound pressure information. In the reconstruction process, the calculation is usually performed by using a spherical harmonic expansion form of a column/spherical coordinate, and a high-order Hankel function, a Bessel function and an associated Legendre function are used as auxiliary functions. Therefore, the calculation complexity is high, and the application in an actual scene is difficult. Therefore, the design adopts a sound pressure matching technology to recover a target sound field, and l is introduced1And l2The degree of fitting to the original sound pressure information is controlled through regularization, so that the robustness of the algorithm is improved. In the process of solving the regularization problem, an alternating direction multiplier method is introduced to reduce the calculation complexity, improve the operation speed of the algorithm and achieve higher real-time requirement.
Disclosure of Invention
The invention provides an immersive broadband 3D sound field playback method aiming at the problems of complicated loudspeaker layout and higher algorithm complexity in the existing multi-channel audio playback system, which is suitable for the playback of a broadband voice signal 3D sound field in an immersive application scene, and a target sound field is reconstructed by utilizing a group of loudspeakers arranged on a wall surface in a sound pressure matching mode.
The invention is used for solving the problem that the accurate perception and playback of a target sound source in an actual scene are difficult, and comprises the following steps:
step 1, taking the central position of the wall surface where the loudspeaker is located as an original point, and establishing a space rectangular coordinate system, wherein an x axis is parallel to the ground, a y axis is perpendicular to the ground, and a z axis is perpendicular to the wall surface. The coordinate position of a virtual sound source to be replayed is designated as s, the number of used loudspeakers is set to be N at most, and M positions are selected from a listening area to listen (namely, the number of listening points is M).
And 2, constructing a virtual sound source radiation sound field. And constructing acoustic propagation characteristics from the omnidirectional virtual sound source to the listening point based on the acoustic transmission function, and taking the function value as a sound pressure value of the sound field radiated by the sound source in the specified area. The invention considers that the application object is a broadband voice signal, so the upper limit value of the frequency to be calculated is 16kHz, the lower limit value is 100Hz, 100 Hz-16 kHz is divided into 8 frequency bands according to an octave, the central frequency of each frequency band is taken, and 8 single-frequency virtual sound sources are selected for calculation.
And 3, constructing a loudspeaker radiation sound field. And setting the positions of the N loudspeakers, and respectively calculating the sound pressure values generated by all listening point positions of the loudspeakers as the sound pressure value set of the sound field radiated by the loudspeakers.
And 4, solving a loudspeaker weight vector aiming at the virtual sound source. By a 11Norm regularization, namely, constraining a virtual sound source radiation sound field and a loudspeaker radiation sound field, converting the solving problem of loudspeaker weights (also called loudspeaker excitation signals or loudspeaker driving signals, namely, signals played by a loudspeaker) into a minimum absolute contraction and selection operator problem, and solving by using an alternative direction multiplier method to obtain a group of loudspeaker weight vectors for replaying each virtual single-frequency sound source (total 8).
And step 5, determining the number of the activated loudspeakers. Summarizing the loudspeaker weight vectors corresponding to the 8 virtual single-frequency sound sources, and keeping nonzero items of contents in the loudspeaker weight vectors as selected loudspeakers (called activated loudspeakers) in a replay system; while removing the zero entries, i.e., the non-selected speakers (referred to as non-activated speakers).
And 6, solving a weight signal for activating the loudspeaker aiming at the sound source to be replayed. Calculating the radiation sound field of the sound source to be replayed and the radiation sound field of the activated loudspeaker again according to the step 2 and the step 3 for the sound source to be replayed and all the activated loudspeakers, and utilizing l2And (4) norm regularization, wherein the weight of each loudspeaker is calculated under the minimum mean square criterion and is used as a driving signal of the loudspeaker during reproduction.
Preferably, the implementation manner of step 2 is as follows: characterization of frequency as f by Green's functionbThe acoustic transfer function from the virtual single-frequency sound source to the mth listening point:
Figure GDA0002968316800000021
in the formula (I), the compound is shown in the specification,
Figure GDA0002968316800000022
kb=2πfbc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position (three-dimensional coordinates) of the virtual single-frequency sound source; y ismThe location of the mth listening point is M ═ 1,2, …, M. In consideration of a broadband voice sound field, the center frequencies of 8 frequency bands in 1 octave from 100Hz to 16kHz are selected for joint optimization (the ratio of the upper limit frequency to the lower limit frequency of each frequency band is 2, and the starting frequency is 88Hz), namely, the 8 frequency bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set1=125Hz,…,f816kHz, i.e., b 1,2, …, 8.
In addition, the virtual sound source in the invention is used in the system design process (namely, the process of selecting the activated loudspeaker from all the loudspeakers), the activated loudspeaker is used for sound field reproduction, and the reproduced sound source is the same as the virtual sound source position. The virtual sound source in the invention is 8 frequencies respectively of f1,f2,…,f8A single frequency sound source of (2). Considering that a virtual sound source has omni-directivity, the sound pressure value of its radiated sound field is equal to the green function value, and can be recorded as:
Figure GDA0002968316800000031
in the formula, T represents transposition operation; y ismIs the position of the mth listening point, m is 1,2,…,M;
Figure GDA0002968316800000032
Is at a frequency fbThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.
Preferably, step 3 is to find a set of sound pressure values of the sound field radiated by the loudspeaker. The acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated as follows:
Figure GDA0002968316800000033
in the formula, xnThe coordinate of the nth speaker in the speaker array is N ═ 1,2, …, N. The set of acoustic transfer functions from the loudspeaker to all listening points is then:
Figure GDA0002968316800000034
let the loudspeaker weight vector to be found (also called loudspeaker excitation signal vector or loudspeaker drive signal vector) be wb=[wb(1),wb(2),…,wb(N)]TAnd N is the total number of speakers. At this time, the sound pressure value set of the sound field radiated by the loudspeaker
Figure GDA0002968316800000035
Can be expressed as:
Figure GDA0002968316800000036
preferably, step 4 is to obtain a weight vector (w) of the speaker required for reconstructing each virtual sound source (single-frequency sound source)b). If the virtual sound source is directly made to radiate the sound field
Figure GDA0002968316800000037
With loudspeaker radiating sound field
Figure GDA0002968316800000038
Equality, solving for w by matrix inversionbThis can cause ill-conditioned solutions and model overfitting problems. Consider l1The regularization has the function of variable selection, so that the model is sparser, namely, the accuracy of fitting is ensured under the condition that a small number of loudspeakers are selected. So this design adopts1The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:
Figure GDA0002968316800000041
in the formula, | | | non-conducting phosphor1Represents l1A norm; | | non-woven hair2Represents l2A norm; lambda [ alpha ]12 is a penalty factor;
Figure GDA0002968316800000042
is to make
Figure GDA0002968316800000043
Minimum, calculated wbThe value is obtained. I.e. at frequency fbAnd (5) obtaining the weight vector of the loudspeaker.
The design adopts an alternating direction multiplier method to solve the above formula, and firstly, the above formula is written into the form of the alternating direction multiplier method:
minimize f(wb)+g(zb)
s.t.wb-zb=0
in the formula (I), the compound is shown in the specification,
Figure GDA0002968316800000044
g(zb)=λ1||zb||1;zbis used for constraining the weight vector w of the loudspeakerbThe variable of (2). Solving the weight w of the loudspeakerbThe process of (1) is an iterative process, and the weight value of the loudspeaker obtained by the (j + 1) th iteration is changedMeasurement of
Figure GDA0002968316800000045
The following formula is used to obtain:
Figure GDA0002968316800000046
in the formula, j +1 is iteration times, and j takes 0 as a starting point;
Figure GDA0002968316800000047
is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,
Figure GDA0002968316800000048
for the constrained variable of the jth iteration, the initial value of the iteration is initialized to
Figure GDA0002968316800000049
I is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]And (4) selecting. Constrained variables
Figure GDA00029683168000000410
The iterative update is shown as follows:
Figure GDA00029683168000000411
in the formula, Sκ(α) is a soft threshold decision function defined as follows:
Figure GDA00029683168000000412
dual variable
Figure GDA00029683168000000413
The iterative update procedure of (1) is as follows:
Figure GDA00029683168000000414
after j iterations, the main residual is
Figure GDA00029683168000000415
Dual residual is
Figure GDA00029683168000000416
The stopping criterion for the algorithm iteration is:
Figure GDA00029683168000000417
Figure GDA00029683168000000418
wherein epsilonpriIs the upper bound, ε, of the main residualdualIs the upper limit of the dual residual, calculated by:
Figure GDA0002968316800000051
Figure GDA0002968316800000052
the design sets the iteration residual as: absolute error epsilonabs=10-4Relative error erel=10-2. When the stop criterion is satisfied, output fbA group of loudspeaker weight vectors w corresponding to frequency pointsb. The weight vector w of 8 groups of loudspeakers under different frequencies is calculatedb,b=1,2,…,8。
Preferably, step 5 is to summarize all the speaker weight vectors obtained in step 4 and statistically retain non-zero terms therein. All the speaker weight vectors w under 8 frequency points can be expressed as:
Figure GDA0002968316800000053
adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w
w=[w(1),w(2),…,w(N)]T
Wherein the content of the first and second substances,
Figure GDA0002968316800000054
n is 1,2, …, N is the total number of speakers. w is aThe loudspeaker corresponding to the medium non-zero element is judged as an activated loudspeaker, corresponding loudspeaker position information can be selected from the loudspeaker array, and the total number of the activated loudspeakers is recorded as Na. Thus, the selection of N from N speakers is completedaAnd a process of activating the speakers with which the later sound field reproduction is performed.
Preferably, step 6 is to calculate the N selected in step 5aThe drive signals of the active loudspeakers are used to synthesize the sound pressure values of the sound source to be reproduced at the listening point. The frequency domain components contained in the sound source to be replayed are set as follows: f. of1,f2,…,fh,…,fHThe corresponding wave numbers are: k is a radical of1,k2,…,kh,…,kHWherein k ish=2πfhAnd H represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index. The frequency content of the sound source to be reproduced is richer than that of the virtual sound source (used for the loudspeaker selection process), i.e. H>>8。
Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2
Figure GDA0002968316800000055
Figure GDA0002968316800000056
Wherein the content of the first and second substances,
Figure GDA0002968316800000057
kh=2πfhc, s is the position of the actual single frequency to be replayed (same as the position of the virtual sound source); y ismThe location of the mth listening point is M ═ 1,2, …, M.
Calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3
Figure GDA0002968316800000061
Figure GDA0002968316800000062
In the formula, wh=[wh(1),…,wh(Na)]TA weight vector for activating the loudspeaker; ghSet of acoustic transfer functions for all active loudspeakers to all listening points:
Figure GDA0002968316800000063
wherein the content of the first and second substances,
Figure GDA0002968316800000064
representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, Na,m=1,2,…,M。
To minimize playback errors, the design utilizes l2Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:
Figure GDA0002968316800000065
in the formula (I), the compound is shown in the specification,
Figure GDA0002968316800000066
is to make
Figure GDA0002968316800000067
W calculated at minimumhThe value is obtained. I.e. at frequency fhAnd (5) obtaining the weight vector of the activated loudspeaker. Lambda [ alpha ]2>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and it is necessary to implement playback with as little total power as possible while satisfying the minimum error, according to the practical choice. The solving process of the above formula is:
Figure GDA0002968316800000068
wherein H represents a conjugate transpose operation, whIs N of the requested activationaAt frequency f of a loudspeakerhA lower drive signal.
Finally, w is1,w2,…,wHAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.
The invention relates to an immersive broadband 3D sound field playback method based on a loudspeaker array; the research aim is to design a set of virtual sound system which accurately replays sound source signals in a scene A in a scene B through a loudspeaker array. According to the method, a convex optimization model theory is utilized, firstly, acoustic transmission functions from a virtual sound source (the position of the virtual sound source is the same as that of a sound to be reproduced and the frequency of the virtual sound source is different) of a scene A placed at a specified space position to each listening point in a scene B are calculated, and the function values are used as sound pressure values of a virtual sound source radiation sound field; secondly, setting a loudspeaker array of a certain wall surface in a scene B as a regular rectangular equal-interval layout, and modeling acoustic propagation characteristics from all loudspeakers to listening points by utilizing a Green function based on the fluctuation characteristics of sound waves; thirdly, based on the linear convex optimization theory, dividing l1And the norm is used as a sparse rule operator, regularization operation is carried out by utilizing an alternating direction multiplier method, the center frequencies of eight frequency bands in 1 octave are selected to calculate the weight of the loudspeaker, and the loudspeaker is activated to be selected. Finally, use l2Norm regularization, calculating weight signals of activated loudspeakers in a playback system so that the radiated sound field of a sound source to be played back and the radiated sound of the activated loudspeakers under the least mean square criterionThe fields are closest. The method has clear thought, low algorithm complexity and high real-time performance, solves the problem that the number of loudspeakers required by the actual three-dimensional sound field reproduction is large, and most importantly, the method can reproduce the sound through the loudspeakers in the planar array layout, is easy to actually place, can be applied to home theaters and network playing platforms, and can also be applied to office environments and audio and video conferences.
Drawings
Fig. 1 is a diagram of an example of speaker array playback.
FIG. 2 is a diagram illustrating an overall design method.
Detailed description of the invention
In the designed general flow, the sound pressure value from a virtual sound source (a single-frequency sound source with the same position as the sound source to be reproduced) to a listening point is firstly calculated, then, the acoustic transmission function from all the loudspeakers to the listening point is calculated, and the acoustic transmission function is multiplied by the corresponding loudspeaker weight value, so that the sound pressure expression of the sound field radiated by the loudspeakers is obtained. By a 11And (4) calculating and summarizing the loudspeaker weights under all frequency points by combining norm regularization and an alternative direction multiplier method, and keeping nonzero items in the loudspeaker weights as activated loudspeakers. And finally, respectively calculating sound pressure values of the sound source to be replayed and the loudspeaker to be activated to a listening point, and calculating driving signals of the loudspeakers by using the minimum mean square as a criterion to obtain a playing signal of each loudspeaker so as to complete the overall design. In implementation, the algorithm of the present invention is embedded into software to realize automatic operation of each process, and the following specific implementation steps are combined with fig. 1 and fig. 2 to further explain the present invention, and the specific work flow is as follows:
step 1: and establishing a space rectangular coordinate system by taking the central position of the wall surface where the loudspeaker is positioned as an original point. Wherein the x-axis is parallel to the ground, the y-axis is perpendicular to the ground, and the z-axis is perpendicular to the wall surface. The coordinate position of the virtual sound source to be played back is designated as s. Setting the number of listening points in a listening area and the number of loudspeakers used at most as M and N respectively, and calculating the spatial position coordinates of the listening points and the loudspeakers;
step 2: and constructing a virtual sound source radiation sound field.
Characterization of frequency as f by Green's functionbAcoustic transmission of virtual single frequency sound source to mth listening pointThe input function is as follows:
Figure GDA0002968316800000071
in the formula (I), the compound is shown in the specification,
Figure GDA0002968316800000072
kb=2πfbc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position (three-dimensional coordinates) of the virtual single-frequency sound source; y ismThe location of the mth listening point is M ═ 1,2, …, M. In consideration of a broadband voice sound field, the center frequencies of 8 frequency bands in 1 octave from 100Hz to 16kHz are selected for joint optimization (the ratio of the upper limit frequency to the lower limit frequency of each frequency band is 2, and the starting frequency is 88Hz), namely, the 8 frequency bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set1=125Hz,…,f816kHz, i.e., b 1,2, …, 8.
In addition, the virtual sound source in the invention is used in the system design process (namely, the process of selecting the activated loudspeaker from all the loudspeakers), the activated loudspeaker is used for sound field reproduction, and the reproduced sound source is the same as the virtual sound source position. The virtual sound source in the invention is 8 frequencies respectively of f1,f2,…,f8A single frequency sound source of (2). Considering that a virtual sound source has omni-directivity, the sound pressure value of its radiated sound field is equal to the green function value, and can be recorded as:
Figure GDA0002968316800000081
in the formula, T represents transposition operation; y ismThe location of the mth listening point, M is 1,2, …, M;
Figure GDA0002968316800000082
is at a frequency fbThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.
And step 3: and constructing a loudspeaker radiation sound field.
The acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated:
Figure GDA0002968316800000083
in the formula, xnThe coordinate of the nth speaker in the speaker array is N ═ 1,2, …, N. The set of acoustic transfer functions from the loudspeaker to all listening points is then:
Figure GDA0002968316800000084
let the loudspeaker weight vector to be found (also called loudspeaker excitation signal vector or loudspeaker drive signal vector) be wb=[wb(1),wb(2),…,wb(N)]TAnd N is the total number of speakers. At this time, the sound pressure value set of the sound field radiated by the loudspeaker
Figure GDA0002968316800000085
Can be expressed as:
Figure GDA0002968316800000086
and 4, step 4: finding speaker weight vector (w) for virtual sound sourceb)。
This design adopts1The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:
Figure GDA0002968316800000087
in the formula, | | | non-conducting phosphor1Represents l1A norm; | | non-woven hair2Represents l2A norm; lambda [ alpha ]12 is a penalty factor;
Figure GDA0002968316800000088
is to make
Figure GDA0002968316800000089
Minimum, calculated wbThe value is obtained. I.e. at frequency fbAnd (5) obtaining the weight vector of the loudspeaker.
The design adopts an alternating direction multiplier method to solve the above formula, and firstly, the above formula is written into the form of the alternating direction multiplier method:
minimize f(wb)+g(zb)
s.t.wb-zb=0
in the formula (I), the compound is shown in the specification,
Figure GDA0002968316800000091
g(zb)=λ1||zb||1;zbis used for constraining the weight vector w of the loudspeakerbThe variable of (2). Solving the weight w of the loudspeakerbThe process of (1) is an iterative process, and the weight variable of the loudspeaker is obtained by the (j + 1) th iteration
Figure GDA0002968316800000092
The following formula is used to obtain:
Figure GDA0002968316800000093
in the formula, j +1 is iteration times, and j takes 0 as a starting point;
Figure GDA0002968316800000094
is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,
Figure GDA0002968316800000095
is the jth timeConstraint variable of iteration, initial value of iteration is initialized to
Figure GDA0002968316800000096
I is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]And (4) selecting. Constrained variables
Figure GDA0002968316800000097
The iterative update is shown as follows:
Figure GDA0002968316800000098
in the formula, Sκ(α) is a soft threshold decision function defined as follows:
Figure GDA0002968316800000099
dual variable
Figure GDA00029683168000000910
The iterative update procedure of (1) is as follows:
Figure GDA00029683168000000911
after j iterations, the main residual is
Figure GDA00029683168000000912
Dual residual is
Figure GDA00029683168000000913
The stopping criterion for the algorithm iteration is:
Figure GDA00029683168000000914
Figure GDA00029683168000000915
wherein epsilonpriIs the upper bound, ε, of the main residualdualIs the upper limit of the dual residual, calculated by:
Figure GDA00029683168000000916
Figure GDA00029683168000000917
the design sets the iteration residual as: absolute error epsilonabs=10-4Relative error erel=10-2. When the stop criterion is satisfied, output fbA group of loudspeaker weight vectors w corresponding to frequency pointsb. The weight vector w of 8 groups of loudspeakers under different frequencies is calculatedb,b=1,2,…,8。
And 5: and (4) summarizing all the weight vectors of the loudspeakers obtained in the step (4), counting and keeping nonzero items in the weight vectors to be used as activated loudspeakers, and simultaneously determining the number and the position information of the activated loudspeakers.
All the speaker weight vectors w under 8 frequency points can be expressed as:
Figure GDA0002968316800000101
adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w
w=[w(1),w(2),…,w(N)]T
Wherein the content of the first and second substances,
Figure GDA0002968316800000102
n is 1,2, …, N is the total number of speakers. w is aThe loudspeaker corresponding to the medium non-zero element is judged as the activated loudspeaker, and the corresponding loudspeaker can be selected from the loudspeaker arrayPosition information of the device, and the total number of activated loudspeakers is recorded as Na. Thus, the selection of N from N speakers is completedaAnd a process of activating the speakers with which the later sound field reproduction is performed.
Step 6: and solving a weight signal for activating the loudspeaker for the sound source to be replayed.
The frequency domain components contained in the sound source to be replayed are set as follows: f. of1,f2,…,fh,…,fHThe corresponding wave numbers are: k is a radical of1,k2,…,kh,…,kHWherein k ish=2πfhAnd H represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index. The frequency content of the sound source to be reproduced is richer than that of the virtual sound source (used for the loudspeaker selection process), i.e. H>>8。
Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2
Figure GDA0002968316800000103
Figure GDA0002968316800000104
Wherein the content of the first and second substances,
Figure GDA0002968316800000105
kh=2πfhc, s is the position of the actual single frequency to be replayed (same as the position of the virtual sound source); y ismThe location of the mth listening point is M ═ 1,2, …, M.
Calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3
Figure GDA0002968316800000106
Figure GDA0002968316800000107
In the formula, wh=[wh(1),…,wh(Na)]TA weight vector for activating the loudspeaker; ghSet of acoustic transfer functions for all active loudspeakers to all listening points:
Figure GDA0002968316800000111
wherein the content of the first and second substances,
Figure GDA0002968316800000112
representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, Na,m=1,2,…,M。
To minimize playback errors, the design utilizes l2Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:
Figure GDA0002968316800000113
in the formula (I), the compound is shown in the specification,
Figure GDA0002968316800000114
is to make
Figure GDA0002968316800000115
W calculated at minimumhThe value is obtained. I.e. at frequency fhAnd (5) obtaining the weight vector of the activated loudspeaker. Lambda [ alpha ]2>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and it is necessary to implement playback with as little total power as possible while satisfying the minimum error, according to the practical choice. The solving process of the above formula is:
Figure GDA0002968316800000116
wherein the content of the first and second substances,Hrepresenting a conjugate transpose operation, whIs N of the requested activationaAt frequency f of a loudspeakerhA lower drive signal.
Finally, w is1,w2,…,wHAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (6)

1. An immersive broadband 3D sound field playback method comprising the steps of:
step 1, establishing a space rectangular coordinate system by taking the central position of the wall surface where the loudspeaker is positioned as an original point
Wherein the x axis is parallel to the ground, the y axis is vertical to the ground, and the z axis is vertical to the wall surface; appointing the coordinate position of the virtual sound source to be replayed as s, setting the number of the used loudspeakers as N at most, and selecting M positions in the listening area for listening;
step 2, constructing a virtual sound source radiation sound field
Establishing acoustic propagation characteristics from a virtual sound source with full directivity to a listening point based on a Green function, and taking the Green function value of the point as a sound pressure value of a sound field radiated by the sound source in a specified area;
step 3, constructing a radiation sound field of the loudspeaker
Setting the positions of N loudspeakers, and respectively calculating the sound pressure values generated by the loudspeakers at all listening point positions as the sound pressure value set of the sound field radiated by the loudspeakers;
step 4, solving weight vector of loudspeaker aiming at virtual sound source
By a 11Normalizing the norm, namely constraining a virtual sound source radiation sound field and a loudspeaker radiation sound field, converting the problem of solving the loudspeaker weight into a minimum absolute contraction and selection operator problem, and solving by using an alternative direction multiplier method to obtain a group of loudspeaker weight vectors for replaying each virtual single-frequency sound source;
step 5, determining the number of activated loudspeakers
Summarizing loudspeaker weight vectors corresponding to all virtual single-frequency sound sources, and reserving nonzero contents in the loudspeaker weight vectors as selected loudspeakers in a replay system; meanwhile, removing zero items in the audio signals, namely, the unselected loudspeakers;
step 6, calculating the weight signal of the activated loudspeaker aiming at the sound source to be replayed
Calculating the radiation sound field of the sound source to be replayed and the radiation sound field of the activated loudspeaker again according to the step 2 and the step 3 for the sound source to be replayed and all the activated loudspeakers, and utilizing l2And (4) norm regularization, wherein the weight of each loudspeaker is calculated under the minimum mean square criterion and is used as a driving signal of the loudspeaker during reproduction.
2. The immersive broadband 3D sound field playback method of claim 1, wherein: step 2, representing the frequency as f by using a Green functionbThe acoustic transfer function from the virtual single-frequency sound source to the mth listening point:
Figure FDA0002968316790000011
wherein the content of the first and second substances,
Figure FDA0002968316790000012
kb=2πfbc is the wave number; c is the speed of sound in air, about 340 m/s; b is a frequency index; s is the position of the virtual single-frequency sound source; y ismFor the location of the mth listening point, M is 1,2, …, M, and considering the wide-band speech sound field, the center frequencies of 8 bands in the 1 octave from 100Hz to 16kHz are selected for joint optimization, that is, the 8 bands are respectively: 88Hz-177Hz, 177Hz-355Hz, 355Hz-710Hz, 710Hz-1420Hz, 1420Hz-2840Hz, 2840Hz-5680Hz, 5680Hz-11360Hz and 11360Hz-33720Hz, and the corresponding center frequencies are 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz, so f is set1=125Hz,…,f816kHz, i.e., b 1,2, …, 8;
the virtual sound source is used in the system design process, namely: a process of selecting an active speaker from all speakers, the active speaker being used for sound field reproduction, the reproduced sound source being the same as a virtual sound source position, the virtual sound source being 8 sound sources each having a frequency f1,f2,…,f8Because the virtual sound source has omni-directivity, the sound pressure value of the radiated sound field is equal to the green function value, and can be recorded as:
Figure FDA0002968316790000021
wherein the content of the first and second substances,Trepresenting a transpose operation; y ismThe location of the mth listening point, M is 1,2, …, M;
Figure FDA0002968316790000022
is at a frequency fbThe radiation sound field set of the virtual single-frequency sound source at the M listening points is a sound field set to be fitted by the loudspeaker array.
3. The immersive broadband 3D sound field playback method of claim 1, wherein: the step 3 specifically comprises the following steps:
the acoustic transfer function from the nth loudspeaker to the mth listening point is first calculated as follows:
Figure FDA0002968316790000023
wherein x isnThe coordinate of the nth loudspeaker in the loudspeaker array is N, which is 1,2, …, N; the set of acoustic transfer functions from the loudspeaker to all listening points is then:
Figure FDA0002968316790000024
suppose the weight vector of the loudspeaker to be solved is wb=[wb(1),wb(2),…,wb(N)]TAnd N is the total number of loudspeakers, at which time the loudspeakers radiate the set of sound pressure values of the sound field
Figure FDA0002968316790000025
Can be expressed as:
Figure FDA0002968316790000026
4. the immersive broadband 3D sound field playback method of claim 1, wherein: step 4 specifically comprises; by means of1The regularization method restrains a virtual sound source radiation sound field and a loudspeaker radiation sound field, and converts the problem of solving the weight of the loudspeaker into a problem of minimum absolute contraction and operator selection, which is as follows:
Figure FDA0002968316790000027
wherein | | | purple hair1Represents l1A norm; | | non-woven hair2Represents l2A norm; lambda [ alpha ]12 is a penalty factor;
Figure FDA0002968316790000028
is to make
Figure FDA0002968316790000029
Minimum, calculated wbA value; i.e. at frequency fbObtaining a weight vector of the loudspeaker;
solving the above formula by adopting an alternating direction multiplier method, firstly, writing the above formula into the form of the alternating direction multiplier method:
minimize f(wb)+g(zb)
s.t.wb-zb=0
wherein the content of the first and second substances,
Figure FDA0002968316790000031
g(zb)=λ1||zb||1;zbis used for constraining the weight vector w of the loudspeakerbA variable of (d); solving the weight w of the loudspeakerbThe process of (1) is an iterative process, and the weight variable of the loudspeaker is obtained by the (j + 1) th iteration
Figure FDA0002968316790000032
The following formula is used to obtain:
Figure FDA0002968316790000033
in the formula, j +1 is iteration times, and j takes 0 as a starting point;
Figure FDA0002968316790000034
is a dual variable of the jth iteration and is used for judging the stopping condition of the algorithm,
Figure FDA0002968316790000035
for the constrained variable of the jth iteration, the initial value of the iteration is initialized to
Figure FDA0002968316790000036
I is an identity matrix; rho is an augmented Lagrange parameter, which influences the convergence rate of the algorithm, and is between 0.1 and 10]Internally selected, constrained variables
Figure FDA0002968316790000037
The iterative update is shown as follows:
Figure FDA0002968316790000038
in the formula, Sκ(α) is a soft threshold decision function defined as follows:
Figure FDA0002968316790000039
dual variable
Figure FDA00029683167900000310
The iterative update procedure of (1) is as follows:
Figure FDA00029683167900000311
after j iterations, the main residual is
Figure FDA00029683167900000312
Dual residual is
Figure FDA00029683167900000313
The stopping criterion for the algorithm iteration is:
Figure FDA00029683167900000314
Figure FDA00029683167900000315
wherein epsilonpriIs the upper bound, ε, of the main residualdualIs the upper limit of the dual residual, calculated by:
Figure FDA00029683167900000316
Figure FDA00029683167900000317
the iteration residual is set to: absolute error epsilonabs=10-4Relative to each otherError epsilonrel=10-2When the stop criterion is satisfied, f is outputbA group of loudspeaker weight vectors w corresponding to frequency pointsbIn total, 8 sets of weight vectors w of speakers under different frequencies are calculatedb,b=1,2,…,8。
5. The immersive broadband 3D sound field playback method of claim 4, wherein: step 5 is to summarize all the speaker weight vectors obtained in step 4, and count and retain the non-zero items therein, and all the speaker weight vectors w under 8 frequency points can be expressed as:
Figure FDA0002968316790000041
adding the matrixes column by column to finally obtain a summarized loudspeaker weight matrix w
w=[w(1),w(2),…,w(N)]T
Wherein the content of the first and second substances,
Figure FDA0002968316790000042
n is the total number of loudspeakers, wThe loudspeaker corresponding to the medium non-zero element is judged as an activated loudspeaker, corresponding loudspeaker position information can be selected from the loudspeaker array, and the total number of the activated loudspeakers is recorded as Na(ii) a Thus, the selection of N from N speakers is completedaAnd a process of activating the speakers with which the later sound field reproduction is performed.
6. The immersive broadband 3D sound field playback method of claim 5, wherein step 6 is to calculate N selected in step 5aA drive signal for activating the loudspeakers to synthesize sound pressure values at the listening point for the sound source to be reproduced; the frequency domain components contained in the sound source to be replayed are set as follows: f. of1,f2,…,fh,…,fHThe corresponding wave numbers are: k is a radical of1,k2,…,kh,…,kHWherein k ish=2πfhH represents an index value corresponding to the actual frequency of the sound source to be played back, and H is the highest frequency index; the frequency content of the sound source to be reproduced is richer than that of the virtual sound source, i.e. H>>8;
Calculating the sound pressure of the target sound field radiated by the sound source to be replayed according to the step 2
Figure FDA0002968316790000043
Figure FDA0002968316790000044
Wherein the content of the first and second substances,
Figure FDA0002968316790000045
kh=2πfhc, s is the position of the actual single frequency to be replayed; y ismThe location of the mth listening point, M is 1,2, …, M,
calculating sound pressure for activating the loudspeaker to reconstruct sound field according to the step 3
Figure FDA0002968316790000046
Figure FDA0002968316790000047
Wherein, wh=[wh(1),…,wh(Na)]TA weight vector for activating the loudspeaker; ghSet of acoustic transfer functions for all active loudspeakers to all listening points:
Figure FDA0002968316790000051
wherein the content of the first and second substances,
Figure FDA0002968316790000052
representing the acoustic transfer function from the nth loudspeaker to the mth listening point in the loudspeaker array, N being 1,2, …, Na,m=1,2,…,M,
To minimize playback errors, use is made of2Regularization finds the activated speaker weights under the constraint of the mean-square minimum criterion:
Figure FDA0002968316790000053
wherein the content of the first and second substances,
Figure FDA0002968316790000054
is to make
Figure FDA0002968316790000055
W calculated at minimumhValue, i.e. at frequency fhThe weight vector, lambda, of the activated loudspeaker calculated below2>0 is a regularization penalty parameter which controls the total power of the loudspeaker, and the playback is realized by using the total power as small as possible while meeting the minimized error according to the practical selection, and the solving process of the above equation is as follows:
Figure FDA0002968316790000056
wherein the content of the first and second substances,Hrepresenting a conjugate transpose operation, whIs N of the requested activationaAt frequency f of a loudspeakerhA lower drive signal;
finally, w is1,w2,…,wHAnd superposing and synthesizing the driving signals of the activated loudspeakers of the sound source to be replayed under all frequency components.
CN201810352481.0A 2018-04-19 2018-04-19 Immersive broadband 3D sound field playback method Active CN108632709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810352481.0A CN108632709B (en) 2018-04-19 2018-04-19 Immersive broadband 3D sound field playback method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810352481.0A CN108632709B (en) 2018-04-19 2018-04-19 Immersive broadband 3D sound field playback method

Publications (2)

Publication Number Publication Date
CN108632709A CN108632709A (en) 2018-10-09
CN108632709B true CN108632709B (en) 2021-04-27

Family

ID=63705604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810352481.0A Active CN108632709B (en) 2018-04-19 2018-04-19 Immersive broadband 3D sound field playback method

Country Status (1)

Country Link
CN (1) CN108632709B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11503422B2 (en) * 2019-01-22 2022-11-15 Harman International Industries, Incorporated Mapping virtual sound sources to physical speakers in extended reality applications
CN113314129B (en) * 2021-04-30 2022-08-05 北京大学 Sound field replay space decoding method adaptive to environment
CN113395638B (en) * 2021-05-25 2022-07-26 西北工业大学 Indoor sound field loudspeaker replaying method based on equivalent source method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102790931A (en) * 2011-05-20 2012-11-21 中国科学院声学研究所 Distance sense synthetic method in three-dimensional sound field synthesis
CN102901559A (en) * 2012-09-27 2013-01-30 哈尔滨工程大学 Sound field separating method based on single-surface measurement and local acoustical holography method
CN103712684A (en) * 2013-12-25 2014-04-09 广西科技大学 Sound field rebuilding method
CN105181121A (en) * 2015-05-29 2015-12-23 合肥工业大学 High-precision near-field acoustic holography algorithm adopting weighted iteration equivalent source method
JP2017028494A (en) * 2015-07-22 2017-02-02 日本電信電話株式会社 Acoustic field sound collection and reproduction device, method for the same and program

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100530350C (en) * 2005-09-30 2009-08-19 中国科学院声学研究所 Sound radiant generation method to object
US20090103753A1 (en) * 2007-10-19 2009-04-23 Weistech Technology Co., Ltd Three-dimension array structure of surround-sound speaker
US20150294041A1 (en) * 2013-07-11 2015-10-15 The University Of North Carolina At Chapel Hill Methods, systems, and computer readable media for simulating sound propagation using wave-ray coupling
CN107566970A (en) * 2017-07-20 2018-01-09 西北工业大学 A kind of medium-high frequency Reconstruction of Sound Field method inside enclosed environment
CN107566969A (en) * 2017-07-20 2018-01-09 西北工业大学 A kind of enclosed environment internal low-frequency Reconstruction of Sound Field method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102790931A (en) * 2011-05-20 2012-11-21 中国科学院声学研究所 Distance sense synthetic method in three-dimensional sound field synthesis
CN102901559A (en) * 2012-09-27 2013-01-30 哈尔滨工程大学 Sound field separating method based on single-surface measurement and local acoustical holography method
CN103712684A (en) * 2013-12-25 2014-04-09 广西科技大学 Sound field rebuilding method
CN105181121A (en) * 2015-05-29 2015-12-23 合肥工业大学 High-precision near-field acoustic holography algorithm adopting weighted iteration equivalent source method
JP2017028494A (en) * 2015-07-22 2017-02-02 日本電信電話株式会社 Acoustic field sound collection and reproduction device, method for the same and program

Also Published As

Publication number Publication date
CN108632709A (en) 2018-10-09

Similar Documents

Publication Publication Date Title
JP7367785B2 (en) Audio processing device and method, and program
US11770671B2 (en) Spatial audio for interactive audio environments
JP4938015B2 (en) Method and apparatus for generating three-dimensional speech
CN101341793B (en) Method to generate multi-channel audio signals from stereo signals
CN101843114B (en) Method, apparatus and integrated circuit for focusing on audio signal
CN109196884B (en) Sound reproduction system
CA3069403C (en) Concept for generating an enhanced sound-field description or a modified sound field description using a multi-layer description
JP5826996B2 (en) Acoustic signal conversion device and program thereof, and three-dimensional acoustic panning device and program thereof
EP0872154A1 (en) An acoustical audio system for producing three dimensional sound image
CN108632709B (en) Immersive broadband 3D sound field playback method
Novo Auditory virtual environments
Zuo et al. 3D multizone soundfield reproduction in a reverberant environment using intensity matching method
Hollerweger Periphonic sound spatialization in multi-user virtual environments
KR100955328B1 (en) Apparatus and method for surround soundfield reproductioin for reproducing reflection
CN113766396A (en) Loudspeaker control
US11388540B2 (en) Method for acoustically rendering the size of a sound source
CN109923877A (en) The device and method that stereo audio signal is weighted
CN116684784B (en) Acoustic playback method and system based on parametric array loudspeaker array
Mickiewicz et al. Spatialization of sound recordings using intensity impulse responses
Jin A tutorial on immersive three-dimensional sound technologies
CN113314129B (en) Sound field replay space decoding method adaptive to environment
O’Dwyer Sound Source Localization and Virtual Testing of Binaural Audio
TW202337236A (en) Apparatus, method and computer program for synthesizing a spatially extended sound source using elementary spatial sectors
JPH05115098A (en) Stereophonic sound field synthesis method
Jimenez et al. Auralisation of Stage Acoustics for Large Ensembles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant