CN111193990B - 3D audio system capable of resisting high-frequency spatial aliasing and implementation method - Google Patents

3D audio system capable of resisting high-frequency spatial aliasing and implementation method Download PDF

Info

Publication number
CN111193990B
CN111193990B CN202010009944.0A CN202010009944A CN111193990B CN 111193990 B CN111193990 B CN 111193990B CN 202010009944 A CN202010009944 A CN 202010009944A CN 111193990 B CN111193990 B CN 111193990B
Authority
CN
China
Prior art keywords
spherical
order
signal
fourier transform
hoa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010009944.0A
Other languages
Chinese (zh)
Other versions
CN111193990A (en
Inventor
曲天书
吴玺宏
林晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202010009944.0A priority Critical patent/CN111193990B/en
Publication of CN111193990A publication Critical patent/CN111193990A/en
Application granted granted Critical
Publication of CN111193990B publication Critical patent/CN111193990B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • H04S7/306For headphones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Abstract

The invention discloses a 3D audio system for resisting high-frequency spatial aliasing and an implementation method. The method comprises the following steps: 1) for a given spherical microphone array, sampling spherical sound pressure, and performing discrete spherical Fourier transform on the sampled spherical sound pressure; the expansion order of the discrete sphere Fourier transform is not more than the truncation order N; 2) obtaining a spatial aliasing matrix E according to the relation between the expansion coefficient of the discrete sphere Fourier transform and the true coefficient of the spherical sound pressure expansion in the step 1); 3) through the formula min (| s | | non-conducting phosphor)1)、
Figure DDA0002356771040000011
Solving to obtain a signal s; 4) from the resulting signal s, by formula BN=YNsCoding s to a higher order N to obtain a higher order HOA signal BN(ii) a 5) And multiplying the obtained HOA signal by an inverse matrix of spherical Fourier transform to reconstruct a sound field and obtain 3D audio.

Description

3D audio system capable of resisting high-frequency spatial aliasing and implementation method
Technical Field
The invention belongs to the technical field of 3D audio, and particularly relates to a 3D audio system capable of resisting high-frequency spatial aliasing and an implementation method.
Background
The 3D audio technology mainly refers to a related technology adopted for a listener to obtain a corresponding spatial hearing sensation at the time of audio playback.
The sound image reconstructed by the currently commonly adopted stereo or surround sound system only has the degree of freedom in the horizontal direction, cannot be separated from the plane where the loudspeaker is located, does not reach the 2D specification, and is far different from the 3D spatial audio definition. Due to the disparity between the 3D audio technology development and the 3D video technology development, the mainstream 3D multimedia system adopts a scheme of "3D video + stereo/surround sound" no matter in a cinema or at home, and this implementation has the defect of inconsistent visual perception and auditory perception, resulting in insufficient immersion and reality, and is difficult to achieve the immersive effect. With the increasing requirements of people on sound reality and immersion and the rise of virtual reality related technologies, 3D audio playback is gradually gaining importance.
In 3D audio playback, the most direct approach is to simulate human perception of any azimuth sound source in space by using Head Related Transfer Function (HRTF), however, this method can only realize audio playback in a specific direction, and has side effects such as front-back confusion, Head-in-Head effect, and the like. Other possible mainstream methods are Vector-Based Amplitude Panning (VBAP), Wave Field Synthesis (WFS), and Ambisonics-Based 3D audio systems are more promising with their unique advantages. Firstly, the method has the characteristic of convenient recording, can realize that a recording end and a playback end are independent respectively, and does not need to consider the layout of a loudspeaker during playback during recording; secondly, the system can be compatible with the existing stereo, 5.1/7.1 and other non-3D space audio playback systems; thirdly, it can provide a plurality of playback modes, and can use a loudspeaker for playback and a headphone for playback; finally, it can enable binaural playback based on head tracking.
Ambisonics has a long history of development, and in the early 20 th 70 s, Michael Gerzon proposed the implementation of first-order Ambisonics. Since the low spatial resolution of first Order Ambisonics does not meet the needs of people, many researchers have begun to study Higher Order Ambisonics (Higher Order Ambisonics, HOA). The HOA uses spherical harmonic functions as a group of orthogonal bases of a space to carry out spherical harmonic decomposition on a sound field to obtain multichannel HOA signals, and the sound field is analyzed and reconstructed according to the HOA signals. Theoretically, the higher the HOA order used, the larger the sound field area that can be reconstructed accurately, but in practical applications, its order is limited by the number of microphones and speakers, and as the coding order increases, the number of microphones and speakers increases squarely.
Ambisonics-based 3D audio systems can provide users with sufficient realism and immersion, but they also face a key problem in practical applications: the available band is narrow (there is serious spatial aliasing and disorientation in high frequency). The upper cut-off frequency of a 3D audio system with 4-order HOA coding using 32 microphones is 5.4kHz, which is intolerable in some application scenarios where high frequencies are required (e.g. concert recording).
High frequency spatial aliasing occurs because the nyquist spatial sampling theorem is not satisfied due to the limited number of spherical microphones. One relatively straightforward approach is to increase the number of microphones and reduce the radius of the array. Although the spatial aliasing can be relieved by increasing the number of the microphones, the number of the microphones is in a square relation with the cut-off frequency, and the number of the microphones required is increased sharply along with the increase of the cut-off frequency, so that the microphones cannot be applied in practice; reducing the radius of the array without changing the number of microphones is limited by the manufacturing process on the one hand, and on the other hand, reducing the radius of the array increases the frequency of the low frequency noise amplification. There have also been proposals to use multi-radius spherical microphone array structures to broaden the available frequency bands, but the multi-radius arrays require complex and expensive array designs and are limited in practical applications. From the above analysis, it can be known that expanding the available frequency band in the hardware level requires a large cost, so a new anti-spatial aliasing HOA coding algorithm is required, the upper cut-off frequency can be greatly improved on the basis of not changing the hardware structure, and the problem of narrow available frequency band of the Ambisonics-based 3D audio system is solved.
Disclosure of Invention
The problem to be solved by the invention is that the available frequency band of the current 3D audio system based on Ambisonics is narrow, and the problem limits the application of the system in some scenes with higher requirements on sound, such as concert recording. Aiming at the problem, the invention provides a 3D audio system implementation method for resisting high-frequency spatial aliasing, which utilizes the inherent aliasing mode of the spherical microphone array for generating spatial aliasing and combines a sparse recovery method to achieve the aim of avoiding the influence of spatial aliasing when HOA coding is carried out at high frequency.
The technical scheme of the invention is as follows:
a method for implementing a 3D audio system with high frequency spatial aliasing rejection, comprising the steps of:
1) for a given spherical microphone array, sampling spherical sound pressure, and performing discrete spherical Fourier transform on the sampled spherical sound pressure; the expansion order of the discrete sphere Fourier transform is not more than the truncation order N;
2) expansion coefficient of discrete sphere Fourier transform according to step 1)
Figure BDA0002356771020000021
True coefficient p expanded from spherical sound pressurenmObtaining a spatial aliasing matrix E through the relationship between the two matrixes;
3) through the formula min (| s | | non-conducting phosphor)1)、
Figure BDA0002356771020000022
Solving to obtain a signal s; wherein, YNIs a spherical Fourier transform matrix of order N, B'NThe method comprises the steps that an N-order HOA signal (with aliasing errors) is obtained by HOA coding according to signals of a spherical microphone array, and epsilon is a set value;
4) according to the signal s obtained in step 3), by means of the formula BN=YNs encodes s to a higher order N to obtain a higher order HOA signal B without aliasing errorsN
5) Multiplying the HOA signal obtained in the step 4) by an inverse matrix of the spherical Fourier transform to reconstruct a sound field and obtain 3D audio.
Furthermore, the frequency f of the signal collected by the spherical microphone array meets the requirement
Figure BDA0002356771020000031
Where c is the speed of sound and r is the radius of the spherical microphone array.
Further, the truncation order N<(M+1)2And M is the number of spherical microphones in the spherical microphone array.
Further, the spatial aliasing matrix E is
Figure BDA0002356771020000036
A matrix of (a); therein of elements
Figure BDA0002356771020000032
Figure BDA0002356771020000033
The spherical fourier expansion order of the spherical sound pressure, Q is the number of spherical microphones.
Furthermore, convolution and superposition are carried out on each loudspeaker signal obtained when the sound field is reconstructed and the head-related impact response of the corresponding loudspeaker, a binaural signal is obtained, and the 3D audio system based on the earphone is realized.
A3D audio system for resisting high frequency spatial aliasing is characterized by comprising a high order HOA signal generation module and a sound field reconstruction module; wherein the content of the first and second substances,
the high-order HOA signal generation module is used for sampling the spherical sound pressure of the spherical microphone array and performing discrete spherical Fourier transform on the sampled spherical sound pressure; the expansion order of the discrete sphere Fourier transform is not more than the truncation order N; then expansion coefficient based on discrete sphere Fourier transform
Figure BDA0002356771020000034
True coefficient p expanded from spherical sound pressurenmObtaining a spatial aliasing matrix E through the relationship between the two matrixes; then through the formula min (| s | | non-woven phosphor)1)、
Figure BDA0002356771020000035
Solving to obtain a signal s; wherein, YNIs a spherical Fourier transform matrix of order N, B'NThe HOA signal of N orders is obtained by HOA coding according to the signal of the spherical microphone array, and epsilon is a set value; then by formula BN=YNs encodes s to N order to obtain HOA signal B of N orderN
And the sound field reconstruction module is used for multiplying the obtained HOA signal by an inverse matrix of the spherical Fourier transform to reconstruct a sound field and obtain 3D audio.
The invention has the beneficial effects that:
the upper cut-off frequency of a spherical microphone array (32 microphones and 4-order HOA coding) is increased from 5.4kHz to 10kHz, so that the problem of high-frequency space aliasing is solved, and the problem of universality of the 3D audio system based on Ambisonics in different scenes is solved.
Drawings
FIG. 1 is a global scheme of an Ambisonics-based 3D audio system;
FIG. 2 is a flow chart of anti-spatially aliased HOA (high Order ambisonics) coding;
FIG. 3 is a diagram of the spatial aliasing pattern of a spherical microphone array (32 microphones, rigid sphere) with a radius of 5 cm;
FIG. 4 is a spatial orientation of frequencies for a single source experiment;
(a) the ideal HOA signal is used, (b) the conventional HOA coding scheme,
(c) the coding method of the invention, (d) the optimized coding method of the invention;
FIG. 5 is a graph of spatial orientation of frequencies for two sound source experiments;
(a) the ideal HOA signal is used, (b) the conventional HOA coding scheme,
(c) the coding method of the present invention, and (d) the optimized coding method of the present invention.
Detailed Description
The following describes a method for implementing a 3D audio system for resisting high-frequency spatial aliasing according to the present invention with reference to the accompanying drawings and embodiments.
Fig. 1 is a global scheme of an Ambisonics-based 3D audio system, and specific implementation steps of the system include spatial aliasing matrix solution, anti-spatial aliasing HOA coding, and experimental verification. FIG. 2 is a flow chart of a spatial aliasing matrix solution. The concrete realization of each step is as follows:
1. spatial aliasing matrix solution
For a given spherical microphone array, the mode in which spatial aliasing occurs is determined, so that the information of the spatial aliasing mode can be used to achieve an anti-spatial aliasing effect. The spatial aliasing pattern of an array of spherical microphones that obey an approximately uniform distribution is analyzed as follows:
a spherical coordinate system is adopted, theta is an elevation angle (the range is 0 to pi), and phi is a horizontal angle (the anticlockwise is increased, and the value range is 0 to 2 pi). The radius of a rigid sphere is r, and the sound pressure of the surface of the rigid sphere can be expanded by using a spherical harmonic function according to the formula (1):
Figure BDA0002356771020000041
here Wn(kr) is a radial function, n is the order of the spherical sound pressure developed using the spherical harmonics, k is the wavenumber, and r is the radius of the spherical microphone array. If the spherical sound pressure is subjected to spherical Fourier transform, the result is represented by pnm(n-order m-order spherical fourier transform coefficient obtained from spherical continuous sound pressure) represents:
Figure BDA0002356771020000042
in practical application, spherical sound pressure needs to be sampled, and the expansion order is truncated to N, then the spherical sound pressure expansion can be written in a matrix form as follows:
Figure BDA0002356771020000043
the result of discrete spherical Fourier transform of discrete spherical sound pressure
Figure BDA0002356771020000044
(discrete spherical fourier transform coefficient of order n m obtained from spherical discrete sound pressure) represents:
Figure BDA0002356771020000051
where Q is the number of spherical microphones. When the order of the array is N (the order of the array is determined by the number of spherical microphones and the sampling scheme, N<(M+1)2And M is the number of spherical microphones. Generally, the truncation order is equal to the order of the array in HOA encoding), high-order parts with orders greater than N are superimposed to a low order in a certain mode after the spherical sound pressure is subjected to discrete sphere fourier expansion, so that low-order components are polluted, and spatial aliasing is realized. The spherical sound pressure function is required to be satisfied without spatial aliasingIs of finite order and is less than N. The higher the order of the spherical sound pressure function expansion as the frequency increases. Thus, for a known array structure, the signal frequency f is satisfied
Figure BDA0002356771020000052
(c is the speed of sound), the spatial aliasing error can be considered negligible, referred to herein as the upper cut-off frequency. More severe spatial aliasing occurs when the signal frequency exceeds the upper cutoff frequency, but the aliasing pattern of the fixed array is fixed, and the spatial aliasing problem can be improved by analyzing and utilizing the aliasing pattern.
The coefficient of the spherical Fourier expansion of the spherical sound pressure is pnmThe order of expansion is
Figure BDA0002356771020000053
Coefficient obtained by analytical calculation
Figure BDA0002356771020000054
True coefficient p expanded from spherical sound pressurenm(coefficient of truth pnmDerived from a formula with spherical continuity) to analyze the process of aliasing occurrence.
Figure BDA0002356771020000055
Figure BDA0002356771020000056
Wherein the content of the first and second substances,
Figure BDA0002356771020000057
where α isqIs a parameter related to the distribution of spherical microphones, and a common sampling scheme is approximately uniform sampling, so that the parameter can be regarded as 1, Yn,mqq) And Yn’,m’qq) Is spherical FourierAnd transforming the matrix to represent the value of the spherical harmonic function at each point. E is called the spatial aliasing matrix and reflects the aliasing mode of the array. The elements in E are visualized as shown in fig. 3.
As can be seen from equation (5), if spatial aliasing is not to occur, it is necessary to set (n ', m') to (n, m)
Figure BDA0002356771020000058
Other cases
Figure BDA0002356771020000059
For the array aliasing matrix diagram shown in FIG. 3, E is
Figure BDA00023567710200000510
Front (N +1)2×(N+1)2Part is a unit array, if the fixed coding order is 4, the requirement is met when the spherical sound pressure expansion order is less than 5, and no space aliasing occurs; if the coding order of the spherical sound pressure is more than 5, the obtained coefficient
Figure BDA00023567710200000511
There is an aliasing error e due to the higher order component from the ideal coefficientnmAs shown in equation (6).
Figure BDA0002356771020000061
However, spatial aliasing does not pollute all coefficients in all cases, and as can be seen from fig. 3, when the expansion order of the spherical sound pressure is 6, only the fourth order of the calculated coefficients deviates from the ideal value, and the coefficients of other lower orders are correct. That is, when the signal frequency exceeds the upper cut-off frequency, spatial aliasing errors contaminate the higher order components first, affecting the lower order components gradually as the frequency increases.
2. Anti-spatial aliasing HOA coding method
When the signal frequency exceeds the upper cut-off frequency, not all orders are contaminated at the beginning, but rather, as the frequency increases, the influence goes from higher orders to lower orders. A more direct idea is then to reject the contaminated higher order components and use only a lower order for encoding, but with the problem of low spatial resolution. In order to make the high frequency have a larger listening area in the reconstruction, it is necessary to process the encoded low-order signal that is not affected by the spatial aliasing. The method of up-scaling to high order with low order HOA signals can be used to partially solve the spatial aliasing problem. Because some low-order components are not affected by the spatial aliasing in a certain frequency range, the high-order components can be recovered by using the correct components, so that the effect caused by the spatial aliasing is eliminated in a certain frequency range. With the HOA signal of lower order N', the up-scaling algorithm is as follows:
Figure BDA0002356771020000062
B′N′=YN′s (7)
here B'NIs an N-stage HOA signal with aliasing errors, B'N′Represents to B'NTruncation to the N' order (N)>N′)。YN′Is a spherical fourier transform matrix of order N'. s ═ s1,s2,…,sL]TIs a virtual loudspeaker signal, virtually L virtual loudspeakers in space, B00~BNNIs the modified HOA signal. The angle T is uniformly distributed on a sphere (which can be obtained by using approximately uniform distribution), and if the HOA signal of N 'order is to be raised to N, L > (N' +1) is satisfied2And L > (N +1)2The conditions of (1).
Solving s by the formula (7), an underdetermined equation needs to be solved, the number of solutions is infinite, in order to obtain a more ideal solution, a sound source sparsity assumption needs to be introduced, and if a sound source is sparse at a time-frequency point, the solution of the equation can be constrained by the following formula:
min(||s||1)
Figure BDA0002356771020000063
||||prepresenting the p-norm,. epsilon.is a parameter with a small value to avoid that the plane wave dictionary cannot contain all possible sound source directions. The method has some disadvantages: this method fails once the signal frequency is so high that even the first order components contain large aliasing errors. And when the available orders are fixed, the performance of the method is rapidly deteriorated along with the increase of the number of sound sources, because the low-order components only depict a rough part of a sound field, in a certain situation, the low-order components of multiple sound sources may be matched with the low-order components of a single sound source, but sparsity constraint selects a more sparse solution, and a true solution of multiple sound sources is abandoned. The aliasing matrix describes the aliasing relationship among the components, the information of the aliasing matrix can be utilized, and the information of more components can be used to optimize the result. Equation (8) becomes:
min(||s||1)
Figure BDA0002356771020000071
wherein, B'NSpatial aliasing errors exist for the N-order HOA signal obtained by HOA encoding from the signals of the spherical microphone array. Recovering HOA coefficient B without spatial aliasing error according to HOA signal with spatial aliasing errorN
Unlike equation (8), equation (9) uses all signals of order N, whereas the method of equation (8) only applies to signals of order N '(N > N').
For the obtained more accurate s, regardless of (8) or (9), s can be encoded into N order by equation (10), and an HOA signal of N order is obtained:
BN=YNs (10)
wherein, YNIs a spherical fourier transform matrix of order N.
3. Spatial decoding
After obtaining the HOA signal, multiplying the HOA signal by an inverse matrix of the spherical fourier transform, i.e. reconstructing the sound field according to the matrix inversion method, the basic principle is as follows: when the spherical harmonic expanded form of the superimposed sound field produced by the loudspeaker array is equivalent to the spherical harmonic expanded form of the original sound field, the sound field reconstructed by the loudspeaker array is equivalent to the original sound field.
Figure BDA0002356771020000072
[s1,s2,…,sL]Is the loudspeaker signal, and L is the number of loudspeakers. And obtaining the signal of the loudspeaker according to the matrix inversion for loudspeaker playing, or converting the signal into a binaural signal in one step for playing by an earphone.
Each of the obtained speaker signals is convolved with its corresponding Head Related Impulse Response (HRIR), and then superimposed to obtain a binaural signal.
Figure BDA0002356771020000073
I.e. a speaker-based and headphone-based 3D audio system can be implemented.
THE ADVANTAGES OF THE PRESENT INVENTION
The advantages of the present invention will be described below with reference to practical results.
The invention uses a spherical microphone array composed of a rigid sphere with the radius of 5cm and 32 microphones as the acquisition equipment of spatial audio, calculates the spatial direction of a sound source for HOA signals obtained by spatial coding, and performs experiments on 2kHz to 10kHz in order to judge the effectiveness of the method on all frequencies.
The spatial orientation map of the sound source can be calculated by:
bN(Ω)=yTBN, (9)
where Y is { Y ═ Y00(Ω),...,YNN(Ω)]TIs a vector formed by spherical harmonic functions of each order in the omega direction, BNIs a calculated HOA signal, from the formulaIt can be seen that when the value of Ω is from 0 to 2 π, a spatial orientation graph of the horizontal plane can be drawn.
FIG. 4 is the experimental result of a single sound source, with a unit amplitude of the source incident from the 50 degree direction on the horizontal plane; fig. 5 shows experimental results of two sound sources, two sound sources of unit amplitude, which are incident from the directions of 50 degrees and 310 degrees on the horizontal plane, respectively. From experimental results, when the frequency is higher than 5.4kHz, the traditional HOA coding method is seriously influenced by space aliasing, and disorder is generated in the high-frequency direction; the method provided by the invention effectively solves the problem, and is hardly influenced by spatial aliasing from 5.4kHz to 10 kHz.
Although specific embodiments of the invention have been disclosed for illustrative purposes and the accompanying drawings, which are included to provide a further understanding of the invention and are incorporated by reference, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the present invention and the appended claims. Therefore, the present invention should not be limited to the disclosure of the preferred embodiments and the accompanying drawings.

Claims (8)

1. A method for implementing a 3D audio system with high frequency spatial aliasing rejection, comprising the steps of:
1) for a given spherical microphone array, sampling spherical sound pressure, and performing discrete spherical Fourier transform on the sampled spherical sound pressure; the expansion order of the discrete sphere Fourier transform is not more than the truncation order N;
2) expansion coefficient of discrete sphere Fourier transform according to step 1)
Figure FDA0002801794250000011
True coefficient p expanded from spherical sound pressurenmObtaining a spatial aliasing matrix E through the relationship between the two matrixes;
3) through the formula mn (| s | | non-woven phosphor)1)、
Figure FDA0002801794250000012
Solving to obtain a signal s; wherein, YNIs a spherical Fourier transform matrix of order N, B'NThe HOA signal of N orders is obtained by HOA coding according to the signal of the spherical microphone array, and epsilon is a set value;
4) according to the signal s obtained in step 3), by means of the formula BN=YNs encodes s into higher order N to obtain higher order HOA signal BN
5) Multiplying the HOA signal obtained in the step 4) by an inverse matrix of the spherical Fourier transform to reconstruct a sound field and obtain 3D audio.
2. The method of claim 1, wherein the frequency f of the signal collected by the spherical microphone array is satisfied
Figure FDA0002801794250000013
Where c is the speed of sound and r is the radius of the spherical microphone array.
3. Method according to claim 1 or 2, characterized in that the truncation order N<(M+1)2And M is the number of spherical microphones in the spherical microphone array.
4. The method of claim 1, wherein the spatial aliasing matrix E is
Figure FDA0002801794250000014
A matrix of (a); therein of elements
Figure FDA0002801794250000015
Figure FDA0002801794250000016
The sphere Fourier expansion order of the spherical sound pressure, Q being the number of spherical microphones, alphaqIs a parameter related to the distribution of spherical microphones, Yn,mqq) Is an n-order m-order spherical Fourier transform matrix, representing the point (theta) of the spherical harmonicqq) Value of (A), Yn’,m’qq) Is n 'order m' order spherical Fourier transform matrix and represents spherical harmonic functionNumber point (theta)qq) Value of (a), thetaqIs the elevation angle of q point in the spherical coordinate system, phiqIs the horizontal angle of q point in the spherical coordinate system.
5. The method of claim 1, wherein each speaker signal obtained when reconstructing the sound field is convolved with a head-related impulse response of a corresponding speaker and then superimposed to obtain a binaural signal, implementing a headphone-based 3D audio system.
6. A3D audio system for resisting high frequency spatial aliasing is characterized by comprising a high order HOA signal generation module and a sound field reconstruction module; wherein the content of the first and second substances,
the high-order HOA signal generation module is used for sampling the spherical sound pressure of the spherical microphone array and performing discrete spherical Fourier transform on the sampled spherical sound pressure; the expansion order of the discrete sphere Fourier transform is not more than the truncation order N; then expansion coefficient based on discrete sphere Fourier transform
Figure FDA0002801794250000021
True coefficient p expanded from spherical sound pressurenmObtaining a spatial aliasing matrix E through the relationship between the two matrixes; then through the formula min (| s | | non-woven phosphor)1)、
Figure FDA0002801794250000022
Solving to obtain a signal s; wherein, YNIs a spherical Fourier transform matrix of order N, B'NThe HOA signal of N orders is obtained by HOA coding according to the signal of the spherical microphone array, and epsilon is a set value; then by formula BN=YNs encodes s into higher order N to obtain higher order HOA signal BN
And the sound field reconstruction module is used for multiplying the obtained HOA signal by an inverse matrix of the spherical Fourier transform to reconstruct a sound field and obtain 3D audio.
7. 3D audio system according to claim 6, characterized in that the truncation order N is of order N<(M+1)2And M is the number of spherical microphones in the spherical microphone array.
8. The 3D audio system of claim 6 wherein the frequency f of the signal picked up by the ball microphone array is such that
Figure FDA0002801794250000023
Where c is the speed of sound and r is the radius of the spherical microphone array.
CN202010009944.0A 2020-01-06 2020-01-06 3D audio system capable of resisting high-frequency spatial aliasing and implementation method Active CN111193990B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010009944.0A CN111193990B (en) 2020-01-06 2020-01-06 3D audio system capable of resisting high-frequency spatial aliasing and implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010009944.0A CN111193990B (en) 2020-01-06 2020-01-06 3D audio system capable of resisting high-frequency spatial aliasing and implementation method

Publications (2)

Publication Number Publication Date
CN111193990A CN111193990A (en) 2020-05-22
CN111193990B true CN111193990B (en) 2021-01-19

Family

ID=70710604

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010009944.0A Active CN111193990B (en) 2020-01-06 2020-01-06 3D audio system capable of resisting high-frequency spatial aliasing and implementation method

Country Status (1)

Country Link
CN (1) CN111193990B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104854655A (en) * 2012-12-12 2015-08-19 汤姆逊许可公司 Method and apparatus for compressing and decompressing higher order ambisonics representation for sound field

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333265B (en) * 2011-05-20 2014-02-19 南京大学 Replay method of sound fields in three-dimensional local space based on continuous sound source concept
EP2592846A1 (en) * 2011-11-11 2013-05-15 Thomson Licensing Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an Ambisonics representation of the sound field
CN106303843B (en) * 2016-07-29 2018-04-03 北京工业大学 A kind of 2.5D playback methods of multizone different phonetic sound source
CN108877817A (en) * 2018-08-23 2018-11-23 深圳市裂石影音科技有限公司 A kind of encoding scheme of audio collecting device and the panorama sound based on this device
CN110133579B (en) * 2019-04-11 2021-02-05 南京航空航天大学 Spherical harmonic order self-adaptive selection method suitable for sound source orientation of spherical microphone array

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104854655A (en) * 2012-12-12 2015-08-19 汤姆逊许可公司 Method and apparatus for compressing and decompressing higher order ambisonics representation for sound field

Also Published As

Publication number Publication date
CN111193990A (en) 2020-05-22

Similar Documents

Publication Publication Date Title
Ahrens et al. An analytical approach to sound field reproduction using circular and spherical loudspeaker distributions
EP2285139B1 (en) Device and method for converting spatial audio signal
CN107222824B (en) Method and apparatus for decoding stereo speaker signals from higher order ambisonics audio signals
US20150294672A1 (en) Method And Device For Decoding An Audio Soundfield Representation For Audio Playback
Tylka et al. Comparison of techniques for binaural navigation of higher-order ambisonic soundfields
TW202022853A (en) Method and apparatus for decoding encoded audio signal in ambisonics format for l loudspeakers at known positions and computer readable storage medium
Rafaely et al. Spherical microphone array beam steering using Wigner-D weighting
JP2019047478A (en) Acoustic signal processing apparatus, acoustic signal processing method, and acoustic signal processing program
EP3329486B1 (en) Method and apparatus for generating from an hoa signal representation a mezzanine hoa signal representation
Suzuki et al. 3D spatial sound systems compatible with human's active listening to realize rich high-level kansei information
WO2017119321A1 (en) Audio processing device and method, and program
WO2017119318A1 (en) Audio processing device and method, and program
CN111193990B (en) 3D audio system capable of resisting high-frequency spatial aliasing and implementation method
WO2018053050A1 (en) Audio signal processor and generator
WO2017119320A1 (en) Audio processing device and method, and program
Farina et al. Recording, simulation and reproduction of spatial soundfields by spatial pcm sampling (sps)
JPWO2019168083A1 (en) Acoustic signal processing device, acoustic signal processing method and acoustic signal processing program
JP2023551016A (en) Audio encoding and decoding method and device
Berebi et al. Enabling head-tracking for binaural sound reproduction based on bilateral ambisonics
Li et al. Recording and reproducing high order surround auditory scenes for mixed and augmented reality
Tsutsumi et al. Sound field synthesis based on superposition of multipoles comprising focused monopole sources
Salvador et al. Enhancement of Spatial Sound Recordings by Adding Virtual Microphones to Spherical Microphone Arrays.
Salvador Castaneda et al. Accuracy of head-related transfer functions synthesized with spherical microphone arrays
Marschall et al. Sound-field reconstruction performance of a mixed-order ambisonics microphone array
Liu et al. A Timbre Equalization Scheme for Spatial Ambisonics Reproduction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant