CN110166927B - Virtual sound image reconstruction method based on positioning correction - Google Patents

Virtual sound image reconstruction method based on positioning correction Download PDF

Info

Publication number
CN110166927B
CN110166927B CN201910392966.7A CN201910392966A CN110166927B CN 110166927 B CN110166927 B CN 110166927B CN 201910392966 A CN201910392966 A CN 201910392966A CN 110166927 B CN110166927 B CN 110166927B
Authority
CN
China
Prior art keywords
loudspeaker
gain
azimuth
sound image
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910392966.7A
Other languages
Chinese (zh)
Other versions
CN110166927A (en
Inventor
涂卫平
翟双星
郑佳玺
余智勇
万言
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201910392966.7A priority Critical patent/CN110166927B/en
Publication of CN110166927A publication Critical patent/CN110166927A/en
Application granted granted Critical
Publication of CN110166927B publication Critical patent/CN110166927B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

The invention provides a virtual sound image reconstruction method based on positioning correction, which comprises the following steps: firstly, determining the azimuth of a loudspeaker and the azimuth of a target reconstructed sound image, then distributing loudspeaker gains based on a vector amplitude translation method, further synthesizing binaural signals and extracting interaural clues, then estimating the azimuth of a virtual sound image based on a virtual sound image estimation model, comparing the estimated azimuth with the target azimuth, adjusting the gain value of the loudspeaker by adopting a dichotomy method, further enabling the deviation of the estimated azimuth and the target azimuth to be smaller than a minimum audible angle, and outputting the adjusted loudspeaker gains so as to correct the vector amplitude translation method. The invention realizes the effect that the sound image azimuth reconstructed by amplitude translation based on the vector is consistent with the target azimuth.

Description

Virtual sound image reconstruction method based on positioning correction
Technical Field
The invention relates to the technical field of audio, in particular to a virtual sound image reconstruction method based on positioning correction.
Background
In virtual reality, realistic acoustic image spatial perception experience is realized, and undistorted reconstruction is realized depending on perception of virtual acoustic images, so how to improve the accuracy of virtual acoustic image reconstruction becomes one of research hotspots in the multimedia field at home and abroad. The most widely used method for reconstructing a virtual sound image is the Amplitude Panning (AP) technique. The AP technology includes a sine law Panning technology, a tangent law Panning technology, a Vector-based Amplitude Panning (Vector Base Amplitude Panning, Vector-based Amplitude Panning), a Multiple Base Amplitude Panning (MDAP) technology, and the like. Virtual sound image reconstruction based on the AP technology adopts a simple geometric model, and by establishing direction vectors from a listening point to each loudspeaker and distributing gains for each loudspeaker based on a vector synthesis method, sound images of a target direction are synthesized.
Although the AP technology is simple in calculation and mainly forms a simple geometric model based on speakers and listening points, it does not consider the filtering effect of human head, trunk and the like in the process of transmitting sound to ears, which causes deviation between the estimated orientation and the listener perceived orientation, and further causes deviation of the synthesized virtual sound image from the target sound image. Based on this, the vector-based magnitude translation technique requires correction studies.
Disclosure of Invention
The invention provides a virtual sound image reconstruction method based on positioning correction, which is used for correcting a vector-based amplitude translation method, so that a virtual sound image reconstructed by vector-based amplitude translation is more accurate; the method comprises the following steps:
step 1: determining the positions of loudspeakers and a target position, wherein the number of the loudspeakers is 2 or 3, and the target position is an ideal virtual sound image position expected to be reconstructed;
step 2: according to the loudspeaker position and the target position, distributing initial gain for each loudspeaker by adopting a vector-based amplitude translation method;
and step 3: synthesizing a binaural signal corresponding to the initial virtual sound image through a summation positioning criterion according to the gain value of the loudspeaker, and extracting an interaural clue;
and 4, step 4: inputting the interaural cables extracted in the step 3 into an existing virtual sound image position estimation model, wherein the estimation model is used for estimating the position represented by a binaural signal;
and 5: judging whether the estimated azimuth of the virtual sound image azimuth estimation model is consistent with the target azimuth, wherein the consistency means that the difference value of the estimated azimuth and the target azimuth is within the minimum audible angle range of the target azimuth, and if the estimated azimuth is consistent with the target azimuth, taking the current loudspeaker gain as the correction gain of amplitude translation based on the vector;
step 6: if the estimated azimuth is inconsistent with the target azimuth, calculating a gain ratio of the loudspeaker, dividing a gain ratio interval, determining a median gain ratio according to a dichotomy, calculating the gain of the loudspeaker, and repeating the steps 3-6, wherein the gain ratio is the ratio of the gain of the right loudspeaker to the gain of the left loudspeaker;
preferably, the extracting of the interaural cues in step 3 specifically includes:
step 3.1: selecting corresponding HRTF data according to the position of each loudspeaker and the target position, wherein the HRTF data are stored in an HRTF database, and the HRTF data of left and right ears corresponding to each spatial position are recorded in the database;
step 3.2: obtaining each loudspeaker signal after each loudspeaker gain acts on the sound source signal, and summing the loudspeaker signals after each loudspeaker signal is respectively convolved with left and right ear HRTF data to obtain left and right ear signals;
step 3.3: the method comprises the steps of extracting interaural clues from left and right ear signals, wherein the interaural clues are clues used for positioning the sound source position and comprise binaural clues and monaural clues.
Preferably, the determining the median gain ratio according to the bisection method in step 6 is a correction gain of a successive approximation loudspeaker adopting the bisection method, and specifically includes:
step 6.1: calculating a gain ratio according to the gain of the loudspeaker, and dividing an original gain interval into a left interval and a right interval by taking the gain ratio as a critical point;
step 6.2: selecting a gain ratio variation interval from the two intervals in step 6.1 according to the deviation of the target azimuth from the predicted azimuth;
step 6.3: and calculating a median gain ratio according to the left limit value and the right limit value of the gain ratio interval, and solving the gains of the left loudspeaker and the right loudspeaker according to a gain normalization mode.
Drawings
FIG. 1: the space position diagram of the loudspeaker and the human head is shown in the embodiment of the invention;
FIG. 2: synthesizing a binaural signal diagram for the left and right speakers;
FIG. 3: is a structural diagram of a neural network;
FIG. 4: the invention is a flow chart of amplitude translation correction based on vector;
FIG. 5: a method diagram for adjusting the gain of a speaker according to an embodiment of the present invention;
FIG. 6: a spatial position diagram of three loudspeakers;
FIG. 7: to estimate the mapping of the sound image in the plane of the speakers 1 and 2;
FIG. 8: to estimate a map of the sound image at the plane of the speakers 2 and 3.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a virtual sound image reconstruction method based on positioning correction, which is used for solving the problem that the position of a reconstructed virtual sound image deviates from a target position due to neglecting the sound field disturbance effect of a listener in the existing vector-based amplitude translation technology.
The technical scheme in the embodiment of the application has the following general idea:
firstly, determining the orientation and the target orientation of a loudspeaker, then distributing corresponding gain values for each loudspeaker by adopting a vector-based amplitude translation method, then synthesizing a virtual sound image based on an HRTF database and extracting interaural clues, then estimating the orientation of the currently synthesized virtual sound image by using a virtual sound image orientation estimation model, next, carrying out dichotomy adjustment on the gain of the loudspeaker according to the difference between the target orientation and the estimated orientation, and then continuously and iteratively adjusting the gain of the loudspeaker until the difference between the target orientation and the estimated orientation is smaller than the minimum audible angle, and recording the current gain of the loudspeaker, namely the finally corrected gain of the loudspeaker.
According to the method provided by the invention, the virtual sound image azimuth is predicted in real time, the loudspeaker gain is continuously adjusted by adopting the dichotomy so as to change the virtual sound image azimuth in real time, and the method is not terminated until the difference value between the predicted azimuth and the target azimuth is smaller than the minimum audible angle. Therefore, when the prediction error of the virtual sound image orientation estimation model is small, the method provided by the invention can effectively improve the problem of vector-based amplitude translation positioning deviation. Most of the existing virtual sound image orientation estimation models have better prediction performance.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical scheme of the invention is explained in detail in the following by combining the drawings and the embodiment.
The invention provides a virtual sound image synthesis method and device based on positioning correction, which are used for solving the problem that the orientation deviation of a virtual sound image synthesized by a vector-based amplitude translation method is large. The implementation flow of the embodiment comprises the following steps:
step 1: determining loudspeaker positions and target positions, wherein the number of the loudspeakers is 2 or 3, and the target positions are virtual sound image positions expected to be reconstructed;
the determining the speaker orientation and the target orientation in the step 1 specifically comprises:
the vector-based amplitude translation method is suitable for the conditions of two or three loudspeakers, taking the condition of 2 loudspeakers as an example, a coordinate system is established by taking a human head as an origin, the 2 loudspeakers are positioned on a circle by taking a listening point (the human head) as a circle center, the front of the human head is set to be 0 degrees, and the corresponding directions of a left ear and a right ear are respectively-90 degrees and 90 degrees; the angles of the 2 loudspeakers are respectively-theta and theta, and the orientation of the target sound image is
Figure BDA0002057182640000041
Step 2: calculating an initial gain value of each loudspeaker by using vector-based amplitude translation according to the loudspeaker orientation and the target orientation (the virtual sound image orientation expected to be reconstructed);
in step 2, calculating the initial gain value g1, g2 of each loudspeaker by using vector-based amplitude translation according to the loudspeaker orientation and the target orientation:
specifically, the principle of the vector-based amplitude panning method is that given 2 or 3 speakers with the same radius from the listening point, assuming that the virtual sound image and the speakers are located on a spherical surface with the same radius from the center point, the location of the speakers and the center point form a unit vector, and the unit vector of the virtual sound image is synthesized by the vector.
In a specific implementation, the number of speakers is 2, and the speakers may be referred to as a left speaker and a right speaker according to their relative orientations, and the initial gain of the speakers may be derived according to a formula derived from a vector-based magnitude panning method as follows:
Figure BDA0002057182640000042
Figure BDA0002057182640000043
and step 3: synthesizing a binaural signal corresponding to the initial virtual sound image through a summation positioning criterion according to the gain value of the loudspeaker, and extracting an interaural clue;
and 3, synthesizing a binaural signal corresponding to the initial virtual sound image through a summation positioning criterion according to the gain value of the loudspeaker, and extracting interaural clues as follows:
determining a corresponding Head-Related transfer function (HRTF) according to the position of each loudspeaker, wherein the HRTF is stored in an HRTF database, the HRTF of the left ear and the HRTF of the right ear corresponding to each spatial position are recorded in the database, the corresponding HRTF is obtained according to the position of the loudspeaker, and a binaural signal synthesized by the double loudspeakers at the position of the human ear is calculated by combining the initial gain value of the loudspeaker obtained in the step 2, and an interaural clue is extracted;
specifically, the method is a sound effect positioning algorithm, which utilizes pulse signals to record the transmission process of free-field sound waves from a sound source to two ears of a listener, including the comprehensive filtering of the sound waves of the head, auricle, trunk and the like of the listener, and stores the sound waves as an HRTF database. Different positions correspond to different HRTFs, the HRTF is related to individual characteristics, the HRTF database comprises a CIPIC database, an ARI database, a PKU database, a SADIE database and the like, the data volume and the sampling precision of each database are different, and the HRTF database can be selected as required.
As an optional implementation, the speakers include a left speaker and a right speaker, the selected database is a CIPIC database, and in step 3, under the left and right speaker configuration, the method synthesizes a virtual sound image based on the HRTF database and extracts an interaural cue, and specifically includes:
step 3.1: selecting a corresponding HRTF from CIPICs according to the position and the target position of each loudspeaker, wherein the CIPIC library records HRTF data of left and right ears corresponding to each spatial position, and the HRTF data contains M-1250 spatial positions in total;
step 3.2: according to left and right ear HRTFs corresponding to the left and right loudspeaker positions, and in combination with the left and right loudspeaker gains, binaural signals corresponding to virtual sound images synthesized by the left and right loudspeakers can be calculated;
specifically, the CIPIC HRTF database is adopted, and s is taken as a sound source signal, the gain of the left loudspeaker is g1, and the gain of the right loudspeaker is g2, so that the left loudspeaker signal is sl-s-g 1, and the right loudspeaker signal is sr-s-g 2; convolving the speaker signal with the HRTF of the left ear to obtain a left ear signal, and convolving the speaker signal with the HRTF of the right ear to obtain a right ear signal; as shown in fig. 2, the left ear signal is the sum of signals al and bl transmitted to the left ear by the left and right speakers, respectively;right earThe signal is the sum of the signals ar and br transmitted by the left and right loudspeakers to the right ear, respectively. The left and right ear signals can be obtained according to the following formula:
xl=s·g1·hrtfll+s·g2·hrtfrl
xr=s·g1·hrtflr+s·g2·hrtfrr
wherein xl is defined as a left ear signal, and xr is defined as a right ear signal; definition of hrtfllDefining hrtf for left ear hrtf corresponding to left loudspeakerrlDefining hrtf for left ear hrtf corresponding to right speakerlrDefining hrtf for right ear hrtf corresponding to left loudspeakerrrThe right ear hrtf corresponding to the right speaker.
Step 3.3: extracting an interaural cue from a binaural signal, wherein the interaural cue is a cue used for judging the position of a sound source by human ears and comprises a binaural cue, a monaural cue and the like;
specifically, the Interaural line includes an Interaural Time Difference (ITD), an Interaural Level Difference (ILD), a binaural Cross-correlation function (CCF), a monaural line, and the like. The monaural spectral cue refers to a monaural spectral cue, and here, the Energy value (GFE) of the left and right ear signals after passing through the gamma Filter is used to represent the monaural spectral cue. The choice of interaural cues may be selected as desired.
In a specific implementation, the binaural signal obtained in step 3.1 is framed and the signal of one of the frames is taken for calculation.
The ILD is calculated as follows:
Figure BDA0002057182640000061
where Xl is defined as the left ear signal and Xr is defined as the right ear signal.
The formula for CCF is as follows:
Figure BDA0002057182640000062
wherein, xl (N) is defined as a left ear signal, xr (N) is defined as a right ear signal, N is defined as an nth time, τ is defined as a time delay of the right ear signal relative to the left ear signal, and N is defined as a total length of the signals.
The ITD value is the time delay difference at the CCF peak value; the value of the GFE is that the signal energy of the left ear and the right ear respectively passes through a gamma atom filter bank with 20 channels, and finally 40 GFE values can be obtained.
And 4, step 4: and estimating the sound image azimuth represented by the binaural signal by adopting a virtual sound image estimation model to obtain an estimated azimuth.
Specifically, the virtual sound image estimation model adopts a sound image estimation method based on a BP neural network model, wherein the input of the neural network model is an interaural clue, and the output of the neural network model is a corresponding sound image azimuth; the network structure is shown in fig. 3, and comprises an input layer, two hidden layers and an output layer; the input layer contains 75 nodes, the hidden layers each contain 151 nodes, and the output layer is 2 nodes. When training the neural network, the activation function of the hidden layer is set to be a sigmoid function, the learning rate is 0.001, and the iteration number is 350. And through verification, the average error of the sound image orientation estimated by the neural network is smaller than the average value of the minimum audible angle, and the neural network model positioning is considered to be accurate.
In the specific implementation process, the interaural cord extracted in the step 3 is input into a neural network model, and the estimated orientation is further obtained.
And 5: and judging whether the estimated azimuth of the virtual sound image azimuth estimation model is consistent with the target azimuth, wherein the consistency refers to that the difference value between the estimated azimuth and the target azimuth is smaller than the minimum audible angle of the target azimuth, and if the estimated azimuth is consistent with the target azimuth, the current loudspeaker gain is used as the correction gain of the amplitude translation based on the vector.
In particular, a target position is defined
Figure BDA0002057182640000071
Subtracting the estimated orientation
Figure BDA0002057182640000072
By a difference of
Figure BDA0002057182640000073
If the difference between the estimated azimuth and the target azimuth
Figure BDA0002057182640000074
Less than the Minimum Audible Angle (MAA), then
Figure BDA0002057182640000075
When the following formula is satisfied, the current speaker gain is output.
Figure BDA0002057182640000076
Step 6: and if the estimated azimuth is inconsistent with the target azimuth, calculating a gain ratio of the loudspeaker, dividing a gain ratio interval, determining a median gain ratio according to a dichotomy, calculating the gain of the loudspeaker, and repeating the steps 3-6, wherein the gain ratio is the ratio of the gain of the right loudspeaker to the gain of the left loudspeaker.
Specifically, the gain value of the speaker is continuously adjusted by the dichotomy until the estimated orientation output by the neural network system is consistent with the target orientation, and the current speaker gain is recorded as the correction gain based on the amplitude translation of the vector.
In practice, the estimated orientation is not consistent with the target orientation, i.e.
Figure BDA0002057182640000077
The general flow for adjusting the speaker gain above MAA is shown in fig. 5. The method specifically comprises the following steps:
step 6.1: firstly, calculating a gain ratio g of a current loudspeaker, setting an adjustment interval of the gain ratio as [ a, b ], and dividing the gain ratio interval into two intervals, namely [ a, g ] and [ g, b ];
step 6.2: if it is
Figure BDA0002057182640000078
The gain ratio interval is chosen to be [ a, g ]]If, if
Figure BDA0002057182640000079
The gain ratio interval is chosen to be g, b];
Step 6.3: and (3) calculating a median gain ratio based on the gain ratio interval, namely calculating an average value of a left limit value and a right limit value of the gain ratio interval, taking the average value as the median gain ratio, then solving gains of the left loudspeaker and the right loudspeaker according to a gain normalization mode, and repeating the steps 3 to 6.
In the implementation, when there are three speakers, as shown in fig. 6 to 8, the target azimuth is first defined
Figure BDA00020571826400000710
The direction mapped on the listening point and the plane formed by the loudspeaker 1 and the loudspeaker 2 is
Figure BDA00020571826400000711
Figure BDA00020571826400000712
The direction mapped on the listening point and the plane formed by the loudspeaker 2 and the loudspeaker 3 is
Figure BDA00020571826400000713
Estimating an orientation
Figure BDA00020571826400000714
The direction mapped on the listening point and the plane formed by the loudspeaker 1 and the loudspeaker 2 is
Figure BDA00020571826400000715
Figure BDA00020571826400000716
The direction mapped on the listening point and the plane formed by the loudspeaker 2 and the loudspeaker 3 is
Figure BDA00020571826400000717
The loudspeaker adjusting steps are as follows:
definition of
Figure BDA0002057182640000081
Is composed of
Figure BDA0002057182640000082
Concrete regulation methodThe formula can be adjusted according to a dichotomy adopted when the number of the loudspeakers is two, and the adjusting method is the same as that of the step 6.1 to the step 6.3; orientation of virtual sound image synthesized by speaker 1 and speaker 2
Figure BDA0002057182640000083
And
Figure BDA0002057182640000084
and (5) the consistency is achieved.
Definition of
Figure BDA0002057182640000085
Is composed of
Figure BDA0002057182640000086
The specific adjusting mode can be adjusted according to a dichotomy adopted when the number of the loudspeakers is two, and the adjusting method is the same as the steps 6.1-6.3; orientation of virtual sound image synthesized by speakers 2 and 3
Figure BDA0002057182640000087
And
Figure BDA0002057182640000088
and (5) the consistency is achieved.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (3)

1. A virtual sound image reconstruction method based on localization correction is characterized by comprising the following steps:
step 1: determining loudspeaker positions and target positions, wherein the number of the loudspeakers is 2 or 3, and the target positions are virtual sound image positions expected to be reconstructed;
step 2: according to the loudspeaker position and the target position, distributing initial gain for each loudspeaker by adopting a vector-based amplitude translation method;
and step 3: synthesizing a binaural signal corresponding to the initial virtual sound image through a summation positioning criterion according to the gain value of the loudspeaker, and extracting an interaural clue;
and 4, step 4: inputting the interaural cables extracted in the step 3 into a virtual sound image position estimation model, wherein the estimation model is used for estimating the position represented by the binaural signal;
and 5: judging whether the estimated azimuth of the virtual sound image azimuth estimation model is consistent with the target azimuth, wherein the consistency means that the difference value of the estimated azimuth and the target azimuth is within the minimum audible angle range of the target azimuth, and if the estimated azimuth is consistent with the target azimuth, taking the current loudspeaker gain as the correction gain of amplitude translation based on the vector;
step 6: if the estimated azimuth is not consistent with the target azimuth, calculating a gain ratio of the loudspeaker, dividing a gain ratio interval, determining a median gain ratio according to a dichotomy, calculating the gain of the loudspeaker, and repeating the steps 3-6, wherein the gain ratio is the ratio of the gain of the right loudspeaker to the gain of the left loudspeaker.
2. The method of claim 1, wherein the extracting of the interaural cues in step 3 specifically comprises:
step 3.1: selecting corresponding HRTF data according to the position of each loudspeaker and the target position, wherein the HRTF data are stored in an HRTF database, and the HRTF data of left and right ears corresponding to each spatial position are recorded in the database;
step 3.2: obtaining each loudspeaker signal after each loudspeaker gain acts on the sound source signal, and summing the loudspeaker signals after each loudspeaker signal is respectively convolved with left and right ear HRTF data to obtain left and right ear signals;
step 3.3: the method comprises the steps of extracting interaural clues from left and right ear signals, wherein the interaural clues are clues used for positioning the sound source position and comprise binaural clues and monaural clues.
3. The method according to claim 1, wherein the determining the median gain ratio according to the bisection method in step 6 is a modified gain of a successive approximation speaker adopting the bisection method, and specifically includes:
step 6.1: calculating a gain ratio according to the gain of the loudspeaker, and dividing an original gain interval into a left interval and a right interval by taking the gain ratio as a critical point;
step 6.2: selecting a gain ratio variation interval from the two intervals in step 6.1 according to the deviation of the target azimuth from the estimated azimuth;
step 6.3: and calculating a median gain ratio according to the left limit value and the right limit value of the gain ratio interval, and solving the gains of the left loudspeaker and the right loudspeaker according to a gain normalization mode.
CN201910392966.7A 2019-05-13 2019-05-13 Virtual sound image reconstruction method based on positioning correction Active CN110166927B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910392966.7A CN110166927B (en) 2019-05-13 2019-05-13 Virtual sound image reconstruction method based on positioning correction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910392966.7A CN110166927B (en) 2019-05-13 2019-05-13 Virtual sound image reconstruction method based on positioning correction

Publications (2)

Publication Number Publication Date
CN110166927A CN110166927A (en) 2019-08-23
CN110166927B true CN110166927B (en) 2020-05-12

Family

ID=67634306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910392966.7A Active CN110166927B (en) 2019-05-13 2019-05-13 Virtual sound image reconstruction method based on positioning correction

Country Status (1)

Country Link
CN (1) CN110166927B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106385660A (en) * 2015-08-07 2017-02-08 杜比实验室特许公司 Audio signal processing based on object
US9648438B1 (en) * 2015-12-16 2017-05-09 Oculus Vr, Llc Head-related transfer function recording using positional tracking
CN109068262A (en) * 2018-08-03 2018-12-21 武汉大学 A kind of acoustic image personalization replay method and device based on loudspeaker

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110268285A1 (en) * 2007-08-20 2011-11-03 Pioneer Corporation Sound image localization estimating device, sound image localization control system, sound image localization estimation method, and sound image localization control method
CN107205207B (en) * 2017-05-17 2019-01-29 华南理工大学 A kind of virtual sound image approximation acquisition methods based on middle vertical plane characteristic

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106385660A (en) * 2015-08-07 2017-02-08 杜比实验室特许公司 Audio signal processing based on object
US9648438B1 (en) * 2015-12-16 2017-05-09 Oculus Vr, Llc Head-related transfer function recording using positional tracking
CN109068262A (en) * 2018-08-03 2018-12-21 武汉大学 A kind of acoustic image personalization replay method and device based on loudspeaker

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Sound Image Reproduction Model Based on Personalized Weight Vectors;Zheng, Jiaxi; Tu, Weiping; Zhang, Xiong;《19th Pacific-Rim Conference on Multimedia (PCM)》;20180922;全文 *
Gain Factors Calibration in 3D Sound Reproduction Using VBAP;Hu Ruimin; Zhang Maosheng; Yang, Yuhong;《9th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP)》;20131018;全文 *
三维音频实时生成技术及实现;涂卫平,姚雪春,张茂胜,胡瑞敏,杨乘;《计算机科学与探索》;20150205;第9卷(第7期);全文 *

Also Published As

Publication number Publication date
CN110166927A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
US9838825B2 (en) Audio signal processing device and method for reproducing a binaural signal
CN110021306B (en) Method for generating custom spatial audio using head tracking
US20240098445A1 (en) Concept for generating an enhanced sound field description or a modified sound field description using a multi-point sound field description
US9961474B2 (en) Audio signal processing apparatus
US7231054B1 (en) Method and apparatus for three-dimensional audio display
US20190230436A1 (en) Method, systems and apparatus for determining audio representation(s) of one or more audio sources
US8437485B2 (en) Method and device for improved sound field rendering accuracy within a preferred listening area
US20180310114A1 (en) Distributed Audio Capture and Mixing
US20150156599A1 (en) Efficient personalization of head-related transfer functions for improved virtual spatial audio
CN106664501A (en) System, apparatus and method for consistent acoustic scene reproduction based on informed spatial filtering
US20090067636A1 (en) Optimization of Binaural Sound Spatialization Based on Multichannel Encoding
CN107820158B (en) Three-dimensional audio generation device based on head-related impulse response
US10652686B2 (en) Method of improving localization of surround sound
US10966046B2 (en) Spatial repositioning of multiple audio streams
Zhong et al. Head-related transfer functions and virtual auditory display
TW202022853A (en) Method and apparatus for decoding encoded audio signal in ambisonics format for l loudspeakers at known positions and computer readable storage medium
JP2009077379A (en) Stereoscopic sound reproduction equipment, stereophonic sound reproduction method, and computer program
Garí et al. Flexible binaural resynthesis of room impulse responses for augmented reality research
Salvador et al. Design theory for binaural synthesis: Combining microphone array recordings and head-related transfer function datasets
Lopez et al. Elevation in wave-field synthesis using HRTF cues
Breebaart et al. Phantom materialization: A novel method to enhance stereo audio reproduction on headphones
CN110166927B (en) Virtual sound image reconstruction method based on positioning correction
US11388540B2 (en) Method for acoustically rendering the size of a sound source
Koyama Boundary integral approach to sound field transform and reproduction
US20200275232A1 (en) Transfer function dataset generation system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant