CN105682000A - Audio processing method and system - Google Patents
Audio processing method and system Download PDFInfo
- Publication number
- CN105682000A CN105682000A CN201610017000.1A CN201610017000A CN105682000A CN 105682000 A CN105682000 A CN 105682000A CN 201610017000 A CN201610017000 A CN 201610017000A CN 105682000 A CN105682000 A CN 105682000A
- Authority
- CN
- China
- Prior art keywords
- signal
- audio
- ears
- rotation
- format
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 19
- 210000005069 ears Anatomy 0.000 claims abstract description 80
- 230000005236 sound signal Effects 0.000 claims abstract description 65
- 238000012545 processing Methods 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000009499 grossing Methods 0.000 abstract 1
- 239000011159 matrix material Substances 0.000 description 19
- 230000006870 function Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000002146 bilateral effect Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
- H04S7/303—Tracking of listener position or orientation
- H04S7/304—For headphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/305—Electronic adaptation of stereophonic audio signals to reverberation of the listening space
- H04S7/306—For headphones
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Stereophonic System (AREA)
Abstract
The invention relates to a cloud audio processing method, server and system. The cloud audio processing method comprises the steps of, aiming at audio signals in different formats, carrying out binaural transcoding for the audio signals in different formats respectively based on a head rotation angle of a client so as to generate binaural audio signals in corresponding formats; and superposing the binaural signals in the corresponding formats to obtain a virtual acoustical signal output by audio ears. The audio processing of the audio processing method is carried out on a cloud server, so that the cloud audio processing method well adapts to cloud architecture based audio processing and storage, and the problems of low quality of the virtual acoustical signal generated by the mobile terminal and great computation burden are reduced. In addition, aiming at the possible delay caused by processing on the server, the cloud audio processing method further comprises a step of carrying out smoothing processing for the angle to remove the delay.
Description
Technical field
The present invention relates to signal processing technology field, particularly to a kind of method of Audio Processing, server and system.
Background technology
Utilizing virtual reality helmet (head-mounteddisplay, HMD) to user present content time, adopt virtual 3D Audiotechnica, audio content is play to user by stereophone, a kind of method improving telepresenc is to follow the tracks of user's headwork (headtracking), and sound is processed accordingly. Such as, if original sound perceived as from dead ahead, when after user's rotary head to the left 90 degree, sound should be processed so that user's perception sound is from front-right 90 degree. Here virtual reality device can have many types, the display device that such as headed is followed the tracks of, or is the stereophone of a headed tracking transducer.
Realize head tracking and also have multiple method. Relatively common is use multi-motion sensor. Motion sensor external member generally includes accelerometer, gyroscope and magnetometric sensor. In motion tracking and absolute direction, every kind of sensor has oneself intrinsic strong point and weakness. Therefore practices well is to adopt sensor " fusion " (sensorfusion) to be combined by the signal from each sensor, produces a more accurate motion detection result.
After obtaining end rotation angle, it is necessary to sound is changed accordingly. A kind of mode is that sound forwards to Ambisonic territory, then again through using spin matrix that signal is converted. Ambisonic signal is typically more than two sound channels, and stereo two sound channels only supported by common media player, and the audio signal directly playing Ambisonic or other multichannels is brought difficulty by this.
In view of this, the solution that a kind of effective and high-quality virtual surround sound generates and plays is needed in this area.
Summary of the invention
In order to overcome the drawbacks described above of prior art, it is an object of the invention to provide a kind of high in the clouds audio-frequency processing method, server and system, it can effectively and in high quality generate virtual surround sound, it is mainly used in coordinating the stereophone that virtual reality helmet carries out audio frequency to play, and the generation server beyond the clouds of described virtual surround sound carries out, well adapt to the existing network type based on cloud framework, generation and the storage of virtual surround sound is performed by server, thus solving existing customer's end cannot play various 3603Daudio, the problem being especially adapted for use in the audio frequency of virtual reality applications.
To achieve these goals, the present invention provides a kind of high in the clouds audio-frequency processing method, and described audio-frequency processing method comprises the following steps, and obtains the anglec of rotation of user's end rotation; Obtain the audio signal of different-format, according to the described anglec of rotation, respectively the audio signal of described different-format is carried out ears transcoding, generate the binaural audio signal of corresponding format; Binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
Preferably, the audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
Preferably, the audio signal of described different-format being carried out ears transcoding, the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
Preferably, if desired for higher spatial accuracy, audio object signal is rotated according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, after ears transcoding, generate high-order B format audio object binaural signal, be overlapped with Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generate the mixing binaural signal of audio object and Ambisonic recorded audio signals, be overlapped with described Double-ear type sound-recording binaural signal.
Preferably, the anglec of rotation of acquired user's end rotation is specially the anglec of rotation obtaining user's end rotation, and the described anglec of rotation is smoothed.
Present invention also offers a kind of high in the clouds audio processing service device, described server includes: acquiring unit, obtains the anglec of rotation of user's end rotation; Collecting unit, gathers the audio signal of different-format; Ears transcoding units, is connected with described acquiring unit and collecting unit respectively, according to the described anglec of rotation, the audio signal of described different-format carries out ears transcoding respectively, generates the binaural audio signal of corresponding format; Superpositing unit, is connected with described ears transcoding units, the binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
Preferably, the audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
Preferably, the audio signal of described different-format is carried out ears transcoding by ears transcoding units, and the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
Preferably, if desired for higher spatial accuracy, audio object signal is rotated by ears transcoding units according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, high-order B format audio object binaural signal is generated after ears transcoding, the high-order B format audio object binaural signal that ears transcoding units is generated by superpositing unit, Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal are overlapped;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal by ears transcoding units, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generating the mixing binaural signal of audio object and Ambisonic recorded audio signals, what ears transcoding units was generated by superpositing unit mix binaural signal with described, Double-ear type sound-recording binaural signal is overlapped.
Preferably, described cloud server also includes smooth unit, is connected with described ears transcoding units and described acquiring unit respectively, and smooth unit receives the anglec of rotation of user's end rotation from acquiring unit, and the described anglec of rotation is smoothed.
Present invention also offers a kind of audio frequency broadcast system, described system includes high in the clouds audio processing service device and client; Described client includes head tracking device, described head tracking device captures the head anglec of rotation, it is uploaded to described high in the clouds audio processing service device by network, described high in the clouds audio process receives the described anglec of rotation, generate audio frequency ears output virtual ring after acoustical signal, by described network transmission to client.
High in the clouds audio-frequency processing method according to the present invention, server and system, effectively and in high quality generate virtual surround sound, it is mainly used in coordinating the stereophone that virtual reality helmet carries out audio frequency to play, and the generation server beyond the clouds of described virtual surround sound carries out, well adapt to the existing network type based on cloud framework, Audio Processing and storage is performed by cloud server, thus solving existing customer's end cannot play various 3603Daudio, the problem being especially adapted for use in the audio frequency of virtual reality applications.
Adopting the high in the clouds audio signal processing technique of the present invention, can be greatly promoted telepresenc in multi-person speech communication, user arbitrarily rotary head can pay close attention to the sound of a direction, and the many people more approached in reality talk scene. Especially in the scene using Streaming Media, by adjusting spatial sound in real time, the orientation of audio frequency, it is possible to promote the audio experience of user. If auxiliary virtual reality video content, then can better promote Consumer's Experience.
Accompanying drawing explanation
Fig. 1 is the theory diagram of one embodiment of high in the clouds audio-frequency processing method of the present invention;
Fig. 2 a-c is the theory diagram of high in the clouds another embodiment of audio-frequency processing method of the present invention;
Fig. 3 is the structural representation of an embodiment of the audio processing service device of the present invention;
Fig. 4 is the structural representation of another embodiment of the audio frequency processing system of the present invention;
Detailed description of the invention
Embodiment one: as it is shown in figure 1, audio object is processed by one includes processing as follows step:
User's end rotation angle is obtained by head tracking device;
According to the described anglec of rotation, audio object is encoded to high-order (being preferably 2 rank or 3 rank) AmbisonicB-format signal;
Convert described AmbisonicB-format signal to virtual speaker array signal; With a single order B-format signal [W1X1Y1Z1]TFor example, convert virtual speaker array signal [L to1L2…LN]TProcess be just by following computing:
Wherein, N is the number of the virtual speaker that virtual speaker topological structure includes. G matrix used in above formula is ambisonic decoding matrix, it is possible to by asking pseudo inverse matrix to draw.
The described virtual speaker array signal of audio object is carried out ears transcoding (usually 3 dimension, namely comprises elevation information) based on binaural room impulse response (BRIR), obtains the ears output virtual ring of audio object around acoustical signal. Specifically: forwarding, from virtual speaker signal, the two stereo BRIR matrixes in road that earphone signal is corresponding to, the stereo matrix in Jiang Gai bis-road and virtual speaker array signal carry out matrix multiplication, obtain virtual surround sound.
BRIR matrix is Then virtual surround sound is Described audio signal can be one or more.
Described binaural room impulse response is preferably off-line and generates, it is possible to adopt true measurement or by special Software Create, therefore needs to store substantial amounts of BRIR not necessarily like when adopting online generating mode under prior art, decreases memory consumption.
When audio object is encoded to AmbisonicB-format signal, horizontal direction exponent number is preferably greater than or equal to vertical direction exponent number, such as, when horizontal direction coding is preferably 3 rank AmbisonicB-format signal, vertical direction coding is preferably 2 rank or 1 rank AmbisonicB-format signal, represents with H3V2, H3V1 respectively. Owing to people is to the heightened perception resolution lower than Plane Angle, therefore adopts the above method suitably reducing exponent number on certain specific direction, decrease operand, but significantly reduce user's perceived effect to sound.
Acoustic field signal and ambient sound are carried out process comprise the steps:
Ambient sound is converted the ears output virtual ring of ambient sound to around acoustical signal, more described audio object (audio object now is primarily referred to as the sound-content outside ambient sound) and described ambient sound respective ears output virtual ring are exported around the corresponding audio mixing of acoustical signal ears. Fig. 1 show the theory diagram of an embodiment of the method. Wherein, the described ears output virtual ring that ambient sound (i.e. acoustic field signal in Fig. 1) converts to ambient sound preferably includes following steps around acoustical signal:
Obtain 1 rank AmbisonicB-format signal of ambient sound;
According to the described anglec of rotation, the described AmbisonicB-format signal of ambient sound is rotated and obtains postrotational AmbisonicB-format signal; Specifically, it is generate spin matrix according to the described anglec of rotation, further according to described spin matrix, the described AmbisonicB-format signal (i.e. signal to be adjusted) of ambient sound is rotated. So-called rotation, is multiplied with signal matrix to be adjusted by spin matrix, rotates the size not changing audio signal matrix component, only changes the direction of component. The exponent number of spin matrix and audio signal matrix adapt. Such as, when signal matrix to be adjusted is [W2X2Y2]TTime, spin matrix is When signal matrix to be adjusted is [W2X2Y2Z2]TTime, spin matrix is
Convert the described postrotational AmbisonicB-format signal of ambient sound to virtual speaker array signal; The described virtual speaker array signal of ambient sound is carried out ears transcoding (usually 2 dimension, does not namely comprise elevation information) based on head related transfer function (HRTF), obtains the ears output virtual ring of ambient sound around acoustical signal.HRTF is HRIR (HeadRelatedImpulseResponse) in the title corresponding to time domain.
It is to be noted can use BRIR or HRIR to be filtered for audio object or ambient sound as required. Generally comprise room model due to BRIR and one group of HRIR/HRTF describing sound bearing forms, if so input signal is with the information in room or environment, using HRIR just can meet demand.
The method of the virtual surround sound of described generation is preferably based on following supposition when implementing computing: virtual speaker array has bilateral symmetry, user is on the axis in room, and described binaural room impulse response and head related transfer function that user is corresponding also have bilateral symmetry. Based on this hypothesis, it is possible to use high-order AmbisonicB-form symmetry optimization method, substantially reduce operand, improve operation efficiency.
Describe below and how audio object is encoded to ambisonic territory.
Audio object is encoded to single order ambisonic signal:
sisiBeing i-th audio object, i=1..k, k is the number of audio object. θiθiIt is the angle (azimuth) in plane, φiφiThe angle being vertically oriented. W sound channel signal represents that omnirange sound wave, X sound channel signal, Y sound channel signal and Z sound channel signal represent the sound wave along the orthogonal orientation X in three, space, Y, Z respectively.
Single order AmbisonicB-format signal is expressed as
In like manner, audio object is encoded to 2 rank or 3 rank AmbisonicB-format signals preferably carries out according to lower table definition:
Trigonometric function in upper table is even function for azimuth angle theta, then the respective component of corresponding AmbisonicB-format signal is symmetrical, if the trigonometric function in upper table is odd function for azimuth angle theta, then the respective component of corresponding AmbisonicB-format signal is heterochiral. For single order AmbisonicB-format signal, from physical significance and coordinate, w, x, z are regardless of left and right, if so the position listened is symmetrical, and assuming that corresponding HRTF coefficient is also similar to symmetrical, the component of the ears output that so w, x, z are corresponding is identical for the left and right passage of output. And y is just reverse for left and right. So y corresponding ears output component be contrary for left and right passage. For having symmetric component, it is possible to adopt fast algorithm, i.e. symmetry optimization in calculating process, operand can be reduced further.
Further, since the process of audio file is likely to survive late by server, the solution taked is the anglec of rotation obtaining user's end rotation, and the described anglec of rotation is smoothed. Therefore little angle change can not be done new direction of rotation and process, and efficiently solves the delay problem of server process.
Embodiment two:
Fig. 2 a-c describes the embodiment being used for promoting immersion experience effect based on high in the clouds MCVF multichannel voice frequency transmission. It should be noted that the present invention contains two kinds of application scenarios (1) audio frequency real-time communication (conference scenario), as shown in Figure 2 b; (2) audio frequency is downloaded, as shown in Figure 2 c;
For two kinds of scenes, input has three kinds of forms: independent audio object, sound field input (wxy form), Double-ear type sound-recording signal.
As shown in Figure 2 b, scene is downloaded for audio frequency:
Storage server storage has Double-ear type sound-recording signal, Ambisonic recorded audio signals (acoustic field signal), and/or audio object, ears transcoding server obtains above-mentioned signal from storage server, at ears transcoding server end, audio object changed into Ambisonic signal, for instance, single order horizontal direction B format signal, i.e. wxy, and be added with other wxy signals (acoustic field signal).Wxy signal is rotated by the angle that ears transcoding server transmits according to client head tracking device by use spin matrix, wxy signal changes into double track, then superposes generation audio frequency download file with Double-ear type sound-recording binaural signal. Typically require compression to reduce transmission bandwidth. Then the dual-channel audio after client downloads compression. This way can be more efficient, but shortcoming is if audio object is only with single order B form, sterically defined resolution can decline to some extent, if but the preferred way ears process being based on cloud service is placed on client, then client downloads wxy signal from server, then rotation process needs not move through server.
If desired for higher spatial accuracy, audio object is first rotated by ears transcoding server according to the anglec of rotation, postrotational audio object signal is encoded to high-order B form (such as 33 rank), superpose in double track territory with other B format signals: after ears transcoding, generate high-order B format audio object binaural signal, be overlapped generating audio file with Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal.
Here we are it should be noted that head tracking is a kind of form, however not excluded that other action parameters, as waved. The present invention is equally applicable.
As shown in Figure 2 c, for audio frequency real-time communication (conference scenario):
Ears transcoding server directly obtains Double-ear type sound-recording microphone array, Ambisonic microphone array, independent sound source or audio object, at ears transcoding server end by Double-ear type sound-recording microphone array, Ambisonic microphone array, independent sound source or audio object perform above-mentioned similar processing procedure.
Embodiment three:
As it is shown on figure 3, a kind of high in the clouds audio processing service device, acquiring unit, obtain the anglec of rotation of user's end rotation that the head tracking device in client transmits; Collecting unit, gathers Double-ear type sound-recording signal, Ambisonic recorded audio signals, audio object respectively; Ears transcoding units, is connected with described acquiring unit and collecting unit respectively, according to the described anglec of rotation, respectively the audio signal of described different-format is carried out ears transcoding, wherein for Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal; And when if desired for higher spatial accuracy, audio object signal is rotated by ears transcoding units according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, high-order B format audio object binaural signal is generated after ears transcoding, the high-order B format audio object binaural signal that ears transcoding units is generated by superpositing unit, Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal are overlapped;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal by ears transcoding units, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generate the mixing binaural signal of audio object and Ambisonic recorded audio signals, what ears transcoding units was generated by superpositing unit mix binaural signal with described, Double-ear type sound-recording binaural signal is overlapped, and obtains audio frequency ears output virtual ring around acoustical signal.
The present embodiment utilizes the multichannel audio transmission that cloud server solves to support head tracking and the problem play.
Embodiment four:
As shown in Figure 4, a kind of audio frequency processing system of the present invention mainly comprises client, stores server, high in the clouds audio processing service device; Client includes head tracking module, and storage server end has multitrack audio file, deposits in a specific way. Client head tracking module obtains user's headwork such as end rotation angle, by parameter through the Internet upload onto the server end one or more high in the clouds audio processing service device, multitrack audio file is carried out respective handling: high in the clouds audio processing service device extracts the audio signal of different-format from storage server, and generate audio frequency ears output virtual ring around acoustical signal according to the anglec of rotation received, by the audio file after ears transcoding by described network transmission to client.
Audio file after the above-mentioned process of client downloads, it is preferred that play with two-channel stereo format.
The preferred embodiment of the present invention is described in detail above in association with accompanying drawing; but; the present invention is not limited to the detail in above-mentioned embodiment; in the technology concept of the present invention; technical scheme can being carried out multiple simple variant, these simple variant belong to protection scope of the present invention.
It is further to note that each the concrete technical characteristic described in above-mentioned detailed description of the invention, in reconcilable situation, it is possible to be combined by any suitable mode. In order to avoid unnecessary repetition, various possible compound modes are no longer illustrated by the present invention separately.
Additionally, can also carry out combination in any between the various different embodiment of the present invention, as long as it is without prejudice to the thought of the present invention, it should be considered as content disclosed in this invention equally.
Claims (12)
1. a high in the clouds audio-frequency processing method, it is characterised in that: described audio-frequency processing method comprises the following steps,
Obtain the anglec of rotation of user's end rotation;
Obtain the audio signal of different-format, according to the described anglec of rotation, respectively the audio signal of described different-format is carried out ears transcoding, generate the binaural audio signal of corresponding format;
Binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
2. high in the clouds according to claim 1 audio-frequency processing method, it is characterised in that:
The audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
3. high in the clouds according to claim 2 audio-frequency processing method, it is characterised in that:
The audio signal of described different-format is carried out ears transcoding, and the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
4. high in the clouds according to claim 3 audio-frequency processing method, it is characterised in that:
If desired for higher spatial accuracy, audio object signal is rotated according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, after ears transcoding, generate high-order B format audio object binaural signal, be overlapped with Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generate the mixing binaural signal of audio object and Ambisonic recorded audio signals, be overlapped with described Double-ear type sound-recording binaural signal.
5. the cloud processing method according to any one of claim 1-4, it is characterised in that:
The anglec of rotation of acquired user's end rotation is specially the anglec of rotation obtaining user's end rotation, and the described anglec of rotation is smoothed.
6. a high in the clouds audio processing service device, it is characterised in that described server includes:
Acquiring unit, obtains the anglec of rotation of user's end rotation;
Collecting unit, gathers the audio signal of different-format;
Ears transcoding units, is connected with described acquiring unit and collecting unit respectively, according to the described anglec of rotation, the audio signal of described different-format carries out ears transcoding respectively, generates the binaural audio signal of corresponding format;
Superpositing unit, is connected with described ears transcoding units, the binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
7. high in the clouds according to claim 6 audio processing service device, it is characterised in that:
The audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
8. high in the clouds according to claim 7 audio processing service device, it is characterised in that:
The audio signal of described different-format is carried out ears transcoding by ears transcoding units, and the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
9. high in the clouds according to claim 8 audio processing service device, it is characterised in that:
If desired for higher spatial accuracy, audio object signal is rotated by ears transcoding units according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, high-order B format audio object binaural signal is generated after ears transcoding, the high-order B format audio object binaural signal that ears transcoding units is generated by superpositing unit, Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal are overlapped;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal by ears transcoding units, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generating the mixing binaural signal of audio object and Ambisonic recorded audio signals, what ears transcoding units was generated by superpositing unit mix binaural signal with described, Double-ear type sound-recording binaural signal is overlapped.
10. the high in the clouds processing server according to any one of claim 6-9, it is characterised in that:
Described cloud server also includes smooth unit, is connected with described ears transcoding units and described acquiring unit respectively, and smooth unit receives the anglec of rotation of user's end rotation from acquiring unit, and the described anglec of rotation is smoothed.
11. an audio frequency broadcast system, it is characterised in that: described system includes audio processing service device in high in the clouds described in claim 6-10 and client; Described client includes head tracking device, described head tracking device captures the head anglec of rotation, it is uploaded to described high in the clouds audio processing service device by network, described high in the clouds audio process obtains the audio signal of different-format, and generate audio frequency ears output virtual ring after acoustical signal according to the described anglec of rotation, by described network transmission to client.
12. audio frequency broadcast system according to claim 11, it is characterized in that: described system also includes storage server, the audio signal of storage different-format, when user asks to download downloaded, described high in the clouds audio processing service device extracts described audio signal from described storage server.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610017000.1A CN105682000B (en) | 2016-01-11 | 2016-01-11 | A kind of audio-frequency processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610017000.1A CN105682000B (en) | 2016-01-11 | 2016-01-11 | A kind of audio-frequency processing method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105682000A true CN105682000A (en) | 2016-06-15 |
CN105682000B CN105682000B (en) | 2017-11-07 |
Family
ID=56300173
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610017000.1A Active CN105682000B (en) | 2016-01-11 | 2016-01-11 | A kind of audio-frequency processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105682000B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106210990A (en) * | 2016-07-13 | 2016-12-07 | 北京时代拓灵科技有限公司 | A kind of panorama sound audio processing method |
CN106851482A (en) * | 2017-03-24 | 2017-06-13 | 北京时代拓灵科技有限公司 | A kind of panorama sound loudspeaker body-sensing real-time interaction system and exchange method |
CN108877815A (en) * | 2017-05-16 | 2018-11-23 | 华为技术有限公司 | A kind of processing stereo signals method and device |
WO2019200996A1 (en) * | 2018-04-19 | 2019-10-24 | 北京微播视界科技有限公司 | Multi-voice channel audio processing method and device, and computer readable storage medium |
CN113709654A (en) * | 2021-08-27 | 2021-11-26 | 维沃移动通信(杭州)有限公司 | Recording method, recording apparatus, recording device and readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4199658A (en) * | 1977-09-10 | 1980-04-22 | Victor Company Of Japan, Limited | Binaural sound reproduction system |
CN101491116A (en) * | 2006-07-07 | 2009-07-22 | 贺利实公司 | Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system |
CN102027535A (en) * | 2008-04-11 | 2011-04-20 | 诺基亚公司 | Processing of signals |
CN102435139A (en) * | 2010-09-08 | 2012-05-02 | 哈曼贝克自动系统股份有限公司 | Head tracking system with improved detection of head rotation |
CN105120421A (en) * | 2015-08-21 | 2015-12-02 | 北京时代拓灵科技有限公司 | Method and apparatus of generating virtual surround sound |
CN105376690A (en) * | 2015-11-04 | 2016-03-02 | 北京时代拓灵科技有限公司 | Method and device of generating virtual surround sound |
-
2016
- 2016-01-11 CN CN201610017000.1A patent/CN105682000B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4199658A (en) * | 1977-09-10 | 1980-04-22 | Victor Company Of Japan, Limited | Binaural sound reproduction system |
CN101491116A (en) * | 2006-07-07 | 2009-07-22 | 贺利实公司 | Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system |
CN102027535A (en) * | 2008-04-11 | 2011-04-20 | 诺基亚公司 | Processing of signals |
CN102435139A (en) * | 2010-09-08 | 2012-05-02 | 哈曼贝克自动系统股份有限公司 | Head tracking system with improved detection of head rotation |
CN105120421A (en) * | 2015-08-21 | 2015-12-02 | 北京时代拓灵科技有限公司 | Method and apparatus of generating virtual surround sound |
CN105376690A (en) * | 2015-11-04 | 2016-03-02 | 北京时代拓灵科技有限公司 | Method and device of generating virtual surround sound |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106210990A (en) * | 2016-07-13 | 2016-12-07 | 北京时代拓灵科技有限公司 | A kind of panorama sound audio processing method |
CN106851482A (en) * | 2017-03-24 | 2017-06-13 | 北京时代拓灵科技有限公司 | A kind of panorama sound loudspeaker body-sensing real-time interaction system and exchange method |
CN108877815A (en) * | 2017-05-16 | 2018-11-23 | 华为技术有限公司 | A kind of processing stereo signals method and device |
CN108877815B (en) * | 2017-05-16 | 2021-02-23 | 华为技术有限公司 | Stereo signal processing method and device |
US11200907B2 (en) | 2017-05-16 | 2021-12-14 | Huawei Technologies Co., Ltd. | Stereo signal processing method and apparatus |
US11763825B2 (en) | 2017-05-16 | 2023-09-19 | Huawei Technologies Co., Ltd. | Stereo signal processing method and apparatus |
WO2019200996A1 (en) * | 2018-04-19 | 2019-10-24 | 北京微播视界科技有限公司 | Multi-voice channel audio processing method and device, and computer readable storage medium |
CN113709654A (en) * | 2021-08-27 | 2021-11-26 | 维沃移动通信(杭州)有限公司 | Recording method, recording apparatus, recording device and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN105682000B (en) | 2017-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10674262B2 (en) | Merging audio signals with spatial metadata | |
US10820134B2 (en) | Near-field binaural rendering | |
US10609504B2 (en) | Audio signal processing method and apparatus for binaural rendering using phase response characteristics | |
US10349197B2 (en) | Method and device for generating and playing back audio signal | |
US10924876B2 (en) | Interpolating audio streams | |
JP7038725B2 (en) | Audio signal processing method and equipment | |
CN105682000A (en) | Audio processing method and system | |
CN105376690A (en) | Method and device of generating virtual surround sound | |
US20200145776A1 (en) | Concept for generating an enhanced sound-field description or a modified sound field description using a multi-layer description | |
CN106210990B (en) | A kind of panorama sound audio processing method | |
US11617051B2 (en) | Streaming binaural audio from a cloud spatial audio processing system to a mobile station for playback on a personal audio delivery device | |
US20200168235A1 (en) | Method for conversion, stereophonic encoding, decoding and transcoding of a three-dimensional audio signal | |
Rafaely et al. | Spatial audio signal processing for binaural reproduction of recorded acoustic scenes–review and challenges | |
Shivappa et al. | Efficient, compelling, and immersive vr audio experience using scene based audio/higher order ambisonics | |
KR102114440B1 (en) | Matrix decoder with constant-power pairwise panning | |
TW202133625A (en) | Selecting audio streams based on motion | |
TW202105164A (en) | Audio rendering for low frequency effects | |
Suzuki et al. | 3D spatial sound systems compatible with human's active listening to realize rich high-level kansei information | |
Sun | Immersive audio, capture, transport, and rendering: a review | |
WO2022110723A1 (en) | Audio encoding and decoding method and apparatus | |
WO2022110722A1 (en) | Audio encoding/decoding method and device | |
Gölles et al. | Cat3DA-Camera-Tracked 3D Audio Player | |
Paterson et al. | Producing 3-D audio | |
WO2022262758A1 (en) | Audio rendering system and method and electronic device | |
WO2022262750A1 (en) | Audio rendering system and method, and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |