CN105682000A - Audio processing method and system - Google Patents

Audio processing method and system Download PDF

Info

Publication number
CN105682000A
CN105682000A CN201610017000.1A CN201610017000A CN105682000A CN 105682000 A CN105682000 A CN 105682000A CN 201610017000 A CN201610017000 A CN 201610017000A CN 105682000 A CN105682000 A CN 105682000A
Authority
CN
China
Prior art keywords
signal
audio
ears
rotation
format
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610017000.1A
Other languages
Chinese (zh)
Other versions
CN105682000B (en
Inventor
张晨
孙学京
刘皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tuoling Inc
Original Assignee
Beijing Tuoling Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tuoling Inc filed Critical Beijing Tuoling Inc
Priority to CN201610017000.1A priority Critical patent/CN105682000B/en
Publication of CN105682000A publication Critical patent/CN105682000A/en
Application granted granted Critical
Publication of CN105682000B publication Critical patent/CN105682000B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • H04S7/306For headphones

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Stereophonic System (AREA)

Abstract

The invention relates to a cloud audio processing method, server and system. The cloud audio processing method comprises the steps of, aiming at audio signals in different formats, carrying out binaural transcoding for the audio signals in different formats respectively based on a head rotation angle of a client so as to generate binaural audio signals in corresponding formats; and superposing the binaural signals in the corresponding formats to obtain a virtual acoustical signal output by audio ears. The audio processing of the audio processing method is carried out on a cloud server, so that the cloud audio processing method well adapts to cloud architecture based audio processing and storage, and the problems of low quality of the virtual acoustical signal generated by the mobile terminal and great computation burden are reduced. In addition, aiming at the possible delay caused by processing on the server, the cloud audio processing method further comprises a step of carrying out smoothing processing for the angle to remove the delay.

Description

A kind of audio-frequency processing method and system
Technical field
The present invention relates to signal processing technology field, particularly to a kind of method of Audio Processing, server and system.
Background technology
Utilizing virtual reality helmet (head-mounteddisplay, HMD) to user present content time, adopt virtual 3D Audiotechnica, audio content is play to user by stereophone, a kind of method improving telepresenc is to follow the tracks of user's headwork (headtracking), and sound is processed accordingly. Such as, if original sound perceived as from dead ahead, when after user's rotary head to the left 90 degree, sound should be processed so that user's perception sound is from front-right 90 degree. Here virtual reality device can have many types, the display device that such as headed is followed the tracks of, or is the stereophone of a headed tracking transducer.
Realize head tracking and also have multiple method. Relatively common is use multi-motion sensor. Motion sensor external member generally includes accelerometer, gyroscope and magnetometric sensor. In motion tracking and absolute direction, every kind of sensor has oneself intrinsic strong point and weakness. Therefore practices well is to adopt sensor " fusion " (sensorfusion) to be combined by the signal from each sensor, produces a more accurate motion detection result.
After obtaining end rotation angle, it is necessary to sound is changed accordingly. A kind of mode is that sound forwards to Ambisonic territory, then again through using spin matrix that signal is converted. Ambisonic signal is typically more than two sound channels, and stereo two sound channels only supported by common media player, and the audio signal directly playing Ambisonic or other multichannels is brought difficulty by this.
In view of this, the solution that a kind of effective and high-quality virtual surround sound generates and plays is needed in this area.
Summary of the invention
In order to overcome the drawbacks described above of prior art, it is an object of the invention to provide a kind of high in the clouds audio-frequency processing method, server and system, it can effectively and in high quality generate virtual surround sound, it is mainly used in coordinating the stereophone that virtual reality helmet carries out audio frequency to play, and the generation server beyond the clouds of described virtual surround sound carries out, well adapt to the existing network type based on cloud framework, generation and the storage of virtual surround sound is performed by server, thus solving existing customer's end cannot play various 3603Daudio, the problem being especially adapted for use in the audio frequency of virtual reality applications.
To achieve these goals, the present invention provides a kind of high in the clouds audio-frequency processing method, and described audio-frequency processing method comprises the following steps, and obtains the anglec of rotation of user's end rotation; Obtain the audio signal of different-format, according to the described anglec of rotation, respectively the audio signal of described different-format is carried out ears transcoding, generate the binaural audio signal of corresponding format; Binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
Preferably, the audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
Preferably, the audio signal of described different-format being carried out ears transcoding, the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
Preferably, if desired for higher spatial accuracy, audio object signal is rotated according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, after ears transcoding, generate high-order B format audio object binaural signal, be overlapped with Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generate the mixing binaural signal of audio object and Ambisonic recorded audio signals, be overlapped with described Double-ear type sound-recording binaural signal.
Preferably, the anglec of rotation of acquired user's end rotation is specially the anglec of rotation obtaining user's end rotation, and the described anglec of rotation is smoothed.
Present invention also offers a kind of high in the clouds audio processing service device, described server includes: acquiring unit, obtains the anglec of rotation of user's end rotation; Collecting unit, gathers the audio signal of different-format; Ears transcoding units, is connected with described acquiring unit and collecting unit respectively, according to the described anglec of rotation, the audio signal of described different-format carries out ears transcoding respectively, generates the binaural audio signal of corresponding format; Superpositing unit, is connected with described ears transcoding units, the binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
Preferably, the audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
Preferably, the audio signal of described different-format is carried out ears transcoding by ears transcoding units, and the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
Preferably, if desired for higher spatial accuracy, audio object signal is rotated by ears transcoding units according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, high-order B format audio object binaural signal is generated after ears transcoding, the high-order B format audio object binaural signal that ears transcoding units is generated by superpositing unit, Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal are overlapped;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal by ears transcoding units, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generating the mixing binaural signal of audio object and Ambisonic recorded audio signals, what ears transcoding units was generated by superpositing unit mix binaural signal with described, Double-ear type sound-recording binaural signal is overlapped.
Preferably, described cloud server also includes smooth unit, is connected with described ears transcoding units and described acquiring unit respectively, and smooth unit receives the anglec of rotation of user's end rotation from acquiring unit, and the described anglec of rotation is smoothed.
Present invention also offers a kind of audio frequency broadcast system, described system includes high in the clouds audio processing service device and client; Described client includes head tracking device, described head tracking device captures the head anglec of rotation, it is uploaded to described high in the clouds audio processing service device by network, described high in the clouds audio process receives the described anglec of rotation, generate audio frequency ears output virtual ring after acoustical signal, by described network transmission to client.
High in the clouds audio-frequency processing method according to the present invention, server and system, effectively and in high quality generate virtual surround sound, it is mainly used in coordinating the stereophone that virtual reality helmet carries out audio frequency to play, and the generation server beyond the clouds of described virtual surround sound carries out, well adapt to the existing network type based on cloud framework, Audio Processing and storage is performed by cloud server, thus solving existing customer's end cannot play various 3603Daudio, the problem being especially adapted for use in the audio frequency of virtual reality applications.
Adopting the high in the clouds audio signal processing technique of the present invention, can be greatly promoted telepresenc in multi-person speech communication, user arbitrarily rotary head can pay close attention to the sound of a direction, and the many people more approached in reality talk scene. Especially in the scene using Streaming Media, by adjusting spatial sound in real time, the orientation of audio frequency, it is possible to promote the audio experience of user. If auxiliary virtual reality video content, then can better promote Consumer's Experience.
Accompanying drawing explanation
Fig. 1 is the theory diagram of one embodiment of high in the clouds audio-frequency processing method of the present invention;
Fig. 2 a-c is the theory diagram of high in the clouds another embodiment of audio-frequency processing method of the present invention;
Fig. 3 is the structural representation of an embodiment of the audio processing service device of the present invention;
Fig. 4 is the structural representation of another embodiment of the audio frequency processing system of the present invention;
Detailed description of the invention
Embodiment one: as it is shown in figure 1, audio object is processed by one includes processing as follows step:
User's end rotation angle is obtained by head tracking device;
According to the described anglec of rotation, audio object is encoded to high-order (being preferably 2 rank or 3 rank) AmbisonicB-format signal;
Convert described AmbisonicB-format signal to virtual speaker array signal; With a single order B-format signal [W1X1Y1Z1]TFor example, convert virtual speaker array signal [L to1L2…LN]TProcess be just by following computing:
L 1 L 2 · · L N = G w 1 G x 1 G y 1 G z 1 G w 2 G x 2 G y 2 G z 2 · · · · · · · · G w N G x N G y N G z N W 1 X 1 Y 1 Z 1 = G W 1 X 1 Y 1 Z 1 .
Wherein, N is the number of the virtual speaker that virtual speaker topological structure includes. G matrix used in above formula is ambisonic decoding matrix, it is possible to by asking pseudo inverse matrix to draw.
The described virtual speaker array signal of audio object is carried out ears transcoding (usually 3 dimension, namely comprises elevation information) based on binaural room impulse response (BRIR), obtains the ears output virtual ring of audio object around acoustical signal. Specifically: forwarding, from virtual speaker signal, the two stereo BRIR matrixes in road that earphone signal is corresponding to, the stereo matrix in Jiang Gai bis-road and virtual speaker array signal carry out matrix multiplication, obtain virtual surround sound.
BRIR matrix is B 1 L B 2 L · · B N L B 1 R B 2 R · · B N R , Then virtual surround sound is L R = B 1 L B 2 L · · B N L B 1 R B 2 R · · B N R L 1 L 2 · · L N = F W L F X L F Y L F Z L F W R F X R F Y R F Z R W 1 X 1 Y 1 Z 1 . Described audio signal can be one or more.
Described binaural room impulse response is preferably off-line and generates, it is possible to adopt true measurement or by special Software Create, therefore needs to store substantial amounts of BRIR not necessarily like when adopting online generating mode under prior art, decreases memory consumption.
When audio object is encoded to AmbisonicB-format signal, horizontal direction exponent number is preferably greater than or equal to vertical direction exponent number, such as, when horizontal direction coding is preferably 3 rank AmbisonicB-format signal, vertical direction coding is preferably 2 rank or 1 rank AmbisonicB-format signal, represents with H3V2, H3V1 respectively. Owing to people is to the heightened perception resolution lower than Plane Angle, therefore adopts the above method suitably reducing exponent number on certain specific direction, decrease operand, but significantly reduce user's perceived effect to sound.
Acoustic field signal and ambient sound are carried out process comprise the steps:
Ambient sound is converted the ears output virtual ring of ambient sound to around acoustical signal, more described audio object (audio object now is primarily referred to as the sound-content outside ambient sound) and described ambient sound respective ears output virtual ring are exported around the corresponding audio mixing of acoustical signal ears. Fig. 1 show the theory diagram of an embodiment of the method. Wherein, the described ears output virtual ring that ambient sound (i.e. acoustic field signal in Fig. 1) converts to ambient sound preferably includes following steps around acoustical signal:
Obtain 1 rank AmbisonicB-format signal of ambient sound;
According to the described anglec of rotation, the described AmbisonicB-format signal of ambient sound is rotated and obtains postrotational AmbisonicB-format signal; Specifically, it is generate spin matrix according to the described anglec of rotation, further according to described spin matrix, the described AmbisonicB-format signal (i.e. signal to be adjusted) of ambient sound is rotated. So-called rotation, is multiplied with signal matrix to be adjusted by spin matrix, rotates the size not changing audio signal matrix component, only changes the direction of component. The exponent number of spin matrix and audio signal matrix adapt. Such as, when signal matrix to be adjusted is [W2X2Y2]TTime, spin matrix is 1 0 0 0 c o s ( θ ) - s i n ( θ ) 0 s i n ( θ ) c o s ( θ ) ; When signal matrix to be adjusted is [W2X2Y2Z2]TTime, spin matrix is 1 0 0 0 0 c o s ( θ ) - s i n ( θ ) 0 0 s i n ( θ ) c o s ( θ ) 0 0 0 0 1 .
Convert the described postrotational AmbisonicB-format signal of ambient sound to virtual speaker array signal; The described virtual speaker array signal of ambient sound is carried out ears transcoding (usually 2 dimension, does not namely comprise elevation information) based on head related transfer function (HRTF), obtains the ears output virtual ring of ambient sound around acoustical signal.HRTF is HRIR (HeadRelatedImpulseResponse) in the title corresponding to time domain.
It is to be noted can use BRIR or HRIR to be filtered for audio object or ambient sound as required. Generally comprise room model due to BRIR and one group of HRIR/HRTF describing sound bearing forms, if so input signal is with the information in room or environment, using HRIR just can meet demand.
The method of the virtual surround sound of described generation is preferably based on following supposition when implementing computing: virtual speaker array has bilateral symmetry, user is on the axis in room, and described binaural room impulse response and head related transfer function that user is corresponding also have bilateral symmetry. Based on this hypothesis, it is possible to use high-order AmbisonicB-form symmetry optimization method, substantially reduce operand, improve operation efficiency.
Describe below and how audio object is encoded to ambisonic territory.
Audio object is encoded to single order ambisonic signal:
W = 1 k Σ i = 1 k s i [ 1 2 ] ;
X = 1 k Σ i = 1 k s i [ cosθ i cosφ i ] ;
Y = 1 k Σ i = 1 k s i [ sinθ i cosφ i ] ;
Z = 1 k Σ i = 1 k s i [ sinφ i ] ;
sisiBeing i-th audio object, i=1..k, k is the number of audio object. θiθiIt is the angle (azimuth) in plane, φiφiThe angle being vertically oriented. W sound channel signal represents that omnirange sound wave, X sound channel signal, Y sound channel signal and Z sound channel signal represent the sound wave along the orthogonal orientation X in three, space, Y, Z respectively.
Single order AmbisonicB-format signal is expressed as W 1 X 1 Y 1 Z 1 .
In like manner, audio object is encoded to 2 rank or 3 rank AmbisonicB-format signals preferably carries out according to lower table definition:
Trigonometric function in upper table is even function for azimuth angle theta, then the respective component of corresponding AmbisonicB-format signal is symmetrical, if the trigonometric function in upper table is odd function for azimuth angle theta, then the respective component of corresponding AmbisonicB-format signal is heterochiral. For single order AmbisonicB-format signal, from physical significance and coordinate, w, x, z are regardless of left and right, if so the position listened is symmetrical, and assuming that corresponding HRTF coefficient is also similar to symmetrical, the component of the ears output that so w, x, z are corresponding is identical for the left and right passage of output. And y is just reverse for left and right. So y corresponding ears output component be contrary for left and right passage. For having symmetric component, it is possible to adopt fast algorithm, i.e. symmetry optimization in calculating process, operand can be reduced further.
Further, since the process of audio file is likely to survive late by server, the solution taked is the anglec of rotation obtaining user's end rotation, and the described anglec of rotation is smoothed. Therefore little angle change can not be done new direction of rotation and process, and efficiently solves the delay problem of server process.
Embodiment two:
Fig. 2 a-c describes the embodiment being used for promoting immersion experience effect based on high in the clouds MCVF multichannel voice frequency transmission. It should be noted that the present invention contains two kinds of application scenarios (1) audio frequency real-time communication (conference scenario), as shown in Figure 2 b; (2) audio frequency is downloaded, as shown in Figure 2 c;
For two kinds of scenes, input has three kinds of forms: independent audio object, sound field input (wxy form), Double-ear type sound-recording signal.
As shown in Figure 2 b, scene is downloaded for audio frequency:
Storage server storage has Double-ear type sound-recording signal, Ambisonic recorded audio signals (acoustic field signal), and/or audio object, ears transcoding server obtains above-mentioned signal from storage server, at ears transcoding server end, audio object changed into Ambisonic signal, for instance, single order horizontal direction B format signal, i.e. wxy, and be added with other wxy signals (acoustic field signal).Wxy signal is rotated by the angle that ears transcoding server transmits according to client head tracking device by use spin matrix, wxy signal changes into double track, then superposes generation audio frequency download file with Double-ear type sound-recording binaural signal. Typically require compression to reduce transmission bandwidth. Then the dual-channel audio after client downloads compression. This way can be more efficient, but shortcoming is if audio object is only with single order B form, sterically defined resolution can decline to some extent, if but the preferred way ears process being based on cloud service is placed on client, then client downloads wxy signal from server, then rotation process needs not move through server.
If desired for higher spatial accuracy, audio object is first rotated by ears transcoding server according to the anglec of rotation, postrotational audio object signal is encoded to high-order B form (such as 33 rank), superpose in double track territory with other B format signals: after ears transcoding, generate high-order B format audio object binaural signal, be overlapped generating audio file with Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal.
Here we are it should be noted that head tracking is a kind of form, however not excluded that other action parameters, as waved. The present invention is equally applicable.
As shown in Figure 2 c, for audio frequency real-time communication (conference scenario):
Ears transcoding server directly obtains Double-ear type sound-recording microphone array, Ambisonic microphone array, independent sound source or audio object, at ears transcoding server end by Double-ear type sound-recording microphone array, Ambisonic microphone array, independent sound source or audio object perform above-mentioned similar processing procedure.
Embodiment three:
As it is shown on figure 3, a kind of high in the clouds audio processing service device, acquiring unit, obtain the anglec of rotation of user's end rotation that the head tracking device in client transmits; Collecting unit, gathers Double-ear type sound-recording signal, Ambisonic recorded audio signals, audio object respectively; Ears transcoding units, is connected with described acquiring unit and collecting unit respectively, according to the described anglec of rotation, respectively the audio signal of described different-format is carried out ears transcoding, wherein for Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal; And when if desired for higher spatial accuracy, audio object signal is rotated by ears transcoding units according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, high-order B format audio object binaural signal is generated after ears transcoding, the high-order B format audio object binaural signal that ears transcoding units is generated by superpositing unit, Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal are overlapped;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal by ears transcoding units, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generate the mixing binaural signal of audio object and Ambisonic recorded audio signals, what ears transcoding units was generated by superpositing unit mix binaural signal with described, Double-ear type sound-recording binaural signal is overlapped, and obtains audio frequency ears output virtual ring around acoustical signal.
The present embodiment utilizes the multichannel audio transmission that cloud server solves to support head tracking and the problem play.
Embodiment four:
As shown in Figure 4, a kind of audio frequency processing system of the present invention mainly comprises client, stores server, high in the clouds audio processing service device; Client includes head tracking module, and storage server end has multitrack audio file, deposits in a specific way. Client head tracking module obtains user's headwork such as end rotation angle, by parameter through the Internet upload onto the server end one or more high in the clouds audio processing service device, multitrack audio file is carried out respective handling: high in the clouds audio processing service device extracts the audio signal of different-format from storage server, and generate audio frequency ears output virtual ring around acoustical signal according to the anglec of rotation received, by the audio file after ears transcoding by described network transmission to client.
Audio file after the above-mentioned process of client downloads, it is preferred that play with two-channel stereo format.
The preferred embodiment of the present invention is described in detail above in association with accompanying drawing; but; the present invention is not limited to the detail in above-mentioned embodiment; in the technology concept of the present invention; technical scheme can being carried out multiple simple variant, these simple variant belong to protection scope of the present invention.
It is further to note that each the concrete technical characteristic described in above-mentioned detailed description of the invention, in reconcilable situation, it is possible to be combined by any suitable mode. In order to avoid unnecessary repetition, various possible compound modes are no longer illustrated by the present invention separately.
Additionally, can also carry out combination in any between the various different embodiment of the present invention, as long as it is without prejudice to the thought of the present invention, it should be considered as content disclosed in this invention equally.

Claims (12)

1. a high in the clouds audio-frequency processing method, it is characterised in that: described audio-frequency processing method comprises the following steps,
Obtain the anglec of rotation of user's end rotation;
Obtain the audio signal of different-format, according to the described anglec of rotation, respectively the audio signal of described different-format is carried out ears transcoding, generate the binaural audio signal of corresponding format;
Binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
2. high in the clouds according to claim 1 audio-frequency processing method, it is characterised in that:
The audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
3. high in the clouds according to claim 2 audio-frequency processing method, it is characterised in that:
The audio signal of described different-format is carried out ears transcoding, and the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
4. high in the clouds according to claim 3 audio-frequency processing method, it is characterised in that:
If desired for higher spatial accuracy, audio object signal is rotated according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, after ears transcoding, generate high-order B format audio object binaural signal, be overlapped with Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generate the mixing binaural signal of audio object and Ambisonic recorded audio signals, be overlapped with described Double-ear type sound-recording binaural signal.
5. the cloud processing method according to any one of claim 1-4, it is characterised in that:
The anglec of rotation of acquired user's end rotation is specially the anglec of rotation obtaining user's end rotation, and the described anglec of rotation is smoothed.
6. a high in the clouds audio processing service device, it is characterised in that described server includes:
Acquiring unit, obtains the anglec of rotation of user's end rotation;
Collecting unit, gathers the audio signal of different-format;
Ears transcoding units, is connected with described acquiring unit and collecting unit respectively, according to the described anglec of rotation, the audio signal of described different-format carries out ears transcoding respectively, generates the binaural audio signal of corresponding format;
Superpositing unit, is connected with described ears transcoding units, the binaural signal superposition to described corresponding format, obtains audio frequency ears output virtual ring around acoustical signal.
7. high in the clouds according to claim 6 audio processing service device, it is characterised in that:
The audio signal of described different-format includes Double-ear type sound-recording signal, Ambisonic recorded audio signals and audio object signal.
8. high in the clouds according to claim 7 audio processing service device, it is characterised in that:
The audio signal of described different-format is carried out ears transcoding by ears transcoding units, and the ears transcoding audio signal generating corresponding format specifically includes:
To described Double-ear type sound-recording signal, it is interpolated according to the described anglec of rotation, generates Double-ear type sound-recording binaural signal;
To described Ambisonic recorded audio signals, according to the described anglec of rotation, described Ambisonic recorded audio signals is adjusted, the Ambisonic recorded audio signals ears transcoding after described adjustment is generated Ambisonic recording binaural signal;
To described audio object signal, according to the described anglec of rotation, described audio object signal is adjusted, the audio object signal ears transcoding after described adjustment is generated audio object binaural signal.
9. high in the clouds according to claim 8 audio processing service device, it is characterised in that:
If desired for higher spatial accuracy, audio object signal is rotated by ears transcoding units according to the anglec of rotation, postrotational audio object signal is encoded to high-order B format audio object signal, high-order B format audio object binaural signal is generated after ears transcoding, the high-order B format audio object binaural signal that ears transcoding units is generated by superpositing unit, Ambisonic recording binaural signal, Double-ear type sound-recording binaural signal are overlapped;
If desired for low complex degree low latency, audio object signal is encoded to single order B format audio object signal by ears transcoding units, superpose with other single orders Ambisonic recorded audio signals, then according to the anglec of rotation, the mixed signal after described superposition is carried out ears transcoding, generating the mixing binaural signal of audio object and Ambisonic recorded audio signals, what ears transcoding units was generated by superpositing unit mix binaural signal with described, Double-ear type sound-recording binaural signal is overlapped.
10. the high in the clouds processing server according to any one of claim 6-9, it is characterised in that:
Described cloud server also includes smooth unit, is connected with described ears transcoding units and described acquiring unit respectively, and smooth unit receives the anglec of rotation of user's end rotation from acquiring unit, and the described anglec of rotation is smoothed.
11. an audio frequency broadcast system, it is characterised in that: described system includes audio processing service device in high in the clouds described in claim 6-10 and client; Described client includes head tracking device, described head tracking device captures the head anglec of rotation, it is uploaded to described high in the clouds audio processing service device by network, described high in the clouds audio process obtains the audio signal of different-format, and generate audio frequency ears output virtual ring after acoustical signal according to the described anglec of rotation, by described network transmission to client.
12. audio frequency broadcast system according to claim 11, it is characterized in that: described system also includes storage server, the audio signal of storage different-format, when user asks to download downloaded, described high in the clouds audio processing service device extracts described audio signal from described storage server.
CN201610017000.1A 2016-01-11 2016-01-11 A kind of audio-frequency processing method and system Active CN105682000B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610017000.1A CN105682000B (en) 2016-01-11 2016-01-11 A kind of audio-frequency processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610017000.1A CN105682000B (en) 2016-01-11 2016-01-11 A kind of audio-frequency processing method and system

Publications (2)

Publication Number Publication Date
CN105682000A true CN105682000A (en) 2016-06-15
CN105682000B CN105682000B (en) 2017-11-07

Family

ID=56300173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610017000.1A Active CN105682000B (en) 2016-01-11 2016-01-11 A kind of audio-frequency processing method and system

Country Status (1)

Country Link
CN (1) CN105682000B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106210990A (en) * 2016-07-13 2016-12-07 北京时代拓灵科技有限公司 A kind of panorama sound audio processing method
CN106851482A (en) * 2017-03-24 2017-06-13 北京时代拓灵科技有限公司 A kind of panorama sound loudspeaker body-sensing real-time interaction system and exchange method
CN108877815A (en) * 2017-05-16 2018-11-23 华为技术有限公司 A kind of processing stereo signals method and device
WO2019200996A1 (en) * 2018-04-19 2019-10-24 北京微播视界科技有限公司 Multi-voice channel audio processing method and device, and computer readable storage medium
CN113709654A (en) * 2021-08-27 2021-11-26 维沃移动通信(杭州)有限公司 Recording method, recording apparatus, recording device and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4199658A (en) * 1977-09-10 1980-04-22 Victor Company Of Japan, Limited Binaural sound reproduction system
CN101491116A (en) * 2006-07-07 2009-07-22 贺利实公司 Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system
CN102027535A (en) * 2008-04-11 2011-04-20 诺基亚公司 Processing of signals
CN102435139A (en) * 2010-09-08 2012-05-02 哈曼贝克自动系统股份有限公司 Head tracking system with improved detection of head rotation
CN105120421A (en) * 2015-08-21 2015-12-02 北京时代拓灵科技有限公司 Method and apparatus of generating virtual surround sound
CN105376690A (en) * 2015-11-04 2016-03-02 北京时代拓灵科技有限公司 Method and device of generating virtual surround sound

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4199658A (en) * 1977-09-10 1980-04-22 Victor Company Of Japan, Limited Binaural sound reproduction system
CN101491116A (en) * 2006-07-07 2009-07-22 贺利实公司 Method and apparatus for creating a multi-dimensional communication space for use in a binaural audio system
CN102027535A (en) * 2008-04-11 2011-04-20 诺基亚公司 Processing of signals
CN102435139A (en) * 2010-09-08 2012-05-02 哈曼贝克自动系统股份有限公司 Head tracking system with improved detection of head rotation
CN105120421A (en) * 2015-08-21 2015-12-02 北京时代拓灵科技有限公司 Method and apparatus of generating virtual surround sound
CN105376690A (en) * 2015-11-04 2016-03-02 北京时代拓灵科技有限公司 Method and device of generating virtual surround sound

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106210990A (en) * 2016-07-13 2016-12-07 北京时代拓灵科技有限公司 A kind of panorama sound audio processing method
CN106851482A (en) * 2017-03-24 2017-06-13 北京时代拓灵科技有限公司 A kind of panorama sound loudspeaker body-sensing real-time interaction system and exchange method
CN108877815A (en) * 2017-05-16 2018-11-23 华为技术有限公司 A kind of processing stereo signals method and device
CN108877815B (en) * 2017-05-16 2021-02-23 华为技术有限公司 Stereo signal processing method and device
US11200907B2 (en) 2017-05-16 2021-12-14 Huawei Technologies Co., Ltd. Stereo signal processing method and apparatus
US11763825B2 (en) 2017-05-16 2023-09-19 Huawei Technologies Co., Ltd. Stereo signal processing method and apparatus
WO2019200996A1 (en) * 2018-04-19 2019-10-24 北京微播视界科技有限公司 Multi-voice channel audio processing method and device, and computer readable storage medium
CN113709654A (en) * 2021-08-27 2021-11-26 维沃移动通信(杭州)有限公司 Recording method, recording apparatus, recording device and readable storage medium

Also Published As

Publication number Publication date
CN105682000B (en) 2017-11-07

Similar Documents

Publication Publication Date Title
US10674262B2 (en) Merging audio signals with spatial metadata
US10820134B2 (en) Near-field binaural rendering
US10609504B2 (en) Audio signal processing method and apparatus for binaural rendering using phase response characteristics
US10349197B2 (en) Method and device for generating and playing back audio signal
US10924876B2 (en) Interpolating audio streams
JP7038725B2 (en) Audio signal processing method and equipment
CN105682000A (en) Audio processing method and system
CN105376690A (en) Method and device of generating virtual surround sound
US20200145776A1 (en) Concept for generating an enhanced sound-field description or a modified sound field description using a multi-layer description
CN106210990B (en) A kind of panorama sound audio processing method
US11617051B2 (en) Streaming binaural audio from a cloud spatial audio processing system to a mobile station for playback on a personal audio delivery device
US20200168235A1 (en) Method for conversion, stereophonic encoding, decoding and transcoding of a three-dimensional audio signal
Rafaely et al. Spatial audio signal processing for binaural reproduction of recorded acoustic scenes–review and challenges
Shivappa et al. Efficient, compelling, and immersive vr audio experience using scene based audio/higher order ambisonics
KR102114440B1 (en) Matrix decoder with constant-power pairwise panning
TW202133625A (en) Selecting audio streams based on motion
TW202105164A (en) Audio rendering for low frequency effects
Suzuki et al. 3D spatial sound systems compatible with human's active listening to realize rich high-level kansei information
Sun Immersive audio, capture, transport, and rendering: a review
WO2022110723A1 (en) Audio encoding and decoding method and apparatus
WO2022110722A1 (en) Audio encoding/decoding method and device
Gölles et al. Cat3DA-Camera-Tracked 3D Audio Player
Paterson et al. Producing 3-D audio
WO2022262758A1 (en) Audio rendering system and method and electronic device
WO2022262750A1 (en) Audio rendering system and method, and electronic device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant