CN104143334B - Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency - Google Patents

Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency Download PDF

Info

Publication number
CN104143334B
CN104143334B CN201310170251.XA CN201310170251A CN104143334B CN 104143334 B CN104143334 B CN 104143334B CN 201310170251 A CN201310170251 A CN 201310170251A CN 104143334 B CN104143334 B CN 104143334B
Authority
CN
China
Prior art keywords
audio
voice data
graphics processor
programmable graphics
road
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310170251.XA
Other languages
Chinese (zh)
Other versions
CN104143334A (en
Inventor
屈振华
龙显军
陈珣
万军
马涛
孙健
梅平
贺征
李屹寰
白冰
郭英
江洪
尹梅
刘豪
陈宇华
钟远晖
余冬苹
林涛
许捷翰
张海涛
叶文超
梁铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201310170251.XA priority Critical patent/CN104143334B/en
Publication of CN104143334A publication Critical patent/CN104143334A/en
Application granted granted Critical
Publication of CN104143334B publication Critical patent/CN104143334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

A kind of method that audio mixing is carried out the invention discloses programmable graphics processor and its to MCVF multichannel voice frequency, is related to digital processing field.The method includes:The N roads voice data to be played in a period of time is stored to input texture buffer, the N roads voice data of input is bound into the data texturing of programmable graphics processor;Programmable graphics processor is pre-processed to each road voice data simultaneously, at least one operation in preprocessing process according to per the coded system of voice data, sample rate or volume all the way, being decoded accordingly to the road voice data, in resampling or decay;Pretreated each road voice data is superposed to audio signal all the way by programmable graphics processor, the road audio signal is stored to output texture buffer, and exported to playback equipment.The present invention significantly reduces audio mixing time delay using the powerful concurrent operation ability of programmable graphics processor.

Description

Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency
Technical field
The present invention relates to digital processing field, more particularly to a kind of parallel processing of utilization programmable graphics processor Ability carries out the method and programmable graphics processor of audio mixing to MCVF multichannel voice frequency.
Background technology
Audio mixing refers to be input into reproduction of multiple audio streams signal to audio mixing equipment simultaneously, and reproduction of multiple audio streams signal is finally mixed into The process that audio stream signal is exported all the way.Mixer is generally divided into hardware audio mixing and software audio mixing.
Hardware audio mixing use special sound processing apparatus, such as specialty audio mixing equipment or sound card pre-processed and Audio mixing.Professional audio mixing equipment, such as Pioneer SVM-1000VDJ mixing desks, it is bulky, it is expensive, it is adaptable to recording studio Or the occasion such as studio hall, be not suitable for portable personal playback equipment and use.Sound card audio mixing is PC(Abbreviation PC)Most often adopt Audio mixing mode, its digital audio processing IC chip for passing through customization carries out decoding, resampling, the decay of MCVF multichannel voice frequency, More advanced sound card can also support that echo suppresses for microphone input part, and ambient noise suppresses, the function such as speech enhan-cement. But the object of sound card audio mixing is mainly multiple physics such as circuit input, PC loudspeaker, microphone, CD playback, the MIDI playback in PC The audio that equipment is produced, the audio mixing of the MCVF multichannel voice frequency played simultaneously for multiple audio softwares, then need to be carried out by operating system Software audio mixing.
Software audio mixing is, using software programming mode, to use central processing unit(CPU)Carry out sound signal processing and audio mixing. During audio mixing is carried out, audio can also experience the treatment such as decay, resampling.Operating system audio mixing belongs to the one of software audio mixing Plant implementation, the audio play-back application interface that audio playing program is provided by call operation system(API)To operation System submit to need play voice data, and by operating system unification carry out audio mixing after, output to sound card or audio broadcasting set It is standby.Software audio mixing is the stereo process that MCVF multichannel voice frequency is realized using the computing capability of CPU, is not rely on specific hardware. In the case that the calculated load of operating system is heavier, or concurrent audio way is excessive, such as multiplayer, MPTY etc. will There is the situation that audio mixing postpones, so as to influence audio mixing effect.
Current many software and hardware manufacturers it is also proposed many above-mentioned mix process of acceleration, reduces audio mixing and acoustic processing postpones Technology.For example, part has hardware-accelerated sound mixing function using the high-grade independent sound card with audio processing chip, can be obvious Reduce CPU usage.But this mode can substantially increase the manufacturing cost and volume of equipment, for current PC, intelligence Mobile phone etc. is generally using integrated on cheap Audio or piece(SoC)The equipment such as sound card are not applied to simultaneously.What Microsoft proposed DirectSoundTMTechnology, is by providing one group of special API, the audio buffer of direct read/write driver of sound card, so as to reach To memory access latencies are reduced, the purpose of audio mixing efficiency is improved.But the calculating energy of CPU is still limited by these technological essences Power.
Therefore, it is necessary to propose a kind of suitable portable equipment, be independent of operating system computing capability and can just reduce audio mixing The audio mixing technology of time delay.
The content of the invention
An embodiment of the present invention technical problem to be solved is:Propose a kind of utilization programmable graphics processor and The method that row disposal ability carries out audio mixing to MCVF multichannel voice frequency, is carried out with solving current portable equipment dependence operating system computing capability Audio mixing, so as to cause asking for audio mixing delay occur in the case of operating system calculated load is heavier or concurrent audio way is excessive Topic.
The one side of the embodiment of the present invention is provided one kind MCVF multichannel voice frequency is mixed based on programmable graphics processor The method of sound, including:The N roads voice data to be played in a period of time is stored to input texture buffer, the N roads that will be input into Voice data binds the data texturing of programmable graphics processor;Programmable graphics processor enters to each road voice data simultaneously Row pretreatment, according to per the coded system of voice data, sample rate or volume all the way in preprocessing process, to the road audio number At least one operation according to being decoded accordingly, in resampling or decay;Programmable graphics processor will be pretreated each Road voice data is superposed to audio signal all the way, and the road audio signal is stored to output texture buffer, and exports to broadcasting Equipment.
The programmable graphics that audio mixing is carried out to MCVF multichannel voice frequency another aspect provides a kind for the treatment of of the embodiment of the present invention Device, including:Data texturing binding module, delays for the N roads voice data to be played in a period of time to be stored to input texture Area is rushed, the N roads voice data of input is bound into the data texturing of programmable graphics processor;Pretreatment module, for simultaneously Each road voice data is pre-processed, in preprocessing process according to per the coded system of voice data all the way, sample rate or Volume, the road voice data is decoded accordingly, resampling or decay at least one operation;Superposition output module, For pretreated each road voice data to be superposed into audio signal all the way, the road audio signal is stored to output texture and is delayed Area is rushed, and is exported to playback equipment.
The present invention by the way that the N roads voice data of input to be bound the data texturing of programmable graphics processor, using can The programming powerful concurrent operation ability of graphic process unit, at the same each road voice data is decoded, resampling or decay etc. it is pre- Treatment, is finally superposed to audio signal all the way by pretreated each road voice data, and output is played out to playback equipment, Whole mix process eliminates the reliance on operating system computing capability using the powerful concurrent operation ability of programmable graphics processor, Therefore, even if in the case where operating system calculated load is heavier, programmable graphics processor can also in real time complete audio mixing, So as to significantly reduce audio mixing time delay, Consumer's Experience is improved.Also, programmable graphics processor can be decoded to audio, The complex process such as resampling, decay, so as to also improve audio quality on the basis of audio mixing real-time is improved.The present invention is proposed Audio mixing scheme be particularly suitable for being currently equipped with powerful programmable graphics processor but relatively weak portable many of integrated sound card Apparatus for media playing, such as PC, smart mobile phone.
By referring to the drawings to the detailed description of exemplary embodiment of the invention, further feature of the invention and its Advantage will be made apparent from.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also Other accompanying drawings are obtained with according to these accompanying drawings.
Fig. 1 is the principle schematic that the present invention carries out audio mixing based on programmable graphics processor to MCVF multichannel voice frequency.
Fig. 2 is the stream of method one embodiment that the present invention carries out audio mixing based on programmable graphics processor to MCVF multichannel voice frequency Journey schematic diagram.
Fig. 3 is that programmable graphics processor of the present invention shows the flow that each road voice data pre-process one embodiment It is intended to.
Fig. 4 is the structural representation of programmable graphics processor one embodiment that the present invention carries out audio mixing to MCVF multichannel voice frequency Figure.
Fig. 5 is the structural representation that the present invention carries out another embodiment of the programmable graphics processor of audio mixing to MCVF multichannel voice frequency Figure.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.Below Description only actually at least one exemplary embodiment is illustrative, and never conduct is to the present invention and its application or makes Any limitation.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, belongs to the scope of protection of the invention.
Fig. 1 is based on programmable graphics processor for the present invention(Abbreviation GPU)The principle that audio mixing is carried out to MCVF multichannel voice frequency is illustrated Figure.As shown in figure 1, audio mixing refers to be input into reproduction of multiple audio streams signal to audio mixing equipment simultaneously, and reproduction of multiple audio streams signal is final It is mixed into the process of audio stream signal output all the way.
It is stronger equipped with performance for current most of portable multimedia playback equipment such as PC, smart mobile phone etc. Big programmable graphics processor(For example, tall and handsome Geforce, Tegra series for reaching, the Radeon series of ATI, the Mali of ARM Series, the Andreno series of high pass, the gc series of Vivante, PowerVR series of Imagenation etc. support CUDA, The graphic chips of the GPU general-purpose computations programming languages such as GLSL, OpenCL), but the relatively weak situation of integrated sound card performance, this hair It is bright to propose a kind of scheme that Multi-channel audio sound mixing is carried out using programmable graphics processor, it is advantageous that by programmable graphics The powerful concurrent operation ability of processor, breaks away from the dependence to operating system computing capability, heavier in operating system calculated load In the case of, programmable graphics processor can also in real time complete audio mixing, so as to audio mixing time delay is greatly reduced, improve user's body Test.On the other hand, programmable graphics processor can also carry out such as decoding, resampling to audio, decay more complicated treatment, So as to also improve audio quality on the basis of audio mixing real-time is improved.Audio mixing scheme proposed by the present invention is carried out in detail below Describe in detail bright.
Fig. 2 is the stream of method one embodiment that the present invention carries out audio mixing based on programmable graphics processor to MCVF multichannel voice frequency Journey schematic diagram.As shown in Fig. 2 the sound mixing method of the embodiment is comprised the following steps:
Step 102, the N roads voice data to be played in a period of time is stored to input texture buffer, the N that will be input into Road voice data binds the data texturing of programmable graphics processor.
Assuming that the i-th road voice data(May be by a rules or μ rule compressions)It is Si(j), j=1...ki, then will be per length all the way It is 1 × kiVoice data be stored as 2 d texture array B(Size is N × K)The 1st of the i-th row to kthiIndividual element, its In, N represents audio way, and K represents the maximum length of all voice datas.I-th tunnel is input into the corresponding line of n-th sample of audio Managing coordinate is(I, n), n-th output corresponding texture coordinate of audio sample point be(1, n).
Step 104, programmable graphics processor is pre-processed to each road voice data simultaneously, the root in preprocessing process According to per the coded system of voice data, sample rate or volume all the way, being decoded accordingly to the road voice data, resampling or At least one operation in decay.
Pretreated each road voice data is superposed to audio signal all the way by step 106, programmable graphics processor, will The road audio signal is stored to output texture buffer, and is exported to playback equipment.
According to the parallel behavior of programmable graphics processor, the calculating of each output audio sample point is executed in parallel 's.Therefore, a computational methods for output sample are exemplarily only given herein, and the computational methods of other output samples can join Examine the computational methods that this exports sample.
Fig. 3 is that programmable graphics processor of the present invention shows the flow that each road voice data pre-process one embodiment It is intended to.As shown in figure 3, the preprocessing process of the embodiment is comprised the following steps:
Step 202, before being pre-processed to each road voice data, needed for n-th output audio sample point of calculating Coordinate of the sample value of the i-th tunnel input audio in texture buffer is input into(X, y),
X=i,
y=n×ki/ N,
Wherein, kiThe number of elements of audio is input into for the i-th tunnel, i ∈ N, x are integer, and y is integer or decimal, and N represents audio Way.
Step 204, according to coordinate(X, y)Carry out re-sampling operations.
Due to managing the effect of buffering area similar to array, can be accessed according to the coordinate specified(Read/write)Wherein deposit Numerical value.But difference is, if the coordinate for reading is not integer, programmable graphics processor can be according to neighbouring rounded coordinate position The sample value interpolation put obtains the value of the position.
A kind of exemplary re-sampling operations proposed by the present invention include:
(1)Obtained from input texture buffer and be located at coordinateWithSample value at 2 points is respectively s1 And s2, wherein,Expression is rounded downwards,Expression rounds up.
A kind of texture mapping method is, it is assumed that the i-th road voice data(May be by a rules or μ rule compressions)It is Si(j), j= 1...ki, then will be 1 × k per length all the wayiVoice data be stored as 2 d texture array B(Size is N × K)I-th row 1st to kthiIndividual element, wherein, N represents audio way, and K represents the maximum length of all voice datas.I-th tunnel is input into sound The corresponding texture coordinate of n-th sample of frequency is(I, n), n-th output corresponding texture coordinate of audio sample point be(1, n).
(2)If s1And s2Without coding, according to formulaInterpolation arithmetic is carried out to obtain Arrive(X, y)Sample value s;
(3)If s1And s2It is encoded, according to formulaInterpolation arithmetic is carried out to obtain (X, y)Sample value s, s'1It is s1Decoded sample value, s'2It is s2Decoded sample value.
Step 206, if s1And s2It is encoded, before resampling, decoding operate is carried out to voice data.
A kind of exemplary decoding operate proposed by the present invention includes:
(1)A rules or/and μ rule decodings look-up table array are bound the data texturing of programmable graphics processor, the texture Positioned at coordinate(1, u)The value at place is the corresponding decoding outputs of u, and the span of u is [0-255].A rule decoding look-up table such as tables 1 Shown, table 1 invests specification last page.Similar, according to the corresponding relation of sample value before and after μ rates coding, μ rates can be constructed Decoding look-up table.
(2)Calculate audio sample values s1In the coordinate of a rules or μ rule decoding look-up table array correspondences texture buffer(u1, v1), u1=1, v1=s1, according to coordinate(u1, v1)Restrained by a or μ rule decoding look-up tables decode and obtain decoded sample value s'1
(3)Calculate audio sample values s2In the coordinate of a rules or μ rule decoding look-up table array correspondences texture buffer(u2, v2), u2=1, v2=s2, according to coordinate(u2, v2)Restrained by a or μ rule decoding look-up tables decode and obtain decoded sample value s'2
Step 208 is right(X, y)Sample value s decayed, specifically include:
Will(X, y)Sample value s be multiplied by the attenuation factor of the i-th tunnel audioiAudio sample values after being decayed, and will Audio sample values after decay are superimposed to output sample o.The formula realized using computer program is expressed as:o=o+αi×s。
Sound mixing method of the invention is further elaborated with reference to instantiation.
Assuming that carry out audio mixing to two-way audio, a length of 10ms during the audio of buffering area buffering, the sample rate of the first via is 44kHz, attenuation coefficient is 0.4, remembers that its buffering area array is S1(j), j=1...44.The sample rate on the second tunnel is 48kHz, decay Coefficient is 0.6, remembers that its buffering area array is S2(j), j=1...48.
(1)The corresponding array of two-way audio is stored in one 2 × 48 two-dimentional buffering area array, and binds programmable figure Shape processor data texturing, is designated as arr2d_audio_tracks.A rule decoding look-up tables are stored in the buffering area of 1 × 256 In array, and programmable graphics processor data texturing is bound, be designated as arr1d_a_law_lookup_table.
(2)I=1 is made, the 1st tunnel audio is processed
a)Calculate current output sample point corresponding coordinate (x, y) in audio texture is input into.Current output sample point Coordinate is determined by GPU, it is possible to use mPosition.x, mPosition.y conduct interviews.Be calculated according to below equation (x, y)。
y=mPosition.y*arr1d_audio_len[i]/out_len
=44/48*mPosition.y
x=mPosition.x
Y1=floor(y)// y is rounded downwards
Y2=ceil(y)// y is rounded up
b)The sample value s1, s2 of (x, y1) and (x, y2) place are obtained by texture mapping to arr2d_audio_tracks:
s1=texture2D(arr2d_audio_tracks,vec2(x,y1));
s2=texture2D(arr2d_audio_tracks,vec2(x,y2));
Texture2D may be referred to abovementioned steps 204 and obtain s1, the method for s2 herein.
c)The type of coding of first via audio is judged according to arr1d_audio_enc_type [i], finds its value It is A_LAW, uses a rule codings.Look-up table then is restrained to s1 using a, and s2 is decoded, i.e., to arr1d_a_law_ It is the value at (1, s1) place that lookup_table obtains coordinate by texture mapping, and again to s1 assignment.I.e.:
s1=texture2D(arr1d_a_law_lookup_table,vec2(1,s1))
s2=texture2D(arr1d_a_law_lookup_table,vec2(1,s2))
The method that texture2D may be referred to the decoding of abovementioned steps 206 herein.
d)Resampling is carried out to first via audio, the interpolation method for using is linear interpolation.Then first via audio is inserted The mathematical formulae of value result is described as:
e)Attenuation coefficient arr1d_audio_attenuation_coeff [i] according to the i-th tunnel audio decays to s, And it is superimposed to output sample value mFragColor.
mFragColor=mFragColor+arr1d_audio_attenuation_coeff[i]*s
(3)I=2 is made, the 2nd tunnel audio is processed, repeat the above steps a)-e).
For above-mentioned example, a kind of pretreatment implementation process of use GLSL language description is given here:
uniform sampler arr2d_audio_tracks;// two-dimensional array, is input into audio, and often row represents sound all the way Frequently
uniform sampler arr1d_a_law_lookup_table;// one-dimension array, a rule decoding look-up tables
uniform int out_len;// output audio sample points
uniform int arr1d_audio_len[MAX];// one-dimension array, record is per road audio length
uniform int arr1d_audio_enc_type[MAX];// one-dimension array, coding staff of the record per road audio Formula
uniform float arr1d_audio_attenuation_coeff[MAX];// one-dimension array, record is per road The attenuation coefficient of audio
void main(){
for(i=0;i<MAX;i++){
float x,y,y1,y2;
float s,s1,s2;
// calculate current output sample point corresponding coordinate (x, y) in audio texture is input into
y=mPosition.y*arr1d_audio_len[i]/out_len;
x=mPosition.x;
Y1=floor(y);// y is rounded downwards
Y2=ceil(y);// y is rounded up
s1=texture2D(arr2d_audio_tracks,vec2(x,y1));
s2=texture2D(arr2d_audio_tracks,vec2(x,y2));
if(arr1d_audio_enc_type[i]==A_LAW){
// perform a rule decodings
s1=texture2D(arr1d_a_law_lookup_table,vec2(1,s1))
s2=texture2D(arr1d_a_law_lookup_table,vec2(1,s2))
}
// perform resampling
s=(y-y1)*s2+(y2-y)*s1;
// perform decay and be superimposed
mFragColor=mFragColor+arr1d_audio_attenuation_coeff[floor(x)]*s;
}
Above-mentioned sound mixing method, by the way that the N roads voice data of input to be bound the data texturing of programmable graphics processor, Using the powerful concurrent operation ability of programmable graphics processor, while being decoded to each road voice data, resampling or being declined The pretreatment such as subtract, pretreated each road voice data is finally superposed to audio signal all the way, and output is entered to playback equipment Row is played, and whole mix process eliminates the reliance on operating system meter using the powerful concurrent operation ability of programmable graphics processor Calculation ability, therefore, even if in the case where operating system calculated load is heavier, programmable graphics processor can also be complete in real time Into audio mixing, so as to significantly reduce audio mixing time delay, Consumer's Experience is improved.Also, programmable graphics processor can enter to audio The complex process such as row decoding, resampling, decay, so as to also improve audio quality on the basis of audio mixing real-time is improved.This The audio mixing scheme that invention is proposed is particularly suitable for being currently equipped with powerful programmable graphics processor but integrated sound card is relatively weak Portable multimedia playback equipment, such as PC, smart mobile phone.
Fig. 4 is the structural representation of programmable graphics processor one embodiment that the present invention carries out audio mixing to MCVF multichannel voice frequency Figure.As shown in figure 4, the programmable graphics processor of the embodiment includes:
Data texturing binding module 302, for the N roads voice data to be played in a period of time to be stored to input texture Buffering area, the data texturing of programmable graphics processor is bound by the N roads voice data of input;
Pretreatment module 304, for being pre-processed to each road voice data simultaneously, according to each in preprocessing process The coded system of road voice data, sample rate or volume, are decoded, in resampling or decay accordingly to the road voice data At least one operation;
Superposition output module 306, for pretreated each road voice data to be superposed into audio signal all the way, by the road Audio signal is stored to output texture buffer, and is exported to playback equipment.
Fig. 5 is the structural representation that the present invention carries out another embodiment of the programmable graphics processor of audio mixing to MCVF multichannel voice frequency Figure.
As shown in figure 5, programmable graphics processor also includes:Coordinate calculation module 408, for entering to each road voice data Before row pretreatment, the sample value of the i-th tunnel input audio needed for calculating n-th output audio sample point is in input texture buffering Coordinate in area(X, y),
X=i, y=n × ki/ N,
Wherein, kiThe number of elements of audio is input into for the i-th tunnel, i ∈ N, x are integer, and y is integer or decimal.
As shown in figure 5, pretreatment module 304 includes resampling unit 3041, for being obtained from input texture buffer Positioned at coordinateWithSample value at 2 points is respectively s1And s2,Expression is rounded downwards,Represent upward Round;If s1And s2Without coding, according to formulaInterpolation arithmetic is carried out to obtain(X, y)Sample value s;If s1And s2It is encoded, according to formulaInterpolation arithmetic is carried out to obtain Arrive(X, y)Sample value s, s'1It is s1Decoded sample value, s'2It is s2Decoded sample value.
As shown in figure 5, pretreatment module 304 includes decoding unit 3042, if for s1And s2It is encoded, carrying out Before interpolation arithmetic, a rules or/and μ rule decodings look-up table array are bound the data texturing of programmable graphics processor;Calculate Audio sample values s1In the coordinate of a rules or μ rule decoding look-up table array correspondences texture buffer(u1, v1), u1=1, v1=s1, according to Coordinate(u1, v1)Restrained by a or μ rule decoding look-up tables decode and obtain decoded sample value s'1;Calculate audio sample values s2In the coordinate of a rules or μ rule decoding look-up table array correspondences texture buffer(u2, v2), u2=1, v2=s2, according to coordinate(u2, v2) Restrained by a or μ rule decoding look-up tables decode and obtain decoded sample value s'2
As shown in figure 5, pretreatment module 304 includes attenuation units 3043, for inciting somebody to action(X, y)Sample value s be multiplied by the i-th tunnel The attenuation factor of audioiAudio sample values after being decayed, and the audio sample values after decay are superimposed to output sample.
The above-mentioned programmable graphics processor for carrying out audio mixing, programmable figure is bound by by the N roads voice data of input The data texturing of shape processor, using the powerful concurrent operation ability of programmable graphics processor, while to each road voice data Decoded, the pretreatment such as resampling or decay, pretreated each road voice data is finally superposed to audio signal all the way, And output is played out to playback equipment, whole mix process utilizes the powerful concurrent operation ability of programmable graphics processor, Operating system computing capability is eliminated the reliance on, therefore, even if in the case where operating system calculated load is heavier, at programmable graphics Reason device can also in real time complete audio mixing, so as to significantly reduce audio mixing time delay, improve Consumer's Experience.Also, programmable graphics The complex process such as processor can be decoded to audio, resampling, decay, so as to be gone back on the basis of audio mixing real-time is improved Improve audio quality.Audio mixing scheme proposed by the present invention be particularly suitable for being currently equipped with powerful programmable graphics processor but The relatively weak portable multimedia playback equipment of integrated sound card, such as PC, smart mobile phone.
One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, it is also possible to instruct the hardware of correlation to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all it is of the invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.
Table 1-a rule decoding look-up tables
A rule inputs Decoding output A rule inputs Decoding output A rule inputs Decoding output A rule inputs Decoding output
0 -5504 64 -344 128 5504 192 344
1 -5248 65 -328 129 5248 193 328
2 -6016 66 -376 130 6016 194 376
3 -5760 67 -360 131 5760 195 360
4 -4480 68 -280 132 4480 196 280
5 -4224 69 -264 133 4224 197 264
6 -4992 70 -312 134 4992 198 312
7 -4736 71 -296 135 4736 199 296
8 -7552 72 -472 136 7552 200 472
9 -7296 73 -456 137 7296 201 456
10 -8064 74 -504 138 8064 202 504
11 -7808 75 -488 139 7808 203 488
12 -6528 76 -408 140 6528 204 408
13 -6272 77 -392 141 6272 205 392
14 -7040 78 -440 142 7040 206 440
15 -6784 79 -424 143 6784 207 424
16 -2752 80 -88 144 2752 208 88
17 -2624 81 -72 145 2624 209 72
18 -3008 82 -120 146 3008 210 120
19 -2880 83 -104 147 2880 211 104
20 -2240 84 -24 148 2240 212 24
21 -2112 85 -8 149 2112 213 8
22 -2496 86 -56 150 2496 214 56
23 -2368 87 -40 151 2368 215 40
24 -3776 88 -216 152 3776 216 216
25 -3648 89 -200 153 3648 217 200
26 -4032 90 -248 154 4032 218 248
27 -3904 91 -232 155 3904 219 232
28 -3264 92 -152 156 3264 220 152
29 -3136 93 -136 157 3136 221 136
30 -3520 94 -184 158 3520 222 184
31 -3392 95 -168 159 3392 223 168
32 -22016 96 -1376 160 22016 224 1376
33 -20992 97 -1312 161 20992 225 1312
34 -24064 98 -1504 162 24064 226 1504
35 -23040 99 -1440 163 23040 227 1440
36 -17920 100 -1120 164 17920 228 1120
37 -16896 101 -1056 165 16896 229 1056
38 -19968 102 -1248 166 19968 230 1248
39 -18944 103 -1184 167 18944 231 1184
40 -30208 104 -1888 168 30208 232 1888
41 -29184 105 -1824 169 29184 233 1824
42 -32256 106 -2016 170 32256 234 2016
43 -31232 107 -1952 171 31232 235 1952
44 -26112 108 -1632 172 26112 236 1632
45 -25088 109 -1568 173 25088 237 1568
46 -28160 110 -1760 174 28160 238 1760
47 -27136 111 -1696 175 27136 239 1696
48 -11008 112 -688 176 11008 240 688
49 -10496 113 -656 177 10496 241 656
50 -12032 114 -752 178 12032 242 752
51 -11520 115 -720 179 11520 243 720
52 -8960 116 -560 180 8960 244 560
53 -8448 117 -528 181 8448 245 528
54 -9984 118 -624 182 9984 246 624
55 -9472 119 -592 183 9472 247 592
56 -15104 120 -944 184 15104 248 944
57 -14592 121 -912 185 14592 249 912
58 -16128 122 -1008 186 16128 250 1008
59 -15616 123 -976 187 15616 251 976
60 -13056 124 -816 188 13056 252 816
61 -12544 125 -784 189 12544 253 784
62 -14080 126 -880 190 14080 254 880
63 -13568 127 -848 191 13568 255 848

Claims (10)

1. a kind of method for carrying out audio mixing to MCVF multichannel voice frequency based on programmable graphics processor, including:
The N roads voice data to be played in a period of time is stored to input texture buffer, the N roads voice data of input is tied up It is set to the data texturing of programmable graphics processor;
Programmable graphics processor is pre-processed to each road voice data simultaneously, according to per audio all the way in preprocessing process The coded system of data, sample rate or volume, the road voice data is decoded accordingly, in resampling or decay at least One operation;Wherein, programmable graphics processor is carried out according to the coded system per voice data all the way to the road voice data Corresponding decoding operate, corresponding re-sampling operations are carried out according to per the sample rate of voice data all the way to the road voice data, Corresponding attenuation operations are carried out to the road voice data according to per the volume of voice data all the way;
Pretreated each road voice data is superposed to audio signal all the way by programmable graphics processor, by the road audio signal Store to output texture buffer, and export to playback equipment.
2. method according to claim 1, it is characterised in that the programmable graphics processor is simultaneously to each road audio number Also include according to before being pre-processed:
Seat of the sample value of the i-th tunnel input audio needed for calculating n-th output audio sample point in texture buffer is input into Mark (x, y),
X=i
Y=n × ki/N
Wherein, kiThe number of elements of audio is input into for the i-th tunnel, i ∈ N, x are integer, and y is integer or decimal.
3. method according to claim 2, it is characterised in that the programmable graphics processor is according to per audio number all the way According to sample rate, carrying out re-sampling operations to the road voice data includes:
Obtained from input texture buffer and be located at coordinateWithSample value at 2 points is respectively s1And s2, Expression is rounded downwards,Expression rounds up;
If s1And s2Without coding, according to formulaInterpolation arithmetic is carried out to obtain (x, y) Sample value s;
If s1And s2It is encoded, according to formulaInterpolation arithmetic is carried out to obtain (x, y) Sample value s, s '1It is s1Decoded sample value, s'2It is s2Decoded sample value.
4. method according to claim 3, it is characterised in that if s1And s2It is encoded, carry out interpolation arithmetic it Before, the programmable graphics processor carries out decoding behaviour according to the coded system per voice data all the way to the road voice data Make, specifically include:
A rules or/and μ rule decodings look-up table array are bound the data texturing of programmable graphics processor;
Calculate audio sample values s1In the coordinate (u of a rules or μ rule decoding look-up table array correspondences texture buffer1, v1), u1=1, v1=s1, according to coordinate (u1, v1) decode and obtain decoded sample value s by a rules or μ rule decoding look-up tables1';
Calculate audio sample values s2In the coordinate (u of a rules or μ rule decoding look-up table array correspondences texture buffer2, v2), u2=1, v2=s2, according to coordinate (u2, v2) decode and obtain decoded sample value s' by a rules or μ rule decoding look-up tables2
5. method according to claim 3, it is characterised in that the programmable graphics processor is according to per audio number all the way According to volume, carrying out decay to the road voice data includes:
The sample value s of (x, y) is multiplied by the attenuation factor of the i-th tunnel audioiAudio sample values after being decayed, and by after decay Audio sample values be superimposed to output sample.
6. a kind of programmable graphics processor that audio mixing is carried out to MCVF multichannel voice frequency, including:
Data texturing binding module, for the N roads voice data to be played in a period of time to be stored to input texture buffer, The N roads voice data of input is bound into the data texturing of programmable graphics processor;
Pretreatment module, for being pre-processed to each road voice data simultaneously, according to per audio all the way in preprocessing process The coded system of data, sample rate or volume, the road voice data is decoded accordingly, in resampling or decay at least One operation;Wherein, programmable graphics processor is carried out according to the coded system per voice data all the way to the road voice data Corresponding decoding operate, corresponding re-sampling operations are carried out according to per the sample rate of voice data all the way to the road voice data, Corresponding attenuation operations are carried out to the road voice data according to per the volume of voice data all the way;
Superposition output module, for pretreated each road voice data to be superposed into audio signal all the way, by the road audio letter Number store to output texture buffer, and export to playback equipment.
7. programmable graphics processor according to claim 6, it is characterised in that the programmable graphics processor is also wrapped Include:
Coordinate calculation module, for before being pre-processed to each road voice data, calculating to be exported for n-th needed for audio sample point The i-th tunnel input audio sample value be input into texture buffer in coordinate (x, y),
X=i
Y=n × ki/N
Wherein, kiThe number of elements of audio is input into for the i-th tunnel, i ∈ N, x are integer, and y is integer or decimal.
8. programmable graphics processor according to claim 6, it is characterised in that the pretreatment module includes resampling Unit, is used for
Obtained from input texture buffer and be located at coordinateWithSample value at 2 points is respectively s1And s2, Expression is rounded downwards,Expression rounds up;
If s1And s2Without coding, according to formulaInterpolation arithmetic is carried out to obtain (x, y) Sample value s;
If s1And s2It is encoded, according to formulaInterpolation arithmetic is carried out to obtain (x, y) Sample value s, s '1It is s1Decoded sample value, s'2It is s2Decoded sample value.
9. programmable graphics processor according to claim 8, it is characterised in that the pretreatment module includes that decoding is single Unit, if for s1And s2It is encoded, before interpolation arithmetic is carried out, a rules or/and μ rule decodings look-up table array are bound The data texturing of programmable graphics processor;
Calculate audio sample values s1In the coordinate (u of a rules or μ rule decoding look-up table array correspondences texture buffer1, v1), u1=1, v1=s1, according to coordinate (u1, v1) decode and obtain decoded sample value s ' by a rules or μ rule decoding look-up tables1
Calculate audio sample values s2In the coordinate (u of a rules or μ rule decoding look-up table array correspondences texture buffer2, v2), u2=1, v2=s2, according to coordinate (u2, v2) decode and obtain decoded sample value s' by a rules or μ rule decoding look-up tables2
10. programmable graphics processor according to claim 8, it is characterised in that the pretreatment module includes decay Unit, the attenuation factor for the sample value s of (x, y) to be multiplied by the i-th tunnel audioiAudio sample values after being decayed, and will Audio sample values after decay are superimposed to output sample.
CN201310170251.XA 2013-05-10 2013-05-10 Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency Active CN104143334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310170251.XA CN104143334B (en) 2013-05-10 2013-05-10 Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310170251.XA CN104143334B (en) 2013-05-10 2013-05-10 Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency

Publications (2)

Publication Number Publication Date
CN104143334A CN104143334A (en) 2014-11-12
CN104143334B true CN104143334B (en) 2017-06-16

Family

ID=51852492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310170251.XA Active CN104143334B (en) 2013-05-10 2013-05-10 Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency

Country Status (1)

Country Link
CN (1) CN104143334B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105976844A (en) * 2016-04-29 2016-09-28 高翔 Audio file generation method and device
CN110689876B (en) * 2019-10-14 2022-04-12 腾讯科技(深圳)有限公司 Voice recognition method and device, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101123723A (en) * 2006-08-11 2008-02-13 北京大学 Digital video decoding method based on image processor
EP2184869A1 (en) * 2008-11-06 2010-05-12 Studer Professional Audio GmbH Method and device for processing audio signals
CN102428514A (en) * 2010-02-18 2012-04-25 杜比实验室特许公司 Audio Decoder And Decoding Method Using Efficient Downmixing
CN102568481A (en) * 2010-12-21 2012-07-11 富士通株式会社 Method for implementing analysis quadrature mirror filter (AQMF) processing and method for implementing synthesis quadrature mirror filter (SQMF) processing
CN102638658A (en) * 2012-03-01 2012-08-15 盛乐信息技术(上海)有限公司 Method and system for editing audio-video
EP2544181A2 (en) * 2011-07-07 2013-01-09 Dolby Laboratories Licensing Corporation Method and system for split client-server reverberation processing
CN102932645A (en) * 2012-11-29 2013-02-13 济南大学 Circuit structure integrating graphic processor and video codec
CN102968995A (en) * 2012-11-16 2013-03-13 新奥特(北京)视频技术有限公司 Sound mixing method and device of audio signal

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004208083A (en) * 2002-12-25 2004-07-22 Meiwa Yo Method and device for producing contents of tv advertisement and inserting advertisement to tv program
US9942593B2 (en) * 2011-02-10 2018-04-10 Intel Corporation Producing decoded audio at graphics engine of host processing platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101123723A (en) * 2006-08-11 2008-02-13 北京大学 Digital video decoding method based on image processor
EP2184869A1 (en) * 2008-11-06 2010-05-12 Studer Professional Audio GmbH Method and device for processing audio signals
CN102428514A (en) * 2010-02-18 2012-04-25 杜比实验室特许公司 Audio Decoder And Decoding Method Using Efficient Downmixing
CN102568481A (en) * 2010-12-21 2012-07-11 富士通株式会社 Method for implementing analysis quadrature mirror filter (AQMF) processing and method for implementing synthesis quadrature mirror filter (SQMF) processing
EP2544181A2 (en) * 2011-07-07 2013-01-09 Dolby Laboratories Licensing Corporation Method and system for split client-server reverberation processing
CN102638658A (en) * 2012-03-01 2012-08-15 盛乐信息技术(上海)有限公司 Method and system for editing audio-video
CN102968995A (en) * 2012-11-16 2013-03-13 新奥特(北京)视频技术有限公司 Sound mixing method and device of audio signal
CN102932645A (en) * 2012-11-29 2013-02-13 济南大学 Circuit structure integrating graphic processor and video codec

Also Published As

Publication number Publication date
CN104143334A (en) 2014-11-12

Similar Documents

Publication Publication Date Title
CN105519139B (en) Acoustic signal processing method, signal processing unit, ears renderer, audio coder and audio decoder
CN106328126B (en) Far field voice recognition processing method and device
CN107481731B (en) Voice data enhancement method and system
EP1921604B1 (en) Environmental effects generator for digital audio signals
KR101838623B1 (en) Method for Processing an Audio Signal in accordance with a Room Impulse Response, Signal Processing Unit, Audio Encoder, Audio Decoder, and Binaural Renderer
CN101410889B (en) Controlling spatial audio coding parameters as a function of auditory events
CN104904239B (en) binaural audio processing
CN102638757B (en) Generate and control the method and system of the digital reverberation for audio signal
CN101379553B (en) Apparatus and method for encoding/decoding signal
CN104428834A (en) Systems, methods, apparatus, and computer-readable media for three-dimensional audio coding using basis function coefficients
EP1757165A1 (en) Coding reverberant sound signals
CN104134444B (en) A kind of song based on MMSE removes method and apparatus of accompanying
CN105247893A (en) Audio signal output device and method, encoding device and method, decoding device and method, and program
TW569550B (en) Method of inverse-modified discrete cosine transform and overlap-add for MPEG layer 3 voice signal decoding and apparatus thereof
CN102760437B (en) Audio decoding device of control conversion of real-time audio track
CN104143334B (en) Programmable graphics processor and its method that audio mixing is carried out to MCVF multichannel voice frequency
CN110503981A (en) Without reference audio method for evaluating objective quality, device and storage medium
KR101226412B1 (en) System, method or apparatus for combining multiple streams of media data
CN105075294B (en) Audio signal processor
CN117153172A (en) Method and apparatus for applying dynamic range compression to high order ambisonics signals
CN111417054B (en) Multi-audio-frequency data channel array generating method and device, electronic equipment and storage medium
US20200380998A1 (en) Methods and Systems for Encoding Frequency-Domain Data
CN109243472A (en) A kind of audio-frequency processing method and audio processing system
Kang et al. A smart background music mixing algorithm for portable digital imaging devices
CN102867514A (en) Sound mixing method and sound mixing apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant