CN110139206A - A kind of processing method and system of stereo audio - Google Patents

A kind of processing method and system of stereo audio Download PDF

Info

Publication number
CN110139206A
CN110139206A CN201910349362.4A CN201910349362A CN110139206A CN 110139206 A CN110139206 A CN 110139206A CN 201910349362 A CN201910349362 A CN 201910349362A CN 110139206 A CN110139206 A CN 110139206A
Authority
CN
China
Prior art keywords
signal
frequency
right channel
channel frequency
region signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910349362.4A
Other languages
Chinese (zh)
Other versions
CN110139206B (en
Inventor
宋冬梅
武剑
王宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING THUNDERSTONE TECHNOLOGY Ltd
Original Assignee
BEIJING THUNDERSTONE TECHNOLOGY Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING THUNDERSTONE TECHNOLOGY Ltd filed Critical BEIJING THUNDERSTONE TECHNOLOGY Ltd
Priority to CN201910349362.4A priority Critical patent/CN110139206B/en
Publication of CN110139206A publication Critical patent/CN110139206A/en
Application granted granted Critical
Publication of CN110139206B publication Critical patent/CN110139206B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/005Musical accompaniment, i.e. complete instrumental rhythm synthesis added to a performed melody, e.g. as output by drum machines

Abstract

The embodiment of the present invention provides the processing method and system of a kind of stereo audio, include: the phase difference of phase of the S1. by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal, is compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtains the second L channel frequency-region signal and the second right channel frequency-region signal;S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation data, and exported.The present invention compares more existing extracting method of accompaniment music, computation complexity and algorithm delay can be reduced half, while retaining the music component of low frequency region very well, and solve the problems such as high-frequency region voice residual is excessive.

Description

A kind of processing method and system of stereo audio
Technical field
The present invention relates to field of audio processing more particularly to a kind for the treatment of method and apparatus of stereo audio.
Background technique
Newer song will not issue accompaniment tone when such as network song, original song are often published on network simultaneously It is happy;It often accompanies missing compared with old song triton, this results in people and wants to can not find accompaniment when singing these songs, and singing experience reduces. The voice of song is eliminated, accompaniment music is obtained, the accompaniment extracting method for not depending on specific accompaniment song library server is With the biggish market demand.
Existing accompaniment extracting method there are several types of:
1. artificial extract, need manually to go to eliminate voice in song when extracting accompaniment music using this method, mainly according to Manually adjustment balanced device reduces the corresponding gain of vocal sections' frequency point, since voice harmonic wave is widely distributed, manually adjust when Between it is all unsatisfactory in cost and effect;
2. pair stereo song the left and right acoustic channels of time domain subtract each other eliminate voice method, using when this method to left and right sound Road synchronizes more demanding, and the voice of treated accompaniment music or more apparent;
3. using frequency domain cross-correlation voice removing method, using when this method, to song left and right acoustic channels data, framing is done respectively Frequency domain cross-correlation calculation eliminates voice by being transformed to time domain again multiplied by smaller coefficient to the higher frequency point of cross correlation value, should Method computation complexity is higher, and a voice eradicating efficacy relatively upper method improves, but the voice of treated accompaniment music still compared with Obviously;4. voice method is eliminated using frequency domain phase difference, Amplitude Ratio, when using this method, respectively to song left and right acoustic channels framing Frequency domain is transformed to, phase difference and Amplitude Ratio that left and right acoustic channels correspond to frequency point is calculated, certain threshold values is set, is less than in phase difference certain Phase threshold values and or Amplitude Ratio when being less than certain Amplitude Ratio threshold values will corresponding value of frequency point clear 0, then be transformed to frequency domain.This method makes It is poor that voice calculating effect is eliminated with Amplitude Ratio, though than preceding several method effect promoting, meter when eliminating voice using phase difference Complexity height is calculated, and low-frequency component weakens the low frequency components such as the drum sound excessively caused in accompaniment music and is largely eliminated, in height Voice residual in frequency part is again more, and accompaniment is made to sound that low frequency component is obvious insufficient and residual voice is more ear-piercing in this way.
In view of the above-mentioned problems, currently no effective solution has been proposed.
Summary of the invention
The embodiment of the present invention provides the processing method and system of a kind of stereo audio, eliminates voice to phase difference to realize The improvement and promotion of method.
On the one hand, the embodiment of the invention provides a kind of processing methods of stereo audio, comprising:
S1. the phase by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal Phase difference, be compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;
S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtained To the second L channel frequency-region signal and the second right channel frequency-region signal;
S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation Data, and export.
Further, in the step S1,
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency Rate range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Its In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
Further, include: before the step S1
S01. place is normalized to the first L channel time-domain signal of stereo audio and the first right channel time-domain signal Reason, obtains the second L channel time-domain signal and the second right channel time-domain signal;
S02. by the second L channel time-domain signal and the second right channel time-domain signal according to preset interval point From at multiple L channel frames and right channel frame;
S03. the time-domain signal of each L channel frame and the right channel frame is subjected to Fourier transformation respectively, is obtained The first L channel frequency-region signal and the first right channel frequency-region signal;
S04. phase of each frequency point in the first L channel frequency-region signal and first right channel are calculated The phase difference of phase in frequency-region signal.
Further, the step S2 includes:
If S21. the phase difference is less than the curve values, and is greater than the opposite number of the curve values absolute value, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, then by the frequency point in the first L channel frequency-region signal and described Corresponding zeros data in one right channel frequency-region signal obtains the second L channel frequency-region signal and second right channel frequency Domain signal.
Further, the step S3 includes:
S31. the second L channel frequency-region signal and the second right channel frequency-region signal are passed through into inverse Fourier transform, Obtain third L channel time-domain signal and third right channel time-domain signal;
S32. each third L channel time-domain signal and the third right channel time-domain signal are merged respectively, Obtain the 4th L channel time-domain signal and the 4th right channel time-domain signal;
S33. the 4th L channel time-domain signal and the 4th right channel time-domain signal are converted into pulse code tune Data processed, and export.
On the other hand, the embodiment of the invention provides a kind of processing systems of stereo audio, comprising:
Comparison module, for the phase and the first right channel frequency domain letter by each frequency point in the first L channel frequency-region signal The phase difference of phase in number is carried out with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point Compare;
Signal processing module, for adjusting the first L channel frequency-region signal and the first right sound according to comparison result Road frequency-region signal obtains the second L channel frequency-region signal and the second right channel frequency-region signal;
Signal post-processing module, for becoming the second L channel frequency-region signal and the second right channel frequency-region signal It is changed to pulse code modulation data, and is exported.
Further, include: in the comparison module
Phase determination curve computation unit, for the phase determination curve to be calculated according to the parameter preset;
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency Rate range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Its In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
Further, further includes: signal pre-processing module, comprising:
Normalization unit, for stereo audio the first L channel time-domain signal and the first right channel time-domain signal into Row normalized obtains the second L channel time-domain signal and the second right channel time-domain signal;
Framing unit is used for the second L channel time-domain signal and the second right channel time-domain signal according to default Be spaced apart into multiple L channel frames and right channel frame;
Fourier transform unit, for carrying out the time-domain signal of each L channel frame and the right channel frame respectively Fourier transformation obtains the first L channel frequency-region signal and the first right channel frequency-region signal;
Phase difference calculating unit calculates phase and first right sound of each frequency point in the first L channel frequency-region signal The phase difference of phase in road frequency-region signal.
Further, the signal processing module, further includes:
Processing unit, for being less than the curve values when the phase difference, and the absolute value for being greater than the curve values is opposite When number, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, by the frequency point in the first L channel frequency-region signal With corresponding zeros data in the first right channel frequency-region signal, the second L channel frequency-region signal and described second are obtained Right channel frequency-region signal.
Further, the signal post-processing module, further includes:
Inverse Fourier transform unit is used for the second L channel frequency-region signal and the second right channel frequency-region signal By inverse Fourier transform, third L channel time-domain signal and third right channel time-domain signal are obtained;
Combining unit, for distinguishing each third L channel time-domain signal and the third right channel time-domain signal It merges, obtains the 4th L channel time-domain signal and the 4th right channel time-domain signal;
The 4th L channel time-domain signal and the 4th right channel time-domain signal are converted to pulse and compiled by converting unit Code modulation data, and export.
Above-mentioned technical proposal has the following beneficial effects:
The present invention can not depend on specific accompaniment song library server, provide the accompaniment music of music VF for user.Together When, the song played in real time can be handled, eliminate voice, to reach synchronism output accompaniment.In addition, comparing more existing accompaniment Music extracting method, computation complexity and algorithm delay can be reduced half by the present invention, while retain the sound of low frequency region very well Happy ingredient, and solve the problems such as high-frequency region voice residual is excessive.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.
Fig. 1 is a kind of stereo audio processing method flow chart according to an embodiment of the present invention;
Fig. 2 is the flow chart of pre-treatment step according to an embodiment of the present invention;
Fig. 3 is the flow chart of data processing step according to an embodiment of the present invention;
Fig. 4 is the flow chart of post-processing step according to an embodiment of the present invention;
Fig. 5 is a kind of structural block diagram of stereo audio processing system according to an embodiment of the present invention;
Fig. 6 is the structural block diagram of signal pre-processing module according to an embodiment of the present invention;
Fig. 7 is the structural block diagram of signal processing module according to an embodiment of the present invention;
Fig. 8 is the structural block diagram of signal post-processing module according to an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Embodiment 1:
As shown in Figures 1 to 4, a kind of processing method of stereo audio, comprising:
S1. the phase by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal Phase difference, be compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;
Wherein, in a kind of wherein embodiment, the parameter preset include: signal processing intensity, signal processing precision, Audio data samples rate and signal processing frequency range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Its In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
In a kind of wherein embodiment, the phase of the frequency point can be calculated by following methods: set the frequency point Complex values of the FIndex in frequency-region signal are x+yi, and the phase obtained after calculating is r, then:
1. taking the absolute value of x and y respectively, a, b are obtained;
A=| x |, b=| y |
If 2. a and b simultaneously be 0, phase r=0;
If 3. a and b not simultaneously be 0:
S=c × c
Phase r=((- 0.0464964749*s+0.15931422) * s-0.327622764) * s*c+c;
4. the phase value being calculated is transformed into-π~π range:
4.1 if b > a phase
If 4.2 x < 0, phase r=π-r;
If 4.3 y < 0, phase r=-r;
Phase of the frequency point in frequency-region signal can be obtained.
Therefore the phase difference divP (FIndex) can be calculated by following equation:
DivP (FIndex)=PL (FIndex)-PR (FIndex),
Wherein PL (FIndex) indicates phase of the frequency point in the first L channel frequency-region signal, PR (FIndex) Indicate phase of the frequency point in the first right channel frequency-region signal.
After obtaining the phase difference divP (FIndex) and the curve values P (FIndex) of the frequency point, to the phase Potential difference divP (FIndex) and the curve values P (FIdex) are compared.
In a kind of wherein embodiment, before the step S1 further include:
S01. place is normalized to the first L channel time-domain signal of stereo audio and the first right channel time-domain signal Reason, obtains the second L channel time-domain signal and the second right channel time-domain signal;
Normalized is that the pulse code modulation audio data value of different bit wides is normalized to -1~+1 range, is made not Amplitude represented by audio data with bit wide facilitates subsequent arithmetic in same magnitude.
It is as follows to normalize formula:
Wherein,
Val indicates the data value of pulse code modulation audio, is indicated in the form of fixed-point number;
Nval indicates the audio data value after normalization, is indicated in the form of floating number;
The bit wide of bitnum expression val data.
S02. by the second L channel time-domain signal and the second right channel time-domain signal according to preset interval point From at multiple L channel frames and right channel frame;
Respectively by after normalization the second L channel time-domain signal and the second right channel time-domain signal be divided into Multiple L channel frames and the right channel frame, and windowed function is carried out to each L channel frame and the right channel frame Processing, window length indicate the length of each the L channel frame and the right channel frame.Window length value eliminates precision by voice and determines It is fixed.Window function uses period Hamming window, and 75% sampled point for having coincidence is overlapped between consecutive frame, so that having between frame and frame flat Slip over the effect crossed.
S03. the time-domain signal of each L channel frame and the right channel frame is subjected to Fourier transformation respectively, is obtained The first L channel frequency-region signal and the first right channel frequency-region signal;
Fourier transformation is done to the time-domain signal of each L channel frame and the right channel frame respectively, obtains L channel Frequency domain data FFTDATA_L (FIndex) and right channel frequency domain data FFTDATA_R (FIndex).When window length WLEN is small When FFT transform counts FFTSIZE, i.e., when audio frame data deficiencies FFT transform is counted, 0 is mended at frequency domain data end and is gathered together enough FFT Transformation points.Because FFT transform Data Conjugate is symmetrical, therefore only take (FFTSIZE/2+1) length data for calculating, then FIndex Value range be 0~FFTSIZE/2.
S04. phase of each frequency point in the first L channel frequency-region signal and first right channel are calculated The phase difference of phase in frequency-region signal.
In view of breaking in low frequency, the low frequencies musical instruments such as voice fundamental frequency and drum sound are Chong Die, and the voice human ear that accompaniment is superimposed the frequency range is several It can not hear voice, not influence voice eradicating efficacy, retain low frequency energy to be more, therefore in 100~200Hz frequency below Frequency point does not consider that voice is eliminated within the scope of rate, and in the present embodiment, bass frequencies lower limit is selected as 140Hz;And it is being higher than 13000Hz Vocal components are substantially not present in frequency range, therefore also do not consider that voice is eliminated.
In conclusion only calculate the frequency point in following range phase and corresponding L channel frequency domain data and right channel frequency Numeric field data:
Using the above method, left and right acoustic channels can reduce the phase calculation amount of half respectively.Whole simultaneously remains music Low frequency and high fdrequency component.
S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtained To the second L channel frequency-region signal and the second right channel frequency-region signal;
In the present embodiment, the step S2 includes:
If S21. the phase difference is less than the curve values, and is greater than the opposite number of the curve values absolute value, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, then by the frequency point in the first L channel frequency-region signal and described Corresponding zeros data in one right channel frequency-region signal obtains the second L channel frequency-region signal and second right channel frequency Domain signal.
When the phase difference meets-| P (FIndex) | < phase difference < | P (FIndex) | when, by the frequency point in institute State corresponding zeros data in the first L channel frequency-region signal and the first right channel frequency-region signal, i.e., the frequency of the described L channel Numeric field data FFTDATA_L (FIndex)=0, the frequency domain data FFTDATA_R (FIndex)=0 of the right channel.
S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation Data, and export.
In the present embodiment, the step S3 includes:
S31. the second L channel frequency-region signal and the second right channel frequency-region signal are passed through into inverse Fourier transform, Obtain third L channel time-domain signal and third right channel time-domain signal;
S32. each third L channel time-domain signal and the third right channel time-domain signal are merged respectively, Obtain the 4th L channel time-domain signal and the 4th right channel time-domain signal;
S33. the 4th L channel time-domain signal and the 4th right channel time-domain signal are converted into pulse code tune Data processed, and export.
The embodiment of the present invention can not depend on specific accompaniment song library server, provide the accompaniment of music VF for user Music.Meanwhile the song played in real time can be handled, voice is eliminated, to reach synchronism output accompaniment.Further, since not Consider that the voice of low frequency range and high frequency region is eliminated, compares more existing extracting method of accompaniment music, the present invention can be by computation complexity It is delayed with algorithm and reduces half, while retaining the music component of low frequency region very well, and it is excessive to solve high-frequency region voice residual The problem of.
Embodiment 2:
As shown in Fig. 5 to Fig. 8, a kind of processing system of stereo audio, comprising:
Comparison module 1, for the phase and the first right channel frequency domain by each frequency point in the first L channel frequency-region signal The phase difference of phase in signal, with the phase determination curve that is calculated according to parameter preset the frequency point curve values into Row compares;
Wherein, in a kind of wherein embodiment, include: in the comparison module 1
Phase determination curve computation unit 11, for the phase determination curve to be calculated according to the parameter preset;
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency Rate range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIoNIt is calculated;Its In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
In a kind of wherein embodiment, the phase of the frequency point can be calculated by following methods: set the frequency point Complex values of the FIndex in frequency-region signal are x+yi, and the phase obtained after calculating is r, then:
1. taking the absolute value of x and y respectively, a, b are obtained;
A=| x |, b=| y |
If 2. a and b simultaneously be 0, phase r=0;
If 3. a and b not simultaneously be 0:
S=c × c
Phase r=((- 0.0464964749*s+0.15931422) * s-0.327622764) * s*c+c;
4. the phase value being calculated is transformed into-π~π range:
4.1 if b > a phase
If 4.2 x < 0, phase r=π-r;
If 4.3 y < 0, phase r=-r;
Phase of the frequency point in frequency-region signal can be obtained.
Therefore the phase difference divP (FIndex) can be calculated by following equation:
DivP (FIndex)=PL (FIndex)-PR (FIndex),
Wherein PL (FIndex) indicates phase of the frequency point in the first L channel frequency-region signal, PR (FIndex) Indicate phase of the frequency point in the first right channel frequency-region signal.
After obtaining the phase difference divP (FIndex) and the curve values P (FIndex) of the frequency point, to the phase Potential difference divP (FIndex) and the curve values P (FIndex) are compared.
In a kind of wherein embodiment, further includes: signal pre-processing module 0, comprising:
Normalization unit 01, for the first L channel time-domain signal and the first right channel time-domain signal to stereo audio It is normalized, obtains the second L channel time-domain signal and the second right channel time-domain signal;
Normalized is that the pulse code modulation audio data value of different bit wides is normalized to -1~+1 range, is made not Amplitude represented by audio data with bit wide facilitates subsequent arithmetic in same magnitude.
It is as follows to normalize formula:
Wherein,
Val indicates the data value of pulse code modulation audio, is indicated in the form of fixed-point number;
Nval indicates the audio data value after normalization, is indicated in the form of floating number;
The bit wide of bitnum expression val data.
Framing unit 02 is used for the second L channel time-domain signal and the second right channel time-domain signal according to pre- If be spaced apart into multiple L channel frames and right channel frame;
The framing unit 02 is respectively by the second L channel time-domain signal and second right channel after normalization Time-domain signal is divided into multiple L channel frames and the right channel frame, and to each L channel frame and the right channel Frame carries out windowed function processing, and window length indicates the length of each the L channel frame and the right channel frame.Window length value by Voice is eliminated precision and is determined.Window function uses period Hamming window, 75% sampled point for having coincidence is overlapped between consecutive frame, so that frame Has the effect of smooth transition between frame.
Fourier transform unit 03, for by the time-domain signal of each L channel frame and the right channel frame respectively into Row Fourier transformation obtains the first L channel frequency-region signal and the first right channel frequency-region signal;
The Fourier transform unit 03 respectively does the time-domain signal of each L channel frame and the right channel frame Fourier transformation obtains the frequency domain data FFTDATA_L (FIndex) of the L channel and frequency domain data FFTDATA_R of right channel (FIndex).When window length WLEN is less than FFT transform points FFTSIZE, i.e., when audio frame data deficiencies FFT transform is counted, Frequency domain data end mend 0 gather together enough FFT transform points.Because FFT transform Data Conjugate is symmetrical, therefore only take (FFTSIZE/2+1) Length data is for calculating, then the value range of FIndex is 0~FFTSIZE/2.
Phase difference calculating unit 04 calculates phase of each frequency point in the first L channel frequency-region signal and first right side The phase difference of phase in sound channel frequency-region signal.
In view of breaking in low frequency, the low frequencies musical instruments such as voice fundamental frequency and drum sound are Chong Die, and the voice human ear that accompaniment is superimposed the frequency range is several It can not hear voice, not influence voice eradicating efficacy, retain low frequency energy to be more, therefore in 100~200Hz frequency below Frequency point does not consider that voice is eliminated within the scope of rate, and in the present embodiment, bass frequencies lower limit is selected as 140Hz;And it is being higher than 13000Hz Vocal components are substantially not present in frequency range, therefore also do not consider that voice is eliminated.
In conclusion the phase difference calculating unit 04 phase for calculating the frequency point in following range and corresponding left sound Road frequency domain data and right channel frequency domain data:
Using the above method, left and right acoustic channels can reduce the phase calculation amount of half respectively.Whole simultaneously remains music Low frequency and high fdrequency component.
Signal processing module 2, for adjusting the first L channel frequency-region signal and first right side according to comparison result Sound channel frequency-region signal obtains the second L channel frequency-region signal and the second right channel frequency-region signal;
In the present embodiment, the signal processing module 2, further includes:
Processing unit 21 for being less than the curve values when the phase difference, and is greater than the absolute value phase of the curve values When anti-number, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, the frequency point is believed in the first L channel frequency domain Number and the first right channel frequency-region signal in corresponding zeros data, obtain the second L channel frequency-region signal and described the Two right channel frequency-region signals.
When the phase difference meets-| P (FIndex) | < phase difference < | P (FIndex) | when, the processing unit 21 By the frequency point in the first L channel frequency-region signal and the first right channel frequency-region signal corresponding zeros data, i.e., The frequency domain data FFTDATA_L (FIndex)=0 of the L channel, the frequency domain data FFTDATA_R of the right channel (FIndex)=0.
Signal post-processing module 3 is used for the second L channel frequency-region signal and the second right channel frequency-region signal It is transformed to pulse code modulation data, and is exported.
In the present embodiment, the signal post-processing module 3, further includes:
Inverse Fourier transform unit 31, for believing the second L channel frequency-region signal and the second right channel frequency domain Number by inverse Fourier transform, third L channel time-domain signal and third right channel time-domain signal are obtained;
Combining unit 32, for dividing each third L channel time-domain signal and the third right channel time-domain signal It does not merge, obtains the 4th L channel time-domain signal and the 4th right channel time-domain signal;
The 4th L channel time-domain signal and the 4th right channel time-domain signal are converted to pulse by converting unit 33 Coding modulation data, and export.
The embodiment of the present invention can not depend on specific accompaniment song library server, provide the accompaniment of music VF for user Music.Meanwhile the song played in real time can be handled, voice is eliminated, to reach synchronism output accompaniment.Further, since not Consider that the voice of low frequency range and high frequency region is eliminated, compares more existing extracting method of accompaniment music, the present invention can be by computation complexity It is delayed with algorithm and reduces half, while retaining the music component of low frequency region very well, and it is excessive to solve high-frequency region voice residual The problem of.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include Within protection scope of the present invention.

Claims (10)

1. a kind of processing method of stereo audio characterized by comprising
S1. the phase of the phase by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal Potential difference is compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;
S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtains the Two L channel frequency-region signals and the second right channel frequency-region signal;
S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation number According to, and export.
2. a kind of processing method of stereo audio as described in claim 1, which is characterized in that in the step S1,
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency model It encloses;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Wherein, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
3. a kind of processing method of stereo audio as described in claim 1, which is characterized in that wrapped before the step S1 It includes:
S01. the first L channel time-domain signal of stereo audio and the first right channel time-domain signal are normalized, are obtained To the second L channel time-domain signal and the second right channel time-domain signal;
S02. the second L channel time-domain signal and the second right channel time-domain signal are spaced apart into according to preset Multiple L channel frames and right channel frame;
S03. the time-domain signal of each L channel frame and the right channel frame is subjected to Fourier transformation respectively, is obtained described First L channel frequency-region signal and the first right channel frequency-region signal;
S04. phase of each frequency point in the first L channel frequency-region signal and the first right channel frequency domain are calculated The phase difference of phase in signal.
4. a kind of processing method of stereo audio as described in claim 1, which is characterized in that the step S2 includes:
If S21. the phase difference is less than the curve values, and is greater than the opposite number of the curve values absolute value, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, then by the frequency point in the first L channel frequency-region signal and described Corresponding zeros data in one right channel frequency-region signal obtains the second L channel frequency-region signal and second right channel frequency Domain signal.
5. a kind of processing method of stereo audio as described in claim 1, which is characterized in that the step S3 includes:
S31. the second L channel frequency-region signal and the second right channel frequency-region signal are obtained by inverse Fourier transform Third L channel time-domain signal and third right channel time-domain signal;
S32. each third L channel time-domain signal and the third right channel time-domain signal are merged respectively, is obtained 4th L channel time-domain signal and the 4th right channel time-domain signal;
S33. the 4th L channel time-domain signal and the 4th right channel time-domain signal are converted into pulse code modulation number According to, and export.
6. a kind of processing system of stereo audio characterized by comprising
Comparison module, for by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal Phase phase difference, compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point Compared with;
Signal processing module, for adjusting the first L channel frequency-region signal and first right channel frequency according to comparison result Domain signal obtains the second L channel frequency-region signal and the second right channel frequency-region signal;
Signal post-processing module, for the second L channel frequency-region signal and the second right channel frequency-region signal to be transformed to Pulse code modulation data, and export.
7. a kind of processing method of stereo audio as claimed in claim 6, which is characterized in that wrapped in the comparison module It includes:
Phase determination curve computation unit, for the phase determination curve to be calculated according to the parameter preset;
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency model It encloses;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Wherein, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
8. a kind of processing system of stereo audio as claimed in claim 6, which is characterized in that further include: Signal Pretreatment Module, comprising:
Normalization unit, for stereo audio the first L channel time-domain signal and the first right channel time-domain signal return One change processing, obtains the second L channel time-domain signal and the second right channel time-domain signal;
Framing unit, for by the second L channel time-domain signal and the second right channel time-domain signal according between preset Every being separated into multiple L channel frames and right channel frame;
Fourier transform unit, for carrying out in Fu the time-domain signal of each L channel frame and the right channel frame respectively Leaf transformation obtains the first L channel frequency-region signal and the first right channel frequency-region signal;
Phase difference calculating unit calculates phase and first right channel frequency of each frequency point in the first L channel frequency-region signal The phase difference of phase in the signal of domain.
9. a kind of processing system of stereo audio as claimed in claim 6, which is characterized in that the signal processing module, Further include:
Processing unit is used for when the phase difference is less than the curve values, and is greater than the absolute value opposite number of the curve values, I.e.-| P (FIndex) | < phase difference < | P (FIndex) |, by the frequency point in the first L channel frequency-region signal and institute Corresponding zeros data in the first right channel frequency-region signal is stated, the second L channel frequency-region signal and the second right sound are obtained Road frequency-region signal.
10. a kind of processing system of stereo audio as claimed in claim 6, which is characterized in that the signal post-processing mould Block, further includes:
Inverse Fourier transform unit, for passing through the second L channel frequency-region signal and the second right channel frequency-region signal Inverse Fourier transform obtains third L channel time-domain signal and third right channel time-domain signal;
Combining unit, for carrying out each third L channel time-domain signal and the third right channel time-domain signal respectively Merge, obtains the 4th L channel time-domain signal and the 4th right channel time-domain signal;
The 4th L channel time-domain signal and the 4th right channel time-domain signal are converted to pulse code tune by converting unit Data processed, and export.
CN201910349362.4A 2019-04-28 2019-04-28 Stereo audio processing method and system Active CN110139206B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910349362.4A CN110139206B (en) 2019-04-28 2019-04-28 Stereo audio processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910349362.4A CN110139206B (en) 2019-04-28 2019-04-28 Stereo audio processing method and system

Publications (2)

Publication Number Publication Date
CN110139206A true CN110139206A (en) 2019-08-16
CN110139206B CN110139206B (en) 2020-11-27

Family

ID=67575403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910349362.4A Active CN110139206B (en) 2019-04-28 2019-04-28 Stereo audio processing method and system

Country Status (1)

Country Link
CN (1) CN110139206B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111615045A (en) * 2020-06-23 2020-09-01 腾讯音乐娱乐科技(深圳)有限公司 Audio processing method, device, equipment and storage medium
CN112053669A (en) * 2020-08-27 2020-12-08 海信视像科技股份有限公司 Method, device, equipment and medium for eliminating human voice
CN113473352A (en) * 2021-07-06 2021-10-01 北京达佳互联信息技术有限公司 Method and device for post-processing of two-channel audio
WO2023137861A1 (en) * 2022-01-18 2023-07-27 Shenzhen SynSense Technology Co., Ltd. Divisive normalization method, device, audio feature extractor and a chip

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609667A (en) * 2009-07-22 2009-12-23 福州瑞芯微电子有限公司 Realize the method for Kara OK function in the PMP player
CN101894559A (en) * 2010-08-05 2010-11-24 展讯通信(上海)有限公司 Audio processing method and device thereof
CN104053120A (en) * 2014-06-13 2014-09-17 福建星网视易信息系统有限公司 Method and device for processing stereo audio frequency
US8964993B2 (en) * 2010-04-27 2015-02-24 Yobe, Inc. Systems and methods for enhancing audio content
EP2088589B1 (en) * 2006-11-27 2016-05-18 Sony Computer Entertainment Inc. Audio processing device and audio processing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2088589B1 (en) * 2006-11-27 2016-05-18 Sony Computer Entertainment Inc. Audio processing device and audio processing method
CN101609667A (en) * 2009-07-22 2009-12-23 福州瑞芯微电子有限公司 Realize the method for Kara OK function in the PMP player
US8964993B2 (en) * 2010-04-27 2015-02-24 Yobe, Inc. Systems and methods for enhancing audio content
CN101894559A (en) * 2010-08-05 2010-11-24 展讯通信(上海)有限公司 Audio processing method and device thereof
CN104053120A (en) * 2014-06-13 2014-09-17 福建星网视易信息系统有限公司 Method and device for processing stereo audio frequency

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111615045A (en) * 2020-06-23 2020-09-01 腾讯音乐娱乐科技(深圳)有限公司 Audio processing method, device, equipment and storage medium
CN112053669A (en) * 2020-08-27 2020-12-08 海信视像科技股份有限公司 Method, device, equipment and medium for eliminating human voice
CN112053669B (en) * 2020-08-27 2023-10-27 海信视像科技股份有限公司 Method, device, equipment and medium for eliminating human voice
CN113473352A (en) * 2021-07-06 2021-10-01 北京达佳互联信息技术有限公司 Method and device for post-processing of two-channel audio
WO2023137861A1 (en) * 2022-01-18 2023-07-27 Shenzhen SynSense Technology Co., Ltd. Divisive normalization method, device, audio feature extractor and a chip

Also Published As

Publication number Publication date
CN110139206B (en) 2020-11-27

Similar Documents

Publication Publication Date Title
CN110139206A (en) A kind of processing method and system of stereo audio
Serra et al. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition
Schroeder Vocoders: Analysis and synthesis of speech
JP4906230B2 (en) A method for time adjustment of audio signals using characterization based on auditory events
Goodwin Residual modeling in music analysis-synthesis
AU2010227994B2 (en) Method and device for audio signal classifacation
JPS63259696A (en) Voice pre-processing method and apparatus
CN108172210B (en) Singing harmony generation method based on singing voice rhythm
JP2018521366A (en) Method and system for decomposing acoustic signal into sound object, sound object and use thereof
US20050065781A1 (en) Method for analysing audio signals
CN104183245A (en) Method and device for recommending music stars with tones similar to those of singers
JP4050350B2 (en) Speech recognition method and system
CN107331403A (en) A kind of audio optimization method, intelligent terminal and storage device based on algorithm
CN103258543B (en) Method for expanding artificial voice bandwidth
EP1485691A1 (en) Method and system for measuring a system&#39;s transmission quality
CN108281150B (en) Voice tone-changing voice-changing method based on differential glottal wave model
CN101449321B (en) Out-of-band signal generator and frequency band expander
Romoli et al. A novel decorrelation approach for multichannel system identification
CN104658547A (en) Method for expanding artificial voice bandwidth
JP2003510665A (en) Apparatus and method for de-esser using adaptive filtering algorithm
CN109841223A (en) A kind of acoustic signal processing method, intelligent terminal and storage medium
CN111667803B (en) Audio processing method and related products
Nam et al. Alias-free virtual analog oscillators using a feedback delay loop
Wen et al. On the characterization of slowly varying sinusoids
Purnhagen Parameter estimation and tracking for time-varying sinusoids

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant