CN110139206A - A kind of processing method and system of stereo audio - Google Patents
A kind of processing method and system of stereo audio Download PDFInfo
- Publication number
- CN110139206A CN110139206A CN201910349362.4A CN201910349362A CN110139206A CN 110139206 A CN110139206 A CN 110139206A CN 201910349362 A CN201910349362 A CN 201910349362A CN 110139206 A CN110139206 A CN 110139206A
- Authority
- CN
- China
- Prior art keywords
- signal
- frequency
- right channel
- channel frequency
- region signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/005—Musical accompaniment, i.e. complete instrumental rhythm synthesis added to a performed melody, e.g. as output by drum machines
Abstract
The embodiment of the present invention provides the processing method and system of a kind of stereo audio, include: the phase difference of phase of the S1. by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal, is compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtains the second L channel frequency-region signal and the second right channel frequency-region signal;S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation data, and exported.The present invention compares more existing extracting method of accompaniment music, computation complexity and algorithm delay can be reduced half, while retaining the music component of low frequency region very well, and solve the problems such as high-frequency region voice residual is excessive.
Description
Technical field
The present invention relates to field of audio processing more particularly to a kind for the treatment of method and apparatus of stereo audio.
Background technique
Newer song will not issue accompaniment tone when such as network song, original song are often published on network simultaneously
It is happy;It often accompanies missing compared with old song triton, this results in people and wants to can not find accompaniment when singing these songs, and singing experience reduces.
The voice of song is eliminated, accompaniment music is obtained, the accompaniment extracting method for not depending on specific accompaniment song library server is
With the biggish market demand.
Existing accompaniment extracting method there are several types of:
1. artificial extract, need manually to go to eliminate voice in song when extracting accompaniment music using this method, mainly according to
Manually adjustment balanced device reduces the corresponding gain of vocal sections' frequency point, since voice harmonic wave is widely distributed, manually adjust when
Between it is all unsatisfactory in cost and effect;
2. pair stereo song the left and right acoustic channels of time domain subtract each other eliminate voice method, using when this method to left and right sound
Road synchronizes more demanding, and the voice of treated accompaniment music or more apparent;
3. using frequency domain cross-correlation voice removing method, using when this method, to song left and right acoustic channels data, framing is done respectively
Frequency domain cross-correlation calculation eliminates voice by being transformed to time domain again multiplied by smaller coefficient to the higher frequency point of cross correlation value, should
Method computation complexity is higher, and a voice eradicating efficacy relatively upper method improves, but the voice of treated accompaniment music still compared with
Obviously;4. voice method is eliminated using frequency domain phase difference, Amplitude Ratio, when using this method, respectively to song left and right acoustic channels framing
Frequency domain is transformed to, phase difference and Amplitude Ratio that left and right acoustic channels correspond to frequency point is calculated, certain threshold values is set, is less than in phase difference certain
Phase threshold values and or Amplitude Ratio when being less than certain Amplitude Ratio threshold values will corresponding value of frequency point clear 0, then be transformed to frequency domain.This method makes
It is poor that voice calculating effect is eliminated with Amplitude Ratio, though than preceding several method effect promoting, meter when eliminating voice using phase difference
Complexity height is calculated, and low-frequency component weakens the low frequency components such as the drum sound excessively caused in accompaniment music and is largely eliminated, in height
Voice residual in frequency part is again more, and accompaniment is made to sound that low frequency component is obvious insufficient and residual voice is more ear-piercing in this way.
In view of the above-mentioned problems, currently no effective solution has been proposed.
Summary of the invention
The embodiment of the present invention provides the processing method and system of a kind of stereo audio, eliminates voice to phase difference to realize
The improvement and promotion of method.
On the one hand, the embodiment of the invention provides a kind of processing methods of stereo audio, comprising:
S1. the phase by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal
Phase difference, be compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;
S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtained
To the second L channel frequency-region signal and the second right channel frequency-region signal;
S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation
Data, and export.
Further, in the step S1,
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency
Rate range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Its
In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
Further, include: before the step S1
S01. place is normalized to the first L channel time-domain signal of stereo audio and the first right channel time-domain signal
Reason, obtains the second L channel time-domain signal and the second right channel time-domain signal;
S02. by the second L channel time-domain signal and the second right channel time-domain signal according to preset interval point
From at multiple L channel frames and right channel frame;
S03. the time-domain signal of each L channel frame and the right channel frame is subjected to Fourier transformation respectively, is obtained
The first L channel frequency-region signal and the first right channel frequency-region signal;
S04. phase of each frequency point in the first L channel frequency-region signal and first right channel are calculated
The phase difference of phase in frequency-region signal.
Further, the step S2 includes:
If S21. the phase difference is less than the curve values, and is greater than the opposite number of the curve values absolute value, i.e. ,-| P
(FIndex) | < phase difference < | P (FIndex) |, then by the frequency point in the first L channel frequency-region signal and described
Corresponding zeros data in one right channel frequency-region signal obtains the second L channel frequency-region signal and second right channel frequency
Domain signal.
Further, the step S3 includes:
S31. the second L channel frequency-region signal and the second right channel frequency-region signal are passed through into inverse Fourier transform,
Obtain third L channel time-domain signal and third right channel time-domain signal;
S32. each third L channel time-domain signal and the third right channel time-domain signal are merged respectively,
Obtain the 4th L channel time-domain signal and the 4th right channel time-domain signal;
S33. the 4th L channel time-domain signal and the 4th right channel time-domain signal are converted into pulse code tune
Data processed, and export.
On the other hand, the embodiment of the invention provides a kind of processing systems of stereo audio, comprising:
Comparison module, for the phase and the first right channel frequency domain letter by each frequency point in the first L channel frequency-region signal
The phase difference of phase in number is carried out with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point
Compare;
Signal processing module, for adjusting the first L channel frequency-region signal and the first right sound according to comparison result
Road frequency-region signal obtains the second L channel frequency-region signal and the second right channel frequency-region signal;
Signal post-processing module, for becoming the second L channel frequency-region signal and the second right channel frequency-region signal
It is changed to pulse code modulation data, and is exported.
Further, include: in the comparison module
Phase determination curve computation unit, for the phase determination curve to be calculated according to the parameter preset;
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency
Rate range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Its
In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
Further, further includes: signal pre-processing module, comprising:
Normalization unit, for stereo audio the first L channel time-domain signal and the first right channel time-domain signal into
Row normalized obtains the second L channel time-domain signal and the second right channel time-domain signal;
Framing unit is used for the second L channel time-domain signal and the second right channel time-domain signal according to default
Be spaced apart into multiple L channel frames and right channel frame;
Fourier transform unit, for carrying out the time-domain signal of each L channel frame and the right channel frame respectively
Fourier transformation obtains the first L channel frequency-region signal and the first right channel frequency-region signal;
Phase difference calculating unit calculates phase and first right sound of each frequency point in the first L channel frequency-region signal
The phase difference of phase in road frequency-region signal.
Further, the signal processing module, further includes:
Processing unit, for being less than the curve values when the phase difference, and the absolute value for being greater than the curve values is opposite
When number, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, by the frequency point in the first L channel frequency-region signal
With corresponding zeros data in the first right channel frequency-region signal, the second L channel frequency-region signal and described second are obtained
Right channel frequency-region signal.
Further, the signal post-processing module, further includes:
Inverse Fourier transform unit is used for the second L channel frequency-region signal and the second right channel frequency-region signal
By inverse Fourier transform, third L channel time-domain signal and third right channel time-domain signal are obtained;
Combining unit, for distinguishing each third L channel time-domain signal and the third right channel time-domain signal
It merges, obtains the 4th L channel time-domain signal and the 4th right channel time-domain signal;
The 4th L channel time-domain signal and the 4th right channel time-domain signal are converted to pulse and compiled by converting unit
Code modulation data, and export.
Above-mentioned technical proposal has the following beneficial effects:
The present invention can not depend on specific accompaniment song library server, provide the accompaniment music of music VF for user.Together
When, the song played in real time can be handled, eliminate voice, to reach synchronism output accompaniment.In addition, comparing more existing accompaniment
Music extracting method, computation complexity and algorithm delay can be reduced half by the present invention, while retain the sound of low frequency region very well
Happy ingredient, and solve the problems such as high-frequency region voice residual is excessive.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is a kind of stereo audio processing method flow chart according to an embodiment of the present invention;
Fig. 2 is the flow chart of pre-treatment step according to an embodiment of the present invention;
Fig. 3 is the flow chart of data processing step according to an embodiment of the present invention;
Fig. 4 is the flow chart of post-processing step according to an embodiment of the present invention;
Fig. 5 is a kind of structural block diagram of stereo audio processing system according to an embodiment of the present invention;
Fig. 6 is the structural block diagram of signal pre-processing module according to an embodiment of the present invention;
Fig. 7 is the structural block diagram of signal processing module according to an embodiment of the present invention;
Fig. 8 is the structural block diagram of signal post-processing module according to an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Embodiment 1:
As shown in Figures 1 to 4, a kind of processing method of stereo audio, comprising:
S1. the phase by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal
Phase difference, be compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;
Wherein, in a kind of wherein embodiment, the parameter preset include: signal processing intensity, signal processing precision,
Audio data samples rate and signal processing frequency range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Its
In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
In a kind of wherein embodiment, the phase of the frequency point can be calculated by following methods: set the frequency point
Complex values of the FIndex in frequency-region signal are x+yi, and the phase obtained after calculating is r, then:
1. taking the absolute value of x and y respectively, a, b are obtained;
A=| x |, b=| y |
If 2. a and b simultaneously be 0, phase r=0;
If 3. a and b not simultaneously be 0:
S=c × c
Phase r=((- 0.0464964749*s+0.15931422) * s-0.327622764) * s*c+c;
4. the phase value being calculated is transformed into-π~π range:
4.1 if b > a phase
If 4.2 x < 0, phase r=π-r;
If 4.3 y < 0, phase r=-r;
Phase of the frequency point in frequency-region signal can be obtained.
Therefore the phase difference divP (FIndex) can be calculated by following equation:
DivP (FIndex)=PL (FIndex)-PR (FIndex),
Wherein PL (FIndex) indicates phase of the frequency point in the first L channel frequency-region signal, PR (FIndex)
Indicate phase of the frequency point in the first right channel frequency-region signal.
After obtaining the phase difference divP (FIndex) and the curve values P (FIndex) of the frequency point, to the phase
Potential difference divP (FIndex) and the curve values P (FIdex) are compared.
In a kind of wherein embodiment, before the step S1 further include:
S01. place is normalized to the first L channel time-domain signal of stereo audio and the first right channel time-domain signal
Reason, obtains the second L channel time-domain signal and the second right channel time-domain signal;
Normalized is that the pulse code modulation audio data value of different bit wides is normalized to -1~+1 range, is made not
Amplitude represented by audio data with bit wide facilitates subsequent arithmetic in same magnitude.
It is as follows to normalize formula:
Wherein,
Val indicates the data value of pulse code modulation audio, is indicated in the form of fixed-point number;
Nval indicates the audio data value after normalization, is indicated in the form of floating number;
The bit wide of bitnum expression val data.
S02. by the second L channel time-domain signal and the second right channel time-domain signal according to preset interval point
From at multiple L channel frames and right channel frame;
Respectively by after normalization the second L channel time-domain signal and the second right channel time-domain signal be divided into
Multiple L channel frames and the right channel frame, and windowed function is carried out to each L channel frame and the right channel frame
Processing, window length indicate the length of each the L channel frame and the right channel frame.Window length value eliminates precision by voice and determines
It is fixed.Window function uses period Hamming window, and 75% sampled point for having coincidence is overlapped between consecutive frame, so that having between frame and frame flat
Slip over the effect crossed.
S03. the time-domain signal of each L channel frame and the right channel frame is subjected to Fourier transformation respectively, is obtained
The first L channel frequency-region signal and the first right channel frequency-region signal;
Fourier transformation is done to the time-domain signal of each L channel frame and the right channel frame respectively, obtains L channel
Frequency domain data FFTDATA_L (FIndex) and right channel frequency domain data FFTDATA_R (FIndex).When window length WLEN is small
When FFT transform counts FFTSIZE, i.e., when audio frame data deficiencies FFT transform is counted, 0 is mended at frequency domain data end and is gathered together enough FFT
Transformation points.Because FFT transform Data Conjugate is symmetrical, therefore only take (FFTSIZE/2+1) length data for calculating, then FIndex
Value range be 0~FFTSIZE/2.
S04. phase of each frequency point in the first L channel frequency-region signal and first right channel are calculated
The phase difference of phase in frequency-region signal.
In view of breaking in low frequency, the low frequencies musical instruments such as voice fundamental frequency and drum sound are Chong Die, and the voice human ear that accompaniment is superimposed the frequency range is several
It can not hear voice, not influence voice eradicating efficacy, retain low frequency energy to be more, therefore in 100~200Hz frequency below
Frequency point does not consider that voice is eliminated within the scope of rate, and in the present embodiment, bass frequencies lower limit is selected as 140Hz;And it is being higher than 13000Hz
Vocal components are substantially not present in frequency range, therefore also do not consider that voice is eliminated.
In conclusion only calculate the frequency point in following range phase and corresponding L channel frequency domain data and right channel frequency
Numeric field data:
Using the above method, left and right acoustic channels can reduce the phase calculation amount of half respectively.Whole simultaneously remains music
Low frequency and high fdrequency component.
S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtained
To the second L channel frequency-region signal and the second right channel frequency-region signal;
In the present embodiment, the step S2 includes:
If S21. the phase difference is less than the curve values, and is greater than the opposite number of the curve values absolute value, i.e. ,-| P
(FIndex) | < phase difference < | P (FIndex) |, then by the frequency point in the first L channel frequency-region signal and described
Corresponding zeros data in one right channel frequency-region signal obtains the second L channel frequency-region signal and second right channel frequency
Domain signal.
When the phase difference meets-| P (FIndex) | < phase difference < | P (FIndex) | when, by the frequency point in institute
State corresponding zeros data in the first L channel frequency-region signal and the first right channel frequency-region signal, i.e., the frequency of the described L channel
Numeric field data FFTDATA_L (FIndex)=0, the frequency domain data FFTDATA_R (FIndex)=0 of the right channel.
S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation
Data, and export.
In the present embodiment, the step S3 includes:
S31. the second L channel frequency-region signal and the second right channel frequency-region signal are passed through into inverse Fourier transform,
Obtain third L channel time-domain signal and third right channel time-domain signal;
S32. each third L channel time-domain signal and the third right channel time-domain signal are merged respectively,
Obtain the 4th L channel time-domain signal and the 4th right channel time-domain signal;
S33. the 4th L channel time-domain signal and the 4th right channel time-domain signal are converted into pulse code tune
Data processed, and export.
The embodiment of the present invention can not depend on specific accompaniment song library server, provide the accompaniment of music VF for user
Music.Meanwhile the song played in real time can be handled, voice is eliminated, to reach synchronism output accompaniment.Further, since not
Consider that the voice of low frequency range and high frequency region is eliminated, compares more existing extracting method of accompaniment music, the present invention can be by computation complexity
It is delayed with algorithm and reduces half, while retaining the music component of low frequency region very well, and it is excessive to solve high-frequency region voice residual
The problem of.
Embodiment 2:
As shown in Fig. 5 to Fig. 8, a kind of processing system of stereo audio, comprising:
Comparison module 1, for the phase and the first right channel frequency domain by each frequency point in the first L channel frequency-region signal
The phase difference of phase in signal, with the phase determination curve that is calculated according to parameter preset the frequency point curve values into
Row compares;
Wherein, in a kind of wherein embodiment, include: in the comparison module 1
Phase determination curve computation unit 11, for the phase determination curve to be calculated according to the parameter preset;
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency
Rate range;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIoNIt is calculated;Its
In, ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
In a kind of wherein embodiment, the phase of the frequency point can be calculated by following methods: set the frequency point
Complex values of the FIndex in frequency-region signal are x+yi, and the phase obtained after calculating is r, then:
1. taking the absolute value of x and y respectively, a, b are obtained;
A=| x |, b=| y |
If 2. a and b simultaneously be 0, phase r=0;
If 3. a and b not simultaneously be 0:
S=c × c
Phase r=((- 0.0464964749*s+0.15931422) * s-0.327622764) * s*c+c;
4. the phase value being calculated is transformed into-π~π range:
4.1 if b > a phase
If 4.2 x < 0, phase r=π-r;
If 4.3 y < 0, phase r=-r;
Phase of the frequency point in frequency-region signal can be obtained.
Therefore the phase difference divP (FIndex) can be calculated by following equation:
DivP (FIndex)=PL (FIndex)-PR (FIndex),
Wherein PL (FIndex) indicates phase of the frequency point in the first L channel frequency-region signal, PR (FIndex)
Indicate phase of the frequency point in the first right channel frequency-region signal.
After obtaining the phase difference divP (FIndex) and the curve values P (FIndex) of the frequency point, to the phase
Potential difference divP (FIndex) and the curve values P (FIndex) are compared.
In a kind of wherein embodiment, further includes: signal pre-processing module 0, comprising:
Normalization unit 01, for the first L channel time-domain signal and the first right channel time-domain signal to stereo audio
It is normalized, obtains the second L channel time-domain signal and the second right channel time-domain signal;
Normalized is that the pulse code modulation audio data value of different bit wides is normalized to -1~+1 range, is made not
Amplitude represented by audio data with bit wide facilitates subsequent arithmetic in same magnitude.
It is as follows to normalize formula:
Wherein,
Val indicates the data value of pulse code modulation audio, is indicated in the form of fixed-point number;
Nval indicates the audio data value after normalization, is indicated in the form of floating number;
The bit wide of bitnum expression val data.
Framing unit 02 is used for the second L channel time-domain signal and the second right channel time-domain signal according to pre-
If be spaced apart into multiple L channel frames and right channel frame;
The framing unit 02 is respectively by the second L channel time-domain signal and second right channel after normalization
Time-domain signal is divided into multiple L channel frames and the right channel frame, and to each L channel frame and the right channel
Frame carries out windowed function processing, and window length indicates the length of each the L channel frame and the right channel frame.Window length value by
Voice is eliminated precision and is determined.Window function uses period Hamming window, 75% sampled point for having coincidence is overlapped between consecutive frame, so that frame
Has the effect of smooth transition between frame.
Fourier transform unit 03, for by the time-domain signal of each L channel frame and the right channel frame respectively into
Row Fourier transformation obtains the first L channel frequency-region signal and the first right channel frequency-region signal;
The Fourier transform unit 03 respectively does the time-domain signal of each L channel frame and the right channel frame
Fourier transformation obtains the frequency domain data FFTDATA_L (FIndex) of the L channel and frequency domain data FFTDATA_R of right channel
(FIndex).When window length WLEN is less than FFT transform points FFTSIZE, i.e., when audio frame data deficiencies FFT transform is counted,
Frequency domain data end mend 0 gather together enough FFT transform points.Because FFT transform Data Conjugate is symmetrical, therefore only take (FFTSIZE/2+1)
Length data is for calculating, then the value range of FIndex is 0~FFTSIZE/2.
Phase difference calculating unit 04 calculates phase of each frequency point in the first L channel frequency-region signal and first right side
The phase difference of phase in sound channel frequency-region signal.
In view of breaking in low frequency, the low frequencies musical instruments such as voice fundamental frequency and drum sound are Chong Die, and the voice human ear that accompaniment is superimposed the frequency range is several
It can not hear voice, not influence voice eradicating efficacy, retain low frequency energy to be more, therefore in 100~200Hz frequency below
Frequency point does not consider that voice is eliminated within the scope of rate, and in the present embodiment, bass frequencies lower limit is selected as 140Hz;And it is being higher than 13000Hz
Vocal components are substantially not present in frequency range, therefore also do not consider that voice is eliminated.
In conclusion the phase difference calculating unit 04 phase for calculating the frequency point in following range and corresponding left sound
Road frequency domain data and right channel frequency domain data:
Using the above method, left and right acoustic channels can reduce the phase calculation amount of half respectively.Whole simultaneously remains music
Low frequency and high fdrequency component.
Signal processing module 2, for adjusting the first L channel frequency-region signal and first right side according to comparison result
Sound channel frequency-region signal obtains the second L channel frequency-region signal and the second right channel frequency-region signal;
In the present embodiment, the signal processing module 2, further includes:
Processing unit 21 for being less than the curve values when the phase difference, and is greater than the absolute value phase of the curve values
When anti-number, i.e. ,-| P (FIndex) | < phase difference < | P (FIndex) |, the frequency point is believed in the first L channel frequency domain
Number and the first right channel frequency-region signal in corresponding zeros data, obtain the second L channel frequency-region signal and described the
Two right channel frequency-region signals.
When the phase difference meets-| P (FIndex) | < phase difference < | P (FIndex) | when, the processing unit 21
By the frequency point in the first L channel frequency-region signal and the first right channel frequency-region signal corresponding zeros data, i.e.,
The frequency domain data FFTDATA_L (FIndex)=0 of the L channel, the frequency domain data FFTDATA_R of the right channel
(FIndex)=0.
Signal post-processing module 3 is used for the second L channel frequency-region signal and the second right channel frequency-region signal
It is transformed to pulse code modulation data, and is exported.
In the present embodiment, the signal post-processing module 3, further includes:
Inverse Fourier transform unit 31, for believing the second L channel frequency-region signal and the second right channel frequency domain
Number by inverse Fourier transform, third L channel time-domain signal and third right channel time-domain signal are obtained;
Combining unit 32, for dividing each third L channel time-domain signal and the third right channel time-domain signal
It does not merge, obtains the 4th L channel time-domain signal and the 4th right channel time-domain signal;
The 4th L channel time-domain signal and the 4th right channel time-domain signal are converted to pulse by converting unit 33
Coding modulation data, and export.
The embodiment of the present invention can not depend on specific accompaniment song library server, provide the accompaniment of music VF for user
Music.Meanwhile the song played in real time can be handled, voice is eliminated, to reach synchronism output accompaniment.Further, since not
Consider that the voice of low frequency range and high frequency region is eliminated, compares more existing extracting method of accompaniment music, the present invention can be by computation complexity
It is delayed with algorithm and reduces half, while retaining the music component of low frequency region very well, and it is excessive to solve high-frequency region voice residual
The problem of.
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects
It is described in detail, it should be understood that being not intended to limit the present invention the foregoing is merely a specific embodiment of the invention
Protection scope, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.
Claims (10)
1. a kind of processing method of stereo audio characterized by comprising
S1. the phase of the phase by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal
Potential difference is compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point;
S2. the first L channel frequency-region signal and the first right channel frequency-region signal are adjusted according to comparison result, obtains the
Two L channel frequency-region signals and the second right channel frequency-region signal;
S3. the second L channel frequency-region signal and the second right channel frequency-region signal are transformed to pulse code modulation number
According to, and export.
2. a kind of processing method of stereo audio as described in claim 1, which is characterized in that in the step S1,
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency model
It encloses;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Wherein,
ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
3. a kind of processing method of stereo audio as described in claim 1, which is characterized in that wrapped before the step S1
It includes:
S01. the first L channel time-domain signal of stereo audio and the first right channel time-domain signal are normalized, are obtained
To the second L channel time-domain signal and the second right channel time-domain signal;
S02. the second L channel time-domain signal and the second right channel time-domain signal are spaced apart into according to preset
Multiple L channel frames and right channel frame;
S03. the time-domain signal of each L channel frame and the right channel frame is subjected to Fourier transformation respectively, is obtained described
First L channel frequency-region signal and the first right channel frequency-region signal;
S04. phase of each frequency point in the first L channel frequency-region signal and the first right channel frequency domain are calculated
The phase difference of phase in signal.
4. a kind of processing method of stereo audio as described in claim 1, which is characterized in that the step S2 includes:
If S21. the phase difference is less than the curve values, and is greater than the opposite number of the curve values absolute value, i.e. ,-| P
(FIndex) | < phase difference < | P (FIndex) |, then by the frequency point in the first L channel frequency-region signal and described
Corresponding zeros data in one right channel frequency-region signal obtains the second L channel frequency-region signal and second right channel frequency
Domain signal.
5. a kind of processing method of stereo audio as described in claim 1, which is characterized in that the step S3 includes:
S31. the second L channel frequency-region signal and the second right channel frequency-region signal are obtained by inverse Fourier transform
Third L channel time-domain signal and third right channel time-domain signal;
S32. each third L channel time-domain signal and the third right channel time-domain signal are merged respectively, is obtained
4th L channel time-domain signal and the 4th right channel time-domain signal;
S33. the 4th L channel time-domain signal and the 4th right channel time-domain signal are converted into pulse code modulation number
According to, and export.
6. a kind of processing system of stereo audio characterized by comprising
Comparison module, for by each frequency point in the phase and the first right channel frequency-region signal in the first L channel frequency-region signal
Phase phase difference, compared with the phase determination curve being calculated according to parameter preset in the curve values of the frequency point
Compared with;
Signal processing module, for adjusting the first L channel frequency-region signal and first right channel frequency according to comparison result
Domain signal obtains the second L channel frequency-region signal and the second right channel frequency-region signal;
Signal post-processing module, for the second L channel frequency-region signal and the second right channel frequency-region signal to be transformed to
Pulse code modulation data, and export.
7. a kind of processing method of stereo audio as claimed in claim 6, which is characterized in that wrapped in the comparison module
It includes:
Phase determination curve computation unit, for the phase determination curve to be calculated according to the parameter preset;
The parameter preset includes: signal processing intensity, signal processing precision, audio data samples rate and signal processing frequency model
It encloses;
The phase determination curve are as follows:
Wherein,
FIndex is the frequency point;
Round () is bracket function;
FS is audio data samples rate;
FFTSIZE is FFT transform points, by formula F FTSIZE=1024 × 2ELIMINATE_PRECISIONIt is calculated;Wherein,
ELIMINATE_PRECISION is the signal processing precision;
Wherein,
PH=ELIMINATE_STRENGTH × 0.1
PL=ELIMINATE_STRENGTH × 0.3
Wherein, ELIMINATE_STRENGTH is the signal processing intensity.
8. a kind of processing system of stereo audio as claimed in claim 6, which is characterized in that further include: Signal Pretreatment
Module, comprising:
Normalization unit, for stereo audio the first L channel time-domain signal and the first right channel time-domain signal return
One change processing, obtains the second L channel time-domain signal and the second right channel time-domain signal;
Framing unit, for by the second L channel time-domain signal and the second right channel time-domain signal according between preset
Every being separated into multiple L channel frames and right channel frame;
Fourier transform unit, for carrying out in Fu the time-domain signal of each L channel frame and the right channel frame respectively
Leaf transformation obtains the first L channel frequency-region signal and the first right channel frequency-region signal;
Phase difference calculating unit calculates phase and first right channel frequency of each frequency point in the first L channel frequency-region signal
The phase difference of phase in the signal of domain.
9. a kind of processing system of stereo audio as claimed in claim 6, which is characterized in that the signal processing module,
Further include:
Processing unit is used for when the phase difference is less than the curve values, and is greater than the absolute value opposite number of the curve values,
I.e.-| P (FIndex) | < phase difference < | P (FIndex) |, by the frequency point in the first L channel frequency-region signal and institute
Corresponding zeros data in the first right channel frequency-region signal is stated, the second L channel frequency-region signal and the second right sound are obtained
Road frequency-region signal.
10. a kind of processing system of stereo audio as claimed in claim 6, which is characterized in that the signal post-processing mould
Block, further includes:
Inverse Fourier transform unit, for passing through the second L channel frequency-region signal and the second right channel frequency-region signal
Inverse Fourier transform obtains third L channel time-domain signal and third right channel time-domain signal;
Combining unit, for carrying out each third L channel time-domain signal and the third right channel time-domain signal respectively
Merge, obtains the 4th L channel time-domain signal and the 4th right channel time-domain signal;
The 4th L channel time-domain signal and the 4th right channel time-domain signal are converted to pulse code tune by converting unit
Data processed, and export.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910349362.4A CN110139206B (en) | 2019-04-28 | 2019-04-28 | Stereo audio processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910349362.4A CN110139206B (en) | 2019-04-28 | 2019-04-28 | Stereo audio processing method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110139206A true CN110139206A (en) | 2019-08-16 |
CN110139206B CN110139206B (en) | 2020-11-27 |
Family
ID=67575403
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910349362.4A Active CN110139206B (en) | 2019-04-28 | 2019-04-28 | Stereo audio processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110139206B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111615045A (en) * | 2020-06-23 | 2020-09-01 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio processing method, device, equipment and storage medium |
CN112053669A (en) * | 2020-08-27 | 2020-12-08 | 海信视像科技股份有限公司 | Method, device, equipment and medium for eliminating human voice |
CN113473352A (en) * | 2021-07-06 | 2021-10-01 | 北京达佳互联信息技术有限公司 | Method and device for post-processing of two-channel audio |
WO2023137861A1 (en) * | 2022-01-18 | 2023-07-27 | Shenzhen SynSense Technology Co., Ltd. | Divisive normalization method, device, audio feature extractor and a chip |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101609667A (en) * | 2009-07-22 | 2009-12-23 | 福州瑞芯微电子有限公司 | Realize the method for Kara OK function in the PMP player |
CN101894559A (en) * | 2010-08-05 | 2010-11-24 | 展讯通信(上海)有限公司 | Audio processing method and device thereof |
CN104053120A (en) * | 2014-06-13 | 2014-09-17 | 福建星网视易信息系统有限公司 | Method and device for processing stereo audio frequency |
US8964993B2 (en) * | 2010-04-27 | 2015-02-24 | Yobe, Inc. | Systems and methods for enhancing audio content |
EP2088589B1 (en) * | 2006-11-27 | 2016-05-18 | Sony Computer Entertainment Inc. | Audio processing device and audio processing method |
-
2019
- 2019-04-28 CN CN201910349362.4A patent/CN110139206B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2088589B1 (en) * | 2006-11-27 | 2016-05-18 | Sony Computer Entertainment Inc. | Audio processing device and audio processing method |
CN101609667A (en) * | 2009-07-22 | 2009-12-23 | 福州瑞芯微电子有限公司 | Realize the method for Kara OK function in the PMP player |
US8964993B2 (en) * | 2010-04-27 | 2015-02-24 | Yobe, Inc. | Systems and methods for enhancing audio content |
CN101894559A (en) * | 2010-08-05 | 2010-11-24 | 展讯通信(上海)有限公司 | Audio processing method and device thereof |
CN104053120A (en) * | 2014-06-13 | 2014-09-17 | 福建星网视易信息系统有限公司 | Method and device for processing stereo audio frequency |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111615045A (en) * | 2020-06-23 | 2020-09-01 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio processing method, device, equipment and storage medium |
CN112053669A (en) * | 2020-08-27 | 2020-12-08 | 海信视像科技股份有限公司 | Method, device, equipment and medium for eliminating human voice |
CN112053669B (en) * | 2020-08-27 | 2023-10-27 | 海信视像科技股份有限公司 | Method, device, equipment and medium for eliminating human voice |
CN113473352A (en) * | 2021-07-06 | 2021-10-01 | 北京达佳互联信息技术有限公司 | Method and device for post-processing of two-channel audio |
WO2023137861A1 (en) * | 2022-01-18 | 2023-07-27 | Shenzhen SynSense Technology Co., Ltd. | Divisive normalization method, device, audio feature extractor and a chip |
Also Published As
Publication number | Publication date |
---|---|
CN110139206B (en) | 2020-11-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110139206A (en) | A kind of processing method and system of stereo audio | |
Serra et al. | Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition | |
Schroeder | Vocoders: Analysis and synthesis of speech | |
JP4906230B2 (en) | A method for time adjustment of audio signals using characterization based on auditory events | |
Goodwin | Residual modeling in music analysis-synthesis | |
AU2010227994B2 (en) | Method and device for audio signal classifacation | |
JPS63259696A (en) | Voice pre-processing method and apparatus | |
CN108172210B (en) | Singing harmony generation method based on singing voice rhythm | |
JP2018521366A (en) | Method and system for decomposing acoustic signal into sound object, sound object and use thereof | |
US20050065781A1 (en) | Method for analysing audio signals | |
CN104183245A (en) | Method and device for recommending music stars with tones similar to those of singers | |
JP4050350B2 (en) | Speech recognition method and system | |
CN107331403A (en) | A kind of audio optimization method, intelligent terminal and storage device based on algorithm | |
CN103258543B (en) | Method for expanding artificial voice bandwidth | |
EP1485691A1 (en) | Method and system for measuring a system's transmission quality | |
CN108281150B (en) | Voice tone-changing voice-changing method based on differential glottal wave model | |
CN101449321B (en) | Out-of-band signal generator and frequency band expander | |
Romoli et al. | A novel decorrelation approach for multichannel system identification | |
CN104658547A (en) | Method for expanding artificial voice bandwidth | |
JP2003510665A (en) | Apparatus and method for de-esser using adaptive filtering algorithm | |
CN109841223A (en) | A kind of acoustic signal processing method, intelligent terminal and storage medium | |
CN111667803B (en) | Audio processing method and related products | |
Nam et al. | Alias-free virtual analog oscillators using a feedback delay loop | |
Wen et al. | On the characterization of slowly varying sinusoids | |
Purnhagen | Parameter estimation and tracking for time-varying sinusoids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |