CN103943113B - The method and apparatus that a kind of song goes accompaniment - Google Patents

The method and apparatus that a kind of song goes accompaniment Download PDF

Info

Publication number
CN103943113B
CN103943113B CN201410151551.8A CN201410151551A CN103943113B CN 103943113 B CN103943113 B CN 103943113B CN 201410151551 A CN201410151551 A CN 201410151551A CN 103943113 B CN103943113 B CN 103943113B
Authority
CN
China
Prior art keywords
audio
accompaniment
signal
song
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410151551.8A
Other languages
Chinese (zh)
Other versions
CN103943113A (en
Inventor
王子亮
陈凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Star Net eVideo Information Systems Co Ltd
Original Assignee
Fujian Star Net eVideo Information Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Star Net eVideo Information Systems Co Ltd filed Critical Fujian Star Net eVideo Information Systems Co Ltd
Priority to CN201410151551.8A priority Critical patent/CN103943113B/en
Publication of CN103943113A publication Critical patent/CN103943113A/en
Application granted granted Critical
Publication of CN103943113B publication Critical patent/CN103943113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

The present invention provides a kind of method that song goes accompaniment, including step:Obtain audio accompaniment signal and song audio signal;FFT is carried out to song audio signal and audio accompaniment signal respectively and obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal;Audio accompaniment signal amplitude spectrum is strengthened;Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines the phase of song audio signal and carries out the audio signal that FFT inverse transformations obtain accompaniment.The present invention also provides the device that a kind of song goes accompaniment.The present invention can improve the degree of purity for the song isolated, and the execution efficiency separated is high, and algorithm is simple and easy to apply.

Description

The method and apparatus that a kind of song goes accompaniment
Technical field
The present invention relates to Audio Signal Processing field, the method and apparatus that more particularly to a kind of song goes accompaniment.
Background technology
Song piece-rate system is widely used in some fields, such as lyrics automatic identification and correction.Lyrics automatic identification Usually require that input system is single song, i.e., only song of not accompanying, but this for almost all of song usually It is unpractiaca, because most song is all the accompaniment while comprising song and musical instrument.
The research that song is isolated in current music is also seldom, never with Sound seperation acoustic problem, such Business is easy for people, but highly difficult for machine.Speech Separation is widely studied, but due to Music is a kind of extremely complex signal, and the multiple signals comprising song and different musical instruments are mixed, and musical instrument sound Sound and song or related, are difficult to isolate pure song using Blind Speech Signal isolation technics.
Master's thesis《Song separation based on time frequency analysis》Propose the song separation analyzed based on TF.Its separation process master Main pitch parameters are depended on, there can be overlapping phenomenon between song and the fundamental tone and overtone of musical instrument in many cases, individually It is it is difficult to obtain the TF information of song completely, therefore song often or can not be separated with accompaniment using keynote height.And it is this Method has that algorithm is complicated, computationally intensive, execution efficiency is low.
The content of the invention
The present invention provides a kind of method that song goes accompaniment, it is possible to increase the degree of purity for the song isolated, and separates Execution efficiency it is high, algorithm is simple and easy to apply.
A kind of method that song goes accompaniment, including step:
Obtain audio accompaniment signal and song audio signal;
Song audio signal and audio accompaniment signal are pre-processed respectively and FFT is carried out and obtains song audio signal Amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal;
Audio accompaniment signal amplitude spectrum is strengthened;
Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines song audio signal Phase carry out FFT inverse transformations obtain accompaniment audio signal.
The present invention also provides the device that a kind of song goes accompaniment, and the song goes the device of accompaniment to include:
Audio accompaniment signal and song audio signal acquisition module, for obtaining audio accompaniment signal and song audio letter Number;
Pretreatment and FFT module, for being gone forward side by side respectively to song audio signal and audio accompaniment signal as pretreatment Row FFT obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal.
Amplitude spectrum of accompanying strengthens module, for strengthening audio accompaniment signal amplitude spectrum;
Spectral substraction and FFT inverse transform modules, for song audio signal amplitude spectrum to be subtracted into enhanced audio accompaniment Signal amplitude is composed, and combines the audio signal that the phase progress FFT inverse transformations of song audio signal obtain accompaniment.
Beneficial effects of the present invention are:The present invention to song audio signal and audio accompaniment signal by carrying out FFT Song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum are obtained, and the amplitude spectrum of audio accompaniment signal is strengthened, Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines the phase of song audio signal to enter Row FFT inverse transformations obtain the audio signal of accompaniment, and the present invention can improve the degree of purity for the song isolated, and separate Execution efficiency is high, and algorithm is simple and easy to apply.
Brief description of the drawings
Fig. 1 goes the execution flow chart of the method for accompaniment for a kind of song in an embodiment of the present invention;
Fig. 2 goes the functional block diagram of the device of accompaniment for a kind of song in an embodiment of the present invention;
Fig. 3 is embodiment of the present invention example song《Meet》Song audio time domain beamformer;
Fig. 4 is embodiment of the present invention example song《Meet》Audio accompaniment time domain beamformer;
Fig. 5 is embodiment of the present invention example song《Meet》Go accompaniment after audio time domain oscillogram;
Major Symbol explanation:
10- audio accompaniments signal and song audio signal acquisition module;20- is pre-processed and FFT module;30- accompanies Amplitude spectrum strengthens module;40- spectral substractions and FFT inverse transform modules.
Embodiment
The present invention is composed by the way that song audio signal amplitude spectrum is subtracted into the enhanced audio accompaniment signal amplitude of amplitude spectrum, from And the degree of purity for the song isolated is improved, and the execution efficiency separated is high.
To describe the technology contents of the present invention in detail, feature, the objects and the effects being constructed, below in conjunction with embodiment And coordinate accompanying drawing to be explained in detail.
Referring to Fig. 1, a kind of song of present embodiment removes the method flow diagram of accompaniment method.The song goes to the side of accompaniment Method, including step:
S1, acquisition audio accompaniment signal and song audio signal;
S2, song audio signal and audio accompaniment signal are pre-processed respectively and FFT is carried out obtain song audio Signal amplitude composes the phase with audio accompaniment amplitude spectrum and song audio signal;
S3, to audio accompaniment signal amplitude spectrum strengthen;
S4, song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude composed, and combine song audio The phase of signal carries out the audio signal that FFT inverse transformations obtain accompaniment.
The present invention is subtracted song audio signal amplitude spectrum after enhancing by strengthening audio accompaniment signal amplitude spectrum Audio accompaniment signal amplitude spectrum, and combine song audio signal phase carry out FFT inverse transformations obtain accompaniment audio letter Number, the beneficial degree of purity for improving the song isolated, and the simple execution efficiency of algorithm of present embodiment is high.
In the present embodiment, the step S1 " obtaining audio accompaniment signal and song audio signal " method for " from Audio accompaniment signal and song audio signal are obtained in stereo song audio ", be specially:
The left channel signals of stereo song audio are carried out anti-phase to obtain left inversion signal;
Left inversion signal is added with right-channel signals and obtains audio accompaniment signal.
And it regard right-channel signals in stereo song audio as the song audio signal for needing removal to accompany.
The stereo song audio includes left channel signals and right-channel signals, and the left channel signals are voice and a left side The mixed signal of sound channel accompaniment, right-channel signals are the mixed signals of voice and R channel accompaniment.
In another embodiment, the step S1 " obtaining audio accompaniment signal and song audio signal " method " from Audio accompaniment signal and song audio signal are obtained in stereo song audio ", be specially:
The right-channel signals of stereo song audio are carried out anti-phase to obtain right inversion signal;
Right inversion signal is added with left channel signals and obtains audio accompaniment signal.
And it regard left channel signals in stereo song audio as the song audio signal for needing removal to accompany.
In another embodiment, the step S1 can also be realized by other method.Determine whether song audio And the audio accompaniment corresponding with song audio, if can just make next step processing.
In the present embodiment, it is described " respectively to song sound in step S2 for ease of the processing to song audio signal Frequency signal and audio accompaniment signal are pre-processed ", it implements step and is:
Step S20, song audio signal and audio accompaniment signal are normalized respectively;Wherein, the normalizing Changing the method handled is:The maximum value of song audio signal and audio accompaniment signal is found out respectively, and song audio is believed Number and audio accompaniment signal divided by corresponding maximum value;
The song audio signal and audio accompaniment signal after normalized are divided into N number of frame respectively, wherein, N is just Integer, each song frame and accompaniment frame include 1024 sampled points, and have per two adjacent song frames or between accompaniment frame The sampled point of 512 coincidences.
By the normalized, its amplitude of the song audio signal and audio accompaniment signal be limited to -1 and Between+1, it is easy to subsequent treatment;Song audio signal and audio accompaniment signal are divided into each frame, and two adjacent songs There is the sampled point of 512 coincidences between bent frame or accompaniment frame, in order that being seamlessly transitted between frame and frame.
In the present embodiment, the spectral leakage caused when being and reduce subsequent conversion to frequency domain, in " difference described in step S2 FFT is carried out to song audio and audio accompaniment signal " it is preceding also including carrying out adding Hanning window to each song frame and accompaniment frame Filtering.
In the present embodiment, in step S3, described " audio accompaniment signal is carried out into amplitude spectrum enhancing " implements Step includes:
Step S30, traversal audio accompaniment signal amplitude spectrum Mn(i),(I=0,1,2L512, n=0,1,2LN-1) it is all Frame, finds out the maximum of all amplitude spectrum corresponding points of common 2m+1 frames of the rear m frames of present frame, the preceding m frames of present frame and present frame, The new value that will be put corresponding to the value as present frame, wherein, m is default positive integer.
Such as, m selections 2 in one embodiment.Travel through all frames of audio accompaniment signal amplitude spectrum(Remove all frames Preceding 2 frame with the frame of end 2), rear 2 frame that present frame, preceding 2 frame of present frame and present frame are found out successively is compared and assignment. For example, present frame is the 2nd frame, then to find out its preceding 2 frame the i.e. the 0th, 1 frame, the 2nd frame, and 2 frames are the 3rd, 4 frames thereafter, to this 5 Frame is traveled through by the 0th~512 point successively, is found out the maximum of 5 each corresponding points of frame and is assigned present frame by the value Corresponding points.For example, the 0th maximum is the value of the 3rd frame in 5 frames, then it is the 2nd frame to assign present frame by the value of the 3rd frame 0th point.Then, the value and assignment that this 5 frame 1-512 point is compared successively give the corresponding points of present frame.Then, ought the 3rd frame work For present frame, its preceding 2 frame the i.e. the 1st, 2 frame, the 3rd frame are found out, and 2 frames are the 4th, 5 frames thereafter, are entered according to above-mentioned identical step Row compares and assignment.Formula is Mn(i)=max(MMn-2(i),MMn-1(i),MMn(i),MMn+1(i),MMn+2(i)),i=0,1, 2L512, n=2,3,4LN-3, wherein MMn(i)=Mn(i), i=0,1,2L512, wherein n=0,1,2LN-1, MMn(i) it is copy The amplitude spectrum caching of audio accompaniment signal.In other embodiments, the m values can be arranged to other positive integers beyond 2, Such as 1,3,4.
The amplitude of audio accompaniment signal can be strengthened by " being strengthened audio accompaniment signal amplitude spectrum " step Spectrum, allows spectral substraction and FFT inverse transformation steps largely to remove the accompaniment composition in song audio signal.
In present embodiment, the step S4 is specifically included:
S41, according to formula(i=0,1,2L512) (n=0,1,2LN-1),
All song audio frames of traversal, are traveled through by the 0th~512 point, by the amplitude spectrum of song audio frame again per frame The corresponding amplitude spectrum of enhanced audio accompaniment frame is subtracted, the amplitude spectrum of all frames of audio after accompaniment is obtained.Wherein, Sn (i) composed for song audio signal amplitude, Mn(i) composed for enhanced audio accompaniment signal amplitude, Yn(i) it is to remove the sound after accompanying Frequency signal amplitude is composed, and a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene;
A takes 2, b to take 4 in the present embodiment, and a and b can be arranged to other values in other embodiments, increases a Value can improve accompaniment after audio signal signal to noise ratio, increase b value can increase the removal of accompaniment.
S42, according to formula kn(i)=Yn(i)/Sn(i) (i=0,1,2L512) (n=0,1,2LN-1) will remove the sound after accompaniment Frequency frame amplitude is composed divided by the corresponding amplitude spectrum of song audio frame obtains proportionality coefficient kn(i);
The FFT real parts of all song audio frames are multiplied by corresponding proportionality coefficient k with imaginary part respectivelyn(i), it can be gone The 0th point of FFT real part and imaginary part to the 512nd point of the audio frame after accompaniment, according to FFT symmetry principle, FFT symmetrical 2 Conjugate complex number, i.e. real part are equal each other for part sample value, imaginary part on the contrary, can obtain the 513rd point to 1023 points of FFT real parts with Imaginary part, then carries out the inverse FFT of 1024 points;
Frame obtained by after inverse FFT is stitched together(Notice that interframe is overlapping), obtain removing the audio letter after accompaniment Number.
Referring to Fig. 2, being that the present invention also provides the functional block diagram that a kind of song removes the device of accompaniment.The song goes accompaniment Device include audio accompaniment signal and song audio signal acquisition module 10, pretreatment and FFT module 20, accompaniment amplitude Spectrum enhancing module 30, spectral substraction and FFT inverse transform modules 40;
Audio accompaniment signal and song audio signal acquisition module, for obtaining audio accompaniment signal and song audio letter Number;
Pretreatment and FFT module, for being gone forward side by side respectively to song audio signal and audio accompaniment signal as pretreatment Row FFT obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal;
Amplitude spectrum of accompanying strengthens module, strengthens for the amplitude spectrum to audio accompaniment signal;
Spectral substraction and FFT inverse transform modules, for song audio signal amplitude spectrum to be subtracted into enhanced audio accompaniment Signal amplitude is composed, and combines the audio signal that the phase progress FFT inverse transformations of song audio signal obtain accompaniment.
The present invention carries out amplitude spectrum enhancing by signal amplitude spectrum enhancing module of accompanying to audio accompaniment signal, makes frequency spectrum phase Subtract and FFT inverse transform modules can largely remove the accompaniment composition in song audio, so as to improve the song isolated Degree of purity.
In the present embodiment, the audio accompaniment signal and song audio signal acquisition module include audio accompaniment signal Acquiring unit and song audio signal acquiring unit.
The audio accompaniment signal acquiring unit is used to the left channel signals of stereo song audio carrying out anti-phase obtain Left inversion signal;Left inversion signal is added with right-channel signals and obtains audio accompaniment signal.
The song audio signal acquiring unit be used for using right-channel signals in stereo song audio as need remove The song audio signal of accompaniment.
The stereo song audio includes left channel signals and right-channel signals, and the left channel signals are voice and a left side The mixed signal of sound channel accompaniment, right-channel signals are the mixed signals of voice and R channel accompaniment.
In another embodiment, the audio accompaniment signal and song audio signal acquisition module are believed including audio accompaniment Number acquiring unit and song audio signal acquiring unit.
The audio accompaniment signal acquiring unit is used to the right-channel signals of stereo song audio carrying out anti-phase obtain Right inversion signal;Right inversion signal is added with left channel signals and obtains audio accompaniment signal.
The song audio signal acquiring unit be used for using left channel signals in stereo song audio as need remove The song audio signal of accompaniment.
In another embodiment, the audio accompaniment signal and song audio signal acquisition module can also be by other Method is realized.Song audio is determined whether and the audio accompaniment corresponding with song audio, if can just make next Step processing.
In the above-described embodiment, the pretreatment and FFT module also include normalization unit, framing unit, added Window unit;
The normalization unit is used to song audio signal and audio accompaniment signal is normalized respectively, wherein Normalized is:Find out the maximum value of song audio signal and audio accompaniment signal respectively, and by song audio signal With audio accompaniment signal divided by corresponding maximum value;
The framing unit is used to the song audio signal and audio accompaniment signal after normalized are divided into N respectively Individual frame, wherein, N is positive integer, and each song frame and accompaniment frame include 1024 sampled points, and per two adjacent song frames Or have the sampled point of 512 coincidences between accompaniment frame.
The windowing unit is used to carry out plus Hanning window filtering each song frame and accompaniment frame.In above-mentioned embodiment In, the accompaniment amplitude spectrum enhancing module is used for all frames for traveling through audio accompaniment signal amplitude spectrum, finds out present frame, present frame Preceding m frames and present frame rear m frames all amplitude spectrum corresponding points of common 2m+1 frames maximum, using the value as present frame institute it is right The new value that should be put, wherein, m is default positive integer.
Spectral substraction and the FFT inverse transform module includes spectral substraction unit, FFT inverse transformation blocks and concatenation unit;
The spectral substraction unit is used to the amplitude spectrum of song audio signal subtracting enhanced audio accompaniment signal width Degree spectrum, obtains the audio frequency signal amplitude spectrum after accompaniment, and formula is:(i=0,1, 2L512)(n=0,1,2LN-1).Wherein, Sn(i) composed for song audio signal amplitude, Mn(i) it is enhanced audio accompaniment signal Amplitude spectrum, Yn(i) to go the audio frequency signal amplitude spectrum after accompaniment, a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene;
The FFT inverse transformation blocks are used for going the audio frequency signal amplitude spectrum after accompaniment to carry out FFT inverse transformations.Specifically, According to formula kn(i)=Yn(i)/Sn(i) (i=0,1,2L512) (n=0,1,2LN-1) will go the audio frequency signal amplitude after accompaniment to compose Divided by song audio signal amplitude spectrum obtains proportionality coefficient kn(i);Then the FFT real parts of song audio signal and imaginary part are distinguished It is multiplied by proportionality coefficient kn(i), and carry out 1024 points FFT inverse transformations;
The concatenation unit is used to frame resulting after FFT inverse transformations being stitched together, and obtains removing the audio after accompaniment Signal.
In summary, the method and apparatus that song of the present invention goes accompaniment, by increasing to audio accompaniment signal amplitude spectrum By force, the amplitude spectrum of song audio signal is subtracted into enhanced audio accompaniment signal amplitude to compose, and combines song audio signal Phase carries out the audio that FFT inverse transformations obtain accompaniment, the degree of purity for the song that beneficial raising is isolated, and present embodiment The simple execution efficiency of algorithm it is high.
Example
Removing accompaniment example below by a specific song, the present invention will be described.
The song of Sun Yan appearances《Meet》, audio format is stereo double channel audio.
By stereo song《Meet》L channel carry out anti-phase obtaining inversion signal;By inversion signal and stereo song The right-channel signals of audio are added and obtain song《Meet》Audio accompaniment signal;And by the right-channel signals of stereo song audio As《Meet》Song audio signal.
2 audios are obtained through above-mentioned steps:Meet _ original singer .wav, and meet _ accompany .wav.
Reading is met _ original singer .wav and meet _ and pre-processed after the voice data for the .wav that accompanies, and 1024 points of progress FFT, met _ the song audio signal amplitude of original singer spectrum and meet _ audio accompaniment signal amplitude spectrum.Then according to such as Lower formula is to meeting _ audio accompaniment signal amplitude spectrum progress amplitude spectrum enhancing:
Mn(i)=max(MMn-2(i),MMn-1(i),MMn(i),MMn+1(i),MMn+2(i)),i=0,1,2L512,n=2,3, 4LN-3, wherein MMn(i)=Mn(i), i=0,1,2L512, n=0,1,2LN-1 represent that the audio accompaniment signal amplitude spectrum of copy is slow Deposit, N represents frame number.
According to formula(i=0,1,2L512) (n=0,1,2LN-1), will meet _ former Sing song audio signal amplitude spectrum subtract it is enhanced meet _ audio accompaniment signal amplitude spectrum, obtain accompaniment after audio Signal amplitude is composed.Wherein a takes 2, b to take 4.
According to formula kn(i)=Yn(i)/Sn(i) (i=0,1,2L512) (n=0,1,2LN-1) will go the audio after accompaniment to believe Number amplitude spectrum divided by meet _ the song audio signal amplitude spectrum of original singer obtains proportionality coefficient kn(i);
By meeting _ the FFT real parts of the song audio signal amplitude spectrum of original singer are multiplied by proportionality coefficient k respectively with imaginary partn (i), and carry out 1024 points FFT inverse transformations;
Frame obtained by after FFT inverse transformations is stitched together, the audio for obtaining removing after accompaniment meets _ voice .wav.
It refer to Fig. 3 to Fig. 5, respectively song《Meet》Song audio, audio accompaniment and the audio gone after accompaniment Time domain beamformer.Use player plays audio:Meet _ voice .wav, can hear, accompaniment removes clean, voice substantially Although amplitude has weakened, tonequality is close to the voice in original audio.
Embodiments of the invention are the foregoing is only, are not intended to limit the scope of the invention, it is every to utilize this hair Equivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skills Art field, is included within the scope of the present invention.

Claims (8)

1. a kind of method that song goes accompaniment, it is characterised in that including step:
Obtain audio accompaniment signal and song audio signal;
Song audio signal and audio accompaniment signal are pre-processed respectively and FFT is carried out and obtains song audio signal amplitude Spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal;
Audio accompaniment signal amplitude spectrum is strengthened;" being strengthened audio accompaniment signal amplitude spectrum " specifically includes step Suddenly:Travel through audio accompaniment signal amplitude and compose all frames, find out the rear common 2m+1 of m frames of present frame, the preceding m frames of present frame and present frame The maximum of all amplitude spectrum corresponding points of frame, the new value that will be put corresponding to the value as present frame, wherein, m is default just whole Number;
Song audio signal amplitude spectrum is subtracted into enhanced audio accompaniment signal amplitude spectrum, and combines the phase of song audio signal Position carries out the audio signal that FFT inverse transformations obtain accompaniment.
2. song according to claim 1 goes accompaniment method, it is characterised in that the step " obtains audio accompaniment signal With song audio signal " method be:Audio accompaniment signal and song audio signal are obtained from stereo song audio, specifically For:
The left or right sound channel signal of stereo song audio is carried out anti-phase to obtain left or right inversion signal;
Left or right inversion signal is added with right or left channel signals and obtains audio accompaniment signal;
And it regard the right side of stereo song audio or left channel signals as the song audio signal for needing to go to accompany.
3. song according to claim 1 goes accompaniment method, it is characterised in that described " respectively to song audio signal The step that implements pre-processed is done with audio accompaniment signal is:
Song audio signal and audio accompaniment signal are normalized respectively;Wherein, the method for the normalized For:Find out the maximum value of song audio signal and audio accompaniment signal respectively, and by song audio signal and audio accompaniment Signal divided by corresponding maximum value;
The song audio signal and audio accompaniment signal after normalized are divided into N number of frame respectively, wherein, N is positive integer;
Each song frame and accompaniment frame are carried out plus Hanning window filtering.
4. song according to claim 1 goes accompaniment method, it is characterised in that described " to compose song audio signal amplitude Enhanced audio accompaniment signal amplitude spectrum is subtracted, and combines the phase progress FFT inverse transformations of song audio signal and obtains companion The audio signal played ", specifically includes step:
According to formulaBy song The amplitude spectrum of audio signal subtracts enhanced audio accompaniment signal amplitude spectrum, obtains the audio frequency signal amplitude spectrum after accompaniment, Wherein, Sn(i) composed for song audio signal amplitude, Mn(i) composed for enhanced audio accompaniment signal amplitude, Yn(i) it is to go accompaniment Audio frequency signal amplitude spectrum afterwards, FN is the points of FFT, and a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene, and N is just Integer;
According to formula kn(i)=Yn(i)/Sn(i) (i=0,1,2 ... FN/2) (n=0,1,2 ... N-1) will go the audio after accompaniment Signal amplitude is composed divided by song audio signal amplitude spectrum obtains proportionality coefficient kn(i)
The FFT real parts of song audio signal and imaginary part are multiplied by proportionality coefficient k respectivelyn(i), and the FFT inversions of FN point are carried out Change;
Frame obtained by after FFT inverse transformations is stitched together, obtains removing the audio signal after accompaniment.
5. a kind of song removes the device of accompaniment, it is characterised in that including:
Audio accompaniment signal and song audio signal acquisition module, for obtaining audio accompaniment signal and song audio signal;
Pretreatment and FFT module, for being pre-processed respectively to song audio signal and audio accompaniment signal and carrying out FFT Conversion obtains song audio signal amplitude spectrum and audio accompaniment signal amplitude spectrum and the phase of song audio signal;
Amplitude spectrum of accompanying strengthens module, for strengthening audio accompaniment signal amplitude spectrum;Specifically for traversal audio accompaniment All frames of signal amplitude spectrum, find out common all amplitudes of 2m+1 frames of the rear m frames of present frame, the preceding m frames of present frame and present frame The maximum of spectrum corresponding points, the new value that will be put corresponding to the value as present frame, wherein, m is default positive integer;
Spectral substraction and FFT inverse transform modules, for song audio signal amplitude spectrum to be subtracted into enhanced audio accompaniment signal Amplitude spectrum, and combine the audio signal that the phase progress FFT inverse transformations of song audio signal obtain accompaniment.
6. song according to claim 5 removes accompaniment apparatus, it is characterised in that the audio accompaniment signal and song audio Signal acquisition module includes audio accompaniment signal acquiring unit and song audio signal acquiring unit;
The audio accompaniment signal acquiring unit is used to the left or right sound channel signal of stereo song audio carrying out anti-phase obtain Left or right inversion signal;Left or right inversion signal is added with right or left channel signals and obtains audio accompaniment signal;
The song audio signal acquiring unit is used for right in stereo song audio or left channel signals as needing to remove The song audio signal of accompaniment.
7. song according to claim 5 removes accompaniment apparatus, it is characterised in that the pretreatment and FFT module are also Including normalization unit, framing unit, windowing unit;
The normalization unit is used to song audio signal and audio accompaniment signal is normalized respectively, wherein normalizing Change is processed as:Find out the maximum value of song audio signal and audio accompaniment signal respectively, and by song audio signal and companion Play audio signal divided by corresponding maximum value;
Song audio signal and audio accompaniment signal after normalized is divided into N number of by the framing unit for respectively Frame, wherein, N is positive integer;
The windowing unit is used to carry out plus Hanning window filtering each song frame and accompaniment frame.
8. song according to claim 5 removes accompaniment apparatus, it is characterised in that spectral substraction and FFT the inversion mold changing Block includes spectral substraction unit, FFT inverse transformation blocks and concatenation unit;
The spectral substraction unit is used to the amplitude spectrum of song audio signal subtracting enhanced audio accompaniment signal amplitude spectrum, obtains The audio frequency signal amplitude gone after accompaniment is composed, and formula is: Wherein, Sn(i) composed for song audio signal amplitude, Mn(i) composed for enhanced audio accompaniment signal amplitude, Yn(i) it is to go accompaniment Audio frequency signal amplitude spectrum afterwards, FN is the points of FFT, and a is signal to noise ratio Dynamic gene, and b is accompaniment Dynamic gene, and N is just Integer;
The FFT inverse transformation blocks are used for going the audio frequency signal amplitude spectrum after accompaniment to carry out FFT inverse transformations;Specifically, according to Formula kn(i)=Yn(i)/Sn(i) (i=0,1,2 ... FN/2) (n=0,1,2 ... N-1) will go the audio frequency signal amplitude after accompaniment Spectrum divided by song audio signal amplitude spectrum obtain proportionality coefficient kn(i);Then by the FFT real parts and imaginary component of song audio signal Proportionality coefficient k is not multiplied byn(i), and the FFT inverse transformations of FN point are carried out;
The concatenation unit is used to frame resulting after FFT inverse transformations being stitched together, and obtains removing the audio signal after accompaniment.
CN201410151551.8A 2014-04-15 2014-04-15 The method and apparatus that a kind of song goes accompaniment Active CN103943113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410151551.8A CN103943113B (en) 2014-04-15 2014-04-15 The method and apparatus that a kind of song goes accompaniment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410151551.8A CN103943113B (en) 2014-04-15 2014-04-15 The method and apparatus that a kind of song goes accompaniment

Publications (2)

Publication Number Publication Date
CN103943113A CN103943113A (en) 2014-07-23
CN103943113B true CN103943113B (en) 2017-11-07

Family

ID=51190745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410151551.8A Active CN103943113B (en) 2014-04-15 2014-04-15 The method and apparatus that a kind of song goes accompaniment

Country Status (1)

Country Link
CN (1) CN103943113B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104183245A (en) * 2014-09-04 2014-12-03 福建星网视易信息系统有限公司 Method and device for recommending music stars with tones similar to those of singers
CN104269174B (en) * 2014-10-24 2018-02-09 北京音之邦文化科技有限公司 A kind of processing method and processing device of audio signal
CN104778958B (en) * 2015-03-20 2017-11-24 广东欧珀移动通信有限公司 A kind of method and device of Noise song splicing
CN105575393A (en) * 2015-12-02 2016-05-11 中国传媒大学 Personalized song recommendation method based on voice timbre
CN106157979B (en) * 2016-06-24 2019-10-08 广州酷狗计算机科技有限公司 A kind of method and apparatus obtaining voice pitch data
CN106024005B (en) 2016-07-01 2018-09-25 腾讯科技(深圳)有限公司 A kind of processing method and processing device of audio data
CN108962277A (en) * 2018-07-20 2018-12-07 广州酷狗计算机科技有限公司 Speech signal separation method, apparatus, computer equipment and storage medium
CN109308901A (en) * 2018-09-29 2019-02-05 百度在线网络技术(北京)有限公司 Chanteur's recognition methods and device
CN109872711B (en) * 2019-01-30 2021-06-18 北京雷石天地电子技术有限公司 Song fundamental frequency extraction method and device
CN111667805B (en) * 2019-03-05 2023-10-13 腾讯科技(深圳)有限公司 Accompaniment music extraction method, accompaniment music extraction device, accompaniment music extraction equipment and accompaniment music extraction medium
CN111429937B (en) * 2020-05-09 2023-09-15 北京声智科技有限公司 Voice separation method, model training method and electronic equipment
CN113129920B (en) * 2021-04-15 2021-08-17 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Music and human voice separation method based on U-shaped network and audio fingerprint

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1924992A (en) * 2006-09-12 2007-03-07 东莞市步步高视听电子有限公司 Kara Ok human voice playing method
CN1945689A (en) * 2006-10-24 2007-04-11 北京中星微电子有限公司 Method and its device for extracting accompanying music from songs
CN101577117A (en) * 2009-03-12 2009-11-11 北京中星微电子有限公司 Extracting method of accompaniment music and device
CN102402977A (en) * 2010-09-14 2012-04-04 无锡中星微电子有限公司 Method for extracting accompaniment and human voice from stereo music and device of method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6815600B2 (en) * 2002-11-12 2004-11-09 Alain Georges Systems and methods for creating, modifying, interacting with and playing musical compositions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1924992A (en) * 2006-09-12 2007-03-07 东莞市步步高视听电子有限公司 Kara Ok human voice playing method
CN1945689A (en) * 2006-10-24 2007-04-11 北京中星微电子有限公司 Method and its device for extracting accompanying music from songs
CN101577117A (en) * 2009-03-12 2009-11-11 北京中星微电子有限公司 Extracting method of accompaniment music and device
CN102402977A (en) * 2010-09-14 2012-04-04 无锡中星微电子有限公司 Method for extracting accompaniment and human voice from stereo music and device of method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Adobe Audition简易人声提取;Yi;《Adobe Audition教程》;20131029;第1页第1-3段,附图4-10 *

Also Published As

Publication number Publication date
CN103943113A (en) 2014-07-23

Similar Documents

Publication Publication Date Title
CN103943113B (en) The method and apparatus that a kind of song goes accompaniment
CN104134444B (en) A kind of song based on MMSE removes method and apparatus of accompanying
US9111526B2 (en) Systems, method, apparatus, and computer-readable media for decomposition of a multichannel music signal
Zhu et al. Multi-stage non-negative matrix factorization for monaural singing voice separation
CN111128214B (en) Audio noise reduction method and device, electronic equipment and medium
Kim et al. Multi-domain processing via hybrid denoising networks for speech enhancement
JP6482173B2 (en) Acoustic signal processing apparatus and method
Fitzgerald Upmixing from mono-a source separation approach
KR20180050652A (en) Method and system for decomposing sound signals into sound objects, sound objects and uses thereof
EP1741313A2 (en) A method and system for sound source separation
KR20130112898A (en) Decomposition of music signals using basis functions with time-evolution information
CN102402977A (en) Method for extracting accompaniment and human voice from stereo music and device of method
CN106653048A (en) Method for separating sound of single channels on basis of human sound models
JP2010210758A (en) Method and device for processing signal containing voice
CN112712816A (en) Training method and device of voice processing model and voice processing method and device
CN107210029A (en) Method and apparatus for handling succession of signals to carry out polyphony note identification
Wright et al. Adversarial guitar amplifier modelling with unpaired data
CN114038476A (en) Audio signal processing method and device
CN107017005B (en) DFT-based dual-channel speech sound separation method
Chen et al. A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation
JP2008072600A (en) Acoustic signal processing apparatus, acoustic signal processing program, and acoustic signal processing method
Lee et al. Excitation signal extraction for guitar tones
TW582024B (en) Method and system for determining reliable speech recognition coefficients in noisy environment
Costa et al. Sparse time-frequency representations for polyphonic audio based on combined efficient fan-chirp transforms
Doumanidis et al. Rnnoise-ex: Hybrid speech enhancement system based on rnn and spectral features

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant