CN105590633A - Method and device for generation of labeled melody for song scoring - Google Patents

Method and device for generation of labeled melody for song scoring Download PDF

Info

Publication number
CN105590633A
CN105590633A CN201510784342.1A CN201510784342A CN105590633A CN 105590633 A CN105590633 A CN 105590633A CN 201510784342 A CN201510784342 A CN 201510784342A CN 105590633 A CN105590633 A CN 105590633A
Authority
CN
China
Prior art keywords
track
energy
energy distribution
accompaniment
distribution spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510784342.1A
Other languages
Chinese (zh)
Inventor
张瑞怀
董昌朝
刘小峰
陈伟煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Bailiheng Information Technology Co Ltd
Original Assignee
Fujian Bailiheng Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Bailiheng Information Technology Co Ltd filed Critical Fujian Bailiheng Information Technology Co Ltd
Priority to CN201510784342.1A priority Critical patent/CN105590633A/en
Publication of CN105590633A publication Critical patent/CN105590633A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/81Detection of presence or absence of voice signals for discriminating voice from music
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/091Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance

Abstract

The present invention provides a method and device for generation of a labeled melody for song scoring, and relates to a method for information extraction in audio frequency data and especially for labeled melody extraction from songs. The method comprises the following steps: S010, obtaining one section of real signals X0 in an original audio track and one section of real signals X1, corresponding to the real signals X0, in an accompaniment audio track; S020, applying windowing discrete Fourier transform to the signals X0 and the signals X1, and obtaining an energy distribution spectrum X0' corresponding to the original audio track and an energy distribution spectrum X1' corresponding to the accompaniment audio track; and S030, calculating the energy difference on each frequency range of the original audio track and the accompaniment audio track, and obtaining a human voice energy distribution spectrum Xmag_diff according to the difference. The present invention provides a method for bitch generation of a music score.

Description

A kind of music score of Chinese operas for song scoring generates method and apparatus
Technical field
Relate to the information extraction in a kind of voice data, particularly from song, extract the raw method of the music score of Chinese operas.
Background technology
Music is a large product of human civilization, and music is not only a kind of civilization art, a kind of Social Culture especially; NoSame music has different social effects, and outstanding music has more the function of cultivating one's taste with soul distillation. Music industryIn global entertainment industry, occupy huge ratio, also have the connection of countless ties with video display industry, game animation industrySystem.
Music has numerous species type, and song is approximately maximum one. From the content of a song, form greatly by threePart: word, song, music. " song " is that a head sings the significant feature of tool, is significant difference place between song. One" song " of first song is made up of accompaniment spectrum and people's sound spectrum. As a first song, the part of voice is the key element of a song most critical especially.
As the key element of a first song most critical, people's sound spectrum is various content-based music information retrievals or comparison functionFoundation---for example singing search, original works of music comparison, proposed algorithm based on music similarity; In addition, people's sound spectrum alsoIt is important material in the middle of music teaching field, musical composition field.
Inventor finds realizing time of the present invention, and people's sound spectrum of wanting to obtain in song has three kinds of methods, first methodBe directly to be provided by the record company under song, but in most of the cases, it is original that record company can not disclose out songPeople's sound spectrum, so mostly often cannot use first method in situation.
The second is to be listened and write out by the staff who has music training, is unusual original and poor efficiency, although accurateRate is the highest, but this method can not be fast and automation complete, and human cost is very high, is not suitable for especiallyProcess song in enormous quantities.
The third is the angle from Audio Signal Processing, acoustic feature based on voice and musical instrument miscellaneous,Or supervision based on other or without supervision machine learning method, extract people's sound spectrum. But in present common music systemDo in process, various voice, instrumental music rail before mixed contracting are all likely applied various effect devices, and different blended compression process allLikely superpose the again effect device of various the unknowns, so this problem has become half-blindness source or total blindness's source signal separates, soThis method becomes more difficult, and the people's sound spectrum accuracy drawing is not high.
Above three kinds of methods, all can not meet automatic high-efficiency ground and calculate in batches the order of people's sound spectrum of magnanimity song.
Summary of the invention
Below provide the simplification of one or more aspect is summarized to try hard to provide the basic comprehension to this type of aspect. ThisSummarize detailed the combining of the not all aspect contemplating and look at, and neither be intended to point out out the key or decisive of all aspectsKey element is the non-scope of attempting to define any or all aspect also. Its unique object be to provide in simplified form one or moreSome concepts of individual aspect are using the more specifically bright order as providing after a while.
For this reason, need to provide the object method of people's sound spectrum that a kind of automatic high-efficiency ground calculates magnanimity song in batches and establishStandby
For achieving the above object, inventor provides a kind of music score of Chinese operas generation method for song scoring, it is characterized in that,Comprise step, S010, obtain the one section reality corresponding with real signal X0 in one section of real signal X0 in original singer's track and accompaniment trackSignal X1; S020, above-mentioned real signal X0 and X1 are implemented to windowing DFT algorithm, the energy that obtains corresponding original singer's track dividesThe Energy distribution spectrum X1 ' of cloth spectrum X0 ' and corresponding accompaniment track; S030, count according to Energy distribution spectrum X0 ' and Energy distribution spectrum X1 'Calculate the difference of original singer's track and accompaniment track energy in each frequency range, obtain voice Energy distribution spectrum Xmag_diff according to difference.S040, calculate base frequency according to voice Energy distribution spectrum Xmag_diff; By song segmentation and to the above-mentioned S010 of each carrying out step by step~S040 step, obtains base frequency corresponding to each segmentation, base frequency corresponding each segmentation is spliced according to time sequencing,Obtain the music score of Chinese operas for song scoring.
Be different from prior art, technique scheme is from the reality of real signal X0 and the corresponding vocal accompaniment track of corresponding original singer's trackIn signal X1, calculate the part that obtains people's acoustic energy, thereby determine that according to the energy of voice the frequency of voice (is also known as soundAdjust), use this method, can offset the impact of the various voice, instrumental music and the various effect devices that mix in accompaniment, increaseThe accuracy of voice identification. And use the batch process song that this method can high-efficient automatic, to obtain voice partMusic score, the music score of voice part can be further used for the points-scoring system of singing. Address relevant object for before reaching, thisOr more aspects are included in the feature of hereinafter fully describing and particularly pointing out in claims. Below description and attachedFigure has elaborated some illustrative aspects of this one or more aspect. But these features are only to have indicated to adoptSeveral with in the variety of way of the principle of various aspects, and this description be intended to contain all these type of aspects and etc. efficacious prescriptionsFace.
Brief description of the drawings
Describe disclosed aspect below with reference to accompanying drawing, it is non-limiting disclosed side in order to illustrate that accompanying drawing is providedFace, in accompanying drawing, similar label indicates similar key element, and therein:
Fig. 1 is a kind of implementation method of the present invention;
Fig. 2 is original singer's track and the accompaniment track schematic diagram of a certain first song;
Fig. 3 obtains the Energy distribution spectrum X0 ' of corresponding original singer's track and the Energy distribution spectrum X1 ' of corresponding accompaniment track;
Fig. 4 is the voice Energy distribution spectrum X obtainingmag_diff
Fig. 5 is the music score of Chinese operas for song scoring obtaining;
Fig. 6 is module map corresponding to one embodiment of the present invention.
Description of reference numerals:
10, pretreatment module;
20, real signal acquisition module;
30, energy computing module;
40, base frequency computing module;
50, music score of Chinese operas synthesis module.
Detailed description of the invention
By describe in detail technical scheme technology contents, structural feature, realized object and effect, below in conjunction with concrete realityExecute example and coordinate accompanying drawing to be explained in detail. In the following description, set forth for explanatory purposes numerous details to provide rightThe thorough understanding of one or more aspect. But it is evident that do not have these details also can put into practice this type of aspect.
The invention provides a kind of music score of Chinese operas generation method for song scoring, referring to Fig. 1, step is as follows,
S010, obtain the one section real letter corresponding with real signal X0 in one section of real signal X0 in original singer's track and accompaniment trackNumber X1;
S020 implements windowing Fourier transformation to above-mentioned real signal X0 and X1, obtains the Energy distribution spectrum of corresponding original singer's trackThe Energy distribution spectrum X1 ' of X0 ' and corresponding accompaniment track;
S030, calculate original singer's track and accompany track in each frequency range according to Energy distribution spectrum X0 ' and Energy distribution spectrum X1 'The difference of energy, obtains voice Energy distribution spectrum X according to differencemag_diff
S040, according to voice Energy distribution spectrum Xmag_difF calculates base frequency;
By song segmentation and to the above-mentioned S010~S040 of each carrying out step by step step, obtain base frequency corresponding to each segmentation,Base frequency corresponding each segmentation is spliced according to time sequencing, obtain the music score of Chinese operas for song scoring.
Voice Energy distribution spectrum Xmag_diffBe also referred to as voice amplitude spectrum.
In certain embodiments, said method is specially, the real signal of the first song original singer of acquisition one track and accompaniment trackReal signal, then does windowed FFT to them, and the short signal in window is calculated to frequency spectrum, passes through Fourier in this methodWhat conversion obtained is the frequency domain distribution (being energy spectrum) within a period of time. The preferred length of window of analyzing use is 4096Sampled point, step is moved 256 sampled points of length. For example, shown in Fig. 2 is a certain song while doing windowing Fourier transformation, and institute is usedThe real signal X0 of corresponding original singer's track and the real signal X1 of corresponding vocal accompaniment track. Real signal X0 and real signal X1 have 4096The short signal (1:26.600~1:26.685 part of corresponding described song) of sampled point. After obtaining real signal X1 and X2,Respectively real signal X0 and X1 are done to Hamming windowed FFT, then obtain respectively the Energy distribution of corresponding original singer's trackThe Energy distribution spectrum X1 ' of spectrum X0 ' and corresponding accompaniment track, is Fourier to a certain 4096 continuous each sampled points of a certain first songAs shown in Figure 3, in figure, that top is X0 ' for the Energy distribution spectrum X0 ' obtaining after conversion and X1 ', below be X1 ').
Above-mentioned real signal X0 and real signal X1 are implemented to Fourier transformation, Ke Yishi,
X0’=fft(x0·w)
X1’=fft(x1·w)
w ( n ) = 0.53836 - 0.46164 cos ( 2 π n N - 1 )
Be understandable that, can use other Fourier transformation implementation algorithms or its to improve algorithm, with according to real signalTry to achieve Energy distribution. For example other Fourier transformation methods, and adopt different algorithms, the X0 ' obtaining and X1 ' and above-mentioned figureShowing that the X0 ' in X compares with X1 ', may be different.
According to the Energy distribution spectrum X1 ' meter of the Energy distribution spectrum X0 ' of the above-mentioned original singer's track calculating and accompaniment trackCalculate original singer's track and compose X with the voice Energy distribution of accompaniment trackmag_diff, preferred computational methods are:
Formula 1:
Wherein
Be understandable that, constant arbitrarily can be multiplied by equation the right, is all the variant of this method. For example this methodVariant can also be:
Formula 2:
Wherein
The Energy distribution spectrum X1 ' meter of the Energy distribution spectrum X0 ' of original singer's track of a certain section of a certain song and accompaniment trackThe voice Energy distribution spectrum X calculatingmag_diffAs shown in Figure 4. Be understandable that and adopt different computational methods, the figure obtainingSpectrum may be discrepant.
What have choosing composes X according to voice Energy distributionmag_diffCalculate base frequency, comprise following concrete steps:
To each the frequency range sampling in voice range frequency range, respectively in conjunction with voice Energy distribution spectrum Xmag_diffIt is right to calculateEnergy weighted average summation maxAvgDb that should sample frequency section; Calculate energy weighted average summation corresponding to each sampling frequency rangeMaximum maxOfMaxAvgDbs in maxAvgDb, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic waveBestOfBestFreq, this harmonic wave bestOfBestFreq respective frequencies is base frequency;
Described calculating comprises step to the energy weighted average summation of the frequency range of should sampling: calculate the various of this sampling frequency rangePossible harmonic wave and each harmonic wave corresponding energy weighted average summation avgDb respectively, and calculate the energy that each harmonic wave is corresponding and addMaximum maxAvgDb in weight average summation avgDb, the harmonic wave bestFreq that this maximum maxAvgDb is corresponding, harmonic waveThe frequency that bestFreq is corresponding is the most probable base frequency of this sampling frequency range. In further embodiments, if maximumMaxOfMaxAvgDbs is less than setting value, and this segmentation does not generate tone, and this period is interior without voice. Different when setting valueComputational methods, setting value can be different, the computational methods of impact setting value have: according to calculating energy distribution profile X0 ' and energyThe method of amount distribution profile X1 ', and calculate voice Energy distribution spectrum Xmag_diffMethod, and according to according to voice Energy distributionSpectrum Xmag_diffCalculate the method for base frequency.
Above-mentioned according to voice Energy distribution spectrum Xmag_diffThe method of calculating base frequency is expressed as by false code:
(figure only shows as shown in Figure 5 to calculate by said method the music score of Chinese operas for song scoring that a certain song drawsThe segment of one this music score of Chinese operas).
By said method, from the real signal X0 of corresponding original singer's track and the real signal X1 of corresponding vocal accompaniment track, calculate and obtainObtain the part of people's acoustic energy, thereby determine the frequency (being also known as tone) of voice according to the energy of voice, use this method, canTo offset the impact of the various voice, instrumental music and the various effect devices that mix in accompaniment, increase the accuracy of voice identification.And use the batch process song that this method can high-efficient automatic, to obtain the music score of voice part, the pleasure of voice partSpectrum can be further used for the points-scoring system of singing.
Obtain the one section reality corresponding with real signal X0 in one section of real signal X0 in original singer's track and the track of accompanying in stepAlso comprise that step separates the original singer's track in the MV of MPG form and accompaniment track before signal X1.
Original singer's track in the MV of MPG form and accompaniment track are separated. In addition, the track of two-channel be passed throughIt is monaural original singer's rail and accompaniment rail that PCA (principal component analytical method) extracts principal component. As shown in Figure 2, one of the first halfDimension real signal is that the original singer's monophonic after extracting, the one dimension real signal of the latter half are the accompaniment monophonics after extracting.
Inventor also provides a kind of electronic equipment, for generating the music score of Chinese operas of song scoring, it is characterized in that, comprises realitySignal acquisition module, energy computing module, base frequency computing module, music score of Chinese operas synthesis module;
Described real signal acquisition module is for obtaining in one section of real signal X0 of original singer's track and accompaniment track and real letterNumber one section of real signal X1 corresponding to X0;
Described energy computing module, for according to real signal X0 and X1, and implements Fourier transformation to X0 and X1, and it is right to obtainAnswer the Energy distribution spectrum X0 ' of original singer's track and the Energy distribution spectrum X1 ' of corresponding accompaniment track; And according to Energy distribution spectrum X0 'Calculate original singer's track and the difference of the track energy in each frequency range of accompanying with Energy distribution spectrum X1 ', obtain people's acoustic energy according to differenceAmount distribution profile Xmag_diff
Described base frequency computing module is for composing X according to voice Energy distributionmag_difF calculates base frequency;
Described music score of Chinese operas synthesis module is combined into for song and comments for basis being commented on to base frequency that computing module calculatesThe music score of Chinese operas dividing.
In further embodiments, preferred energy computing module is for obtaining respectively correspondence according to Fourier transformation methodThe Energy distribution spectrum X1 ' of the Energy distribution spectrum X0 ' of original singer's track and corresponding accompaniment track.
In further embodiments, preferred energy computing module is for composing X0 ' and corresponding accompaniment tone according to Energy distributionThe Energy distribution spectrum X1 ' of rail calculates voice Energy distribution spectrum, and described calculating formula is:
Wherein
In further embodiments, preferred described base frequency computing module is in voice range frequency rangeEach frequency range of sampling, respectively in conjunction with voice Energy distribution spectrum Xmag_diffCalculate energy weighted average that should sample frequency sectionSummation maxAvgDb; Calculate the maximum in energy weighted average summation maxAvgDb corresponding to each sampling frequency rangeMaxOfMaxAvgDbs, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic wave bestOfBestDiv, this harmonic waveBestOfBestDiv respective frequencies is base frequency; Described calculating comprises the energy weighted average summation of the frequency range of should samplingStep: calculate various possible harmonic wave and the energy weighted average summation avgDb corresponding to each harmonic wave difference of this sampling frequency range, withAnd calculate the maximum maxAvgDb in the energy weighted average summation avgDb that each harmonic wave is corresponding, this maximum maxAvgDb coupleThe harmonic wave bestDiv answering, the frequency that harmonic wave bestDiv is corresponding is the most probable base frequency of this sampling frequency range. If maximumMaxOfMaxAvgDbs is less than setting value, in corresponding segments without voice composition.
In further embodiments, preferably also comprise pretreatment module, described pretreatment module is used for song filesIn original singer's track with accompaniment track separate.
Being understandable that song files can make video file, can be also audio file. It should be noted that, at thisWen Zhong, the relational terms such as the first and second grades is only used for an entity or operation and another entity or behaviourMake a distinction, and not necessarily require or imply between these entities or operation and have the relation of any this reality or suitableOrder. And term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thereby make bagProcess, method, article or the terminal device of drawing together a series of key elements not only comprise those key elements, but also do not comprise clearly rowOther key elements that go out, or be also included as the intrinsic key element of this process, method, article or terminal device. Do not havingIn the situation of more restrictions, by the key element limiting statement " comprising ... " or " comprising ... ", and be not precluded within and comprise described wantingIn process, method, article or the terminal device of element, also there is other key element. In addition, in this article, " being greater than ", " being less than "," exceed " etc. and to be interpreted as and not comprise given figure; " more than ", " below ", " in " etc. be interpreted as and comprise given figure.
Those skilled in the art should understand, the various embodiments described above can be provided as method, device or computer program productProduct. These embodiment can adopt complete hardware implementation example, completely implement software example or the embodiment in conjunction with software and hardware aspectForm. All or part of step in the method that the various embodiments described above relate to can be carried out the hardware that instruction is relevant by program and comeComplete, described program can be stored in the storage medium that computer equipment can read, for carrying out the various embodiments described above instituteThe all or part of step of stating. Described computer equipment, includes but not limited to: personal computer, server, all-purpose computer,Special-purpose computer, the network equipment, embedded device, programmable device, intelligent mobile terminal, intelligent home device, Wearable intelligenceEnergy equipment, vehicle intelligent equipment etc.; Described storage medium, includes but not limited to: RAM, ROM, magnetic disc, tape, CD, sudden strain of a muscleDeposit, USB flash disk, portable hard drive, storage card, memory stick, webserver stores, network cloud storage etc.
The various embodiments described above are with reference to according to the method described in embodiment, equipment (system) and computer programFlow chart and/or block diagram are described. Should understand can be by computer program instructions realization flow figure and/or block diagram everyFlow process in one flow process and/or square frame and flow chart and/or block diagram and/or the combination of square frame. These computers can be providedProgrammed instruction to produce a machine, makes the finger of carrying out by the processor of computer equipment to the processor of computer equipmentOrder produces for realizing and specifies at flow process of flow chart or multiple flow process and/or square frame of block diagram or multiple square frameThe device of function.
These computer program instructions also can be stored in and can establish with the computer of ad hoc fashion work by vectoring computer equipmentIn standby readable memory, the instruction that makes to be stored in this computer equipment readable memory produces the manufacture that comprises command deviceProduct, this command device is realized at flow process of flow chart or multiple flow process and/or square frame of block diagram or multiple square frame middle fingerFixed function.
These computer program instructions also can be loaded on computer equipment, make to carry out on computer equipment a series ofOperating procedure is to produce computer implemented processing, thereby the instruction of carrying out on computer equipment is provided for realizing in flow processThe step of the function of specifying in flow process of figure or multiple flow process and/or square frame of block diagram or multiple square frame.
Although the various embodiments described above are described, once those skilled in the art cicada basicCreative concept, can make other change and amendment to these embodiment, so the foregoing is only enforcement of the present inventionExample, not thereby limits scope of patent protection of the present invention, every equivalence that utilizes description of the present invention and accompanying drawing content to doStructure or the conversion of equivalent flow process, or be directly or indirectly used in other relevant technical fields, be all in like manner included in of the present inventionWithin scope of patent protection.

Claims (11)

1. the music score of Chinese operas generation method for song scoring, is characterized in that, comprises step,
S010, obtain the one section real signal corresponding with real signal X0 in one section of real signal X0 in original singer's track and accompaniment trackX1;
S020, above-mentioned real signal X0 and X1 are implemented to windowing DFT algorithm, obtain the Energy distribution of corresponding original singer's trackThe Energy distribution spectrum X1 ' of spectrum X0 ' and corresponding accompaniment track;
S030, Energy distribution spectrum X1 ' the calculating original singer's track and the companion that compose X0 ' and accompaniment track according to the Energy distribution of original singer's trackPlay the difference of track energy in each frequency range, obtain voice Energy distribution spectrum X according to differencemag_diff
S040, according to voice Energy distribution spectrum Xmag_diffCalculate base frequency;
By song segmentation and to the above-mentioned S010~S040 of each carrying out step by step step, obtain base frequency corresponding to each segmentation, will be eachBase frequency corresponding to segmentation splices according to time sequencing, obtains the music score of Chinese operas for song scoring.
2. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, described according to original singerThe Energy distribution spectrum X1 ' of the Energy distribution spectrum X0 ' of track and accompaniment track calculates people's acoustic energy of original singer's track and accompaniment trackDistribution profile Xmag_diff, be specially:
Wherein i=1,2..., N
3. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, to above-mentioned real signalX0 and real signal X1 implement windowing Fourier transformation, obtain respectively Energy distribution spectrum X0 ' and the corresponding accompaniment of corresponding original singer's trackThe Energy distribution spectrum X1 ' of track is specially:
X0’=fft(x0·w)
X1’=fft(x1·w)
4. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, described according to peopleAcoustic energy distribution profile Xmag_diffCalculate base frequency, comprise step:
To each frequency sampling in voice range frequency range, respectively in conjunction with voice Energy distribution spectrum Xmag_diffCalculate shouldThe energy weighted average summation maxAvgDb of sample frequency section; Calculate energy weighted average summation corresponding to each sampling frequency rangeMaximum maxOfMaxAvgDbs in maxAvgDb, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic waveBestOfBestFreq, this harmonic wave bestOfBestFreq respective frequencies is base frequency;
Described calculating comprises step to the energy weighted average summation of the frequency range of should sampling: calculating the various of this sampling frequency range mayHarmonic wave and respectively corresponding energy weighted average summation avgDb of each harmonic wave, and it is flat to calculate the energy weighting that each harmonic wave is correspondingAll maximum maxAvgDb in summation avgDb, the harmonic wave bestFreq that this maximum maxAvgDb is corresponding, harmonic waveThe frequency that bestFreq is corresponding is the most probable base frequency of this sampling frequency range.
5. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 4, is characterized in that, if maximumMaxOfMaxAvgDbs is less than setting value, and this period does not generate tone.
6. a kind of music score of Chinese operas generation method for song scoring as claimed in claim 1, is characterized in that, obtains former in stepBefore singing one section of real signal X1 corresponding with real signal X0 in one section of real signal X0 in track and the track of accompanying, also comprise stepOriginal singer's track in song files is separated with accompaniment track.
7. an electronic equipment, for generating the music score of Chinese operas of song scoring, is characterized in that, comprises real signal acquisition module, energyComputing module, base frequency computing module, music score of Chinese operas synthesis module;
Described real signal acquisition module is for obtaining in one section of real signal X0 of original singer's track and accompaniment track and real signal X0One section of corresponding real signal X1;
Described energy computing module, for according to real signal X0 and X1, and implements Fourier transformation to X0 and X1, obtains corresponding formerSing the Energy distribution spectrum X0 ' of track and the Energy distribution spectrum X1 ' of corresponding accompaniment track; And according to Energy distribution spectrum X0 ' and energyAmount distribution profile X1 ' calculates the difference of original singer's track and accompaniment track energy in each frequency range, obtains people's acoustic energy divide according to differenceCloth spectrum Xmag_diff;
Described base frequency computing module is for composing X according to voice Energy distributionmag_diffCalculate base frequency;
For basis is commented on, base frequency that computing module calculates is combined into for song scoring described music score of Chinese operas synthesis moduleThe music score of Chinese operas.
8. a kind of electronic equipment as claimed in claim 7, is characterized in that, described energy computing module is used for according to FourierTransform method obtains respectively the Energy distribution spectrum X0 ' of corresponding original singer's track and the Energy distribution spectrum X1 ' of corresponding accompaniment track.
9. a kind of electronic equipment as claimed in claim 7, is characterized in that, described energy computing module is for dividing according to energyThe Energy distribution spectrum X1 ' of cloth spectrum X0 ' and corresponding accompaniment track calculates voice Energy distribution spectrum, and described calculating formula is:
Wherein i=1,2..., N
10. a kind of electronic equipment as claimed in claim 7, is characterized in that, described base frequency computing module is used for peopleEach sampling frequency range in the frequency range of sound territory, respectively in conjunction with voice Energy distribution spectrum Xmag_diffCalculate should sample frequencyThe energy weighted average summation maxAvgDb of section; Calculate in energy weighted average summation maxAvgDb corresponding to each sampling frequency rangeMaximum maxOfMaxAvgDbs, the harmonic wave that this maximum maxOfMaxAvgDbs is corresponding is harmonic wave bestOfBestFreq,This harmonic wave bestOfBestFreq respective frequencies is base frequency; Described calculating is total to the energy weighted average of the frequency range of should samplingWith comprise step: calculate the various possible harmonic wave of this sampling frequency range and each harmonic wave corresponding energy weighted average summation respectivelyAvgDb, and calculate the maximum maxAvgDb in the energy weighted average summation avgDb that each harmonic wave is corresponding, this maximumThe harmonic wave bestFreq that maxAvgDb is corresponding, the frequency that harmonic wave bestFreq is corresponding is the most probable basis of this sampling frequency range frequencyRate; If maximum maxOfMaxAvgDbs is less than setting value, in corresponding segments without voice composition.
11. a kind of electronic equipments as claimed in claim 7, is characterized in that, also comprise pretreatment module, described pretreatment mouldPiece is for separating original singer's track of song files and accompaniment track.
CN201510784342.1A 2015-11-16 2015-11-16 Method and device for generation of labeled melody for song scoring Pending CN105590633A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510784342.1A CN105590633A (en) 2015-11-16 2015-11-16 Method and device for generation of labeled melody for song scoring

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510784342.1A CN105590633A (en) 2015-11-16 2015-11-16 Method and device for generation of labeled melody for song scoring

Publications (1)

Publication Number Publication Date
CN105590633A true CN105590633A (en) 2016-05-18

Family

ID=55930155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510784342.1A Pending CN105590633A (en) 2015-11-16 2015-11-16 Method and device for generation of labeled melody for song scoring

Country Status (1)

Country Link
CN (1) CN105590633A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107910019A (en) * 2017-11-30 2018-04-13 中国科学院微电子研究所 A kind of human acoustical signal's processing and analysis method
CN109300485A (en) * 2018-11-19 2019-02-01 北京达佳互联信息技术有限公司 Methods of marking, device, electronic equipment and the computer storage medium of audio signal
WO2020015411A1 (en) * 2018-07-18 2020-01-23 阿里巴巴集团控股有限公司 Method and device for training adaptation level evaluation model, and method and device for evaluating adaptation level

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1148230A (en) * 1995-04-18 1997-04-23 德克萨斯仪器股份有限公司 Method and system for karaoke scoring
US6057502A (en) * 1999-03-30 2000-05-02 Yamaha Corporation Apparatus and method for recognizing musical chords
US20030106413A1 (en) * 2001-12-06 2003-06-12 Ramin Samadani System and method for music identification
US20050065781A1 (en) * 2001-07-24 2005-03-24 Andreas Tell Method for analysing audio signals
CN1607575A (en) * 2003-10-16 2005-04-20 扬智科技股份有限公司 Humming transcription system and methodology
CN1924992A (en) * 2006-09-12 2007-03-07 东莞市步步高视听电子有限公司 Kara Ok human voice playing method
CN1945689A (en) * 2006-10-24 2007-04-11 北京中星微电子有限公司 Method and its device for extracting accompanying music from songs
CN101238511A (en) * 2005-08-11 2008-08-06 旭化成株式会社 Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program
CN101894552A (en) * 2010-07-16 2010-11-24 安徽科大讯飞信息科技股份有限公司 Speech spectrum segmentation based singing evaluating system
CN101944355A (en) * 2009-07-03 2011-01-12 深圳Tcl新技术有限公司 Obbligato music generation device and realization method thereof
CN102054480A (en) * 2009-10-29 2011-05-11 北京理工大学 Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)
CN102682762A (en) * 2011-03-15 2012-09-19 新加坡科技研究局 Harmony synthesizer and method for harmonizing vocal signals
CN103426433A (en) * 2012-05-14 2013-12-04 宏达国际电子股份有限公司 Noise cancellation method
US20140039891A1 (en) * 2007-10-16 2014-02-06 Adobe Systems Incorporated Automatic separation of audio data
CN103680517A (en) * 2013-11-20 2014-03-26 华为技术有限公司 Method, device and equipment for processing audio signals
CN104134444A (en) * 2014-07-11 2014-11-05 福建星网视易信息系统有限公司 Song accompaniment removing method and device based on MMSE
CN104219556A (en) * 2014-09-12 2014-12-17 北京阳光视翰科技有限公司 Use method of four-soundtrack karaoke identification playing system
CN104538011A (en) * 2014-10-30 2015-04-22 华为技术有限公司 Tone adjusting method and device and terminal device
CN104683933A (en) * 2013-11-29 2015-06-03 杜比实验室特许公司 Audio object extraction method

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1148230A (en) * 1995-04-18 1997-04-23 德克萨斯仪器股份有限公司 Method and system for karaoke scoring
US6057502A (en) * 1999-03-30 2000-05-02 Yamaha Corporation Apparatus and method for recognizing musical chords
US20050065781A1 (en) * 2001-07-24 2005-03-24 Andreas Tell Method for analysing audio signals
US20030106413A1 (en) * 2001-12-06 2003-06-12 Ramin Samadani System and method for music identification
CN1607575A (en) * 2003-10-16 2005-04-20 扬智科技股份有限公司 Humming transcription system and methodology
CN101238511A (en) * 2005-08-11 2008-08-06 旭化成株式会社 Sound source separating device, speech recognizing device, portable telephone, and sound source separating method, and program
CN1924992A (en) * 2006-09-12 2007-03-07 东莞市步步高视听电子有限公司 Kara Ok human voice playing method
CN1945689A (en) * 2006-10-24 2007-04-11 北京中星微电子有限公司 Method and its device for extracting accompanying music from songs
US20140039891A1 (en) * 2007-10-16 2014-02-06 Adobe Systems Incorporated Automatic separation of audio data
CN101944355A (en) * 2009-07-03 2011-01-12 深圳Tcl新技术有限公司 Obbligato music generation device and realization method thereof
CN102054480A (en) * 2009-10-29 2011-05-11 北京理工大学 Method for separating monaural overlapping speeches based on fractional Fourier transform (FrFT)
CN101894552A (en) * 2010-07-16 2010-11-24 安徽科大讯飞信息科技股份有限公司 Speech spectrum segmentation based singing evaluating system
CN102682762A (en) * 2011-03-15 2012-09-19 新加坡科技研究局 Harmony synthesizer and method for harmonizing vocal signals
CN103426433A (en) * 2012-05-14 2013-12-04 宏达国际电子股份有限公司 Noise cancellation method
CN103680517A (en) * 2013-11-20 2014-03-26 华为技术有限公司 Method, device and equipment for processing audio signals
CN104683933A (en) * 2013-11-29 2015-06-03 杜比实验室特许公司 Audio object extraction method
CN104134444A (en) * 2014-07-11 2014-11-05 福建星网视易信息系统有限公司 Song accompaniment removing method and device based on MMSE
CN104219556A (en) * 2014-09-12 2014-12-17 北京阳光视翰科技有限公司 Use method of four-soundtrack karaoke identification playing system
CN104538011A (en) * 2014-10-30 2015-04-22 华为技术有限公司 Tone adjusting method and device and terminal device

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107910019A (en) * 2017-11-30 2018-04-13 中国科学院微电子研究所 A kind of human acoustical signal's processing and analysis method
WO2020015411A1 (en) * 2018-07-18 2020-01-23 阿里巴巴集团控股有限公司 Method and device for training adaptation level evaluation model, and method and device for evaluating adaptation level
US11074897B2 (en) 2018-07-18 2021-07-27 Advanced New Technologies Co., Ltd. Method and apparatus for training adaptation quality evaluation model, and method and apparatus for evaluating adaptation quality
US11367424B2 (en) 2018-07-18 2022-06-21 Advanced New Technologies Co., Ltd. Method and apparatus for training adaptation quality evaluation model, and method and apparatus for evaluating adaptation quality
CN109300485A (en) * 2018-11-19 2019-02-01 北京达佳互联信息技术有限公司 Methods of marking, device, electronic equipment and the computer storage medium of audio signal
CN109300485B (en) * 2018-11-19 2022-06-10 北京达佳互联信息技术有限公司 Scoring method and device for audio signal, electronic equipment and computer storage medium

Similar Documents

Publication Publication Date Title
Nanni et al. Combining visual and acoustic features for audio classification tasks
CN103440873B (en) A kind of music recommend method based on similarity
Schlüter Learning to Pinpoint Singing Voice from Weakly Labeled Examples.
Lehner et al. On the reduction of false positives in singing voice detection
Lagrange et al. Normalized cuts for predominant melodic source separation
Lu et al. Fog computing approach for music cognition system based on machine learning algorithm
CN111400543B (en) Audio fragment matching method, device, equipment and storage medium
Wu et al. Combining visual and acoustic features for music genre classification
CN103871426A (en) Method and system for comparing similarity between user audio frequency and original audio frequency
CN105679324A (en) Voiceprint identification similarity scoring method and apparatus
CN104282316A (en) Karaoke scoring method based on voice matching, and device thereof
CN103489445A (en) Method and device for recognizing human voices in audio
CN105590633A (en) Method and device for generation of labeled melody for song scoring
CN107767850A (en) A kind of singing marking method and system
WO2024021882A1 (en) Audio data processing method and apparatus, and computer device and storage medium
Dandawate et al. Indian instrumental music: Raga analysis and classification
CN111445922B (en) Audio matching method, device, computer equipment and storage medium
Felipe et al. Acoustic scene classification using spectrograms
Yang et al. Research based on the application and exploration of artificial intelligence in the field of traditional music
Dhall et al. Music genre classification with convolutional neural networks and comparison with f, q, and mel spectrogram-based images
Smaragdis Polyphonic pitch tracking by example
MX2022006798A (en) Method for music generation, electronic device, storage medium cross reference to related applications.
Tsai et al. Clustering music recordings based on genres
Nagathil et al. Musical genre classification based on a highly-resolved cepstral modulation spectrum
Shirali-Shahreza et al. Fast and scalable system for automatic artist identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160518