CN108806657A

CN108806657A - Music model training, musical composition method, apparatus, terminal and storage medium

Info

Publication number: CN108806657A
Application number: CN201810570846.7A
Authority: CN
Inventors: 王义文; 刘奡智; 王健宗; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-06-05
Filing date: 2018-06-05
Publication date: 2018-11-13
Also published as: WO2019232928A1

Abstract

A kind of music model training method, including：MIDI music data collection is obtained, the MIDI music datas collection includes multiple MIDI music score；Extract the feature vector of each MIDI music score；Feature vector is input in structuring support vector machines and is trained to obtain music model, including：Construct discriminant function f (x；W), x is feature vector, and w is parameter vector, will maximize discriminant function f (x；W) data valueIt is exported as predicted value；According to default loss functionPredicted value and actual value are calculated, wherein P is the probability distribution of data, the empiric risk being calculated with training sample dataInstead of；Use the optimization formula of SVMIt solves unique parameters vector ω and makes the empiric risk by training sample dataIt is zero；Solve discriminant function f (x；ω), musical time sequence is finally exported.The present invention also provides musical composition method, apparatus, terminal and storage mediums.The present invention be by artificial intelligence be used for train music model first, the music model trained can improve the ability in feature extraction of MIDI music score.

Description

Music model training, musical composition method, apparatus, terminal and storage medium

Technical field

The present invention relates to music technology fields, and in particular to a kind of music model training, musical composition method, apparatus, end End and storage medium.

Background technology

In all fields (for example, recording studio recording, on-the-spot demonstration, broadcast) of audio creation, usually using a series of Signal processing tool for processing audio signal.This includes the audio mixing for handling independent audio signal, such as master control is completed；And processing With combination by multiple audio signals of different sound source (for example, component musical instrument in instrumental ensembling) creation.The target of the processing is Improve the aesthetic performance of the audio signal of gained, such as to create the audio mixing of high quality when combining multiple signals；Or it is viscous It is attached to and transmits relevant some functional restraints, such as to minimize the Signal Degrade of the data compression due to such as mp3, or Person mitigates the influence of the background noise on aircraft.Currently, this work is by usually specializing in the audio of the specific region of creation Technical staff manually completes, very labor intensive.

Invention content

In view of the foregoing, it is necessary to propose a kind of music model training and/or musical composition method, apparatus, terminal and Storage medium can realize that user plays several sounds on piano, you can write out complete abundant song, and played, saved Shi Shengli and it is not required to specialize in the technical staff of creation refering to cost-effective.

The first aspect of the present invention provides a kind of music model training method, the method includes：

MIDI music data collection is obtained, the MIDI music datas collection includes multiple MIDI music score；

Extract the feature vector of each MIDI music score；

Described eigenvector is input in structuring support vector machines and is trained to obtain music model, including：Construction Discriminant function f (x；W), x is feature vector, and w is parameter vector, will maximize discriminant function f (x；W) data valueIt is exported as predicted value；According to default loss function The predicted value is calculated with actual value, wherein P is the probability distribution of data, is calculated with training sample data Empiric riskInstead of；Use the optimization formula of SVMSolving unique parameters vector ω makes Obtain the empiric risk by training sample dataIt is zero；Solve discriminant function f (x；ω), musical time sequence is finally exported.

Preferably, the feature vector for extracting each MIDI music score includes：

The pitch sequence of MIDI music score is extracted as first eigenvector；

The time series of MIDI music score is extracted as second feature vector；

The first eigenvector and the second feature vector are attached, the feature vector of MIDI music score is obtained.

Preferably, the time series of the extraction MIDI music score further includes as second feature vector：

The greatest common divisor for solving all time series, as unit time；Or

The multiple that each time series is the unit time is calculated, using the multiple as the corresponding sequential sequence of button Row.

Preferably, the method further includes：

Acquired MIDI music data collection is divided into the first data set and the second data set；

The data set that the first preset quantity is randomly choosed in first data set participates in the training of the music model；

Select a head MIDI music score at random in second data set；

Extract the feature vector of the music score in the preset time period for the MIDI music score picked out；

The feature vector of music score in the preset time period is input in the trained music model, output pair The MIDI music score answered；

According to the MIDI music score of the MIDI music score picked out and the output, the property of trained music model is verified Energy.

Preferably, the performance of the music model of the verification training includes：

The first waveform for the MIDI music score picked out described in extraction；

Extract the second waveform of the MIDI music score of the output；

Calculate the similarity of the first waveform and second waveform；

Judge whether the similarity is more than default similarity threshold；

If the similarity is more than or equal to the default similarity threshold, it is determined that the music trained The performance of model is more excellent；

If the similarity is less than the default similarity threshold, it is determined that the property of the music model trained It can be poor.

The second aspect of the present invention provides a kind of musical composition method, the method includes：

The MIDI music score for including several MIDI notes for acquiring user's creation, as MIDI music score to be created；

The pitch sequence of MIDI music score to be created described in extraction is as third feature vector；

The time series of MIDI music score to be created described in extraction is as fourth feature vector；

The third feature vector and the fourth feature vector are attached, the feature vector of MIDI music score is obtained；

Described eigenvector is input in advance trained music model and is learnt, wherein the music model To train to obtain using the music model training apparatus；

Export corresponding MIDI music score.

The third aspect of the present invention provides a kind of music model training apparatus, and described device includes：

Acquisition module, for obtaining MIDI music data collection, the MIDI music datas collection includes multiple MIDI music score；

Extraction module, the feature vector for extracting each MIDI music score；

Training module is trained to obtain music mould for described eigenvector to be input in structuring support vector machines Type, including：Construct discriminant function f (x；W), x is feature vector, and w is parameter vector, will maximize discriminant function f (x；w) Data valueIt is exported as predicted value；According to default loss functionThe predicted value is calculated with actual value, wherein P is the probability distribution of data, with training The empiric risk that sample data is calculatedInstead of；Use the optimization formula of SVMIt asks Solution unique parameters vector ω makes the empiric risk by training sample dataIt is zero；Solve discriminant function f (x；ω), most Musical time sequence is exported afterwards.

The fourth aspect of the present invention provides a kind of musical composition device, and described device includes：

Acquisition module, the MIDI music score for including several MIDI notes for acquiring user's creation, as to be created MIDI music score；

First extraction module, for extracting the pitch sequence of the MIDI music score to be created as third feature vector；

Second extraction module, for extracting the time series of the MIDI music score to be created as fourth feature vector；

Link block obtains MIDI pleasures for the third feature vector and the fourth feature vector to be attached The feature vector of spectrum；

Study module learns for described eigenvector to be input in advance trained music model, wherein The music model is to train to obtain using the music model training apparatus；

Output module, for exporting corresponding MIDI music score.

The third aspect of the present invention provides a kind of terminal, and the terminal includes processor and memory, and the processor is used The music model training method and/or musical composition side are realized when executing the computer program stored in the memory Method.

The fourth aspect of the present invention provides a kind of computer readable storage medium, is deposited on the computer readable storage medium Computer program is contained, the computer program realizes the music model training method and/or music when being executed by processor Creative method.

The present invention be by artificial intelligence be used for train music model first, the music model trained can be carried significantly The ability in feature extraction of high MIDI music score；Using trained music model, it is only necessary to acquire several MIDI notes, you can create MIDI music greatly reduces the creation cost of MIDI melodies, saves the expenses of a large amount of band performance persons, shorten It the working time of recording studio, improves work efficiency.

Description of the drawings

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.

Fig. 1 is the flow chart for the music model training method that the embodiment of the present invention one provides.

Fig. 2 is the flow chart of musical composition method provided by Embodiment 2 of the present invention.

Fig. 3 is the functional block diagram for the music model training apparatus that the embodiment of the present invention three provides.

Fig. 4 is the functional block diagram for the musical composition device that the embodiment of the present invention four provides.

Fig. 5 is the schematic diagram for the terminal that the embodiment of the present invention five provides.

Following specific implementation mode will be further illustrated the present invention in conjunction with above-mentioned attached drawing.

Specific implementation mode

To better understand the objects, features and advantages of the present invention, below in conjunction with the accompanying drawings and specific real Applying example, the present invention will be described in detail.It should be noted that in the absence of conflict, the embodiment of the present invention and embodiment In feature can be combined with each other.

Elaborate many details in the following description to facilitate a thorough understanding of the present invention, described embodiment only It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill The every other embodiment that personnel are obtained without making creative work, shall fall within the protection scope of the present invention.

Unless otherwise defined, all of technologies and scientific terms used here by the article and belong to the technical field of the present invention The normally understood meaning of technical staff is identical.Used term is intended merely to description tool in the description of the invention herein The purpose of the embodiment of body, it is not intended that in the limitation present invention.

The music model training method and/or musical composition method of the embodiment of the present invention are applied in one or more terminal In.The music model training method and/or musical composition method can also be applied to by terminal and pass through network and the end It holds in the hardware environment that the server being attached is constituted.Network includes but not limited to：Wide area network, Metropolitan Area Network (MAN) or LAN. The music model training method and/or musical composition method of the embodiment of the present invention can be executed by server, can also be by end It holds to execute；It can also be and executed jointly by server and terminal.

The terminal for needing progress music model training method and/or musical composition method, can be directly at end The music model training method and/or musical composition function that the method for the present invention is provided, or installation are integrated on end for real The client of the method for the existing present invention.For another example, method provided by the present invention can also be with Software Development Kit (Software Development Kit, SDK) form operate in the equipment such as server, music model training side is provided in the form of SDK The instruction to music model can be realized by the interface of offer for the interface of method and/or musical composition function, terminal or other equipment White silk and/or art music.

Embodiment one

Fig. 1 is the flow chart for the music model training method that the embodiment of the present invention one provides.The stream according to different requirements, Execution sequence in journey figure can change, and certain steps can be omitted.

S11, MIDI music data collection is obtained, the MIDI music datas collection includes multiple MIDI music score.

MIDI (Musical Instrument Digital Interface, instrument devices digital interface) be music circle most Extensive music standards format can be described as " music score that computer capacity understands ", be to record sound with the digital controlled signal of note It is happy.The not instead of voice signal of MIDI transmission, the instructions such as note, control parameter, it indicates what MIDI equipment will do how It does, such as played, which note, much volumes, what tone terminate, what accompaniment are subject to.That is, MIDI data includes：MIDI The information such as channel, time, pitch, dynamics, volume, the reverberation that some musical instrument is sent to sound-producing devices such as MIDI synthesizers.

In the present embodiment, the acquisition of the MIDI music datas collection may include：

1) it is collected into from NineS data sets, which includes nine grade of difficulty, and each grade of difficulty has 50 The data of first MIDI music score.

2) it is collected into from FourS data sets, which includes the four of 400 head MIDI music-book on pianoforte files composition The data set of a grade of difficulty, each grade of difficulty have the data of 100 head MIDI music score.

NineS data sets and FourS data sets are the MIDI music data collection of profession, and MIDI music data collection is exclusively used in Collect MIDI music.

S12, the feature vector for extracting each MIDI music score.

In the present embodiment, the feature vector for extracting each MIDI music score includes：

1) pitch sequence of extraction MIDI music score is as first eigenvector；

Include black key and white key in piano, totally 88 buttons, each button represents different pitches, thus can use One 88 vectors indicate pitch.

Pitch of the first identifier to mark some button to be pressed when can be pre-set, second identifier is pre-set Pitch when marking some button not to be pressed.The first identifier can be 1, and the second identifier can be 0, The pitch sequence of MIDI, pitch sequence when i-th of key of a certain moment is pressed can be marked to be denoted as using 0-1 labelling methods

For example, the 1st pitch of the button when being pressed at the first moment in piano is 1, the first moment remaining 87 The pitch of position button is 0, then the music sequence at the first moment is denoted as P1=(1,0,0,0 ..., 0,0)；The 5th in piano is pressed Pitch of the key when being pressed at the second moment is 1, and the pitch of remaining 87 button of the second moment is 0, then the music at the second moment Sequence is denoted as P5=(0,0,0,0,1 ..., 0,0).

2) time series of extraction MIDI music score is as second feature vector；

The time series refers to time for being persistently pressed after being pressed of some button a certain moment, can be indicated with T.

For example, the 3rd button is at a time by Continued depression 2 seconds, then the time series of the 3rd button is 2.

3) first eigenvector and the second feature vector are attached, obtain the feature of MIDI music score to Amount.

The first eigenvector and the second feature vector can be linked in sequence, for example, obtained MIDI music score Feature vector be denoted as

It can also be by the first eigenvector and the second feature vector interconnection, for example, obtained MIDI pleasures The feature vector of spectrum is denoted as

Preferably, excessive in order to avoid the time series of button, the time series of the extraction MIDI music score is as second Feature vector can also include：

The greatest common divisor for solving all time series, as unit time；

For example, the greatest common divisor of all time series is unit duration, if a lasting duration of button is unit K times of duration, then the time series of this button is denoted as K, when performance, allows the pitch of button to repeat K times, to indicate this sound High level duration.

In other embodiments, the unit time can also be a pre-set arbitrary number, for example, 3 seconds.

S13, described eigenvector is input in structuring support vector machines and is trained to obtain music model.

Unlike traditional support vector machine (support vector machine, SVM), structuring support vector machines It can be according to the Structural Characteristics inside music data, to construct suitable structured features function Ψ (x, y), so as to effective The complicated structuring music data of processing, the target of algorithm is to find a discriminant function f (x for being used for predicting；ω), sentencing After other function determines, music data input value x is given, selection can maximize discriminant function f (x；Data value y ω) is as defeated Go out.

The specific implementation process is as follows：

1) input described eigenvector X={ M_i| i=1,2 ... N }, each time point, there are two data vectors：Sound High and duration.

2) construction discriminant function f (x；W), discriminant function f (x will be maximized；W) data value y carries out defeated as predicted value Go out.

With shown in formula (1-1)：

Wherein, ω is parameter vector, it is assumed that F and the comprehensive characteristics of input, output have following (1-2) institute of linear relationship Show：

F(x,y；W)=<w,Ψ(x,y> (1-2)

Wherein, structured features function Ψ (x, y) is the character representation of input and output.

3) predicted value and actual value are calculated according to default loss function.

For the accuracy of quantitative prediction, need to design a loss function Δ:Y × y → R, when predicted value and actual value More close, loss function can be smaller, and when predicted value and actual value difference are larger, loss function can become larger, and total loss function can To be defined as follows shown in (1-3)：

Wherein, P is the probability distribution of data, needs the empiric risk calculated with training sample dataIt replaces, differentiates Function f (x；Performance ω) can be measured by loss function, different f (x；Different loss functions ω) is corresponded to, In this algorithm training process, empirical loss function is the smaller the better.

4) the parameter vector ω is calculated.

Calculating parameter vector ω makes the empiric risk by training sample dataIt is zero, and condition is following (1-4) It is shown：

N non-linear formula above is launched into n | | y | |-n linear formulas (1-5),

Wherein, δ Ψ_i(y)=Ψ (x_i, y_i)-Ψ(x_i, y).

5) unique parameters vector ω is solved.

Under restrict above, the solution of parameter vector ω may have multiple, unique parameter vector ω in order to obtain, Next by the maximum-margin principles of SVM, problem is converted to the optimization problem of SVM, under entering shown in (1-6)：

It arrives here, so that it may to solve optimization problem, obtain unique parameters vector ω, solve discriminant function f (x；ω), finally Export musical time sequence.

Preferably, in order to verify the performance of trained music model, the method can also include：

Acquired MIDI music data collection is divided into the first data set and the second data set.

In this preferred embodiment, the think of of cross validation (Cross Validation) may be used when training music model Think, acquired MIDI music datas collection is carried out according to suitable ratio to be divided into the first data set and the second data set, closes Suitable division proportion such as 7：3.

First data set is to training music model, and second data set is testing trained music mould The performance of type shows that the performance of trained music model is better if the accuracy rate of test is higher；If the accuracy rate of test It is relatively low, then show that the performance of trained music model is poor.

It further, i.e., will be all if the total quantity of the first data set and the second data set that mark off is still larger First data set is used for participating in the training of music model, will cause to find the corresponding optimized parameter vector ω costs of music model compared with Greatly.Thus, after acquired MIDI music data collection is divided into the first data set, the method can also include：Institute The data set that the first preset quantity is randomly choosed in the first data set generated participates in the training of music model.

In this preferred embodiment, in order to increase the randomness for the first data set for participating in training, random number life may be used It is randomly choosed at algorithm.

In this preferred embodiment, first preset quantity can be a pre-set fixed value, for example, 40, i.e., The training that 40 MIDI music score participate in music model is picked out at random in the first data set generated.First preset quantity A pre-set ratio value is can also be, for example, 1/10, that is, select 1/ at random in the first data set generated The sample of 10 ratios participates in the training of music model.

Further, in order to verify the trained performance based on structuring SVM music models, the method can also wrap It includes：

Select a head MIDI music score at random in second data set；

The feature vector of music score in the preset time period is input to trained based on structuring SVM music models In, export corresponding MIDI music score；

According to the MIDI music score of the MIDI music score picked out and the output, verify trained based on structuring SVM The performance of music model.

The MIDI music score of the MIDI music score and the output picked out described in the basis is verified trained based on structuring The performance of SVM music models specifically includes：

Extract the first waveform for the MIDI music score picked out；

Extract the second waveform of the MIDI music score of output；

Calculate the similarity of the first waveform and second waveform；

Judge whether the similarity is more than default similarity threshold；

If the similarity be more than or equal to the default similarity threshold, it is determined that train based on structure The performance for changing SVM music models is more excellent；

If the similarity be less than the default similarity threshold, it is determined that train based on structuring SVM sounds The performance of happy model is poor.

It is described in addition to the pitch and sequential of extraction MIDI music score are as feature vector in an alternative embodiment Method can also include：The analysis that music element is carried out to the MIDI music of input obtains the MIDI music and is based on music member The statistic analysis result of element；The statistic analysis result is input in structuring support vector machines and is trained.

The analysis of the music element includes：Structural analysis, track analysis, tone color analysis, rhythm analysis or speed point Analysis.In MIDI music, chord is obtained into line, so as to carry out the structure point of MIDI music by the methods of chord identification Analysis.Wherein, the structural analysis includes：Period analysis, phrase analysis, chord analysis, trifle analysis, note analysis.Period by It is multiple to sound melodious, smooth, meet music chord and carries out theoretical trifle composition；Trifle is made of note.In MIDI files, Including some channel events.Can be obtained from the event of channel such as pitch information, when value information, timbre information, dynamics information, Expression information, Pitchbend Wheel or modulation wheel information, breath controller information, volume controller information, sound field controller information etc., Track analysis and tone color analysis can be carried out from these information.Wherein, the track, which is analyzed, includes：The analysis of drum rail, background rail Analysis, the analysis of accompaniment rail, the analysis of melody rail.One on rhythm can be substantially obtained from the note distribution and volume distribution of drum rail A little information carry out rhythm analysis.Further include some additional events, such as lyrics, label, track name, tone mark, bat in MIDI files Number, velocity amplitude etc. can obtain the information such as such as speed and tune from the information of these events, to carry out velocity analysis. The structure of MIDI music, track, tone color, rhythm or speed are analyzed according to the method described above, the MIDI can be obtained Statistic analysis result of the music based on the music element.

The present invention carries out the MIDI music of input the analysis of music element, obtains the MIDI music and is based on music element Statistic analysis result no longer elaborate.

Music model training method of the present invention, be by artificial intelligence be used for train music model first, use SVM based on structuring trains music model, the music model trained that can significantly improve the feature extraction energy of MIDI music score Power.

Embodiment two

Fig. 2 is the flow chart of musical composition method provided by Embodiment 2 of the present invention.The flow chart according to different requirements, In execution sequence can change, certain steps can be omitted.

The MIDI music score for including several MIDI notes that S21, acquisition user create, as MIDI music score to be created.

User can arbitrarily play several notes on piano, can stop playing after user plays several notes, this When acquire several notes, several notes are input in advance trained music model, you can voluntarily play out One first complete music.

The pitch sequence of MIDI music score to be created described in S22, extraction is as third feature vector.

The time series of MIDI music score to be created described in S23, extraction is as fourth feature vector.

S24, the third feature vector and the fourth feature vector are attached, obtain the feature of MIDI music score to Amount.

If in training music model using the method being linked in sequence, at this point, also will using the method being linked in sequence The third feature vector is linked in sequence with the fourth feature vector.

If in training music model using the method for interconnection, at this point, also using the method for interconnection will The third feature vector is linked in sequence with the fourth feature vector.

S25, described eigenvector is input in advance trained music model learns.

Memory MIDI music score is had the function of based on structuring SVM music models because trained, thus will include several The feature vector of the MIDI music score of MIDI notes is input in advance trained music model, and model can automatically export therewith Corresponding MIDI music score.

S26, the corresponding MIDI music score of output.

Musical composition method of the present invention can substantially reduce MIDI pleasures using advance trained music model Bent creation cost, saves the expenses of a large amount of band performance persons, shortens the working time in recording studio, improve work Make efficiency.

The above is only the specific implementation mode of the present invention, but scope of protection of the present invention is not limited thereto, for For those skilled in the art, without departing from the concept of the premise of the invention, improvement, but these can also be made It all belongs to the scope of protection of the present invention.

With reference to the 3rd to 5 figure, respectively to realizing the terminal of above-mentioned music model training method and musical composition method Function module and hardware configuration are introduced.

Embodiment three

Fig. 3 is the functional block diagram in music model training apparatus preferred embodiment of the present invention.

In some embodiments, the music model training apparatus 30 is run in terminal.The music model training dress It may include multiple function modules being made of program code segments to set 30.Each journey in the music model training apparatus 30 The program code of sequence section can be stored in memory, and performed by least one processor, with execution (refer to Fig. 1 and its Associated description) training to music model.

In the present embodiment, the function of the music model training apparatus 30 of the terminal performed by it can be divided For multiple function modules.The function module may include：Acquisition module 301, extraction module 302, training module 303 and verification Module 304.The so-called module of the present invention, which refers to one kind, performed by least one processor and capable of completing fixed work( The series of computation machine program segment of energy, storage is in memory.It in some embodiments, will be rear about the function of each module It is described in detail in continuous embodiment.

Acquisition module 301, for obtaining MIDI music data collection, the MIDI music datas collection includes multiple MIDI pleasures Spectrum.

Extraction module 302, the feature vector for extracting each MIDI music score.

In the present embodiment, the feature vector that the extraction module 302 extracts each MIDI music score includes：

1) pitch sequence of extraction MIDI music score is as first eigenvector；

2) time series of extraction MIDI music score is as second feature vector；

The greatest common divisor for solving all time series, as unit time；

Training module 303 is trained to obtain sound for described eigenvector to be input in structuring support vector machines Happy model.

The specific implementation process is as follows：

With shown in formula (1-1)：

F(x,y；W)=<w,Ψ(x,y> (1-2)

4) the parameter vector ω is calculated.

Wherein, δ Ψ_i(y)=Ψ (x_i, y_i)-Ψ(x_i, y).

5) unique parameters vector ω is solved.

Preferably, in order to verify the performance of trained music model, the acquisition module 301 is additionally operable to：

It further, i.e., will be all if the total quantity of the first data set and the second data set that mark off is still larger First data set is used for participating in the training of music model, will cause to find the corresponding optimized parameter vector ω costs of music model compared with Greatly.Thus, after acquired MIDI music data collection is divided into the first data set, the acquisition module 301 is additionally operable to：? The data set that the first preset quantity is randomly choosed in the first data set generated participates in the training of music model.

Further, in order to verify the trained performance based on structuring SVM music models, the music model training fills It can also includes authentication module 304 to set 30, be used for：

Select a head MIDI music score at random in second data set；

The MIDI music score of the MIDI music score and the output picked out described in the basis, the verification training of authentication module 304 The performance based on structuring SVM music models specifically include：

Extract the first waveform for the MIDI music score picked out；

Extract the second waveform of the MIDI music score of output；

Calculate the similarity of the first waveform and second waveform；

Judge whether the similarity is more than default similarity threshold；

It is described in addition to the pitch and sequential of extraction MIDI music score are as feature vector in an alternative embodiment Music model training apparatus 30 can also include：The analysis that music element is carried out to the MIDI music of input, obtains the MIDI Statistic analysis result of the music based on music element；The statistic analysis result is input in structuring support vector machines and is carried out Training.

Music model training apparatus of the present invention, be by artificial intelligence be used for train music model first, use SVM based on structuring trains music model, the music model trained that can significantly improve the feature extraction energy of MIDI music score Power.

Example IV

Fig. 4 is the functional block diagram in musical composition device preferred embodiment of the present invention.

In some embodiments, the musical composition device 40 is run in terminal.The musical composition device 40 can be with Including multiple function modules being made of program code segments.The program generation of each program segment in the musical composition device 40 Code can be stored in memory, and performed by least one processor, with execution (referring to Fig. 2 and its associated description) to sound Happy creation.

In the present embodiment, the function of the musical composition device 40 of the terminal performed by it can be divided into more A function module.The function module may include：Acquisition module 401, the first extraction module 402, the second extraction module 403, Link block 404, study module 405 and output module 406.The so-called module of the present invention refers to that one kind can be by least one place Reason device is performed and can complete the series of computation machine program segment of fixed function, and storage is in memory.In some realities It applies in example, the function about each module will be described in detail in subsequent embodiment.

Acquisition module 401, the MIDI music score for including several MIDI notes for acquiring user's creation, as to be created MIDI music score.

First extraction module 402, for extract the pitch sequence of the MIDI music score to be created as third feature to Amount.

Second extraction module 403, for extract the time series of the MIDI music score to be created as fourth feature to Amount.

Link block 404 obtains MIDI for the third feature vector and the fourth feature vector to be attached The feature vector of music score.

Study module 405 learns for described eigenvector to be input in advance trained music model.

Output module 406, for exporting corresponding MIDI music score.

Musical composition device of the present invention can substantially reduce MIDI pleasures using advance trained music model Bent creation cost, saves the expenses of a large amount of band performance persons, shortens the working time in recording studio, improve work Make efficiency.

The above-mentioned integrated unit realized in the form of software function module, can be stored in one and computer-readable deposit In storage media.Above-mentioned software function module is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, double screen equipment or the network equipment etc.) or processor (processor) execute the present invention The part of a embodiment the method.

Embodiment five

The terminal 5 includes：Memory 51, at least one processor 52 are stored in the memory 51 and can be in institute State the computer program 53 run at least one processor 52 and at least one communication bus 54.

At least one processor 52 realized when executing the computer program 53 above-mentioned music model training method and/ Or the step in musical composition embodiment of the method.

Illustratively, the computer program 53 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 51, and are executed by least one processor 52, to complete this hair It is bright.One or more of module/units can be the series of computation machine program instruction section that can complete specific function, this refers to Enable section for describing implementation procedure of the computer program 53 in the terminal 5.

The terminal 5 can be the computing devices such as desktop PC, notebook, palm PC and cloud server.This Field technology personnel are appreciated that the schematic diagram 5 is only the example of terminal 5, and the not restriction of structure paired terminal 5 can be with Including components more more or fewer than diagram, certain components or different components are either combined, such as the terminal 5 may be used also To include input-output equipment, network access equipment, bus etc..

At least one processor 52 can be central processing unit (Central Processing Unit, CPU), It can also be other general processors, digital signal processor (Digital Signal Processor, DSP), special integrated Circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..The processor 52 can be microprocessor or the processor 52 can also be any conventional processor Deng the processor 52 is the control centre of the terminal 5, utilizes each portion of various interfaces and the entire terminal of connection 5 Point.

The memory 51 can be used for storing the computer program 53 and/or module/unit, and the processor 52 passes through Operation executes the computer program and/or module/unit being stored in the memory 51, and calls and be stored in memory Data in 51 realize the various functions of the terminal 5.The memory 51 can include mainly storing program area and storage data Area, wherein storing program area can storage program area, needed at least one function application program (such as sound-playing function, Image player function etc.) etc.；Storage data field can be stored uses created data (such as audio data, electricity according to terminal 5 Script for story-telling etc.) etc..In addition, memory 51 may include high-speed random access memory, can also include nonvolatile memory, example Such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card), at least one disk memory, flush memory device or other volatibility are solid State memory device.

If the integrated module/unit of the terminal 5 is realized in the form of SFU software functional unit and as independent product Sale in use, can be stored in a computer read/write memory medium.Based on this understanding, in present invention realization All or part of flow in embodiment method is stated, relevant hardware can also be instructed to complete by computer program, institute The computer program stated can be stored in a computer readable storage medium, which, can when being executed by processor The step of realizing above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, the computer Program code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer can Reading medium may include：Any entity or device, recording medium, USB flash disk, mobile hard of the computer program code can be carried Disk, magnetic disc, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It needs to illustrate It is that the content that the computer-readable medium includes can be fitted according to legislation in jurisdiction and the requirement of patent practice When increase and decrease, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium does not include that electric carrier wave is believed Number and telecommunication signal.

In several embodiments provided by the present invention, it should be understood that disclosed terminal and method can pass through it Its mode is realized.For example, terminal embodiment described above is only schematical, for example, the division of the unit, only Only a kind of division of logic function, formula that in actual implementation, there may be another division manner.

In addition, each functional unit in each embodiment of the present invention can be integrated in same treatment unit, it can also That each unit physically exists alone, can also two or more units be integrated in same unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of hardware adds software function module.

It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Profit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent requirements of the claims Variation includes within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This Outside, it is clear that one word of " comprising " is not excluded for other units or, odd number is not excluded for plural number.The multiple units stated in system claims Or device can also be realized by a unit or device by software or hardware.The first, the second equal words are used for indicating name Claim, and does not represent any particular order.

Finally it should be noted that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although reference Preferred embodiment describes the invention in detail, it will be understood by those of ordinary skill in the art that, it can be to the present invention's Technical solution is modified or equivalent replacement, without departing from the spirit of the technical scheme of the invention range.

Claims

1. a kind of music model training method, which is characterized in that the method includes：

Extract the feature vector of each MIDI music score；

2. the method as described in claim 1, which is characterized in that it is described extract each MIDI music score feature vector include：

The pitch sequence of MIDI music score is extracted as first eigenvector；

The time series of MIDI music score is extracted as second feature vector；

3. method as claimed in claim 2, which is characterized in that the time series of the extraction MIDI music score is as second feature Vector further includes：

The greatest common divisor for solving all time series, as unit time；Or

The multiple that each time series is the unit time is calculated, using the multiple as the corresponding time series of button.

4. the method as described in claim 1, which is characterized in that the method further includes：

Select a head MIDI music score at random in second data set；

The feature vector of music score in the preset time period is input in the trained music model, output is corresponding MIDI music score；

According to the MIDI music score of the MIDI music score picked out and the output, the performance of trained music model is verified.

5. method as claimed in claim 4, which is characterized in that the performance of music model of the verification training includes：

Extract the second waveform of the MIDI music score of the output；

Calculate the similarity of the first waveform and second waveform；

Judge whether the similarity is more than default similarity threshold；

If the similarity is more than or equal to the default similarity threshold, it is determined that the music model trained Performance it is more excellent；

If the similarity be less than the default similarity threshold, it is determined that the performance of the music model trained compared with Difference.

6. a kind of musical composition method, which is characterized in that the method includes：

Described eigenvector is input in advance trained music model and is learnt, wherein the music model is to adopt It trains to obtain with the method as described in claim 1-5 any one；

Export corresponding MIDI music score.

7. a kind of music model training apparatus, which is characterized in that described device includes：

Extraction module, the feature vector for extracting each MIDI music score；

Training module is trained to obtain music model for described eigenvector to be input in structuring support vector machines, packet It includes：Construct discriminant function f (x；W), x is feature vector, and w is parameter vector, will maximize discriminant function f (x；W) data valueIt is exported as predicted value；According to default loss function The predicted value is calculated with actual value, wherein P is the probability distribution of data, is calculated with training sample data Empiric riskInstead of；Use the optimization formula of SVMSolving unique parameters vector ω makes Obtain the empiric risk by training sample dataIt is zero；Solve discriminant function f (x；ω), musical time sequence is finally exported.

8. a kind of musical composition device, which is characterized in that described device includes：

Acquisition module, the MIDI music score for including several MIDI notes for acquiring user's creation are found pleasure in as MIDI to be created Spectrum；

Link block obtains MIDI music score for the third feature vector and the fourth feature vector to be attached Feature vector；

Study module learns, wherein described for described eigenvector to be input in advance trained music model Music model is to train to obtain using device as claimed in claim 7；

Output module, for exporting corresponding MIDI music score.

9. a kind of terminal, which is characterized in that the terminal includes processor and memory, and the processor is for executing described deposit Realized when the computer program stored in reservoir music model training method as described in any one of claim 1 to 5 and/ Or realize musical composition method as claimed in claim 6.

10. a kind of computer readable storage medium, computer program, feature are stored on the computer readable storage medium It is, the music model instruction as described in any one of claim 1 to 5 is realized when the computer program is executed by processor Practice method and/or realizes musical composition method as claimed in claim 6.