CN109872708A

CN109872708A - A kind of music generating method and device based on DCGAN

Info

Publication number: CN109872708A
Application number: CN201910066130.8A
Authority: CN
Inventors: 王义文; 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-01-23
Filing date: 2019-01-23
Publication date: 2019-06-11
Anticipated expiration: 2039-01-23
Also published as: CN109872708B; WO2020151150A1

Abstract

The embodiment of the present application discloses a kind of music generating method and device based on DCGAN, wherein method includes: construction depth convolution production confrontation network DCGAN model, training dataset is obtained again, then it will be trained in N number of melody matrix and corresponding N number of chord Input matrix DCGAN model that the training data is concentrated, to obtain trained DCGAN model, then the target melody matrix that will acquire is inputted in the trained DCGAN model and is handled, and obtain that the trained DCGAN model generates with the matched target chord matrix of the target melody matrix, finally export the music file after the melody tracks that the target melody matrix maps out are merged with the harmony audio track that the target chord matrix maps out.Using the embodiment of the present application, it can automatically generate with the matched music file of chord, reduce manual processing links.

Description

A kind of music generating method and device based on DCGAN

Technical field

This application involves field of computer technology more particularly to a kind of music generating methods and device based on DCGAN.

Background technique

Currently, existing music generating mode is usually a given Duan Xuanlv, the music personage by profession is given rotation Rule matches chord, is obtained with this with the matched music file of chord.But it specifically, to realize to one section of melody with chord, The hardware technology for needing the person of dubbing in background music to have powerful music theory, operative knowledge etc. is supported, while also requiring to match in terms of soft technique Happy person has strong, sensitive musicality experience.Therefore, to one section of good music file of generation necessarily by the limit of the person's of dubbing in background music level System.

Summary of the invention

The embodiment of the present application provides a kind of music generating method based on DCGAN, can automatically generate and match with chord Music file, reduce manual processing links.

In a first aspect, the embodiment of the present application provides a kind of music generating method based on DCGAN, this method comprises:

Construction depth convolution production fights network DCGAN model；

Training dataset is obtained, which concentrates including N number of melody matrix and corresponding N number of chord matrix, Middle melody matrix and chord matrix are binary matrix；

In the N number of melody matrix and corresponding N number of chord Input matrix DCGAN model that the training data is concentrated into Row training, obtains trained DCGAN model；

The target melody matrix that will acquire is inputted in the trained DCGAN model and is handled, and obtains the training Good DCGAN model generate with the matched target chord matrix of the target melody matrix, export target melody matrix mapping The harmony audio track that melody tracks out and the target chord matrix map out merge after music file.

It with reference to first aspect, in some possible embodiments, include generator, discriminator in above-mentioned DCGAN model And adjuster, the generator, the discriminator and the adjuster are convolutional neural networks CNN.The training data is concentrated N number of melody matrix and corresponding N number of chord Input matrix DCGAN model in be trained, obtain trained DCGAN Model, comprising:

For any melody matrix i that the training data is concentrated, melody matrix i is inputted to the generation of the DCGAN model It is generated and the matched first chord matrix j of melody matrix i in device；By the first chord matrix j and melody matrix i at this Training data concentrate corresponding chord matrix k input in the discriminator of the DCGAN model distinguish the first chord matrix j with should and The identical probability of string matrix k；Judge that whether the discriminator is directed to the probability of the first chord matrix j output in preset range It is interior, the parameter on the transposition convolutional layer of the generator is carried out if otherwise inputting the probability in the adjuster of the DCGAN model Adjustment, melody matrix i is re-entered in the generator after being adjusted and is regenerated and melody matrix i matched first Chord matrix j, and by the first chord matrix j regenerated and melody matrix i the training data concentrate it is corresponding and String matrix k is inputted in the discriminator of the DCGAN model and is distinguished the first chord matrix j regenerated and the chord matrix k phase Same probability.When the discriminator is directed to the probability for each first chord Output matrix that the generator generates within a preset range When, obtain trained DCGAN model.

With reference to first aspect, in some possible embodiments, training dataset is obtained, comprising:

The twin track data set including multiple twin track music files is obtained, which includes for expression The music file of melody tracks and harmony audio track；N number of target twin track music file is determined from the twin track data set, it should Chord in target twin track music file belongs to preset basic chord set, include in the basic chord set it is 12 big and Every trifle of string and 12 small chords, the target twin track music file uses a chord；Obtain N number of target twin track sound Melody matrix of each target twin track music file on melody tracks, obtains N number of melody matrix in music file；

Obtain chord of each target twin track music file in harmony audio track in N number of target twin track music file Matrix obtains N number of chord matrix.

With reference to first aspect, in some possible embodiments, it obtains each in N number of target twin track music file Melody matrix of the target twin track music file on melody tracks, obtains N number of melody matrix, comprising:

The melody of each target twin track music file in N number of target twin track music file is adjusted to default pitch In range；Obtain the melody note after adjusting in each target twin track music file；According to target double-tone each after the adjustment Melody note in rail music file generates the melody matrix of each target twin track music file after the adjustment, the melody square Battle array is the binary matrix of h*w, and the h is for indicating that preset note number, the w are used to indicate the trifle of target twin track music file Number.

With reference to first aspect, in some possible embodiments, it obtains each in N number of target twin track music file Chord matrix of the target twin track music file in harmony audio track, obtains N number of chord matrix, comprising:

Obtain used by each trifle of each target twin track music file in N number of target twin track music file and The chord classification of string and each used chord of trifle；It is used according to each trifle of each target twin track music file Chord and each used chord of trifle chord classification, generate the chord square of each target twin track music file Battle array, the chord matrix are the binary matrix of w*m, which is used to indicate the small joint number of target twin track music file, which is used for table Show the chord parameter of each trifle.

With reference to first aspect, in some possible embodiments, the instruction is inputted in the target melody matrix that will acquire Before being handled in the DCGAN model perfected, this method further include: obtain the single-tone rail music file including melody tracks； The melody of the single-tone rail music file is adjusted to default pitch range；Obtain the melody after adjusting in single-tone rail music file Note；According to the melody note after the adjustment in single-tone rail music file, single-tone rail music file target is revolved after generating the adjustment Restrain matrix.

With reference to first aspect, in some possible embodiments, which includes at least one full articulamentum and extremely A few transposition convolutional layer, the discriminator include at least one convolutional layer and at least one full articulamentum, which includes extremely A few convolutional layer and at least one full articulamentum, the adjuster are reversed generator.

Second aspect, the embodiment of the present application provide a kind of music generating device, which includes:

Constructing module fights network DCGAN model for construction depth convolution production；

First obtains module, and for obtaining training dataset, it includes N number of melody matrix and correspondence which, which concentrates, N number of chord matrix, wherein melody matrix and chord matrix are binary matrix；

Training module, N number of melody matrix and corresponding N number of chord Input matrix for concentrating the training data should It is trained in DCGAN model, obtains trained DCGAN model；

Input module, the target melody matrix for will acquire are inputted in the trained DCGAN model and are handled, And obtain that the trained DCGAN model generates with the matched target chord matrix of the target melody matrix；

Output module is mapped out for exporting the melody tracks that the target melody matrix maps out with the target chord matrix Harmony audio track merge after music file.

It include generator, discriminator in above-mentioned DCGAN model in some possible embodiments in conjunction with second aspect And adjuster, the generator, the discriminator and the adjuster are convolutional neural networks CNN.Above-mentioned training module is specific For: any melody matrix i concentrated for the training data inputs melody matrix i in the generator of the DCGAN model It generates and the matched first chord matrix j of melody matrix i；By the first chord matrix j and melody matrix i in the training Corresponding chord matrix k is inputted in the discriminator of the DCGAN model and is distinguished the first chord matrix j and the chord square in data set The identical probability of battle array k；Judge whether within a preset range the discriminator is directed to the probability of the first chord matrix j output, if Otherwise the probability is inputted in the adjuster of the DCGAN model and the parameter on the transposition convolutional layer of the generator is adjusted, Melody matrix i is re-entered in the generator after being adjusted and is regenerated and matched first chord of melody matrix i Matrix j, and the first chord matrix j regenerated and melody matrix i is concentrated into corresponding chord square in the training data Battle array k, which is inputted in the discriminator of the DCGAN model, distinguishes that the first chord matrix j regenerated is identical with the chord matrix k Probability.When the discriminator for the generator generate each first chord Output matrix probability within a preset range when, Obtain trained DCGAN model.

In conjunction with second aspect, in some possible embodiments, above-mentioned first acquisition module includes: the first acquisition list Member, for obtaining the twin track data set including multiple twin track music files, which includes for expression The music file of melody tracks and harmony audio track；Determination unit, for determining N number of target double-tone from the twin track data set Rail music file；Second acquisition unit, for obtaining each target twin track music text in N number of target twin track music file Melody matrix of the part on melody tracks, obtains N number of melody matrix；Third acquiring unit, for obtaining N number of target twin track Chord matrix of each target twin track music file in harmony audio track in music file, obtains N number of chord matrix.Wherein, Chord in the target twin track music file belongs to preset basic chord set, includes 12 big in the basic chord set Every trifle of chord and 12 small chords, the target twin track music file uses a chord.

In conjunction with second aspect, in some possible embodiments, above-mentioned second acquisition unit is specifically used for: this is N number of The melody of each target twin track music file is adjusted to default pitch range in target twin track music file；Obtain adjustment Melody note in each target twin track music file afterwards；According to the rotation in target twin track music file each after the adjustment Note symbol, generates the melody matrix of each target twin track music file after the adjustment, which is the binary square of h*w Battle array, the h is for indicating that preset note number, the w are used to indicate the small joint number of target twin track music file.

In conjunction with second aspect, in some possible embodiments, above-mentioned third acquiring unit is specifically used for: obtaining the N Chord used by each trifle of each target twin track music file and each trifle in a target twin track music file The chord classification of used chord；According to chord used by each trifle of each target twin track music file and this is each The chord classification of the used chord of trifle generates the chord matrix of each target twin track music file, which is The binary matrix of w*m, the w are used to indicate the small joint number of target twin track music file, which is used to indicate the chord ginseng of each trifle Number.

In conjunction with second aspect, in some possible embodiments, the device further include: second obtains module, for obtaining Take the single-tone rail music file including melody tracks；The melody of the single-tone rail music file is adjusted to default pitch range； Obtain the melody note after adjusting in single-tone rail music file；According to the melody note after the adjustment in single-tone rail music file, Generate the target melody matrix of single-tone rail music file after the adjustment.

In conjunction with second aspect, in some possible embodiments, which includes at least one full articulamentum and extremely A few transposition convolutional layer, the discriminator include at least one convolutional layer and at least one full articulamentum, which includes extremely A few convolutional layer and at least one full articulamentum, the adjuster are reversed generator.

The third aspect, the embodiment of the present application provide a kind of terminal, including processor, input equipment, output equipment and deposit Reservoir, the processor, input equipment, output equipment and memory are connected with each other, wherein the memory supports terminal for storing The computer program of the above method is executed, which includes program instruction, which is configured for calling the journey Sequence instruction, executes the music generating method based on DCGAN of above-mentioned first aspect.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, which deposits Computer program is contained, which includes program instruction, which makes the processor when being executed by a processor Execute the music generating method based on DCGAN of above-mentioned first aspect.

The embodiment of the present application fights network DCGAN model by construction depth convolution production, then obtains training dataset, Then it will be instructed in N number of melody matrix and corresponding N number of chord Input matrix DCGAN model that the training data is concentrated Practice, to obtain trained DCGAN model, the target melody matrix that then will acquire inputs the trained DCGAN mould Handled in type, and obtain that the trained DCGAN model generates with the matched target chord square of the target melody matrix Battle array, finally exports the melody tracks that the target melody matrix maps out and the harmony audio track that the target chord matrix maps out carries out Music file after merging.It can automatically generate with the matched music file of chord, reduce manual processing links.

Detailed description of the invention

Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in embodiment description Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is some embodiments of the present application, general for this field For logical technical staff, without creative efforts, it is also possible to obtain other drawings based on these drawings.

Fig. 1 is a schematic flow diagram of the music generating method provided by the embodiments of the present application based on DCGAN；

Fig. 2 is the schematic network structure of DCGAN model provided by the embodiments of the present application；

Fig. 3 is another schematic flow diagram of the music generating method provided by the embodiments of the present application based on DCGAN；

Fig. 4 a is the schematic diagram of MIDI note provided by the embodiments of the present application；

Fig. 4 b is the schematic diagram of melody matrix provided by the embodiments of the present application；

Fig. 5 a is the schematic diagram of 24 chords provided by the embodiments of the present application；

Fig. 5 b is the schematic diagram of chord matrix provided by the embodiments of the present application；

Fig. 6 is a schematic block diagram of music generating device provided by the embodiments of the present application；

Fig. 7 is an a kind of schematic block diagram of terminal provided by the embodiments of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description, it is clear that described embodiment is some embodiments of the present application, instead of all the embodiments.Based on this Shen Please in embodiment, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall in the protection scope of this application.

It should be appreciated that term " includes " and " having " and their any deformations, it is intended that cover and non-exclusive include. Such as contain a series of steps or units process, method, system, product or equipment be not limited to listed step or Unit, but optionally further comprising the step of not listing or unit, or optionally further comprising for these process, methods, produce The other step or units of product or equipment inherently.

It is also understood that referenced herein " embodiment " it is meant that describe in conjunction with the embodiments special characteristic, structure or Characteristic may be embodied at least one embodiment of the application.Each position in the description shows that the phrase might not Identical embodiment is each meant, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art Member explicitly and implicitly understands that embodiment described herein can be combined with other embodiments.

It will be further appreciated that the term "and/or" used in present specification and the appended claims is Refer to any combination and all possible combinations of one or more of associated item listed, and including these combinations.

Below in conjunction with Fig. 1 to Fig. 7, to the music generating method and device provided by the embodiments of the present application based on DCGAN It is illustrated.

It is a schematic flow diagram of the music generating method provided by the embodiments of the present application based on DCGAN referring to Fig. 1.Such as Shown in Fig. 1, being somebody's turn to do the music generating method based on DCGAN may include step:

S101, construction depth convolution production fight network DCGAN model.

In some possible embodiments, terminal can construct a depth convolution production confrontation network (Deep Convolution Generative Adversarial Networks, DCGAN) model.Wherein, it can wrap in DCGAN model Include a generator, a discriminator and an adjuster.Generator, discriminator and adjuster are convolutional neural networks (Convolutional Neural Network, CNN) may include at least one full articulamentum and at least one in generator Transposition convolutional layer；It may include at least one convolutional layer and at least one full articulamentum in discriminator；Adjuster can be reversed Generator, including at least one convolutional layer and at least one full articulamentum.Generator can be used for according to given stochastic ordering One section of column-generation music true as far as possible removes deception discriminator, music that discriminator can be used for generating generator and true Music distinguishes as far as possible, in this way, generator and discriminator just constitute one dynamic " gambling process ", and adjuster can be with The parameter on transposition convolutional layer for adjusting generator, so that the music that generator generates can preferably cheat discriminator.

As shown in Fig. 2, being the schematic network structure of DCGAN model provided by the embodiments of the present application.Wherein, Condititoner CNN indicates that the adjuster in DCGAN model, Generator CNN indicate the generator in DCGAN model, Discriminator CNN indicates the discriminator in DCGAN model.Since adjuster is substantially a reversed generator, So adjuster and generator have identical convolution nuclear shape, the output of adjuster and generator is also of similar shape, therefore The output of each convolutional layer of adjuster is given on the corresponding transposition convolutional layer of generator, in order to generator transposition convolutional layer On parameter be adjusted, while the output of generator being inputted as one of discriminator.Noise z indicates input generator Random sequence, X or G (z) indicate the output of generator, and 2D conditions indicates that true data (herein refer to non-generator The data of generation).

S102 obtains training dataset.

In some possible embodiments, terminal can obtain above-mentioned for training from preset tranining database N number of training sample of DCGAN model may include 1 melody matrix and corresponding 1 chord square in each training sample Battle array.N number of training sample can be determined as the training sample set of above-mentioned DCGAN model by terminal, then the training sample is concentrated It just include N number of training sample, i.e. training sample concentration may include N number of melody matrix and corresponding N number of chord matrix.Its In, N can be the integer more than or equal to 2.Melody matrix can be the binary matrix of 128*16, and chord matrix can be 16* 13 binary matrix.

S103, will be in the N number of melody matrix and corresponding N number of chord Input matrix DCGAN model of training data concentration It is trained, obtains trained DCGAN model.

In some possible embodiments, above-mentioned DCGAN model includes a generator, a discriminator and one Adjuster.Generator, discriminator and adjuster are CNN.Terminal can be using independent alternately training method come to the DCGAN Generator and discriminator in model are trained.Specifically, by taking a wheel iteration of training process as an example.Terminal can be fixed and be distinguished Parameter constant on other device convolutional layer, training generator, any melody matrix i that terminal can concentrate above-mentioned training data are defeated Enter in the generator of the DCGAN model and generates and the matched first chord matrix j of melody matrix i.Fixed generator transposition volume Parameter constant, training discriminator on lamination, terminal can be by the first chord matrix j and melody matrix i in the training In data set corresponding chord matrix k inputted in the discriminator of the DCGAN model jointly distinguish the first chord matrix j with should and The identical probability of string matrix k (i.e. similarity between the first chord matrix j and chord matrix k).Judge that the discriminator is directed to be somebody's turn to do Within a preset range whether the probability of the first chord matrix j output (between such as 0.85 to 1, including 0.85 and 1).If the discrimination Device is directed to the probability of the first chord matrix j output not in the preset range, the probability that terminal can export the discriminator It inputs in the adjuster of the DCGAN model and the parameter on the transposition convolutional layer of the generator is adjusted.Terminal can should Melody matrix i re-enter in generator adjusted regenerate with the matched first chord matrix j of melody matrix i, and The the first chord matrix j regenerated the and chord matrix k can be merged and be distinguished in the discriminator for inputting the DCGAN model First chord matrix j probability identical with the chord matrix k.If the discriminator is general for the first chord matrix j output Rate is in the preset range, then terminal can be concentrated from above-mentioned training data and choose another melody matrix, is trained process One wheel iteration.For each melody matrix that above-mentioned training data is concentrated, it is both needed to carry out a wheel iteration in the training process, that is, instructs Practice data and be concentrated with N number of melody matrix, training process at least N takes turns iteration.When discriminator generated for generator each the The probability of one chord Output matrix within a preset range when, obtain trained DCGAN model.

In some possible embodiments, the training process of above-mentioned DCGAN model can be indicated to minor function 1-1:

Wherein, the p in function 1-1_dataIndicate N number of chord matrix that above-mentioned training data is concentrated, p_zIndicate above-mentioned trained number According to N number of melody matrix of concentration.D indicates that discriminator, G indicate generator.G (z) indicates the output of generator, and D (x) indicates to distinguish The output of device is (value of D (x) within 0 to 1, including 0 and 1).Training D maximizes log D (x), and training G makes log (1- D (G (z))) it minimizes, that is, maximize the loss of D.Training process is usually first fixed party (such as discriminator D), updates another The parameter of network (such as generator G), alternating iteration, so that the mistake of other side maximizes.Finally, when G restrains, then G and D training It completes, obtains trained DCGAN model.

In some possible embodiments, the generator of DCGAN model joined characteristic matching in learning process.It is special Sign matching can be indicated to minor function 1-2:

Wherein, the E in function 1-2 indicates that average value, X indicate that the chord matrix that above-mentioned training data is concentrated, z indicate above-mentioned The melody matrix that training data is concentrated, G (z) indicate the output of generator.First convolutional layer of f discriminator, λ₁,λ₂Indicate life The adjustment parameter grown up to be a useful person.The range of adjustment parameter is in the undistorted range of system.

S104, the target melody matrix that will acquire is inputted in trained DCGAN model and is handled, and obtains training Good DCGAN model generate with the matched target chord matrix of target melody matrix, export what target melody matrix mapped out The harmony audio track that melody tracks and target chord matrix map out merge after music file.

In some possible embodiments, terminal is after obtaining trained DCGAN model, an available mesh Mark melody matrix.The target melody matrix can be the binary matrix that user directly inputs, and can also give birth at random for terminal At a binary matrix.For example, first obtain a random noise (Gaussian noise, Uniform noise etc.), then will acquire with Machine noise processed is matrix identical with the above-mentioned training data concentration data format of melody matrix, then the random noise is passed through The matrix obtained after processing is determined as target melody matrix.Terminal, can be by the target after getting target melody matrix Melody matrix is inputted in above-mentioned trained DCGAN model and is generated and the matched target chord matrix of the target melody matrix.Eventually The target chord matrix for holding the available trained DCGAN model to generate, and the target melody matrix can be mapped as The target chord matrix is mapped as harmony audio track by melody tracks.Terminal can merge the rotation that the target melody matrix maps out The harmony audio track that note rail and the target chord matrix map out, the music file after being merged, and can will be after the merging Music file with musical instrument digital interface (musical instrument digital interface, MIDI) format output. Wherein, the size of target chord matrix is identical as the size of chord matrix in above-mentioned training dataset.Music text after the merging It include melody and chord in part.For example, in t moment, melody tracks and the target which maps out and The harmony audio track that string matrix maps out is simultaneously emitted by the sound of each comfortable t moment.The embodiment of the present application is by constructing DCGAN model simultaneously It is trained using melody matrix and chord matrix, obtains trained DCGAN model, then in trained DCGAN Target melody matrix (can be a random noise) be inputted in model, trained DCGAN model is according to this target melody Matrix, which is generated, matches target chord matrix with this target melody matrix, can automatically generate with the matched music text of chord Part reduces manual processing links to save manpower.

In the embodiment of the present application, terminal fights network DCGAN model by construction depth convolution production, then obtains instruction Practice data set, the N number of melody matrix for then concentrating the training data and corresponding N number of chord Input matrix DCGAN mould It is trained in type, to obtain trained DCGAN model, the target melody matrix that then will acquire inputs this and trains DCGAN model in handled, and obtain that the trained DCGAN model generates with the matched mesh of target melody matrix Chord matrix is marked, the melody tracks that the target melody matrix maps out and the chord that the target chord matrix maps out finally are exported Track merge after music file.It can automatically generate with the matched music file of chord, reduce manual processing links.

It is another schematic flow diagram of the music generating method provided by the embodiments of the present application based on DCGAN referring to Fig. 3. As shown in figure 3, being somebody's turn to do the music generating method based on DCGAN may include step:

S301, construction depth convolution production fight network DCGAN model.

In some possible embodiments, the step S301 in the embodiment of the present application can refer to embodiment illustrated in fig. 1 The implementation of step S101, details are not described herein.

S302 obtains the twin track data set including multiple twin track music files.

S303 determines N number of target twin track music file from twin track data set.

In some possible embodiments, the available MIDI data collection of terminal, MIDI data concentration may include The music file of multiple midi formats.The MIDI data can be concentrated the music including melody tracks and harmony audio track by terminal File is determined as twin track music file, and the multiple twin track music files that the MIDI data can be concentrated are as twin track Data set.Chord in the twin track data set can be belonged to preset basic chord set by terminal and small joint number is equal to target threshold The twin track music file of value is determined as target twin track music file, obtains N number of target twin track music file.Wherein, in advance If basic chord set may include 12 big chord and 12 small chords.This 12 big chords are as follows: C, C#, D, D#, E, F,F#,G,G#,A,A#,B；This 12 small chords are as follows: A, A#, B, C, C#, D, D#, E, F, F#, G, G#.Each target twin track sound Every trifle of music file is only with a chord.Targets threshold can be 16, i.e., each target twin track music file includes 16 A trifle.N can be the integer more than or equal to 2.

In some possible embodiments, in order to meet the input format of above-mentioned DCGAN model, terminal can will be above-mentioned Each target twin track music file in N number of target twin track music file is one group of carry out sequence segmentation according to 8 trifles.Example Such as, some target twin track music file shares 18 trifles, is one group according to 8 trifles and is split, and first group is preceding 8 trifle, Second group is 8 intermediate trifles, and third group is last 2 trifle.

S304 obtains in N number of target twin track music file each target twin track music file on melody tracks Melody matrix obtains N number of melody matrix.

In some possible embodiments, terminal can be by each mesh in above-mentioned N number of target twin track music file The melody of mark twin track music file is adjusted to default pitch range.The preset pitch range can for C4 to B5 the two Between octave.For example, terminal is by the pitch of melody note in each target twin track music file not in preset two octaves Between melody note remove, only retain the pitch of melody note in each target twin track music file C4 to B5 the two Melody note between octave.Melody note in the available each target twin track music file adjusted of terminal, and Each target twin track music file can be generated according to the melody note in target twin track music file each after adjustment Melody matrix.Wherein, melody matrix can be the binary matrix of h*w, and h can be used to indicate that MIDI note number, h=128；w It can be used to indicate that the small joint number of target twin track music file, w=16.Element 0 in melody matrix can be used to indicate that tune Without MIDI note on the corresponding position of 0 element, element 1 in melody matrix can be with for target twin track music file after whole For indicating that target twin track music file adjusted has corresponding MIDI note on the corresponding position of 1 element.

For example, by taking a target twin track music file generates a melody matrix as an example.It as shown in fig. 4 a, is the application The schematic diagram for the MIDI note that embodiment provides；It as shown in Figure 4 b, is the schematic diagram of melody matrix provided by the embodiments of the present application. Wherein, M indicates melody matrix, and the size of M is 128 rows 16 column.Every a line in M indicates a MIDI note, such as the first row table Show that first MIDI note 00 (hexadecimal note code) in 128 MIDI notes, the second row indicate 128 MIDI sounds The 01, the 13rd row of second MIDI note in symbol indicates the 13rd MIDI note 0C etc. in 128 MIDI notes.In M Each column indicate a trifle, if first row indicate target twin track music file in first trifle, the tenth column indicate The tenth trifle in target twin track music file etc..As shown in Figure 4 b, the element 1 in the 2nd row the 1st column of M indicates target 1st note of the 1st trifle of twin track music file is MIDI note 01；Element 1 in the 2nd row the 3rd column of M indicates mesh The 2nd note for marking the 3rd trifle of twin track music file is MIDI note 01.Element 0 in the 1st row the 5th column of M indicates There is no MIDI note 00 in 5th trifle of target twin track music file.

S305 obtains in N number of target twin track music file each target twin track music file in harmony audio track Chord matrix obtains N number of chord matrix.

In some possible embodiments, every trifle of above-mentioned target twin track music file is only with a chord. In the available above-mentioned N number of target twin track music file of terminal used by each trifle of each target twin track music file Chord, and may determine that each used chord of trifle chord classification (chord of i.e. each trifle belong to big chord or it is small and String).Terminal can the chord according to used by each trifle of each target twin track music file and each trifle used The chord classification of chord generates the chord matrix of each target twin track music file.Wherein, chord matrix can be w*m Binary matrix, w can be used to indicate that the small joint number of target twin track music file, w=16；M can be used to indicate that each trifle Chord parameter, m=13, preceding 12 chord parameters of this 13 chord parameters respectively indicate 12 chords, the 13rd chord ginseng Number indicates chord classification, i.e., big chord or small chord.

For example, by taking a target twin track music file generates a chord matrix as an example.It as shown in Figure 5 a, is the application The schematic diagram for 24 chords that embodiment provides.Wherein, the major in Fig. 5 a indicates that big chord, minor indicate small chord； " 13 " indicate the 13rd chord parameter, and the 13rd chord parameter is that 0 expression chord classification is big chord, and the 13rd chord parameter is 1 indicates that chord classification is small chord.It as shown in Figure 5 b, is the schematic diagram of chord matrix provided by the embodiments of the present application.Wherein, scheme Y in 5b indicates that chord matrix, Y share 16 rows 13 column.Every a line in Y indicates a trifle, as the first row indicates that target is double The first trifle in track music file, fourth line indicate the 4th trifle etc. in target twin track music file.It is every in Y One column indicate a chord parameter, and 0 element representation in preceding 12 column does not have corresponding chord, and 1 list of elements in preceding 12 column is shown with Corresponding chord, and one and only one 1 element in preceding 12 elements of the every row of Y.The 13rd column of Y indicate chord classification, and 0 indicates Big chord, 1 indicates small chord.As shown in Figure 5 b, the 13rd column element of the 1st row is 1, indicates small chord, then in the 1st row the 2nd column Element 1 indicate target twin track music file the 1st trifle use small chord A#.The 13rd column element of 2nd row is 0, is indicated Big chord, then the element 1 of the 2nd row the 4th column indicates that the 2nd trifle of target twin track music file uses big chord D#.Cause It is 0 for the 13rd column element of the 16th row, indicates big chord, then the element 1 of the 16th row the 1st column indicates target twin track music file The 16th trifle use big chord C.

S306, will be in the N number of melody matrix and corresponding N number of chord Input matrix DCGAN model of training data concentration It is trained, obtains trained DCGAN model.

In some possible embodiments, the step S306 in the embodiment of the present application can refer to embodiment illustrated in fig. 1 The implementation of step S103, details are not described herein.

S307 obtains target melody matrix.

In some possible embodiments, terminal can be concentrated from MIDI data and obtain any list including melody tracks Track music file.Terminal can remove the pitch of melody note in the single-tone rail music file, and in default pitch range, (C4 is arrived B5 the two octaves) outer melody note, only retain the pitch of melody note in the single-tone rail music file in default pitch model Enclose interior melody note, the single-tone rail music file after being adjusted.The available single-tone rail music file adjusted of terminal Melody note, and single-tone rail music after the adjustment can be generated according to the melody note of the single-tone rail music file adjusted The target melody matrix of file.Wherein, the binary matrix of target melody matrix 128*16.Element 0 in target melody matrix can For indicating single-tone rail music file adjusted on the corresponding position of 0 element without MIDI note, target melody matrix In element 1 can be used to indicate that single-tone rail music file adjusted has corresponding MIDI on the corresponding position of 1 element Note.

S308, the target melody matrix that will acquire is inputted in trained DCGAN model and is handled, and obtains training Good DCGAN model generate with the matched target chord matrix of target melody matrix, export what target melody matrix mapped out The harmony audio track that melody tracks and target chord matrix map out merge after music file.

In some possible embodiments, the step S308 in the embodiment of the present application can refer to embodiment illustrated in fig. 1 The implementation of step S104, details are not described herein.

In the embodiment of the present application, terminal fights network DCGAN model by construction depth convolution production, then obtains packet The twin track data set for including multiple twin track music files determines N number of target twin track music file from twin track data set, Melody matrix of each target twin track music file on melody tracks in N number of target twin track music file is obtained, N is obtained A melody matrix obtains sum of each target twin track music file in harmony audio track in N number of target twin track music file String matrix obtains N number of chord matrix.The N number of melody matrix and corresponding N number of chord Input matrix that training data is concentrated It is trained in DCGAN model, obtains trained DCGAN model.Target melody matrix is obtained again, target melody matrix is defeated Enter and handled in trained DCGAN model, and obtains matching with target melody matrix for trained DCGAN model generation Target chord matrix, the harmony audio track that the melody tracks that map out of output target melody matrix and target chord matrix map out Music file after merging.It can automatically generate with the matched music file of chord, reduce manual processing links.

It is a schematic block diagram of music generating device provided by the embodiments of the present application referring to Fig. 6.As shown in fig. 6, this Shen Please the music generating device of embodiment include:

Constructing module 10 fights network DCGAN model for construction depth convolution production；

First obtains module 20, and for obtaining training dataset, it includes N number of melody matrix and right which, which concentrates, The N number of chord matrix answered, wherein melody matrix and chord matrix are binary matrix；

Training module 30, N number of melody matrix and corresponding N number of chord Input matrix for concentrating the training data It is trained in the DCGAN model, obtains trained DCGAN model；

Input module 40, the target melody matrix for will acquire input in the trained DCGAN model Reason, and obtain that the trained DCGAN model generates with the matched target chord matrix of the target melody matrix；

Output module 50, for exporting the melody tracks and target chord matrix mapping that the target melody matrix maps out Harmony audio track out merge after music file.

In some possible embodiments, include generator, discriminator and adjuster in above-mentioned DCGAN model, be somebody's turn to do Generator, the discriminator and the adjuster are convolutional neural networks CNN.Above-mentioned training module 30 is specifically used for: being directed to should Melody matrix i is inputted in the generator of the DCGAN model and is generated and the rotation by any melody matrix i that training data is concentrated Restrain the matched first chord matrix j of matrix i；By the first chord matrix j and melody matrix i in training data concentration pair The chord matrix k answered is inputted in the discriminator of the DCGAN model and is distinguished that the first chord matrix j is identical with the chord matrix k Probability；Judge whether within a preset range the discriminator is directed to the probability of the first chord matrix j output, if otherwise that this is general Rate is inputted in the adjuster of the DCGAN model and is adjusted to the parameter on the transposition convolutional layer of the generator, by the melody square Battle array i re-enter in the generator after being adjusted regenerate with the matched first chord matrix j of melody matrix i, and will The the first chord matrix j and melody matrix i regenerated concentrates corresponding chord matrix k input should in the training data The the first chord matrix j regenerated the probability identical with the chord matrix k is distinguished in the discriminator of DCGAN model.When this Discriminator for the generator generate each first chord Output matrix probability within a preset range when, trained DCGAN model.

In some possible embodiments, above-mentioned first acquisition module 20 includes first acquisition unit 201, determination unit 202, second acquisition unit 203 and third acquiring unit 204.

Above-mentioned first acquisition unit 201, for obtaining the twin track data set including multiple twin track music files, this pair Track music file is used to indicate the music file comprising melody tracks and harmony audio track；Above-mentioned determination unit 202, for from The twin track data set determines N number of target twin track music file；Above-mentioned second acquisition unit 203, for obtaining N number of mesh Melody matrix of each target twin track music file on melody tracks in twin track music file is marked, N number of melody square is obtained Battle array；Above-mentioned third acquiring unit 204, for obtaining each target twin track music text in N number of target twin track music file Chord matrix of the part in harmony audio track, obtains N number of chord matrix.Wherein, the chord category in the target twin track music file It include 12 big chord and 12 small chords, the target twin track in preset basic chord set, the basic chord set Every trifle of music file uses a chord.

In some possible embodiments, above-mentioned second acquisition unit 203 is specifically used for: by N number of target twin track The melody of each target twin track music file is adjusted to default pitch range in music file；Obtain each target after adjusting Melody note in twin track music file；It is raw according to the melody note in target twin track music file each after the adjustment At the melody matrix of target twin track music file each after the adjustment, which is the binary matrix of h*w, which is used for Indicate that preset note number, the w are used to indicate the small joint number of target twin track music file.

In some possible embodiments, above-mentioned third acquiring unit 204 is specifically used for: obtaining N number of target double-tone Chord used by each trifle of each target twin track music file and each used chord of trifle in rail music file Chord classification；It is used according to chord used by each trifle of each target twin track music file and each trifle The chord classification of chord generates the chord matrix of each target twin track music file, which is the binary square of w*m Battle array, the w are used to indicate the small joint number of target twin track music file, which is used to indicate the chord parameter of each trifle.

In some possible embodiments, which further includes the second acquisition module 60.The second acquisition module 60, is used In the single-tone rail music file that acquisition includes melody tracks；The melody of the single-tone rail music file is adjusted to default pitch range It is interior；Obtain the melody note after adjusting in single-tone rail music file；According to the melody sound after the adjustment in single-tone rail music file Symbol, generates the target melody matrix of single-tone rail music file after the adjustment.

In some possible embodiments, which includes at least one full articulamentum and at least one transposition convolution Layer, the discriminator include at least one convolutional layer and at least one full articulamentum, the adjuster include at least one convolutional layer with At least one full articulamentum, the adjuster are reversed generator.

In the specific implementation, above-mentioned music generating device can be executed provided by above-mentioned Fig. 1 or Fig. 3 by above-mentioned modules Implementation provided by each step in implementation realizes the function of being realized in the various embodiments described above, for details, reference can be made to The corresponding description that each step provides in above-mentioned Fig. 1 or embodiment of the method shown in Fig. 3, details are not described herein.

In the embodiment of the present application, music generating device fights network DCGAN model by construction depth convolution production, Training dataset is obtained again, and the N number of melody matrix and corresponding N number of chord Input matrix for then concentrating the training data should It is trained in DCGAN model, so that trained DCGAN model is obtained, the target melody matrix input that then will acquire Handled in the trained DCGAN model, and obtain that the trained DCGAN model generates with the target melody matrix Matched target chord matrix finally exports melody tracks and target chord matrix mapping that the target melody matrix maps out Harmony audio track out merge after music file.It can automatically generate with the matched music file of chord, reduce artificial Processing links.

It is a schematic block diagram of terminal provided by the embodiments of the present application referring to Fig. 7.As shown in fig. 7, the embodiment of the present application In terminal may include: one or more processors 701；One or more input equipments 702, one or more output equipments 703 and memory 704.Above-mentioned processor 701, input equipment 702, output equipment 703 and memory 704 are connected by bus 705 It connects.Memory 702 includes program instruction for storing computer program, the computer program, and processor 701 is deposited for executing The program instruction that reservoir 702 stores.

Wherein, processor 701 is configured for calling described program instruction execution: construction depth convolution production fights net Network DCGAN model；Training dataset is obtained, it includes N number of melody matrix and corresponding N number of chord square which, which concentrates, Battle array, wherein melody matrix and chord matrix are binary matrix；The N number of melody matrix and correspondence that the training data is concentrated N number of chord Input matrix DCGAN model in be trained, obtain trained DCGAN model.Input equipment 702 is used for The target melody matrix that will acquire is inputted in the trained DCGAN model and is handled, and obtains the trained DCGAN Model generate with the matched target chord matrix of the target melody matrix.Output equipment 703 is for exporting the target melody square The harmony audio track that the melody tracks that map out of battle array and the target chord matrix map out merge after music file.

It should be appreciated that in the embodiment of the present application, alleged processor 701 can be central processing unit (Central Processing Unit, CPU), which can also be other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic Device, discrete gate or transistor logic, discrete hardware components etc..General processor can be microprocessor or this at Reason device is also possible to any conventional processor etc..

Input equipment 702 may include Trackpad, microphone etc., output equipment 703 may include display (LCD etc.), Loudspeaker etc..

The memory 704 may include read-only memory and random access memory, and to processor 701 provide instruction and Data.The a part of of memory 704 can also include nonvolatile RAM.For example, memory 704 can also be deposited Store up the information of device type.

In the specific implementation, processor 701, input equipment 702 described in the embodiment of the present application, output equipment 703 can Implementation described in the music generating method provided by the embodiments of the present application based on DCGAN is executed, this also can be performed Apply for the implementation of music generating device described in embodiment, details are not described herein.

The embodiment of the present application also provides a kind of computer readable storage medium, which has meter Calculation machine program, the computer program include program instruction, which realizes Fig. 1 or shown in Fig. 3 when being executed by processor Music generating method based on DCGAN, detail please refer to the description of Fig. 1 or embodiment illustrated in fig. 3, and details are not described herein.

Above-mentioned computer readable storage medium can be music generating device described in aforementioned any embodiment or electronics is set Standby internal storage unit, such as the hard disk or memory of electronic equipment.The computer readable storage medium is also possible to the electronics The plug-in type hard disk being equipped on the External memory equipment of equipment, such as the electronic equipment, intelligent memory card (smart media Card, SMC), secure digital (secure digital, SD) card, flash card (flash card) etc..Further, the calculating Machine readable storage medium storing program for executing can also both including the electronic equipment internal storage unit and also including External memory equipment.The computer Readable storage medium storing program for executing is for other programs and data needed for storing the computer program and the electronic equipment.The computer can Reading storage medium can be also used for temporarily storing the data that has exported or will export.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware With the interchangeability of software, each exemplary composition and step are generally described according to function in the above description.This A little functions are implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Specially Industry technical staff can use different methods to achieve the described function each specific application, but this realization is not It is considered as beyond scope of the present application.

The application is referring to the method, apparatus of the embodiment of the present application and the flow chart and/or box of computer program product Figure describes.It should be understood that each process and/or the side in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in frame and flowchart and/or the block diagram.These computer program instructions be can provide to logical With the processor of the processing equipments of computer, special purpose computer, Embedded Processor or other programmable diagnosis and treatment data to generate One machine, so that generating use by the instruction that the processor of computer or the processing equipment of other programmable diagnosis and treatment data executes In the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram It sets.

These computer program instructions, which may also be stored in, to be able to guide processing of computer or other programmable diagnosis and treatment data and sets In standby computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates Manufacture including command device, the command device are realized in one or more flows of the flowchart and/or one, block diagram The function of being specified in box or multiple boxes.

These computer program instructions can also be loaded into the processing equipment of computer or other programmable diagnosis and treatment data, be made It obtains and executes series of operation steps on a computer or other programmable device to generate computer implemented processing, thus counting The instruction executed on calculation machine or other programmable devices is provided for realizing in one or more flows of the flowchart and/or side The step of function of being specified in block diagram one box or multiple boxes.

Although the application is described in conjunction with specific features and embodiment, it is clear that, do not departing from this Shen In the case where spirit and scope please, it can be carry out various modifications and is combined.Correspondingly, the specification and drawings are only institute The exemplary illustration for the application that attached claim is defined, and be considered as covered within the scope of the application any and all and repair Change, change, combining or equivalent.Obviously, those skilled in the art the application can be carried out various modification and variations without It is detached from spirit and scope.If in this way, these modifications and variations of the application belong to the claim of this application and its Within the scope of equivalent technologies, then the application is also intended to include these modifications and variations.

Claims

1. a kind of music generating method based on DCGAN characterized by comprising

Construction depth convolution production fights network DCGAN model；

Training dataset is obtained, the training data is concentrated including N number of melody matrix and corresponding N number of chord matrix, wherein Melody matrix and chord matrix are binary matrix；

In DCGAN model described in the N number of melody matrix and corresponding N number of chord Input matrix that the training data is concentrated into Row training, obtains trained DCGAN model；

The target melody matrix that will acquire is inputted in the trained DCGAN model and is handled, and obtains the training Good DCGAN model generate with the matched target chord matrix of the target melody matrix, export the target melody matrix The harmony audio track that the melody tracks that map out and the target chord matrix map out merge after music file.

2. the method according to claim 1, wherein in the DCGAN model include generator, discriminator and Adjuster, the generator, the discriminator and the adjuster are convolutional neural networks CNN；

DCGAN model described in N number of melody matrix that the training data is concentrated and corresponding N number of chord Input matrix In be trained, obtain trained DCGAN model, comprising:

For any melody matrix i that the training data is concentrated, the melody matrix i is inputted to the life of the DCGAN model Grow up to be a useful person middle generation and the matched first chord matrix j of the melody matrix i；

The first chord matrix j and melody matrix i is concentrated into corresponding chord matrix k input in the training data The first chord matrix j probability identical with the chord matrix k is distinguished in the discriminator of the DCGAN model；

Judge whether within a preset range the discriminator is directed to the probability of the first chord matrix j output, if otherwise will The probability is inputted in the adjuster of the DCGAN model and is adjusted to the parameter on the transposition convolutional layer of the generator, The melody matrix i is re-entered in the generator after being adjusted and is regenerated and the melody matrix i matched first Chord matrix j, and the first chord matrix j regenerated the and melody matrix i is concentrated in the training data and is corresponded to Chord matrix k input in the discriminator of the DCGAN model distinguish described in the first chord matrix j for regenerating and it is described and The identical probability of string matrix k；

When the discriminator is directed to the probability for each first chord Output matrix that the generator generates within a preset range When, obtain trained DCGAN model.

3. the method according to claim 1, wherein the acquisition training dataset, comprising:

The twin track data set including multiple twin track music files is obtained, the twin track music file is for indicating comprising rotation The music file of note rail and harmony audio track；

N number of target twin track music file is determined from the twin track data set, in the target twin track music file Chord belongs to preset basic chord set, includes 12 big chord and 12 small chords in the basic chord set, described Every trifle of target twin track music file uses a chord；

Obtain melody square of each target twin track music file on melody tracks in N number of target twin track music file Battle array, obtains N number of melody matrix；

Obtain chord square of each target twin track music file in harmony audio track in N number of target twin track music file Battle array, obtains N number of chord matrix.

4. according to the method described in claim 3, it is characterized in that, described obtain in N number of target twin track music file Melody matrix of each target twin track music file on melody tracks, obtains N number of melody matrix, comprising:

The melody of each target twin track music file in N number of target twin track music file is adjusted to default pitch model In enclosing；

Obtain the melody note after adjusting in each target twin track music file；

According to the melody note in target twin track music file each after the adjustment, each target is double after generating the adjustment The melody matrix of track music file, the melody matrix are the binary matrix of h*w, and the h is used to indicate preset note number, The w is used to indicate the small joint number of target twin track music file.

5. according to the method described in claim 3, it is characterized in that, described obtain in N number of target twin track music file Chord matrix of each target twin track music file in harmony audio track, obtains N number of chord matrix, comprising:

Obtain chord used by each trifle of each target twin track music file in N number of target twin track music file And the chord classification of the used chord of each trifle；

Used according to chord used by each trifle of each target twin track music file and each trifle and The chord classification of string, generates the chord matrix of each target twin track music file, and the chord matrix is the binary of w*m Matrix, the w are used to indicate the small joint number of target twin track music file, and the m is used to indicate the chord parameter of each trifle.

6. the method according to claim 1, wherein it is described will acquire target melody matrix input described in Before being handled in trained DCGAN model, the method also includes:

Obtain the single-tone rail music file including melody tracks；

The melody of the single-tone rail music file is adjusted to default pitch range；

Obtain the melody note after adjusting in single-tone rail music file；

According to the melody note after the adjustment in single-tone rail music file, the mesh of single-tone rail music file after the adjustment is generated Mark melody matrix.

7. according to the method described in claim 2, it is characterized in that, the generator includes at least one full articulamentum and at least One transposition convolutional layer, the discriminator include at least one convolutional layer and at least one full articulamentum, and the adjuster includes At least one convolutional layer and at least one full articulamentum, the adjuster are reversed generator.

8. a kind of music generating device characterized by comprising

Module is obtained, for obtaining training dataset, it includes N number of melody matrix and corresponding N number of that the training data, which is concentrated, Chord matrix, wherein melody matrix and chord matrix are binary matrix；

Training module, described in the N number of melody matrix and corresponding N number of chord Input matrix for concentrating the training data It is trained in DCGAN model, obtains trained DCGAN model；

Input module, the target melody matrix for will acquire, which inputs in the trained DCGAN model, to be handled, and Obtain that the trained DCGAN model generates with the matched target chord matrix of the target melody matrix；

9. a kind of terminal, which is characterized in that the processor, defeated including processor, input equipment, output equipment and memory Enter equipment, output equipment and memory to be connected with each other, wherein the memory is for storing computer program, the computer Program includes program instruction, and the processor is configured for calling described program instruction, is executed such as any one of claim 1-7 The method.

10. a kind of computer readable storage medium, which is characterized in that the computer storage medium is stored with computer program, The computer program includes program instruction, and described program instruction makes the processor execute such as right when being executed by a processor It is required that the described in any item methods of 1-7.