WO2020151150A1 - Dcgan-based music generation method, and music generation apparatus - Google Patents

Dcgan-based music generation method, and music generation apparatus Download PDF

Info

Publication number
WO2020151150A1
WO2020151150A1 PCT/CN2019/088805 CN2019088805W WO2020151150A1 WO 2020151150 A1 WO2020151150 A1 WO 2020151150A1 CN 2019088805 W CN2019088805 W CN 2019088805W WO 2020151150 A1 WO2020151150 A1 WO 2020151150A1
Authority
WO
WIPO (PCT)
Prior art keywords
chord
matrix
melody
track
target
Prior art date
Application number
PCT/CN2019/088805
Other languages
French (fr)
Chinese (zh)
Inventor
王义文
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020151150A1 publication Critical patent/WO2020151150A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/38Chord
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H7/00Instruments in which the tones are synthesised from a data store, e.g. computer organs

Definitions

  • This application relates to the field of computer technology, in particular to a DCGAN-based music generation method and device.
  • the existing music generation method is usually to give a melody, and professional musicians can chord the given melody to obtain a music file with chord matching.
  • the music player needs to have strong hardware technical support in terms of music theory and operation knowledge.
  • the music player is also required to have a strong and sensitive musical experience in terms of software technology. Therefore, to generate a high-quality music file is bound to be restricted by the level of the music player.
  • the embodiment of the present application provides a DCGAN-based music generation method, which can automatically generate music files with chord matching and reduce manual processing links.
  • an embodiment of the present application provides a DCGAN-based music generation method, which includes:
  • the training data set includes N melody matrices and corresponding N chord matrices, where the melody matrix and the chord matrix are both binary matrices;
  • an embodiment of the present application provides a music generating device, which includes:
  • Construction module used to construct a deep convolutional generative confrontation network DCGAN model
  • the first acquisition module is used to acquire a training data set, the training data set includes N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both binary matrices;
  • the training module is used to input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model;
  • the input module is used to input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix;
  • the output module is used to output a music file obtained by merging the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix.
  • an embodiment of the present application provides a terminal, including a processor, an input device, an output device, and a memory.
  • the processor, input device, output device, and memory are connected to each other, wherein the memory is used to store and support terminal execution
  • the computer program of the above method the computer program includes program instructions, and the processor is configured to call the program instructions to execute the DCGAN-based music generation method of the above first aspect.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute The above-mentioned first aspect of the DCGAN-based music generation method.
  • a music file with chord matching can be automatically generated and manual processing steps are reduced.
  • FIG. 1 is a schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of the network structure of the DCGAN model provided by an embodiment of the present application.
  • FIG. 3 is another schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application.
  • 4a is a schematic diagram of MIDI notes provided by an embodiment of the present application.
  • Figure 4b is a schematic diagram of a melody matrix provided by an embodiment of the present application.
  • Figure 5a is a schematic diagram of 24 chords provided by an embodiment of the present application.
  • Figure 5b is a schematic diagram of a chord matrix provided by an embodiment of the present application.
  • Fig. 6 is a schematic block diagram of a music generating device provided by an embodiment of the present application.
  • FIG. 7 is a schematic block diagram of a terminal provided by an embodiment of the present application.
  • Fig. 1 is a schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application.
  • the DCGAN-based music generation method may include steps:
  • the terminal may construct a Deep Convolution Generative Adversarial Networks (DCGAN) model.
  • the DCGAN model can include a generator, a discriminator and a regulator.
  • the generator, discriminator and regulator are all convolutional neural networks (Convolutional Neural Network, CNN), the generator may include at least one fully connected layer and at least one transposed convolutional layer; the discriminator may include at least one convolution Layer and at least one fully connected layer; the regulator may be an inverted generator, including at least one convolutional layer and at least one fully connected layer.
  • the generator can be used to generate a piece of music that is as realistic as possible according to a given random sequence to deceive the discriminator.
  • the discriminator can be used to distinguish the music generated by the generator from the real music as much as possible, so that the generator and the discriminator It constitutes a dynamic "game process", and the adjuster can be used to adjust the parameters on the transposed convolutional layer of the generator, so that the music generated by the generator can better deceive the discriminator.
  • FIG. 2 it is a schematic diagram of the network structure of the DCGAN model provided by an embodiment of the present application.
  • Condititoner CNN represents the regulator in the DCGAN model
  • Generator CNN represents the generator in the DCGAN model
  • Discriminator CNN represents the discriminator in the DCGAN model.
  • the regulator is essentially a reverse generator
  • the regulator and the generator have the same convolution kernel shape, and the output of the regulator and the generator have the same shape, so the output of each convolution layer of the regulator Give it to the generator's corresponding transposed convolutional layer, so that the parameters on the generator's transposed convolutional layer can be adjusted, and the output of the generator is used as an input of the discriminator.
  • Noise z represents the random sequence of the input generator
  • X or G(z) represents the output of the generator
  • 2D conditions represents the real data (here refers to the data not generated by the generator).
  • the terminal may obtain N training samples for training the above-mentioned DCGAN model from a preset training database, and each training sample may include a melody matrix and a corresponding chord matrix.
  • the terminal may determine the N training samples as the training sample set of the aforementioned DCGAN model, then the training sample set includes N training samples, that is, the training sample set may include N melody matrices and corresponding N chord matrices.
  • N can be an integer greater than or equal to 2.
  • the melody matrix can be a 128*16 binary matrix
  • the chord matrix can be a 16*13 binary matrix.
  • the aforementioned DCGAN model includes a generator, a discriminator, and a regulator.
  • the generator, discriminator, and regulator are all CNN.
  • the terminal can use a separate alternate training method to train the generator and discriminator in the DCGAN model. Specifically, take one iteration of the training process as an example.
  • the terminal can fix the parameters on the convolutional layer of the discriminator and train the generator.
  • the terminal can input any melody matrix i in the above training data set into the generator of the DCGAN model to generate the first chord that matches the melody matrix i Matrix j.
  • the fixed generator transposes the parameters on the convolutional layer and trains the discriminator.
  • the terminal can input the first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set to the discriminator of the DCGAN model. Identify the probability that the first chord matrix j is the same as the chord matrix k (ie, the similarity between the first chord matrix j and the chord matrix k). It is determined whether the probabilities that the discriminator outputs for the first chord matrix j are all within a preset range (for example, between 0.85 and 1, inclusive of 0.85 and 1). If the probability that the discriminator outputs for the first chord matrix j is not within the preset range, the terminal can input the probability output by the discriminator into the regulator of the DCGAN model on the transposed convolutional layer of the generator. Parameters are adjusted.
  • the terminal can re-input the melody matrix i into the adjusted generator to regenerate the first chord matrix j that matches the melody matrix i, and can combine the regenerated first chord matrix j and the chord matrix k into the DCGAN
  • the discriminator of the model discriminates that the first chord matrix j has the same probability as the chord matrix k. If the probability that the discriminator outputs for the first chord matrix j is within the preset range, the terminal can select another melody matrix from the training data set to perform one iteration of the training process. For each melody matrix in the above training data set, one round of iteration is required during the training process, that is, there are N melody matrices in the training data set, and the training process has at least N rounds of iteration. When the probability that the discriminator outputs each first chord matrix generated by the generator is within the preset range, a trained DCGAN model is obtained.
  • the training process of the aforementioned DCGAN model can be represented by the following function 1-1:
  • p data in the function 1-1 represents the N chord matrices in the training data set
  • p z represents the N melody matrices in the training data set.
  • D stands for discriminator and G stands for generator.
  • G(z) represents the output of the generator
  • D(x) represents the output of the discriminator (the value of D(x) is in the range of 0 to 1, including 0 and 1).
  • Training D maximizes log D(x), and training G minimizes log(1-D(G(z))), that is, maximizes the loss of D.
  • the training process is usually to fix one party (such as the discriminator D), update the parameters of the other network (such as the generator G), and alternate iterations to maximize the error of the other party.
  • G converges, the training of G and D is completed, and a trained DCGAN model is obtained.
  • the generator of the DCGAN model adds feature matching during the learning process.
  • Feature matching can be represented by the following function 1-2:
  • E in the function 1-2 represents the average value
  • X represents the chord matrix in the training data set
  • z represents the melody matrix in the training data set
  • G(z) represents the output of the generator.
  • the first convolutional layer of f discriminator, ⁇ 1 , ⁇ 2 represent the tuning parameters of the generator.
  • the range of adjustment parameters can be within the range of system distortion.
  • S104 Input the obtained target melody matrix into the trained DCGAN model for processing, and obtain a target chord matrix matching the target melody matrix generated by the trained DCGAN model, and output the melody track and target mapped by the target melody matrix
  • the music file after the chord tracks mapped out by the chord matrix are merged.
  • the terminal may obtain a target melody matrix after obtaining the trained DCGAN model.
  • the target melody matrix may be a binary matrix directly input by the user, or may be a binary matrix randomly generated by the terminal. For example, first obtain a random noise (Gaussian noise, uniform noise, etc.), then process the obtained random noise into a matrix with the same data format as the melody matrix in the above training data set, and then process the random noise to obtain the matrix Determine the target melody matrix. After obtaining the target melody matrix, the terminal can input the target melody matrix into the above-mentioned trained DCGAN model to generate a target chord matrix that matches the target melody matrix.
  • a random noise Gausian noise, uniform noise, etc.
  • the terminal can obtain the target chord matrix generated by the trained DCGAN model, and can map the target melody matrix to a melody track, and the target chord matrix to a chord track.
  • the terminal can merge the melody track mapped by the target melody matrix and the chord track mapped by the target chord matrix to obtain a merged music file, and the merged music file can be used with a musical instrument digital interface (musical instrument digital interface). , MIDI) format output.
  • the size of the target chord matrix is the same as the size of the chord matrix in the training data set.
  • the merged music file includes melody and chord. For example, at time t, the melody track mapped by the target melody matrix and the chord track mapped by the target chord matrix simultaneously emit their respective notes at time t.
  • the embodiment of this application constructs a DCGAN model and uses the melody matrix and chord matrix to train it to obtain a trained DCGAN model, and then input the target melody matrix (which can be a random noise) into the trained DCGAN model, and the trained DCGAN model
  • the DCGAN model generates a target chord matrix that matches the target melody matrix based on the target melody matrix, and can automatically generate music files with chord matching, thereby saving manpower and reducing manual processing steps.
  • the terminal constructs a deep convolutional generative confrontation network DCGAN model, obtains a training data set, and then inputs the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model.
  • FIG. 3 is another schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application.
  • the DCGAN-based music generation method may include steps:
  • step S301 in the embodiment of the present application can refer to the implementation manner of step S101 in the embodiment shown in FIG. 1, which will not be repeated here.
  • S302 Acquire a dual-track data set including multiple dual-track music files.
  • the terminal may obtain a MIDI data set, and the MIDI data set may include multiple music files in MIDI format.
  • the terminal may determine a music file including a melody track and a chord track in the MIDI data set as a dual-track music file, and may use multiple dual-track music files in the MIDI data set as a dual-track data set.
  • the terminal may determine a dual-track music file whose chords in the dual-track data set belong to the preset basic chord set and whose number of bars is equal to the target threshold as the target dual-track music file, and obtain N target dual-track music files.
  • the preset basic chord set may include 12 major chords and 12 minor chords.
  • the 12 major chords are: C, C#, D, D#, E, F, F#, G, G#, A, A#, B; the 12 minor chords are: A, A#, B, C, C#, D, D#, E, F, F#, G, G#.
  • the target threshold may be 16, that is, each target dual-track music file includes 16 bars.
  • N can be an integer greater than or equal to 2.
  • the terminal may sequentially divide each of the aforementioned N target dual-track music files into groups of 8 bars.
  • a target two-track music file has a total of 18 measures and is divided into groups of 8 measures, the first group is the first 8 measures, the second group is the middle 8 measures, and the third group is the last 2 measures.
  • S304 Obtain the melody matrix of each target dual-track music file on the melody track among the N target dual-track music files, to obtain N melody matrices.
  • the terminal may adjust the melody of each target dual-track music file among the N target dual-track music files to within a preset pitch range.
  • the preset pitch range can be between the two octaves of C4 to B5.
  • the terminal removes the melody notes whose pitches of the melody notes in each target dual-track music file are not between the preset two octaves, and only keeps the pitches of the melody notes in each target dual-track music file between C4 and C4. B5 The melody note between these two octaves.
  • the terminal may obtain the adjusted melody notes in each target dual-track music file, and may generate the melody matrix of each target dual-track music file according to the adjusted melody notes in each target dual-track music file.
  • Element 0 in the melody matrix can be used to indicate that the adjusted target dual-track music file has no MIDI notes at the position corresponding to the 0 element, and element 1 in the melody matrix can be used to indicate the adjusted target dual-track music file There is a corresponding MIDI note at the position corresponding to the 1 element.
  • FIG. 4a is a schematic diagram of MIDI notes provided by an embodiment of the present application
  • FIG. 4b is a schematic diagram of a melody matrix provided by an embodiment of the present application.
  • M represents the melody matrix
  • the size of M is 128 rows and 16 columns.
  • Each line in M represents a MIDI note.
  • the first line represents the first MIDI note 00 (hexadecimal note code) in 128 MIDI notes
  • the second line represents the second of 128 MIDI notes.
  • MIDI note 01 the thirteenth row represents the thirteenth MIDI note 0C among 128 MIDI notes, etc.
  • Each column in M represents a measure, for example, the first column represents the first measure in the target two-track music file, and the tenth column represents the tenth measure in the target two-track music file.
  • the element 1 in the second row and the first column of M indicates that the first note in the first measure of the target dual-track music file is MIDI note 01;
  • the element in the second row and the third column of M Element 1 indicates that the second note of the third bar of the target dual-track music file is MIDI note 01.
  • the element 0 in the first row and fifth column of M indicates that there is no MIDI note 00 in the fifth measure of the target two-track music file.
  • only one chord is used in each measure of the target two-track music file.
  • the terminal can obtain the chord used in each bar of each target dual-track music file in the above N target dual-track music files, and can determine the chord category of the chord used in each bar (that is, the chord of each bar is a major chord or Minor chords).
  • the terminal may generate the chord matrix of each target dual-track music file according to the chord adopted in each bar of the target dual-track music file and the chord type of the chord adopted in each bar.
  • the first 12 chord parameters of the two chord parameters represent 12 chords respectively, and the 13th chord parameter represents the chord category, namely major chord or minor chord.
  • Fig. 5a is a schematic diagram of 24 chords provided in an embodiment of the present application. Among them, major in Figure 5a represents a major chord, minor represents a minor chord; "13" represents the 13th chord parameter, the 13th chord parameter is 0, the chord category is a major chord, and the 13th chord parameter is 1 represents the chord category It is a minor chord. As shown in Fig. 5b, Fig. 5b is a schematic diagram of a chord matrix provided by an embodiment of the present application. Among them, Y in Figure 5b represents the chord matrix, and Y has 16 rows and 13 columns.
  • Each line in Y represents a measure, for example, the first line represents the first bar in the target dual-track music file, the fourth line represents the fourth bar in the target dual-track music file, and so on.
  • Each column in Y represents a chord parameter, the 0 element in the first 12 columns indicates that there is no corresponding chord, the 1 element in the first 12 columns indicates that there is a corresponding chord, and there is only one of the first 12 elements in each row of Y One 1 element.
  • the 13th column of Y indicates the chord category, 0 indicates a major chord, and 1 indicates a minor chord.
  • the element in the first row and the 13th column is 1, indicating a minor chord
  • the element 1 in the first row and the second column indicates that the first bar of the target dual-track music file adopts the minor chord A#.
  • the element in the second row and the 13th column is 0, indicating a major chord
  • the element 1 in the second row and the fourth column indicates that the second measure of the target dual-track music file uses the major chord D#.
  • the element in row 16 and column 13 is 0, indicating a major chord
  • the element 1 in row 16 and column 1 indicates that the 16th bar of the target dual-track music file uses a major chord C.
  • step S306 in the embodiment of the present application can refer to the implementation manner of step S103 in the embodiment shown in FIG. 1, which will not be repeated here.
  • the terminal may obtain any single-track music file including a melody track from the MIDI data set.
  • the terminal can remove the melody notes whose pitches of the melody notes in the single-track music file are outside the preset pitch range (two octaves from C4 to B5), and only retain the pitch of the melody notes in the single-track music file
  • the melody notes within the preset pitch range will be adjusted to a single-track music file.
  • the terminal may obtain the adjusted melody note of the single-track music file, and may generate the target melody matrix of the adjusted single-track music file according to the adjusted melody note of the single-track music file.
  • the target melody matrix is a binary matrix of 128*16.
  • Element 0 in the target melody matrix can be used to indicate that the adjusted single-track music file has no MIDI notes at the position corresponding to the 0 element
  • element 1 in the target melody matrix can be used to indicate the adjusted single-track music file There is a corresponding MIDI note at the position corresponding to the 1 element.
  • step S308 in the embodiment of the present application can refer to the implementation manner of step S104 in the embodiment shown in FIG. 1, and details are not described herein again.
  • the terminal constructs a deep convolutional generative confrontation network DCGAN model, and then obtains a dual-track data set including multiple dual-track music files, and determines N target dual-tones from the dual-track data set Track music files, obtain the melody matrix of each target dual-track music file on the melody track in the N target dual-track music files, obtain N melody matrices, and obtain each target dual-tone in the N target dual-track music files Track the chord matrix of the music file on the chord track to obtain N chord matrices.
  • the N melody matrices and the corresponding N chord matrices in the training data set are input into the DCGAN model for training, and a trained DCGAN model is obtained.
  • FIG. 6 is a schematic block diagram of a music generating apparatus provided by an embodiment of the present application.
  • the music generating device of the embodiment of the present application includes:
  • the construction module 10 is used to construct a deep convolutional generative confrontation network DCGAN model
  • the first obtaining module 20 is configured to obtain a training data set, the training data set includes N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both binary matrices;
  • the training module 30 is configured to input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model;
  • the input module 40 is configured to input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix;
  • the output module 50 is configured to output a music file obtained by merging the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix.
  • the aforementioned DCGAN model includes a generator, a discriminator, and a regulator, and the generator, the discriminator, and the regulator are all convolutional neural network CNN.
  • the above-mentioned training module 30 is specifically configured to: for any melody matrix i in the training data set, input the melody matrix i into the generator of the DCGAN model to generate a first chord matrix j that matches the melody matrix i; A chord matrix j and the corresponding chord matrix k of the melody matrix i in the training data set are input into the discriminator of the DCGAN model.
  • the probability that the first chord matrix j is the same as the chord matrix k is judged; Whether the output probabilities of a chord matrix j are all within the preset range, if otherwise, the probability is input into the regulator of the DCGAN model to adjust the parameters on the transposed convolutional layer of the generator, and the melody matrix i is renewed Input the adjusted generator to regenerate the first chord matrix j matching the melody matrix i, and input the regenerated first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set into the
  • the discriminator of the DCGAN model discriminates the probability that the regenerated first chord matrix j is the same as the chord matrix k. When the probability that the discriminator outputs to each first chord matrix generated by the generator is within a preset range, a trained DCGAN model is obtained.
  • the aforementioned first acquiring module 20 includes a first acquiring unit 201, a determining unit 202, a second acquiring unit 203, and a third acquiring unit 204.
  • the above-mentioned first obtaining unit 201 is configured to obtain a dual-track data set including a plurality of dual-track music files, the dual-track music file is used to represent a music file containing a melody track and a chord track; the above determining unit 202 , Used to determine N target dual-track music files from the dual-track data set; the second acquisition unit 203, used to obtain the target dual-track music files in the N target dual-track music files The melody matrix on the sound track to obtain N melody matrices; the third obtaining unit 204 is used to obtain the chord matrix of each target dual-track music file on the chord track among the N target dual-track music files to obtain N chord matrices.
  • the chords in the target dual-track music file belong to a preset basic chord set.
  • the basic chord set includes 12 major chords and 12 minor chords. Each measure of the target dual-track music file uses one chord.
  • the aforementioned second obtaining unit 203 is specifically configured to: adjust the melody of each target dual-track music file among the N target dual-track music files to within a preset pitch range; Melody notes in each target dual-track music file; according to the adjusted melody notes in each target dual-track music file, generate the adjusted melody matrix of each target dual-track music file, the melody matrix is h*w
  • the h is used to represent the preset number of notes
  • the w is used to represent the number of bars of the target two-track music file.
  • the above-mentioned third obtaining unit 204 is specifically configured to: obtain the chord adopted by each bar of each target dual-track music file in the N target dual-track music files and the chord adopted by each bar The chord category of each target dual-track music file; according to the chord used in each measure of the target dual-track music file and the chord category of the chord used in each measure, the chord matrix of each target dual-track music file is generated, and the chord matrix is w* A binary matrix of m, the w is used to represent the number of bars of the target two-track music file, and the m is used to represent the chord parameters of each bar.
  • the device further includes a second acquisition module 60.
  • the second acquiring module 60 is configured to acquire a single-track music file including a melody track; adjust the melody of the single-track music file to a preset pitch range; acquire the melody in the adjusted single-track music file Note: According to the melody note in the adjusted single-track music file, the target melody matrix of the adjusted single-track music file is generated.
  • the generator includes at least one fully connected layer and at least one transposed convolutional layer
  • the discriminator includes at least one convolutional layer and at least one fully connected layer
  • the adjuster includes at least one convolutional layer.
  • the regulator is a reverse generator.
  • the above-mentioned music generating device can execute the implementation provided by each step in the implementation provided in Figure 1 or Figure 3 through the above-mentioned modules to implement the functions implemented in the above-mentioned embodiments.
  • the above-mentioned figure please refer to the above-mentioned figure. The corresponding description provided in each step in the method embodiment shown in 1 or FIG. 3 will not be repeated here.
  • the music generating device constructs a deep convolutional generative confrontation network DCGAN model, then obtains a training data set, and then inputs the N melody matrices and the corresponding N chord matrices in the training data set to the DCGAN model Training in the DCGAN model to obtain a trained DCGAN model, and then input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord generated by the trained DCGAN model that matches the target melody matrix Matrix, and finally output a music file in which the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix are combined.
  • FIG. 7 is a schematic block diagram of a terminal provided in an embodiment of the present application.
  • the terminal in this embodiment of the present application may include: one or more processors 701; one or more input devices 702, one or more output devices 703, and a memory 704.
  • the aforementioned processor 701, input device 702, output device 703, and memory 704 are connected via a bus 705.
  • the memory 702 is configured to store a computer program including program instructions, and the processor 701 is configured to execute the program instructions stored in the memory 702.
  • the processor 701 is configured to call the program instructions to execute: construct a deep convolutional generative confrontation network DCGAN model; obtain a training data set, which includes N melody matrices and corresponding N chord matrices, where The melody matrix and the chord matrix are both binary matrices; the N melody matrices and the corresponding N chord matrices in the training data set are input into the DCGAN model for training, and a trained DCGAN model is obtained.
  • the input device 702 is configured to input the acquired target melody matrix into the trained DCGAN model for processing, and acquire a target chord matrix generated by the trained DCGAN model that matches the target melody matrix.
  • the output device 703 is configured to output a music file in which the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix are combined.
  • the processor 701 may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors or digital signal processors (DSP). , Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the input device 702 may include a touch panel, a microphone, etc.
  • the output device 703 may include a display (LCD, etc.), a speaker, etc.
  • the memory 704 may include a read-only memory and a random access memory, and provides instructions and data to the processor 701. A part of the memory 704 may also include a non-volatile random access memory. For example, the memory 704 may also store device type information.
  • the processor 701, input device 702, and output device 703 described in the embodiments of this application can perform the implementation described in the DCGAN-based music generation method provided in the embodiments of this application, and can also perform the implementation of this application.
  • the implementation of the music generating device described in the embodiment will not be repeated here.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions.
  • the program instructions are executed by a processor, the computer-readable storage medium shown in FIG. 1 or FIG.
  • the DCGAN music generation method please refer to the description of the embodiment shown in FIG. 1 or FIG. 3 for specific details, which will not be repeated here.
  • the above-mentioned computer-readable storage medium may be the internal storage unit of the music generating apparatus or terminal described in any of the foregoing embodiments, such as the hard disk or memory of the terminal.
  • the computer-readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card equipped on the terminal. (flash card) etc.
  • the computer-readable storage medium may also include both an internal storage unit of the terminal and an external storage device.
  • the computer-readable storage medium is used to store the computer program and other programs and data required by the terminal.
  • the computer-readable storage medium can also be used to temporarily store data that has been output or will be output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

Provided are a DCGAN-based music generation method, and a music generation apparatus. The method comprises: constructing a deep convolution generative adversarial network (DCGAN) model (S101); then acquiring a training data set (S102); next, inputting N melody matrices and N corresponding chord matrices in the training data set into the DCGAN model for training so as to obtain a trained DCGAN model (S103); and then inputting an acquired target melody matrix into the trained DCGAN model for processing, acquiring a target chord matrix generated by the trained DCGAN model and matching the target melody matrix, and finally, outputting a music file formed after combining a melody track mapped by the target melody matrix with a chord track mapped by the target chord matrix (S104). By means of the music generation method, a music file with matching chords can be automatically generated, thereby reducing manual handling steps.

Description

一种基于DCGAN的音乐生成方法及装置A DCGAN-based music generation method and device
本申请要求于2019年1月23日提交中国专利局、申请号为2019100661308、申请名称为“一种基于DCGAN的音乐生成方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on January 23, 2019, the application number is 2019100661308, and the application name is "A method and device for generating music based on DCGAN", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种基于DCGAN的音乐生成方法及装置。This application relates to the field of computer technology, in particular to a DCGAN-based music generation method and device.
背景技术Background technique
目前,现有的音乐生成方式通常是给定一段旋律,由专业的音乐人士为给定的旋律配和弦,以此得到具有和弦匹配的音乐文件。但具体来说,要实现对一段旋律的配和弦,需要配乐者有强大的乐理、操作知识等方面的硬件技术支持,同时在软技术方面还要求配乐者有强烈、敏感的乐感体验。因此,要想生成一段优质的音乐文件必然受配乐者水平的限制。At present, the existing music generation method is usually to give a melody, and professional musicians can chord the given melody to obtain a music file with chord matching. But specifically, to realize the chord of a melody, the music player needs to have strong hardware technical support in terms of music theory and operation knowledge. At the same time, the music player is also required to have a strong and sensitive musical experience in terms of software technology. Therefore, to generate a high-quality music file is bound to be restricted by the level of the music player.
发明内容Summary of the invention
本申请实施例提供一种基于DCGAN的音乐生成方法,可以自动生成带有和弦匹配的音乐文件,减少人工处理环节。The embodiment of the present application provides a DCGAN-based music generation method, which can automatically generate music files with chord matching and reduce manual processing links.
第一方面,本申请实施例提供了一种基于DCGAN的音乐生成方法,该方法包括:In the first aspect, an embodiment of the present application provides a DCGAN-based music generation method, which includes:
构造深度卷积生成式对抗网络DCGAN模型;Construct a deep convolutional generative confrontation network DCGAN model;
获取训练数据集,该训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;Obtain a training data set. The training data set includes N melody matrices and corresponding N chord matrices, where the melody matrix and the chord matrix are both binary matrices;
将该训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入该DCGAN模型中进行训练,得到训练好的DCGAN模型;Input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model;
将获取到的目标旋律矩阵输入该训练好的DCGAN模型中进行处理,并获取该训练好的DCGAN模型生成的与该目标旋律矩阵匹配的目标和弦矩阵,输出该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。Input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix, and output the melody track mapped by the target melody matrix A music file merged with the chord track mapped out by the target chord matrix.
第二方面,本申请实施例提供了一种音乐生成装置,该装置包括:In a second aspect, an embodiment of the present application provides a music generating device, which includes:
构造模块,用于构造深度卷积生成式对抗网络DCGAN模型;Construction module, used to construct a deep convolutional generative confrontation network DCGAN model;
第一获取模块,用于获取训练数据集,该训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;The first acquisition module is used to acquire a training data set, the training data set includes N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both binary matrices;
训练模块,用于将该训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入该DCGAN模型中进行训练,得到训练好的DCGAN模型;The training module is used to input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model;
输入模块,用于将获取到的目标旋律矩阵输入该训练好的DCGAN模型中进行处理,并获取该训练好的DCGAN模型生成的与该目标旋律矩阵匹配的目标和弦矩阵;The input module is used to input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix;
输出模块,用于输出该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。The output module is used to output a music file obtained by merging the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix.
第三方面,本申请实施例提供了一种终端,包括处理器、输入设备、输出设备和存储 器,该处理器、输入设备、输出设备和存储器相互连接,其中,该存储器用于存储支持终端执行上述方法的计算机程序,该计算机程序包括程序指令,该处理器被配置用于调用该程序指令,执行上述第一方面的基于DCGAN的音乐生成方法。In a third aspect, an embodiment of the present application provides a terminal, including a processor, an input device, an output device, and a memory. The processor, input device, output device, and memory are connected to each other, wherein the memory is used to store and support terminal execution The computer program of the above method, the computer program includes program instructions, and the processor is configured to call the program instructions to execute the DCGAN-based music generation method of the above first aspect.
第四方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序包括程序指令,该程序指令当被处理器执行时使该处理器执行上述第一方面的基于DCGAN的音乐生成方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute The above-mentioned first aspect of the DCGAN-based music generation method.
本申请实施例通过训练DCGAN模型并利用训练好的DCGAN模型生成和弦矩阵,可以自动生成带有和弦匹配的音乐文件,减少人工处理环节。In the embodiment of the application, by training a DCGAN model and generating a chord matrix using the trained DCGAN model, a music file with chord matching can be automatically generated and manual processing steps are reduced.
附图说明Description of the drawings
图1是本申请实施例提供的基于DCGAN的音乐生成方法的一示意流程图;FIG. 1 is a schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application;
图2是本申请实施例提供的DCGAN模型的网络结构示意图;2 is a schematic diagram of the network structure of the DCGAN model provided by an embodiment of the present application;
图3是本申请实施例提供的基于DCGAN的音乐生成方法的另一示意流程图;FIG. 3 is another schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application;
图4a是本申请实施例提供的MIDI音符的示意图;4a is a schematic diagram of MIDI notes provided by an embodiment of the present application;
图4b是本申请实施例提供的旋律矩阵的示意图;Figure 4b is a schematic diagram of a melody matrix provided by an embodiment of the present application;
图5a是本申请实施例提供的24个和弦的示意图;Figure 5a is a schematic diagram of 24 chords provided by an embodiment of the present application;
图5b是本申请实施例提供的和弦矩阵的示意图;Figure 5b is a schematic diagram of a chord matrix provided by an embodiment of the present application;
图6是本申请实施例提供的音乐生成装置的一示意性框图;Fig. 6 is a schematic block diagram of a music generating device provided by an embodiment of the present application;
图7是本申请实施例提供的终端的一示意性框图。FIG. 7 is a schematic block diagram of a terminal provided by an embodiment of the present application.
具体实施方式detailed description
下面将结合图1至图7,对本申请实施例提供的基于DCGAN的音乐生成方法及装置进行说明。Hereinafter, the DCGAN-based music generation method and device provided by the embodiments of the present application will be described with reference to FIGS. 1 to 7.
参见图1,是本申请实施例提供的基于DCGAN的音乐生成方法的一示意流程图。如图1所示,该基于DCGAN的音乐生成方法可包括步骤:Refer to Fig. 1, which is a schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application. As shown in Figure 1, the DCGAN-based music generation method may include steps:
S101,构造深度卷积生成式对抗网络DCGAN模型。S101, construct a deep convolutional generative confrontation network DCGAN model.
在一些可行的实施方式中,终端可以构造一个深度卷积生成式对抗网络(Deep Convolution Generative Adversarial Networks,DCGAN)模型。其中,DCGAN模型中可以包括一个生成器、一个辨别器以及一个调节器。生成器、辨别器以及调节器均为卷积神经网络(Convolutional Neural Network,CNN),生成器中可以包括至少一个全连接层和至少一个转置卷积层;辨别器中可以包括至少一个卷积层和至少一个全连接层;调节器可以为反向的生成器,包括至少一个卷积层和至少一个全连接层。生成器可以用于根据给定的随机序列生成一段尽量真实的音乐去欺骗辨别器,辨别器可以用于将生成器生成的音乐与真实的音乐尽量区分开来,这样,生成器和辨别器就构成了一个动态的“博弈过程”,而调节器可以用于调节生成器的转置卷积层上的参数,以使生成器生成的音乐能更好地欺骗辨别器。In some feasible implementation manners, the terminal may construct a Deep Convolution Generative Adversarial Networks (DCGAN) model. Among them, the DCGAN model can include a generator, a discriminator and a regulator. The generator, discriminator and regulator are all convolutional neural networks (Convolutional Neural Network, CNN), the generator may include at least one fully connected layer and at least one transposed convolutional layer; the discriminator may include at least one convolution Layer and at least one fully connected layer; the regulator may be an inverted generator, including at least one convolutional layer and at least one fully connected layer. The generator can be used to generate a piece of music that is as realistic as possible according to a given random sequence to deceive the discriminator. The discriminator can be used to distinguish the music generated by the generator from the real music as much as possible, so that the generator and the discriminator It constitutes a dynamic "game process", and the adjuster can be used to adjust the parameters on the transposed convolutional layer of the generator, so that the music generated by the generator can better deceive the discriminator.
如图2所示,是本申请实施例提供的DCGAN模型的网络结构示意图。其中,Condititoner CNN表示DCGAN模型中的调节器,Generator CNN表示DCGAN模型中的生成器,Discriminator CNN表示DCGAN模型中的辨别器。由于调节器本质上是一个反向的生成器,所以调节器与生成器有相同的卷积核形状,调节器和生成器的输出也具有相同的形状,故 将调节器各个卷积层的输出给到生成器相应的转置卷积层上,以便于对生成器转置卷积层上的参数进行调整,同时将生成器的输出作为辨别器的一个输入。Noise z表示输入生成器的随机序列,X或G(z)表示生成器的输出,2D conditions表示真实的数据(这里指非生成器生成的数据)。As shown in FIG. 2, it is a schematic diagram of the network structure of the DCGAN model provided by an embodiment of the present application. Among them, Condititoner CNN represents the regulator in the DCGAN model, Generator CNN represents the generator in the DCGAN model, and Discriminator CNN represents the discriminator in the DCGAN model. Since the regulator is essentially a reverse generator, the regulator and the generator have the same convolution kernel shape, and the output of the regulator and the generator have the same shape, so the output of each convolution layer of the regulator Give it to the generator's corresponding transposed convolutional layer, so that the parameters on the generator's transposed convolutional layer can be adjusted, and the output of the generator is used as an input of the discriminator. Noise z represents the random sequence of the input generator, X or G(z) represents the output of the generator, and 2D conditions represents the real data (here refers to the data not generated by the generator).
S102,获取训练数据集。S102: Obtain a training data set.
在一些可行的实施方式中,终端可以从预设的训练数据库中获取用于训练上述DCGAN模型的N个训练样本,每个训练样本中可以包括1个旋律矩阵以及对应的1个和弦矩阵。终端可以将该N个训练样本确定为上述DCGAN模型的训练样本集,那么该训练样本集中就包括N个训练样本,即该训练样本集中可以包括N个旋律矩阵以及对应的N个和弦矩阵。其中,N可以为大于或等于2的整数。旋律矩阵可以为128*16的二元矩阵,和弦矩阵可以为16*13的二元矩阵。In some feasible implementation manners, the terminal may obtain N training samples for training the above-mentioned DCGAN model from a preset training database, and each training sample may include a melody matrix and a corresponding chord matrix. The terminal may determine the N training samples as the training sample set of the aforementioned DCGAN model, then the training sample set includes N training samples, that is, the training sample set may include N melody matrices and corresponding N chord matrices. Wherein, N can be an integer greater than or equal to 2. The melody matrix can be a 128*16 binary matrix, and the chord matrix can be a 16*13 binary matrix.
S103,将训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入DCGAN模型中进行训练,得到训练好的DCGAN模型。S103: Input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training, to obtain a trained DCGAN model.
在一些可行的实施方式中,上述DCGAN模型包括一个生成器、一个辨别器以及一个调节器。生成器、辨别器以及调节器均为CNN。终端可以采用单独交替训练方法来对该DCGAN模型中的生成器和辨别器进行训练。具体地,以训练过程的一轮迭代为例。终端可以固定辨别器卷积层上的参数不变,训练生成器,终端可以将上述训练数据集中的任一旋律矩阵i输入该DCGAN模型的生成器中生成与该旋律矩阵i匹配的第一和弦矩阵j。固定生成器转置卷积层上的参数不变、训练辨别器,终端可以将该第一和弦矩阵j以及该旋律矩阵i在该训练数据集中对应的和弦矩阵k共同输入该DCGAN模型的辨别器中辨别该第一和弦矩阵j与该和弦矩阵k相同的概率(即第一和弦矩阵j与和弦矩阵k之间的相似度)。判断该辨别器针对该第一和弦矩阵j输出的概率是否均在预设范围内(如0.85到1之间,包括0.85和1)。若该辨别器针对该第一和弦矩阵j输出的概率不在该预设范围内,终端可以将该辨别器输出的概率输入该DCGAN模型的调节器中对该生成器的转置卷积层上的参数进行调整。终端可以将该旋律矩阵i重新输入调整后的生成器中重新生成与该旋律矩阵i匹配的第一和弦矩阵j,并可以将重新生成的第一和弦矩阵j以及该和弦矩阵k合并输入该DCGAN模型的辨别器中辨别该第一和弦矩阵j与该和弦矩阵k相同的概率。若该辨别器针对该第一和弦矩阵j输出的概率在该预设范围内,则终端可以从上述训练数据集中选取另一旋律矩阵,进行训练过程的一轮迭代。针对上述训练数据集中的每个旋律矩阵,在训练过程中均需进行一轮迭代,即训练数据集中有N个旋律矩阵,训练过程至少有N轮迭代。当辨别器针对生成器生成的各个第一和弦矩阵输出的概率均在预设范围内时,得到训练好的DCGAN模型。In some feasible implementation manners, the aforementioned DCGAN model includes a generator, a discriminator, and a regulator. The generator, discriminator, and regulator are all CNN. The terminal can use a separate alternate training method to train the generator and discriminator in the DCGAN model. Specifically, take one iteration of the training process as an example. The terminal can fix the parameters on the convolutional layer of the discriminator and train the generator. The terminal can input any melody matrix i in the above training data set into the generator of the DCGAN model to generate the first chord that matches the melody matrix i Matrix j. The fixed generator transposes the parameters on the convolutional layer and trains the discriminator. The terminal can input the first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set to the discriminator of the DCGAN model. Identify the probability that the first chord matrix j is the same as the chord matrix k (ie, the similarity between the first chord matrix j and the chord matrix k). It is determined whether the probabilities that the discriminator outputs for the first chord matrix j are all within a preset range (for example, between 0.85 and 1, inclusive of 0.85 and 1). If the probability that the discriminator outputs for the first chord matrix j is not within the preset range, the terminal can input the probability output by the discriminator into the regulator of the DCGAN model on the transposed convolutional layer of the generator. Parameters are adjusted. The terminal can re-input the melody matrix i into the adjusted generator to regenerate the first chord matrix j that matches the melody matrix i, and can combine the regenerated first chord matrix j and the chord matrix k into the DCGAN The discriminator of the model discriminates that the first chord matrix j has the same probability as the chord matrix k. If the probability that the discriminator outputs for the first chord matrix j is within the preset range, the terminal can select another melody matrix from the training data set to perform one iteration of the training process. For each melody matrix in the above training data set, one round of iteration is required during the training process, that is, there are N melody matrices in the training data set, and the training process has at least N rounds of iteration. When the probability that the discriminator outputs each first chord matrix generated by the generator is within the preset range, a trained DCGAN model is obtained.
在一些可行的实施方式中,上述DCGAN模型的训练过程可以用以下函数1-1表示:In some feasible implementation manners, the training process of the aforementioned DCGAN model can be represented by the following function 1-1:
Figure PCTCN2019088805-appb-000001
Figure PCTCN2019088805-appb-000001
其中,函数1-1中的p data表示上述训练数据集中的N个和弦矩阵,p z表示上述训练数据集中的N个旋律矩阵。D表示辨别器、G表示生成器。G(z)表示生成器的输出,D(x)表示辨别器的输出(D(x)的值在0到1之内,包括0和1)。训练D使得log D(x)最大化,训练G使得log(1-D(G(z)))最小化,即最大化D的损失。训练过程通常是先固定一方(如辨别 器D),更新另一个网络(如生成器G)的参数,交替迭代,使得对方的错误最大化。最终,当G收敛时,则G和D训练完成,得到训练好的DCGAN模型。 Wherein, p data in the function 1-1 represents the N chord matrices in the training data set, and p z represents the N melody matrices in the training data set. D stands for discriminator and G stands for generator. G(z) represents the output of the generator, D(x) represents the output of the discriminator (the value of D(x) is in the range of 0 to 1, including 0 and 1). Training D maximizes log D(x), and training G minimizes log(1-D(G(z))), that is, maximizes the loss of D. The training process is usually to fix one party (such as the discriminator D), update the parameters of the other network (such as the generator G), and alternate iterations to maximize the error of the other party. Finally, when G converges, the training of G and D is completed, and a trained DCGAN model is obtained.
在一些可行的实施方式中,DCGAN模型的生成器在学习过程中加入了特征匹配。特征匹配可以用以下函数1-2表示:In some feasible implementations, the generator of the DCGAN model adds feature matching during the learning process. Feature matching can be represented by the following function 1-2:
Figure PCTCN2019088805-appb-000002
Figure PCTCN2019088805-appb-000002
其中,函数1-2中的E表示平均值,X表示上述训练数据集中的和弦矩阵,z表示上述训练数据集中的旋律矩阵,G(z)表示生成器的输出。f辨别器的第一个卷积层,λ 12表示生成器的调节参数。调节参数的范围在系统不失真的范围内均可。 Among them, E in the function 1-2 represents the average value, X represents the chord matrix in the training data set, z represents the melody matrix in the training data set, and G(z) represents the output of the generator. The first convolutional layer of f discriminator, λ 1 , λ 2 represent the tuning parameters of the generator. The range of adjustment parameters can be within the range of system distortion.
S104,将获取到的目标旋律矩阵输入训练好的DCGAN模型中进行处理,并获取训练好的DCGAN模型生成的与目标旋律矩阵匹配的目标和弦矩阵,输出目标旋律矩阵映射出的旋律音轨与目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。S104: Input the obtained target melody matrix into the trained DCGAN model for processing, and obtain a target chord matrix matching the target melody matrix generated by the trained DCGAN model, and output the melody track and target mapped by the target melody matrix The music file after the chord tracks mapped out by the chord matrix are merged.
在一些可行的实施方式中,终端在得到训练好的DCGAN模型之后,可以获取一个目标旋律矩阵。该目标旋律矩阵可以为用户直接输入的一个二元矩阵,还可以为终端随机生成的一个二元矩阵。比如,先获取一个随机噪声(高斯噪声、均匀噪声等),再将获取到的随机噪声处理为与上述训练数据集中旋律矩阵的数据格式相同的矩阵,再将该随机噪声经过处理后得到的矩阵确定为目标旋律矩阵。终端在获取到目标旋律矩阵之后,可以将该目标旋律矩阵输入上述训练好的DCGAN模型中生成与该目标旋律矩阵匹配的目标和弦矩阵。终端可以获取该训练好的DCGAN模型生成的目标和弦矩阵,并可以将该目标旋律矩阵映射为旋律音轨,将该目标和弦矩阵映射为和弦音轨。终端可以合并该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨,得到合并后的音乐文件,并可以将该合并后的音乐文件以乐器数字接口(musical instrument digital interface,MIDI)格式输出。其中,目标和弦矩阵的大小与上述训练数据集中和弦矩阵的大小相同。该合并后的音乐文件中包括旋律以及和弦。例如,在t时刻,该目标旋律矩阵映射出的旋律音轨以及该目标和弦矩阵映射出的和弦音轨同时发出各自在t时刻的音。本申请实施例通过构造DCGAN模型并利用旋律矩阵以及和弦矩阵对其进行训练,得到训练好的DCGAN模型,再在训练好的DCGAN模型中输入目标旋律矩阵(可以为一个随机噪声),训练好的DCGAN模型根据这个目标旋律矩阵生成与这个目标旋律矩阵匹配目标和弦矩阵,可以自动生成带有和弦匹配的音乐文件,从而节省了人力,减少了人工处理环节。In some feasible implementation manners, the terminal may obtain a target melody matrix after obtaining the trained DCGAN model. The target melody matrix may be a binary matrix directly input by the user, or may be a binary matrix randomly generated by the terminal. For example, first obtain a random noise (Gaussian noise, uniform noise, etc.), then process the obtained random noise into a matrix with the same data format as the melody matrix in the above training data set, and then process the random noise to obtain the matrix Determine the target melody matrix. After obtaining the target melody matrix, the terminal can input the target melody matrix into the above-mentioned trained DCGAN model to generate a target chord matrix that matches the target melody matrix. The terminal can obtain the target chord matrix generated by the trained DCGAN model, and can map the target melody matrix to a melody track, and the target chord matrix to a chord track. The terminal can merge the melody track mapped by the target melody matrix and the chord track mapped by the target chord matrix to obtain a merged music file, and the merged music file can be used with a musical instrument digital interface (musical instrument digital interface). , MIDI) format output. Among them, the size of the target chord matrix is the same as the size of the chord matrix in the training data set. The merged music file includes melody and chord. For example, at time t, the melody track mapped by the target melody matrix and the chord track mapped by the target chord matrix simultaneously emit their respective notes at time t. The embodiment of this application constructs a DCGAN model and uses the melody matrix and chord matrix to train it to obtain a trained DCGAN model, and then input the target melody matrix (which can be a random noise) into the trained DCGAN model, and the trained DCGAN model The DCGAN model generates a target chord matrix that matches the target melody matrix based on the target melody matrix, and can automatically generate music files with chord matching, thereby saving manpower and reducing manual processing steps.
在本申请实施例中,终端通过构造深度卷积生成式对抗网络DCGAN模型,再获取训练数据集,接着将该训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入该DCGAN模型中进行训练,从而得到训练好的DCGAN模型,然后将获取到的目标旋律矩阵输入该训练好的DCGAN模型中进行处理,并获取该训练好的DCGAN模型生成的与该目标旋律矩阵匹配的目标和弦矩阵,最后输出该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。可以自动生成带有和弦匹配的音乐文件,减少人工处理环节。In the embodiment of this application, the terminal constructs a deep convolutional generative confrontation network DCGAN model, obtains a training data set, and then inputs the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model. Training to obtain a trained DCGAN model, and then input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix, Finally, a music file obtained by merging the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix is output. Can automatically generate music files with chord matching, reducing manual processing links.
参见图3,是本申请实施例提供的基于DCGAN的音乐生成方法的另一示意流程图。如图3所示,该基于DCGAN的音乐生成方法可包括步骤:Refer to FIG. 3, which is another schematic flowchart of a DCGAN-based music generation method provided by an embodiment of the present application. As shown in Figure 3, the DCGAN-based music generation method may include steps:
S301,构造深度卷积生成式对抗网络DCGAN模型。S301: Construct a deep convolutional generative confrontation network DCGAN model.
在一些可行的实施方式中,本申请实施例中步骤S301的实现方式可参考图1所示实施例的步骤S101的实现方式,在此不再赘述。In some feasible implementation manners, the implementation manner of step S301 in the embodiment of the present application can refer to the implementation manner of step S101 in the embodiment shown in FIG. 1, which will not be repeated here.
S302,获取包括多个双音轨音乐文件的双音轨数据集。S302: Acquire a dual-track data set including multiple dual-track music files.
S303,从双音轨数据集确定出N个目标双音轨音乐文件。S303: Determine N target dual-track music files from the dual-track data set.
在一些可行的实施方式中,终端可以获取MIDI数据集,该MIDI数据集中可以包括多个MIDI格式的音乐文件。终端可以将该MIDI数据集中包括旋律音轨以及和弦音轨的音乐文件确定为双音轨音乐文件,并可以将该MIDI数据集中的多个双音轨音乐文件作为双音轨数据集。终端可以将该双音轨数据集中和弦属于预设的基本和弦集合且小节数等于目标阈值的双音轨音乐文件确定为目标双音轨音乐文件,得到N个目标双音轨音乐文件。其中,预设的基本和弦集合可以包括12个大和弦以及12个小和弦。这12个大和弦为:C、C#、D、D#、E、F、F#、G、G#、A、A#、B;这12个小和弦为:A、A#、B、C、C#、D、D#、E、F、F#、G、G#。每个目标双音轨音乐文件的每小节只采用一个和弦。目标阈值可以为16,即每个目标双音轨音乐文件包括16个小节。N可以为大于或等于2的整数。In some feasible implementation manners, the terminal may obtain a MIDI data set, and the MIDI data set may include multiple music files in MIDI format. The terminal may determine a music file including a melody track and a chord track in the MIDI data set as a dual-track music file, and may use multiple dual-track music files in the MIDI data set as a dual-track data set. The terminal may determine a dual-track music file whose chords in the dual-track data set belong to the preset basic chord set and whose number of bars is equal to the target threshold as the target dual-track music file, and obtain N target dual-track music files. Among them, the preset basic chord set may include 12 major chords and 12 minor chords. The 12 major chords are: C, C#, D, D#, E, F, F#, G, G#, A, A#, B; the 12 minor chords are: A, A#, B, C, C#, D, D#, E, F, F#, G, G#. Only one chord is used in each measure of each target dual-track music file. The target threshold may be 16, that is, each target dual-track music file includes 16 bars. N can be an integer greater than or equal to 2.
在一些可行的实施方式中,为了满足上述DCGAN模型的输入格式,终端可以将上述N个目标双音轨音乐文件中的每个目标双音轨音乐文件按照8小节为一组进行顺序分割。例如,某个目标双音轨音乐文件共有18小节,按照8小节为一组进行分割,第一组为前8小节,第二组为中间的8小节,第三组为最后2小节。In some feasible implementation manners, in order to satisfy the input format of the aforementioned DCGAN model, the terminal may sequentially divide each of the aforementioned N target dual-track music files into groups of 8 bars. For example, a target two-track music file has a total of 18 measures and is divided into groups of 8 measures, the first group is the first 8 measures, the second group is the middle 8 measures, and the third group is the last 2 measures.
S304,获取N个目标双音轨音乐文件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵。S304: Obtain the melody matrix of each target dual-track music file on the melody track among the N target dual-track music files, to obtain N melody matrices.
在一些可行的实施方式中,终端可以将上述N个目标双音轨音乐文件中的各个目标双音轨音乐文件的旋律调整至预设音高范围内。该预设的音高范围可以为C4到B5这两个八度之间。例如,终端将各个目标双音轨音乐文件中旋律音符的音高不在预设的两个八度之间的旋律音符去掉,只保留各个目标双音轨音乐文件中旋律音符的音高在C4到B5这两个八度之间的旋律音符。终端可以获取调整后的各个目标双音轨音乐文件中的旋律音符,并可以根据调整后各个目标双音轨音乐文件中的旋律音符生成该各个目标双音轨音乐文件的旋律矩阵。其中,旋律矩阵可以为h*w的二元矩阵,h可以用于表示MIDI音符数,h=128;w可以用于表示目标双音轨音乐文件的小节数,w=16。旋律矩阵中的元素0可以用于表示调整后的目标双音轨音乐文件在该0元素对应的位置上无MIDI音符,旋律矩阵中的元素1可以用于表示调整后的目标双音轨音乐文件在该1元素对应的位置上有对应的MIDI音符。In some feasible implementation manners, the terminal may adjust the melody of each target dual-track music file among the N target dual-track music files to within a preset pitch range. The preset pitch range can be between the two octaves of C4 to B5. For example, the terminal removes the melody notes whose pitches of the melody notes in each target dual-track music file are not between the preset two octaves, and only keeps the pitches of the melody notes in each target dual-track music file between C4 and C4. B5 The melody note between these two octaves. The terminal may obtain the adjusted melody notes in each target dual-track music file, and may generate the melody matrix of each target dual-track music file according to the adjusted melody notes in each target dual-track music file. Among them, the melody matrix can be a binary matrix of h*w, h can be used to represent the number of MIDI notes, h=128; w can be used to represent the number of bars of the target two-track music file, w=16. Element 0 in the melody matrix can be used to indicate that the adjusted target dual-track music file has no MIDI notes at the position corresponding to the 0 element, and element 1 in the melody matrix can be used to indicate the adjusted target dual-track music file There is a corresponding MIDI note at the position corresponding to the 1 element.
例如,以一个目标双音轨音乐文件生成一个旋律矩阵为例。如图4a所示,图4a是本申请实施例提供的MIDI音符的示意图;如图4b所示,图4b是本申请实施例提供的旋律矩阵的示意图。其中,M表示旋律矩阵,M的大小为128行16列。M中的每一行表示一个MIDI音符,如第一行表示128个MIDI音符中的第一个MIDI音符00(十六进制的音符代码),第二行表示128个MIDI音符中的第二个MIDI音符01,第十三行表示128个MIDI音符中的第十三个MIDI音符0C等。M中的每一列表示一个小节,如第一列表示目标双音轨音乐文件中的第一个小节,第十列表示目标双音轨音乐文件中的第十个小节等。如图4b所示,M的第2行第1列中的元素1表示目标双音轨音乐文件的第1个小节的第1个音符 为MIDI音符01;M的第2行第3列中的元素1表示目标双音轨音乐文件的第3个小节的第2个音符为MIDI音符01。M的第1行第5列中的元素0表示目标双音轨音乐文件的第5个小节中没有MIDI音符00。For example, take a target two-track music file to generate a melody matrix as an example. As shown in FIG. 4a, FIG. 4a is a schematic diagram of MIDI notes provided by an embodiment of the present application; as shown in FIG. 4b, FIG. 4b is a schematic diagram of a melody matrix provided by an embodiment of the present application. Among them, M represents the melody matrix, and the size of M is 128 rows and 16 columns. Each line in M represents a MIDI note. For example, the first line represents the first MIDI note 00 (hexadecimal note code) in 128 MIDI notes, and the second line represents the second of 128 MIDI notes. MIDI note 01, the thirteenth row represents the thirteenth MIDI note 0C among 128 MIDI notes, etc. Each column in M represents a measure, for example, the first column represents the first measure in the target two-track music file, and the tenth column represents the tenth measure in the target two-track music file. As shown in Figure 4b, the element 1 in the second row and the first column of M indicates that the first note in the first measure of the target dual-track music file is MIDI note 01; the element in the second row and the third column of M Element 1 indicates that the second note of the third bar of the target dual-track music file is MIDI note 01. The element 0 in the first row and fifth column of M indicates that there is no MIDI note 00 in the fifth measure of the target two-track music file.
S305,获取N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩阵,得到N个和弦矩阵。S305: Obtain the chord matrix of each target dual-track music file on the chord track among the N target dual-track music files, to obtain N chord matrices.
在一些可行的实施方式中,上述目标双音轨音乐文件的每小节只采用一个和弦。终端可以获取上述N个目标双音轨音乐文件中各个目标双音轨音乐文件的各小节所采用的和弦,并可以判断该各小节所采用和弦的和弦类别(即各小节的和弦属于大和弦或小和弦)。终端可以根据该各个目标双音轨音乐文件的各小节所采用的和弦以及各小节所采用和弦的和弦类别,生成该各个目标双音轨音乐文件的和弦矩阵。其中,和弦矩阵可以为w*m的二元矩阵,w可以用于表示目标双音轨音乐文件的小节数,w=16;m可以用于表示各小节的和弦参数,m=13,这13个和弦参数的前12个和弦参数分别表示12个和弦,第13个和弦参数表示和弦类别,即大和弦或小和弦。In some feasible implementation manners, only one chord is used in each measure of the target two-track music file. The terminal can obtain the chord used in each bar of each target dual-track music file in the above N target dual-track music files, and can determine the chord category of the chord used in each bar (that is, the chord of each bar is a major chord or Minor chords). The terminal may generate the chord matrix of each target dual-track music file according to the chord adopted in each bar of the target dual-track music file and the chord type of the chord adopted in each bar. Among them, the chord matrix can be a binary matrix of w*m, w can be used to represent the number of bars of the target dual-track music file, w=16; m can be used to represent the chord parameters of each bar, m=13, this 13 The first 12 chord parameters of the two chord parameters represent 12 chords respectively, and the 13th chord parameter represents the chord category, namely major chord or minor chord.
例如,以一个目标双音轨音乐文件生成一个和弦矩阵为例。如图5a所示,图5a是本申请实施例提供的24个和弦的示意图。其中,图5a中的major表示大和弦,minor表示小和弦;“13”表示第13个和弦参数,第13个和弦参数为0表示和弦类别为大和弦,第13个和弦参数为1表示和弦类别为小和弦。如图5b所示,图5b是本申请实施例提供的和弦矩阵的示意图。其中,图5b中的Y表示和弦矩阵,Y共有16行13列。Y中的每一行表示一个小节,如第一行表示目标双音轨音乐文件中的第一小节,第四行表示目标双音轨音乐文件中的第四小节等等。Y中的每一列表示一个和弦参数,前12列中的0元素表示没有对应的和弦,前12列中的1元素表示有对应的和弦,且Y每行的前12个元素中有且仅有一个1元素。Y的第13列表示和弦类别,0表示大和弦,1表示小和弦。如图5b所示,第1行第13列元素为1,表示小和弦,那么第1行第2列中的元素1表示目标双音轨音乐文件的第1个小节采用小和弦A#。第2行第13列元素为0,表示大和弦,那么第2行第4列的元素1表示目标双音轨音乐文件的第2个小节采用大和弦D#。因为第16行第13列元素为0,表示大和弦,那么第16行第1列的元素1表示目标双音轨音乐文件的第16个小节采用大和弦C。For example, take a target two-track music file to generate a chord matrix as an example. As shown in Fig. 5a, Fig. 5a is a schematic diagram of 24 chords provided in an embodiment of the present application. Among them, major in Figure 5a represents a major chord, minor represents a minor chord; "13" represents the 13th chord parameter, the 13th chord parameter is 0, the chord category is a major chord, and the 13th chord parameter is 1 represents the chord category It is a minor chord. As shown in Fig. 5b, Fig. 5b is a schematic diagram of a chord matrix provided by an embodiment of the present application. Among them, Y in Figure 5b represents the chord matrix, and Y has 16 rows and 13 columns. Each line in Y represents a measure, for example, the first line represents the first bar in the target dual-track music file, the fourth line represents the fourth bar in the target dual-track music file, and so on. Each column in Y represents a chord parameter, the 0 element in the first 12 columns indicates that there is no corresponding chord, the 1 element in the first 12 columns indicates that there is a corresponding chord, and there is only one of the first 12 elements in each row of Y One 1 element. The 13th column of Y indicates the chord category, 0 indicates a major chord, and 1 indicates a minor chord. As shown in Figure 5b, the element in the first row and the 13th column is 1, indicating a minor chord, then the element 1 in the first row and the second column indicates that the first bar of the target dual-track music file adopts the minor chord A#. The element in the second row and the 13th column is 0, indicating a major chord, then the element 1 in the second row and the fourth column indicates that the second measure of the target dual-track music file uses the major chord D#. Because the element in row 16 and column 13 is 0, indicating a major chord, then the element 1 in row 16 and column 1 indicates that the 16th bar of the target dual-track music file uses a major chord C.
S306,将训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入DCGAN模型中进行训练,得到训练好的DCGAN模型。S306: Input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training, to obtain a trained DCGAN model.
在一些可行的实施方式中,本申请实施例中步骤S306的实现方式可参考图1所示实施例的步骤S103的实现方式,在此不再赘述。In some feasible implementation manners, the implementation manner of step S306 in the embodiment of the present application can refer to the implementation manner of step S103 in the embodiment shown in FIG. 1, which will not be repeated here.
S307,获取目标旋律矩阵。S307: Acquire a target melody matrix.
在一些可行的实施方式中,终端可以从MIDI数据集中获取包括旋律音轨的任一单音轨音乐文件。终端可以去掉该单音轨音乐文件中旋律音符的音高在预设音高范围(C4到B5这两个八度)外的旋律音符,只保留该单音轨音乐文件中旋律音符的音高在预设音高范围内的旋律音符,得到调整后的单音轨音乐文件。终端可以获取调整后的单音轨音乐文件的旋律音符,并可以根据该调整后的单音轨音乐文件的旋律音符生成该调整后单音轨音乐文件的目标旋律矩阵。其中,目标旋律矩阵128*16的二元矩阵。目标旋律矩阵中的元素0 可以用于表示调整后的单音轨音乐文件在该0元素对应的位置上无MIDI音符,目标旋律矩阵中的元素1可以用于表示调整后的单音轨音乐文件在该1元素对应的位置上有对应的MIDI音符。In some feasible implementation manners, the terminal may obtain any single-track music file including a melody track from the MIDI data set. The terminal can remove the melody notes whose pitches of the melody notes in the single-track music file are outside the preset pitch range (two octaves from C4 to B5), and only retain the pitch of the melody notes in the single-track music file The melody notes within the preset pitch range will be adjusted to a single-track music file. The terminal may obtain the adjusted melody note of the single-track music file, and may generate the target melody matrix of the adjusted single-track music file according to the adjusted melody note of the single-track music file. Among them, the target melody matrix is a binary matrix of 128*16. Element 0 in the target melody matrix can be used to indicate that the adjusted single-track music file has no MIDI notes at the position corresponding to the 0 element, and element 1 in the target melody matrix can be used to indicate the adjusted single-track music file There is a corresponding MIDI note at the position corresponding to the 1 element.
S308,将获取到的目标旋律矩阵输入训练好的DCGAN模型中进行处理,并获取训练好的DCGAN模型生成的与目标旋律矩阵匹配的目标和弦矩阵,输出目标旋律矩阵映射出的旋律音轨与目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。S308: Input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix, and output the melody track and target mapped by the target melody matrix The music file after the chord tracks mapped out by the chord matrix are merged.
在一些可行的实施方式中,本申请实施例中步骤S308的实现方式可参考图1所示实施例的步骤S104的实现方式,在此不再赘述。In some feasible implementation manners, the implementation manner of step S308 in the embodiment of the present application can refer to the implementation manner of step S104 in the embodiment shown in FIG. 1, and details are not described herein again.
在本申请实施例中,终端通过构造深度卷积生成式对抗网络DCGAN模型,再获取包括多个双音轨音乐文件的双音轨数据集,从双音轨数据集确定出N个目标双音轨音乐文件,获取N个目标双音轨音乐文件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵,获取N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩阵,得到N个和弦矩阵。将训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入DCGAN模型中进行训练,得到训练好的DCGAN模型。再获取目标旋律矩阵,将目标旋律矩阵输入训练好的DCGAN模型中进行处理,并获取训练好的DCGAN模型生成的与目标旋律矩阵匹配的目标和弦矩阵,输出目标旋律矩阵映射出的旋律音轨与目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。可以自动生成带有和弦匹配的音乐文件,减少人工处理环节。In the embodiment of this application, the terminal constructs a deep convolutional generative confrontation network DCGAN model, and then obtains a dual-track data set including multiple dual-track music files, and determines N target dual-tones from the dual-track data set Track music files, obtain the melody matrix of each target dual-track music file on the melody track in the N target dual-track music files, obtain N melody matrices, and obtain each target dual-tone in the N target dual-track music files Track the chord matrix of the music file on the chord track to obtain N chord matrices. The N melody matrices and the corresponding N chord matrices in the training data set are input into the DCGAN model for training, and a trained DCGAN model is obtained. Then obtain the target melody matrix, input the target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix, and output the melody track mapped by the target melody matrix and The music file after the chord tracks mapped by the target chord matrix are merged. Can automatically generate music files with chord matching, reducing manual processing links.
参见图6,是本申请实施例提供的音乐生成装置的一示意性框图。如图6所示,本申请实施例的音乐生成装置包括:Refer to FIG. 6, which is a schematic block diagram of a music generating apparatus provided by an embodiment of the present application. As shown in FIG. 6, the music generating device of the embodiment of the present application includes:
构造模块10,用于构造深度卷积生成式对抗网络DCGAN模型;The construction module 10 is used to construct a deep convolutional generative confrontation network DCGAN model;
第一获取模块20,用于获取训练数据集,该训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;The first obtaining module 20 is configured to obtain a training data set, the training data set includes N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both binary matrices;
训练模块30,用于将该训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入该DCGAN模型中进行训练,得到训练好的DCGAN模型;The training module 30 is configured to input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model;
输入模块40,用于将获取到的目标旋律矩阵输入该训练好的DCGAN模型中进行处理,并获取该训练好的DCGAN模型生成的与该目标旋律矩阵匹配的目标和弦矩阵;The input module 40 is configured to input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord matrix generated by the trained DCGAN model that matches the target melody matrix;
输出模块50,用于输出该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。The output module 50 is configured to output a music file obtained by merging the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix.
在一些可行的实施方式中,上述DCGAN模型中包括生成器、辨别器以及调节器,该生成器、该辨别器以及该调节器均为卷积神经网络CNN。上述训练模块30具体用于:针对该训练数据集中的任一旋律矩阵i,将该旋律矩阵i输入该DCGAN模型的生成器中生成与该旋律矩阵i匹配的第一和弦矩阵j;将该第一和弦矩阵j以及该旋律矩阵i在该训练数据集中对应的和弦矩阵k输入该DCGAN模型的辨别器中辨别该第一和弦矩阵j与该和弦矩阵k相同的概率;判断该辨别器针对该第一和弦矩阵j输出的概率是否均在预设范围内,若否则将该概率输入该DCGAN模型的调节器中对该生成器的转置卷积层上的参数进行调整,将该旋律矩阵i重新输入经过调整后的生成器中重新生成与该旋律矩阵i匹配的第一和 弦矩阵j,并将重新生成的第一和弦矩阵j以及该旋律矩阵i在该训练数据集中对应的和弦矩阵k输入该DCGAN模型的辨别器中辨别该重新生成的第一和弦矩阵j与该和弦矩阵k相同的概率。当该辨别器针对该生成器生成的各个第一和弦矩阵输出的概率均在预设范围内时,得到训练好的DCGAN模型。In some feasible implementation manners, the aforementioned DCGAN model includes a generator, a discriminator, and a regulator, and the generator, the discriminator, and the regulator are all convolutional neural network CNN. The above-mentioned training module 30 is specifically configured to: for any melody matrix i in the training data set, input the melody matrix i into the generator of the DCGAN model to generate a first chord matrix j that matches the melody matrix i; A chord matrix j and the corresponding chord matrix k of the melody matrix i in the training data set are input into the discriminator of the DCGAN model. The probability that the first chord matrix j is the same as the chord matrix k is judged; Whether the output probabilities of a chord matrix j are all within the preset range, if otherwise, the probability is input into the regulator of the DCGAN model to adjust the parameters on the transposed convolutional layer of the generator, and the melody matrix i is renewed Input the adjusted generator to regenerate the first chord matrix j matching the melody matrix i, and input the regenerated first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set into the The discriminator of the DCGAN model discriminates the probability that the regenerated first chord matrix j is the same as the chord matrix k. When the probability that the discriminator outputs to each first chord matrix generated by the generator is within a preset range, a trained DCGAN model is obtained.
在一些可行的实施方式中,上述第一获取模块20包括第一获取单元201、确定单元202、第二获取单元203以及第三获取单元204。In some feasible implementation manners, the aforementioned first acquiring module 20 includes a first acquiring unit 201, a determining unit 202, a second acquiring unit 203, and a third acquiring unit 204.
上述第一获取单元201,用于获取包括多个双音轨音乐文件的双音轨数据集,该双音轨音乐文件用于表示包含旋律音轨以及和弦音轨的音乐文件;上述确定单元202,用于从该双音轨数据集确定出N个目标双音轨音乐文件;上述第二获取单元203,用于获取该N个目标双音轨音乐文件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵;上述第三获取单元204,用于获取该N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩阵,得到N个和弦矩阵。其中,该目标双音轨音乐文件中的和弦属于预设的基本和弦集合,该基本和弦集合中包括12个大和弦和12个小和弦,该目标双音轨音乐文件的每小节采用一个和弦。The above-mentioned first obtaining unit 201 is configured to obtain a dual-track data set including a plurality of dual-track music files, the dual-track music file is used to represent a music file containing a melody track and a chord track; the above determining unit 202 , Used to determine N target dual-track music files from the dual-track data set; the second acquisition unit 203, used to obtain the target dual-track music files in the N target dual-track music files The melody matrix on the sound track to obtain N melody matrices; the third obtaining unit 204 is used to obtain the chord matrix of each target dual-track music file on the chord track among the N target dual-track music files to obtain N chord matrices. Among them, the chords in the target dual-track music file belong to a preset basic chord set. The basic chord set includes 12 major chords and 12 minor chords. Each measure of the target dual-track music file uses one chord.
在一些可行的实施方式中,上述第二获取单元203具体用于:将该N个目标双音轨音乐文件中各个目标双音轨音乐文件的旋律调整至预设音高范围内;获取调整后各个目标双音轨音乐文件中的旋律音符;根据该调整后各个目标双音轨音乐文件中的旋律音符,生成该调整后各个目标双音轨音乐文件的旋律矩阵,该旋律矩阵为h*w的二元矩阵,该h用于表示预设的音符数,该w用于表示目标双音轨音乐文件的小节数。In some feasible implementation manners, the aforementioned second obtaining unit 203 is specifically configured to: adjust the melody of each target dual-track music file among the N target dual-track music files to within a preset pitch range; Melody notes in each target dual-track music file; according to the adjusted melody notes in each target dual-track music file, generate the adjusted melody matrix of each target dual-track music file, the melody matrix is h*w The h is used to represent the preset number of notes, and the w is used to represent the number of bars of the target two-track music file.
在一些可行的实施方式中,上述第三获取单元204具体用于:获取该N个目标双音轨音乐文件中各个目标双音轨音乐文件的各小节所采用的和弦以及该各小节所采用和弦的和弦类别;根据该各个目标双音轨音乐文件的各小节所采用的和弦以及该各小节所采用和弦的和弦类别,生成该各个目标双音轨音乐文件的和弦矩阵,该和弦矩阵为w*m的二元矩阵,该w用于表示目标双音轨音乐文件的小节数,该m用于表示各小节的和弦参数。In some feasible implementation manners, the above-mentioned third obtaining unit 204 is specifically configured to: obtain the chord adopted by each bar of each target dual-track music file in the N target dual-track music files and the chord adopted by each bar The chord category of each target dual-track music file; according to the chord used in each measure of the target dual-track music file and the chord category of the chord used in each measure, the chord matrix of each target dual-track music file is generated, and the chord matrix is w* A binary matrix of m, the w is used to represent the number of bars of the target two-track music file, and the m is used to represent the chord parameters of each bar.
在一些可行的实施方式中,该装置还包括第二获取模块60。该第二获取模块60,用于获取包括旋律音轨的单音轨音乐文件;将该单音轨音乐文件的旋律调整至预设音高范围内;获取调整后单音轨音乐文件中的旋律音符;根据该调整后单音轨音乐文件中的旋律音符,生成该调整后单音轨音乐文件的目标旋律矩阵。In some feasible implementation manners, the device further includes a second acquisition module 60. The second acquiring module 60 is configured to acquire a single-track music file including a melody track; adjust the melody of the single-track music file to a preset pitch range; acquire the melody in the adjusted single-track music file Note: According to the melody note in the adjusted single-track music file, the target melody matrix of the adjusted single-track music file is generated.
在一些可行的实施方式中,该生成器包括至少一个全连接层和至少一个转置卷积层,该辨别器包括至少一个卷积层和至少一个全连接层,该调节器包括至少一个卷积层和至少一个全连接层,该调节器为反向的生成器。In some feasible embodiments, the generator includes at least one fully connected layer and at least one transposed convolutional layer, the discriminator includes at least one convolutional layer and at least one fully connected layer, and the adjuster includes at least one convolutional layer. Layer and at least one fully connected layer, the regulator is a reverse generator.
具体实现中,上述音乐生成装置可通过上述各个模块执行上述图1或图3所提供的实现方式中各个步骤所提供的实现方式,实现上述各实施例中所实现的功能,具体可参见上述图1或图3所示的方法实施例中各个步骤提供的相应描述,在此不再赘述。In specific implementation, the above-mentioned music generating device can execute the implementation provided by each step in the implementation provided in Figure 1 or Figure 3 through the above-mentioned modules to implement the functions implemented in the above-mentioned embodiments. For details, please refer to the above-mentioned figure. The corresponding description provided in each step in the method embodiment shown in 1 or FIG. 3 will not be repeated here.
在本申请实施例中,音乐生成装置通过构造深度卷积生成式对抗网络DCGAN模型,再获取训练数据集,接着将该训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入该DCGAN模型中进行训练,从而得到训练好的DCGAN模型,然后将获取到的目标旋律矩阵输入该训练好的DCGAN模型中进行处理,并获取该训练好的DCGAN模型生成的与 该目标旋律矩阵匹配的目标和弦矩阵,最后输出该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。可以自动生成带有和弦匹配的音乐文件,减少人工处理环节。In the embodiment of the present application, the music generating device constructs a deep convolutional generative confrontation network DCGAN model, then obtains a training data set, and then inputs the N melody matrices and the corresponding N chord matrices in the training data set to the DCGAN model Training in the DCGAN model to obtain a trained DCGAN model, and then input the obtained target melody matrix into the trained DCGAN model for processing, and obtain the target chord generated by the trained DCGAN model that matches the target melody matrix Matrix, and finally output a music file in which the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix are combined. Can automatically generate music files with chord matching, reducing manual processing links.
参见图7,是本申请实施例提供的终端的一示意性框图。如图7所示,本申请实施例中的终端可以包括:一个或多个处理器701;一个或多个输入设备702,一个或多个输出设备703和存储器704。上述处理器701、输入设备702、输出设备703和存储器704通过总线705连接。存储器702用于存储计算机程序,所述计算机程序包括程序指令,处理器701用于执行存储器702存储的程序指令。Refer to FIG. 7, which is a schematic block diagram of a terminal provided in an embodiment of the present application. As shown in FIG. 7, the terminal in this embodiment of the present application may include: one or more processors 701; one or more input devices 702, one or more output devices 703, and a memory 704. The aforementioned processor 701, input device 702, output device 703, and memory 704 are connected via a bus 705. The memory 702 is configured to store a computer program including program instructions, and the processor 701 is configured to execute the program instructions stored in the memory 702.
其中,处理器701被配置用于调用所述程序指令执行:构造深度卷积生成式对抗网络DCGAN模型;获取训练数据集,该训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;将该训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入该DCGAN模型中进行训练,得到训练好的DCGAN模型。输入设备702用于将获取到的目标旋律矩阵输入该训练好的DCGAN模型中进行处理,并获取该训练好的DCGAN模型生成的与该目标旋律矩阵匹配的目标和弦矩阵。输出设备703用于输出该目标旋律矩阵映射出的旋律音轨与该目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。The processor 701 is configured to call the program instructions to execute: construct a deep convolutional generative confrontation network DCGAN model; obtain a training data set, which includes N melody matrices and corresponding N chord matrices, where The melody matrix and the chord matrix are both binary matrices; the N melody matrices and the corresponding N chord matrices in the training data set are input into the DCGAN model for training, and a trained DCGAN model is obtained. The input device 702 is configured to input the acquired target melody matrix into the trained DCGAN model for processing, and acquire a target chord matrix generated by the trained DCGAN model that matches the target melody matrix. The output device 703 is configured to output a music file in which the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix are combined.
应当理解,在本申请实施例中,所称处理器701可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in the embodiments of the present application, the processor 701 may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors or digital signal processors (DSP). , Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
输入设备702可以包括触控板、麦克风等,输出设备703可以包括显示器(LCD等)、扬声器等。The input device 702 may include a touch panel, a microphone, etc., and the output device 703 may include a display (LCD, etc.), a speaker, etc.
该存储器704可以包括只读存储器和随机存取存储器,并向处理器701提供指令和数据。存储器704的一部分还可以包括非易失性随机存取存储器。例如,存储器704还可以存储设备类型的信息。The memory 704 may include a read-only memory and a random access memory, and provides instructions and data to the processor 701. A part of the memory 704 may also include a non-volatile random access memory. For example, the memory 704 may also store device type information.
具体实现中,本申请实施例中所描述的处理器701、输入设备702、输出设备703可执行本申请实施例提供的基于DCGAN的音乐生成方法的中所描述的实现方式,也可执行本申请实施例所描述的音乐生成装置的实现方式,在此不再赘述。In specific implementation, the processor 701, input device 702, and output device 703 described in the embodiments of this application can perform the implementation described in the DCGAN-based music generation method provided in the embodiments of this application, and can also perform the implementation of this application. The implementation of the music generating device described in the embodiment will not be repeated here.
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序包括程序指令,该程序指令被处理器执行时实现图1或图3所示的基于DCGAN的音乐生成方法,具体细节请参照图1或图3所示实施例的描述,在此不再赘述。The embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and the computer program includes program instructions. When the program instructions are executed by a processor, the computer-readable storage medium shown in FIG. 1 or FIG. For the DCGAN music generation method, please refer to the description of the embodiment shown in FIG. 1 or FIG. 3 for specific details, which will not be repeated here.
上述计算机可读存储介质可以是前述任一实施例所述的音乐生成装置或终端的内部存储单元,例如终端的硬盘或内存。该计算机可读存储介质也可以是该终端的外部存储设备,例如该终端上配备的插接式硬盘,智能存储卡(smart media card,SMC),安全数字(secure  digital,SD)卡,闪存卡(flash card)等。进一步地,该计算机可读存储介质还可以既包括该终端的内部存储单元也包括外部存储设备。该计算机可读存储介质用于存储该计算机程序以及该终端所需的其他程序和数据。该计算机可读存储介质还可以用于暂时地存储已经输出或者将要输出的数据。The above-mentioned computer-readable storage medium may be the internal storage unit of the music generating apparatus or terminal described in any of the foregoing embodiments, such as the hard disk or memory of the terminal. The computer-readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card equipped on the terminal. (flash card) etc. Further, the computer-readable storage medium may also include both an internal storage unit of the terminal and an external storage device. The computer-readable storage medium is used to store the computer program and other programs and data required by the terminal. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (20)

  1. 一种基于DCGAN的音乐生成方法,其特征在于,包括:A DCGAN-based music generation method is characterized in that it includes:
    构造深度卷积生成式对抗网络DCGAN模型;Construct a deep convolutional generative confrontation network DCGAN model;
    获取训练数据集,所述训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;Acquiring a training data set, the training data set including N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both binary matrices;
    将所述训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入所述DCGAN模型中进行训练,得到训练好的DCGAN模型;Input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training, to obtain a trained DCGAN model;
    将获取到的目标旋律矩阵输入所述训练好的DCGAN模型中进行处理,并获取所述训练好的DCGAN模型生成的与所述目标旋律矩阵匹配的目标和弦矩阵,输出所述目标旋律矩阵映射出的旋律音轨与所述目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。Input the obtained target melody matrix into the trained DCGAN model for processing, obtain the target chord matrix generated by the trained DCGAN model and match the target melody matrix, and output the target melody matrix to map out A music file obtained by merging the melody track of and the chord track mapped from the target chord matrix.
  2. 根据权利要求1所述的方法,其特征在于,所述DCGAN模型中包括生成器、辨别器以及调节器,所述生成器、所述辨别器以及所述调节器均为卷积神经网络CNN;The method according to claim 1, wherein the DCGAN model includes a generator, a discriminator, and a regulator, and the generator, the discriminator, and the regulator are all convolutional neural network CNN;
    所述将所述训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入所述DCGAN模型中进行训练,得到训练好的DCGAN模型,包括:The inputting the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model includes:
    针对所述训练数据集中的任一旋律矩阵i,将所述旋律矩阵i输入所述DCGAN模型的生成器中生成与所述旋律矩阵i匹配的第一和弦矩阵j;For any melody matrix i in the training data set, input the melody matrix i into the generator of the DCGAN model to generate a first chord matrix j that matches the melody matrix i;
    将所述第一和弦矩阵j以及所述旋律矩阵i在所述训练数据集中对应的和弦矩阵k输入所述DCGAN模型的辨别器中辨别所述第一和弦矩阵j与所述和弦矩阵k相同的概率;Input the first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set into the discriminator of the DCGAN model to discriminate that the first chord matrix j is the same as the chord matrix k Probability
    判断所述辨别器针对所述第一和弦矩阵j输出的概率是否均在预设范围内,若否则将所述概率输入所述DCGAN模型的调节器中对所述生成器的转置卷积层上的参数进行调整,将所述旋律矩阵i重新输入经过调整后的生成器中重新生成与所述旋律矩阵i匹配的第一和弦矩阵j,并将重新生成的第一和弦矩阵j以及所述旋律矩阵i在所述训练数据集中对应的和弦矩阵k输入所述DCGAN模型的辨别器中辨别所述重新生成的第一和弦矩阵j与所述和弦矩阵k相同的概率;Determine whether the probabilities output by the discriminator for the first chord matrix j are all within a preset range, if not, input the probabilities into the regulator of the DCGAN model to transpose the convolutional layer of the generator The above parameters are adjusted, the melody matrix i is re-input into the adjusted generator to regenerate the first chord matrix j matching the melody matrix i, and the regenerated first chord matrix j and the The probability that the corresponding chord matrix k of the melody matrix i in the training data set is input into the discriminator of the DCGAN model to discriminate that the regenerated first chord matrix j is the same as the chord matrix k;
    当所述辨别器针对所述生成器生成的各个第一和弦矩阵输出的概率均在预设范围内时,得到训练好的DCGAN模型。When the probability that the discriminator outputs for each first chord matrix generated by the generator is within a preset range, a trained DCGAN model is obtained.
  3. 根据权利要求1或2所述的方法,其特征在于,所述获取训练数据集,包括:The method according to claim 1 or 2, wherein the obtaining a training data set comprises:
    获取包括多个双音轨音乐文件的双音轨数据集,所述双音轨音乐文件用于表示包含旋律音轨以及和弦音轨的音乐文件;Acquiring a dual-track data set including a plurality of dual-track music files, where the dual-track music file is used to represent a music file containing a melody track and a chord track;
    从所述双音轨数据集确定出N个目标双音轨音乐文件,所述目标双音轨音乐文件中的和弦属于预设的基本和弦集合,所述基本和弦集合中包括12个大和弦和12个小和弦,所述目标双音轨音乐文件的每小节采用一个和弦;N target dual-track music files are determined from the dual-track data set, the chords in the target dual-track music file belong to a preset basic chord set, and the basic chord set includes 12 major chord sums 12 minor chords, one chord is adopted for each measure of the target dual-track music file;
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵;Obtaining a melody matrix of each target dual-track music file on the melody track among the N target dual-track music files to obtain N melody matrices;
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩阵,得到N个和弦矩阵。Obtain the chord matrix of each target dual-track music file on the chord track among the N target dual-track music files to obtain N chord matrices.
  4. 根据权利要求3所述的方法,其特征在于,所述获取所述N个目标双音轨音乐文 件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵,包括:The method according to claim 3, wherein the obtaining the melody matrix of each target dual-track music file on the melody track among the N target dual-track music files to obtain N melody matrices includes :
    将所述N个目标双音轨音乐文件中各个目标双音轨音乐文件的旋律调整至预设音高范围内;Adjusting the melody of each target dual-track music file in the N target dual-track music files to within a preset pitch range;
    获取调整后各个目标双音轨音乐文件中的旋律音符;Obtain the melody notes in each target dual-track music file after adjustment;
    根据所述调整后各个目标双音轨音乐文件中的旋律音符,生成所述调整后各个目标双音轨音乐文件的旋律矩阵,所述旋律矩阵为h*w的二元矩阵,所述h用于表示预设的音符数,所述w用于表示目标双音轨音乐文件的小节数。According to the adjusted melody notes in each target dual-track music file, the melody matrix of each target dual-track music file after adjustment is generated. The melody matrix is a binary matrix of h*w, and the h is Y represents the preset number of notes, and the w is used to represent the number of bars of the target two-track music file.
  5. 根据权利要求3所述的方法,其特征在于,所述获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩阵,得到N个和弦矩阵,包括:The method according to claim 3, wherein the obtaining the chord matrix of each target dual-track music file in the chord track among the N target dual-track music files to obtain N chord matrices includes :
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件的各小节所采用的和弦以及所述各小节所采用和弦的和弦类别;Acquiring the chord used in each bar of each target dual-track music file in the N target dual-track music files and the chord category of the chord used in each bar;
    根据所述各个目标双音轨音乐文件的各小节所采用的和弦以及所述各小节所采用和弦的和弦类别,生成所述各个目标双音轨音乐文件的和弦矩阵,所述和弦矩阵为w*m的二元矩阵,所述w用于表示目标双音轨音乐文件的小节数,所述m用于表示各小节的和弦参数。According to the chord used in each bar of the target dual-track music file and the chord category of the chord used in each bar, the chord matrix of each target dual-track music file is generated, and the chord matrix is w* A binary matrix of m, where w is used to represent the number of bars of the target two-track music file, and m is used to represent the chord parameters of each bar.
  6. 根据权利要求1-5任意一项所述的方法,其特征在于,所述在将获取到的目标旋律矩阵输入所述训练好的DCGAN模型中进行处理之前,所述方法还包括:The method according to any one of claims 1 to 5, wherein before inputting the acquired target melody matrix into the trained DCGAN model for processing, the method further comprises:
    获取包括旋律音轨的单音轨音乐文件;Obtain a single-track music file including a melody track;
    将所述单音轨音乐文件的旋律调整至预设音高范围内;Adjusting the melody of the single-track music file to a preset pitch range;
    获取调整后单音轨音乐文件中的旋律音符;Get the melody notes in the adjusted single track music file;
    根据所述调整后单音轨音乐文件中的旋律音符,生成所述调整后单音轨音乐文件的目标旋律矩阵。According to the melody notes in the adjusted single-track music file, a target melody matrix of the adjusted single-track music file is generated.
  7. 根据权利要求2所述的方法,其特征在于,所述生成器包括至少一个全连接层和至少一个转置卷积层,所述辨别器包括至少一个卷积层和至少一个全连接层,所述调节器包括至少一个卷积层和至少一个全连接层,所述调节器为反向的生成器。The method according to claim 2, wherein the generator includes at least one fully connected layer and at least one transposed convolutional layer, and the discriminator includes at least one convolutional layer and at least one fully connected layer, so The regulator includes at least one convolutional layer and at least one fully connected layer, and the regulator is an inverted generator.
  8. 一种音乐生成装置,其特征在于,包括:A music generating device, characterized by comprising:
    构造模块,用于构造深度卷积生成式对抗网络DCGAN模型;Construction module, used to construct a deep convolutional generative confrontation network DCGAN model;
    获取模块,用于获取训练数据集,所述训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;An acquiring module, configured to acquire a training data set, the training data set includes N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both binary matrices;
    训练模块,用于将所述训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入所述DCGAN模型中进行训练,得到训练好的DCGAN模型;A training module, configured to input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training, to obtain a trained DCGAN model;
    输入模块,用于将获取到的目标旋律矩阵输入所述训练好的DCGAN模型中进行处理,并获取所述训练好的DCGAN模型生成的与所述目标旋律矩阵匹配的目标和弦矩阵;An input module, configured to input the acquired target melody matrix into the trained DCGAN model for processing, and acquire a target chord matrix generated by the trained DCGAN model that matches the target melody matrix;
    输出模块,用于输出所述目标旋律矩阵映射出的旋律音轨与所述目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。The output module is used to output a music file obtained by combining the melody track mapped from the target melody matrix and the chord track mapped from the target chord matrix.
  9. 根据权利要求8所述的装置,其特征在于,所述DCGAN模型中包括生成器、辨别器以及调节器,所述生成器、所述辨别器以及所述调节器均为卷积神经网络CNN;The device according to claim 8, wherein the DCGAN model includes a generator, a discriminator, and a regulator, and the generator, the discriminator, and the regulator are all convolutional neural network CNN;
    所述训练模块,具体用于:The training module is specifically used for:
    针对所述训练数据集中的任一旋律矩阵i,将所述旋律矩阵i输入所述DCGAN模型的 生成器中生成与所述旋律矩阵i匹配的第一和弦矩阵j;For any melody matrix i in the training data set, input the melody matrix i into the generator of the DCGAN model to generate a first chord matrix j that matches the melody matrix i;
    将所述第一和弦矩阵j以及所述旋律矩阵i在所述训练数据集中对应的和弦矩阵k输入所述DCGAN模型的辨别器中辨别所述第一和弦矩阵j与所述和弦矩阵k相同的概率;Input the first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set into the discriminator of the DCGAN model to discriminate that the first chord matrix j is the same as the chord matrix k Probability
    判断所述辨别器针对所述第一和弦矩阵j输出的概率是否均在预设范围内,若否则将所述概率输入所述DCGAN模型的调节器中对所述生成器的转置卷积层上的参数进行调整,将所述旋律矩阵i重新输入经过调整后的生成器中重新生成与所述旋律矩阵i匹配的第一和弦矩阵j,并将重新生成的第一和弦矩阵j以及所述旋律矩阵i在所述训练数据集中对应的和弦矩阵k输入所述DCGAN模型的辨别器中辨别所述重新生成的第一和弦矩阵j与所述和弦矩阵k相同的概率;Determine whether the probabilities output by the discriminator for the first chord matrix j are all within a preset range, if not, input the probabilities into the regulator of the DCGAN model to transpose the convolutional layer of the generator The above parameters are adjusted, the melody matrix i is re-input into the adjusted generator to regenerate the first chord matrix j matching the melody matrix i, and the regenerated first chord matrix j and the The probability that the corresponding chord matrix k of the melody matrix i in the training data set is input into the discriminator of the DCGAN model to discriminate that the regenerated first chord matrix j is the same as the chord matrix k;
    当所述辨别器针对所述生成器生成的各个第一和弦矩阵输出的概率均在预设范围内时,得到训练好的DCGAN模型。When the probability that the discriminator outputs for each first chord matrix generated by the generator is within a preset range, a trained DCGAN model is obtained.
  10. 根据权利要求8或9所述的装置,其特征在于,所述第一获取模块包括:The device according to claim 8 or 9, wherein the first obtaining module comprises:
    第一获取单元,用于获取包括多个双音轨音乐文件的双音轨数据集,所述双音轨音乐文件用于表示包含旋律音轨以及和弦音轨的音乐文件;The first acquiring unit is configured to acquire a dual-track data set including a plurality of dual-track music files, where the dual-track music file is used to represent a music file containing a melody track and a chord track;
    确定单元,用于从所述双音轨数据集确定出N个目标双音轨音乐文件,所述目标双音轨音乐文件中的和弦属于预设的基本和弦集合,所述基本和弦集合中包括12个大和弦和12个小和弦,所述目标双音轨音乐文件的每小节采用一个和弦;The determining unit is configured to determine N target dual-track music files from the dual-track data set, the chords in the target dual-track music files belong to a preset basic chord set, and the basic chord set includes 12 major chords and 12 minor chords, each measure of the target dual-track music file adopts one chord;
    第二获取单元,用于获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵;The second acquiring unit is configured to acquire the melody matrix of each target double-track music file on the melody track among the N target double-track music files to obtain N melody matrices;
    第三获取单元,用于获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩阵,得到N个和弦矩阵。The third obtaining unit is used to obtain the chord matrix of each target dual-track music file on the chord track among the N target dual-track music files to obtain N chord matrices.
  11. 根据权利要求10所述的装置,其特征在于,所述第二获取单元具体用于:The device according to claim 10, wherein the second acquiring unit is specifically configured to:
    将所述N个目标双音轨音乐文件中各个目标双音轨音乐文件的旋律调整至预设音高范围内;Adjusting the melody of each target dual-track music file in the N target dual-track music files to within a preset pitch range;
    获取调整后各个目标双音轨音乐文件中的旋律音符;Obtain the melody notes in each target dual-track music file after adjustment;
    根据所述调整后各个目标双音轨音乐文件中的旋律音符,生成所述调整后各个目标双音轨音乐文件的旋律矩阵,所述旋律矩阵为h*w的二元矩阵,所述h用于表示预设的音符数,所述w用于表示目标双音轨音乐文件的小节数。According to the adjusted melody notes in each target dual-track music file, the melody matrix of each target dual-track music file after adjustment is generated. The melody matrix is a binary matrix of h*w, and the h is Y represents the preset number of notes, and the w is used to represent the number of bars of the target two-track music file.
  12. 根据权利要求10所述的装置,其特征在于,所述第三获取单元具体用于:The device according to claim 10, wherein the third acquiring unit is specifically configured to:
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件的各小节所采用的和弦以及所述各小节所采用和弦的和弦类别;Acquiring the chord used in each bar of each target dual-track music file in the N target dual-track music files and the chord category of the chord used in each bar;
    根据所述各个目标双音轨音乐文件的各小节所采用的和弦以及所述各小节所采用和弦的和弦类别,生成所述各个目标双音轨音乐文件的和弦矩阵,所述和弦矩阵为w*m的二元矩阵,所述w用于表示目标双音轨音乐文件的小节数,所述m用于表示各小节的和弦参数。According to the chord used in each bar of the target dual-track music file and the chord category of the chord used in each bar, the chord matrix of each target dual-track music file is generated, and the chord matrix is w* A binary matrix of m, where w is used to represent the number of bars of the target two-track music file, and m is used to represent the chord parameters of each bar.
  13. 根据权利要求8-12任意一项所述的装置,其特征在于,所述装置还包括:The device according to any one of claims 8-12, wherein the device further comprises:
    第二获取模块,用于获取包括旋律音轨的单音轨音乐文件,将所述单音轨音乐文件的旋律调整至预设音高范围内,获取调整后单音轨音乐文件中的旋律音符,根据所述调整后单音轨音乐文件中的旋律音符,生成所述调整后单音轨音乐文件的目标旋律矩阵。The second acquisition module is configured to acquire a single-track music file including a melody track, adjust the melody of the single-track music file to a preset pitch range, and acquire the melody notes in the adjusted single-track music file And generating the target melody matrix of the adjusted single-track music file according to the melody notes in the adjusted single-track music file.
  14. 根据权利要求9所述的装置,其特征在于,所述生成器包括至少一个全连接层和至少一个转置卷积层,所述辨别器包括至少一个卷积层和至少一个全连接层,所述调节器包括至少一个卷积层和至少一个全连接层,所述调节器为反向的生成器。The device according to claim 9, wherein the generator includes at least one fully connected layer and at least one transposed convolutional layer, and the discriminator includes at least one convolutional layer and at least one fully connected layer, so The regulator includes at least one convolutional layer and at least one fully connected layer, and the regulator is an inverted generator.
  15. 一种终端,其特征在于,包括处理器、输入设备、输出设备和存储器,所述处理器、输入设备、输出设备和存储器相互连接,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器用于执行所述存储器的所述程序指令,其中:A terminal, characterized in that it includes a processor, an input device, an output device, and a memory, the processor, input device, output device, and memory are connected to each other, and the memory is used to store a computer program, and the computer program includes a program Instructions, the processor is used to execute the program instructions of the memory, wherein:
    所述处理器,用于构造深度卷积生成式对抗网络DCGAN模型;获取训练数据集,所述训练数据集中包括N个旋律矩阵以及对应的N个和弦矩阵,其中旋律矩阵以及和弦矩阵均为二元矩阵;将所述训练数据集中的N个旋律矩阵以及对应的N个和弦矩阵输入所述DCGAN模型中进行训练,得到训练好的DCGAN模型;The processor is used to construct a deep convolutional generative confrontation network DCGAN model; to obtain a training data set, the training data set includes N melody matrices and corresponding N chord matrices, wherein the melody matrix and the chord matrix are both two Meta matrix; input the N melody matrices and the corresponding N chord matrices in the training data set into the DCGAN model for training to obtain a trained DCGAN model;
    所述输入设备,用于将获取到的目标旋律矩阵输入所述训练好的DCGAN模型中进行处理,并获取所述训练好的DCGAN模型生成的与所述目标旋律矩阵匹配的目标和弦矩阵;The input device is configured to input the acquired target melody matrix into the trained DCGAN model for processing, and acquire a target chord matrix generated by the trained DCGAN model that matches the target melody matrix;
    所述输出设备,用于输出所述目标旋律矩阵映射出的旋律音轨与所述目标和弦矩阵映射出的和弦音轨进行合并后的音乐文件。The output device is configured to output a music file obtained by merging a melody track mapped from the target melody matrix and a chord track mapped from the target chord matrix.
  16. 根据权利要求15所述的终端,其特征在于,所述DCGAN模型中包括生成器、辨别器以及调节器,所述生成器、所述辨别器以及所述调节器均为卷积神经网络CNN;The terminal according to claim 15, wherein the DCGAN model includes a generator, a discriminator, and a regulator, and the generator, the discriminator, and the regulator are all convolutional neural networks CNN;
    所述处理器具体用于:The processor is specifically used for:
    针对所述训练数据集中的任一旋律矩阵i,将所述旋律矩阵i输入所述DCGAN模型的生成器中生成与所述旋律矩阵i匹配的第一和弦矩阵j;For any melody matrix i in the training data set, input the melody matrix i into the generator of the DCGAN model to generate a first chord matrix j that matches the melody matrix i;
    将所述第一和弦矩阵j以及所述旋律矩阵i在所述训练数据集中对应的和弦矩阵k输入所述DCGAN模型的辨别器中辨别所述第一和弦矩阵j与所述和弦矩阵k相同的概率;Input the first chord matrix j and the chord matrix k corresponding to the melody matrix i in the training data set into the discriminator of the DCGAN model to discriminate that the first chord matrix j is the same as the chord matrix k Probability
    判断所述辨别器针对所述第一和弦矩阵j输出的概率是否均在预设范围内,若否则将所述概率输入所述DCGAN模型的调节器中对所述生成器的转置卷积层上的参数进行调整,将所述旋律矩阵i重新输入经过调整后的生成器中重新生成与所述旋律矩阵i匹配的第一和弦矩阵j,并将重新生成的第一和弦矩阵j以及所述旋律矩阵i在所述训练数据集中对应的和弦矩阵k输入所述DCGAN模型的辨别器中辨别所述重新生成的第一和弦矩阵j与所述和弦矩阵k相同的概率;Determine whether the probabilities output by the discriminator for the first chord matrix j are all within a preset range, if not, input the probabilities into the regulator of the DCGAN model to transpose the convolutional layer of the generator The above parameters are adjusted, the melody matrix i is re-input into the adjusted generator to regenerate the first chord matrix j matching the melody matrix i, and the regenerated first chord matrix j and the The probability that the corresponding chord matrix k of the melody matrix i in the training data set is input into the discriminator of the DCGAN model to discriminate that the regenerated first chord matrix j is the same as the chord matrix k;
    当所述辨别器针对所述生成器生成的各个第一和弦矩阵输出的概率均在预设范围内时,得到训练好的DCGAN模型。When the probability that the discriminator outputs for each first chord matrix generated by the generator is within a preset range, a trained DCGAN model is obtained.
  17. 根据权利要求15或16所述的终端,其特征在于,所述处理器具体用于:The terminal according to claim 15 or 16, wherein the processor is specifically configured to:
    获取包括多个双音轨音乐文件的双音轨数据集,所述双音轨音乐文件用于表示包含旋律音轨以及和弦音轨的音乐文件;Acquiring a dual-track data set including a plurality of dual-track music files, where the dual-track music file is used to represent a music file containing a melody track and a chord track;
    从所述双音轨数据集确定出N个目标双音轨音乐文件,所述目标双音轨音乐文件中的和弦属于预设的基本和弦集合,所述基本和弦集合中包括12个大和弦和12个小和弦,所述目标双音轨音乐文件的每小节采用一个和弦;N target dual-track music files are determined from the dual-track data set, the chords in the target dual-track music file belong to a preset basic chord set, and the basic chord set includes 12 major chord sums 12 minor chords, one chord is adopted for each measure of the target dual-track music file;
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在旋律音轨上的旋律矩阵,得到N个旋律矩阵;Obtaining a melody matrix of each target dual-track music file on the melody track among the N target dual-track music files to obtain N melody matrices;
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件在和弦音轨上的和弦矩 阵,得到N个和弦矩阵。Obtain the chord matrix of each target dual-track music file on the chord track in the N target dual-track music files, and obtain N chord matrixes.
  18. 根据权利要求17所述的终端,其特征在于,所述处理器还具体用于:The terminal according to claim 17, wherein the processor is further specifically configured to:
    将所述N个目标双音轨音乐文件中各个目标双音轨音乐文件的旋律调整至预设音高范围内;Adjusting the melody of each target dual-track music file in the N target dual-track music files to within a preset pitch range;
    获取调整后各个目标双音轨音乐文件中的旋律音符;Obtain the melody notes in each target dual-track music file after adjustment;
    根据所述调整后各个目标双音轨音乐文件中的旋律音符,生成所述调整后各个目标双音轨音乐文件的旋律矩阵,所述旋律矩阵为h*w的二元矩阵,所述h用于表示预设的音符数,所述w用于表示目标双音轨音乐文件的小节数。According to the adjusted melody notes in each target dual-track music file, the melody matrix of each target dual-track music file after adjustment is generated. The melody matrix is a binary matrix of h*w, and the h is Y represents the preset number of notes, and the w is used to represent the number of bars of the target two-track music file.
  19. 根据权利要求17所述的终端,其特征在于,所述处理器还具体用于:The terminal according to claim 17, wherein the processor is further specifically configured to:
    获取所述N个目标双音轨音乐文件中各个目标双音轨音乐文件的各小节所采用的和弦以及所述各小节所采用和弦的和弦类别;Acquiring the chord used in each bar of each target dual-track music file in the N target dual-track music files and the chord category of the chord used in each bar;
    根据所述各个目标双音轨音乐文件的各小节所采用的和弦以及所述各小节所采用和弦的和弦类别,生成所述各个目标双音轨音乐文件的和弦矩阵,所述和弦矩阵为w*m的二元矩阵,所述w用于表示目标双音轨音乐文件的小节数,所述m用于表示各小节的和弦参数。According to the chord used in each bar of the target dual-track music file and the chord category of the chord used in each bar, the chord matrix of each target dual-track music file is generated, and the chord matrix is w* A binary matrix of m, where w is used to represent the number of bars of the target two-track music file, and m is used to represent the chord parameters of each bar.
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行如权利要求1-7任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program includes program instructions that, when executed by a processor, cause the processor to execute The method of any one of 1-7 is required.
PCT/CN2019/088805 2019-01-23 2019-05-28 Dcgan-based music generation method, and music generation apparatus WO2020151150A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910066130.8 2019-01-23
CN201910066130.8A CN109872708B (en) 2019-01-23 2019-01-23 Music generation method and device based on DCGAN

Publications (1)

Publication Number Publication Date
WO2020151150A1 true WO2020151150A1 (en) 2020-07-30

Family

ID=66918056

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088805 WO2020151150A1 (en) 2019-01-23 2019-05-28 Dcgan-based music generation method, and music generation apparatus

Country Status (2)

Country Link
CN (1) CN109872708B (en)
WO (1) WO2020151150A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763910A (en) * 2020-11-25 2021-12-07 北京沃东天骏信息技术有限公司 Music generation method and device

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111477200B (en) * 2020-04-03 2023-08-25 深圳市人工智能与机器人研究院 Music score file generation method, device, computer equipment and storage medium
CN113012665B (en) * 2021-02-19 2024-04-19 腾讯音乐娱乐科技(深圳)有限公司 Music generation method and training method of music generation model
CN112818164B (en) * 2021-03-24 2023-09-15 平安科技(深圳)有限公司 Music type identification method, device, equipment and storage medium
CN112915525B (en) * 2021-03-26 2023-06-16 平安科技(深圳)有限公司 Game music generation method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573915A (en) * 2003-06-06 2005-02-02 明基电通股份有限公司 Method of creating music file with main melody and accompaniment
CN109086416A (en) * 2018-08-06 2018-12-25 中国传媒大学 A kind of generation method of dubbing in background music, device and storage medium based on GAN
CN109189974A (en) * 2018-08-08 2019-01-11 平安科技(深圳)有限公司 A kind of method for building up, system, equipment and the storage medium of model of wrirting music

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1573915A (en) * 2003-06-06 2005-02-02 明基电通股份有限公司 Method of creating music file with main melody and accompaniment
CN109086416A (en) * 2018-08-06 2018-12-25 中国传媒大学 A kind of generation method of dubbing in background music, device and storage medium based on GAN
CN109189974A (en) * 2018-08-08 2019-01-11 平安科技(深圳)有限公司 A kind of method for building up, system, equipment and the storage medium of model of wrirting music

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AKBARI MOHAMMAD; LIANG JIE: "Semi-Recurrent CNN-Based VAE-GAN for Sequential Data Generation", 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 20 April 2018 (2018-04-20), pages 2321 - 2325, XP033401002, DOI: 10.1109/ICASSP.2018.8461724 *
K1D: "Music Generation", 15 October 2018 (2018-10-15), XP055723613, Retrieved from the Internet <URL:https://www.jianshu.com/p/b4cdc377845f> *
LI-CHIA YANG, SZU-YU CHOU, YI-HSUAN YANG: "MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation", COMPUTER SCIENCE, 18 July 2017 (2017-07-18), pages 1 - 8, XP081399939, DOI: 10.5281/zenodo.1415990 *
LIU HAO-MIN; YANG YI-HSUAN: "Lead Sheet Generation and Arrangement by Conditional Generative Adversarial Network,", 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 20 December 2018 (2018-12-20), pages 722 - 727, XP033502358, DOI: 10.1109/ICMLA.2018.00114 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763910A (en) * 2020-11-25 2021-12-07 北京沃东天骏信息技术有限公司 Music generation method and device

Also Published As

Publication number Publication date
CN109872708A (en) 2019-06-11
CN109872708B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
WO2020151150A1 (en) Dcgan-based music generation method, and music generation apparatus
US10025773B2 (en) System and method for natural language processing using synthetic text
Ni et al. An end-to-end machine learning system for harmonic analysis of music
WO2020248393A1 (en) Speech synthesis method and system, terminal device, and readable storage medium
WO2022121257A1 (en) Model training method and apparatus, speech recognition method and apparatus, device, and storage medium
US20210232929A1 (en) Neural architecture search
WO2019232928A1 (en) Musical model training method, music creation method, devices, terminal and storage medium
JP6793708B2 (en) Music synthesis methods, systems, terminals, computer readable storage media and programs
CN111681631B (en) Collocation harmony method, collocation harmony device, electronic equipment and computer readable medium
CN116072098B (en) Audio signal generation method, model training method, device, equipment and medium
US20210241735A1 (en) Systems, devices, and methods for computer-generated musical compositions
CN104392716B (en) The phoneme synthesizing method and device of high expressive force
US20230230571A1 (en) Audio processing method and apparatus based on artificial intelligence, device, storage medium, and computer program product
CN113470664B (en) Voice conversion method, device, equipment and storage medium
CN113470684A (en) Audio noise reduction method, device, equipment and storage medium
WO2020029382A1 (en) Method, system and apparatus for building music composition model, and storage medium
TWI740315B (en) Sound separation method, electronic and computer readable storage medium
CN109448697B (en) Poem melody generation method, electronic device and computer readable storage medium
CN111477200A (en) Music score file generation method and device, computer equipment and storage medium
WO2021190660A1 (en) Music chord recognition method and apparatus, and electronic device and storage medium
CN113140230A (en) Method, device and equipment for determining pitch value of note and storage medium
US20200243162A1 (en) Method, system, and computing device for optimizing computing operations of gene sequencing system
Chen et al. Lightgrad: Lightweight diffusion probabilistic model for text-to-speech
WO2023245523A1 (en) Method and apparatus for generating training data
CN113112969A (en) Buddhism music score recording method, device, equipment and medium based on neural network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19911264

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19911264

Country of ref document: EP

Kind code of ref document: A1