CN109086416A - A kind of generation method of dubbing in background music, device and storage medium based on GAN - Google Patents

A kind of generation method of dubbing in background music, device and storage medium based on GAN Download PDF

Info

Publication number
CN109086416A
CN109086416A CN201810885875.2A CN201810885875A CN109086416A CN 109086416 A CN109086416 A CN 109086416A CN 201810885875 A CN201810885875 A CN 201810885875A CN 109086416 A CN109086416 A CN 109086416A
Authority
CN
China
Prior art keywords
music
music data
data collection
gan
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810885875.2A
Other languages
Chinese (zh)
Inventor
靳聪
王洪亮
周帜
陈小森
李高玲
李中仝
孙圆圆
李雨静
王南苏
帖云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Communication University of China
Original Assignee
Communication University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Communication University of China filed Critical Communication University of China
Priority to CN201810885875.2A priority Critical patent/CN109086416A/en
Publication of CN109086416A publication Critical patent/CN109086416A/en
Pending legal-status Critical Current

Links

Landscapes

  • Auxiliary Devices For Music (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

This application provides a kind of generation method of dubbing in background music, device and storage medium based on GAN, wherein this method comprises: obtaining MIDI data collection and random sample;The each MIDI data concentrated to MIDI data formats, and obtains the corresponding music data of the MIDI data, constitutes music data collection by the corresponding music data of the MIDI data;According to preset melody condition, the music data collection is screened, the music data collection after being screened;Music data collection input production after the random sample and the screening is fought into network, target is obtained and dubs in background music.Using the method for the embodiment of the present application, the characteristics of it is for production confrontation network, setting melody condition optimizes music data collection, improve the training effect of production confrontation network, further, only need the style of adjustment music data collection, so that it may change the style dubbed in background music that production confrontation network generates, and then increase the flexibility for generation of dubbing in background music.

Description

A kind of generation method of dubbing in background music, device and storage medium based on GAN
Technical field
This application involves voice synthesis field, more particularly, to a kind of generation method of dubbing in background music based on GAN, device and Storage medium.
Background technique
With the development of artificial intelligence, dub in background music to generate is utilized deep learning network mostly.In the related art, pass through depth Dubbing in background music of obtaining of degree learning network has had relatively high musicogenic, and melody is coherent, and harmony also rarely has mistake, but multiple small Connection between section is relatively little, and whole trend is still more random, and style of wrirting music is single, in synthesis process of dubbing in background music, still So need different degrees of manual intervention.Therefore, the training dataset for being more suitable for deep learning Web vector graphic can be obtained, Improve deep learning network training effect, and then by trained deep learning network obtain one more agree with dub in background music, Become the major issue for optimizing generation of dubbing in background music.
Summary of the invention
In view of this, the be designed to provide a kind of generation method of dubbing in background music based on GAN, device and the storage of the application are situated between Matter is dubbed in background music with optimization and generates used training dataset, and production confrontation network generation is combined to dub in background music, and improves generation of dubbing in background music Effect.
In a first aspect, the embodiment of the present application provides a kind of generation method of dubbing in background music based on GAN, wherein include:
Obtain MIDI data collection and random sample;
The each MIDI data concentrated to MIDI data formats, and obtains the corresponding music score number of the MIDI data According to constituting music data collection by the corresponding music data of the MIDI data;
According to preset melody condition, the music data collection is screened, the music data collection after being screened;
Music data collection input production after the random sample and the screening is fought into network G AN, obtains target It dubs in background music.
With reference to first aspect, the embodiment of the present application provides the first possible embodiment of first aspect, wherein right The MIDI data that MIDI data is concentrated formats, after obtaining music data, comprising:
The track for merging similar musical instrument in the music data obtains merging the music data after track.
The possible embodiment of with reference to first aspect the first, the embodiment of the present application provide second of first aspect Possible embodiment, wherein according to preset melody condition, the music data collection is screened, after being screened Music data collection, comprising:
The music data collection after the merging track is screened according to interval relation;And
The music data collection after the merging track is screened according to the frequency of use of note.
The possible embodiment of second with reference to first aspect, the embodiment of the present application provide the third of first aspect Possible embodiment, wherein the music data collection is screened according to interval relation, comprising:
Big music data of the skip frame degree less than 8 degree for filtering out the interval is concentrated from the music data;And
The interval music data that there is no continuously jumping greatly is filtered out from music data concentration;And
It filters out interval from music data concentration to occur after jumping greatly, the music score that the up-downlink direction of interval changes Data.
The possible embodiment of second with reference to first aspect, the embodiment of the present application provide the 4th kind of first aspect Possible embodiment, wherein according to preset melody condition, the music data collection is screened, after being screened After music data collection, comprising: carry out data cleansing to the music data collection after screening according to preset format condition, counted According to the music data collection after cleaning.
The 4th kind of possible embodiment with reference to first aspect, the embodiment of the present application provide the 5th kind of first aspect Possible embodiment, wherein the music data collection input production after the random sample and the screening is fought into network, Target is obtained to dub in background music, comprising:
The random sample is inputted into the production network in the GAN, the music data collection after the screening is inputted Discriminate network in the GAN, the training GAN;
Network is fought using the trained GAN as the production.
With reference to first aspect, the embodiment of the present application provides the 6th kind of possible embodiment of first aspect, wherein will Music data input production after the random sample and the screening fights network, obtains after target dubs in background music, comprising: be based on Hidden Markov Model HMM algorithm and the target are dubbed in background music, and are generated and are dubbed in background music matched harmony accompaniment with the target.
Second aspect, the embodiment of the present application provide a kind of generating means of dubbing in background music based on GAN, comprising:
Acquiring unit, for obtaining MIDI data collection and random sample;
Format conversion unit obtains described for formatting to each MIDI data that MIDI data is concentrated The corresponding music score of MIDI data constitutes music data collection by the corresponding music score of the MIDI data;
Music data screening unit, for screening, obtaining to the music data collection according to preset melody condition Music data collection after screening;
It dubs in background music generation unit, for by the music data collection input production confrontation after the random sample and the screening Network obtains target and dubs in background music.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, wherein include: processor, memory and total Line, the memory are stored with the executable machine readable instructions of the processor, when electronic equipment operation, the processor By bus communication between the memory, the machine readable instructions execute above-mentioned one kind when being executed by the processor Based on GAN dub in background music generation method the step of.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, wherein this is computer-readable to deposit Computer program is stored on storage media, which executes a kind of above-mentioned matching based on GAN when being run by processor The step of happy generation method.
A kind of generation method of dubbing in background music based on GAN provided by the embodiments of the present application, by choosing true MIDI data, by it Music data is converted to, and forms music data collection, and according to preset melody condition, music data collection is screened, into And fighting network by production and generate has dubbing in background music for music data collection style, is directed to the characteristics of production fights network, Setting melody condition optimizes music data collection, improves the training effect of production confrontation network, further, it is only necessary to Adjust the style of music data collection, so that it may change the style dubbed in background music that production confrontation network generates, and then increase and dub in background music The flexibility of generation.
To enable the above objects, features, and advantages of the application to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.
Detailed description of the invention
Technical solution in ord to more clearly illustrate embodiments of the present application, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only some embodiments of the application, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows a kind of flow chart of generation method of dubbing in background music based on GAN provided by the embodiments of the present application;
Fig. 2 shows the schematic diagrames that MIDI data collection provided by the embodiments of the present application is converted to music data collection;
Fig. 3 shows the structural schematic diagram of GAN provided by the embodiments of the present application;
Fig. 4 shows the schematic diagram of Hidden Markov Model provided by the embodiments of the present application;
Fig. 5 shows a kind of structural schematic diagram of generating means of dubbing in background music based on GAN provided by the embodiments of the present application;
Fig. 6 is the schematic diagram of the hardware configuration of electronic equipment provided by the embodiments of the present application.
Specific embodiment
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application Middle attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only It is some embodiments of the present application, instead of all the embodiments.The application being usually described and illustrated herein in the accompanying drawings is real The component for applying example can be arranged and be designed with a variety of different configurations.Therefore, below to the application's provided in the accompanying drawings The detailed description of embodiment is not intended to limit claimed scope of the present application, but is merely representative of the selected reality of the application Apply example.Based on embodiments herein, those skilled in the art institute obtained without making creative work There are other embodiments, shall fall in the protection scope of this application.
For convenient for understanding the present embodiment, first to a kind of dubbing in background music based on GAN disclosed in the embodiment of the present application Generation method describes in detail.
Embodiment one
Fig. 1 shows a kind of flow chart of generation method of dubbing in background music based on GAN provided by the embodiments of the present application, such as Fig. 1 institute Show, the described method comprises the following steps:
S110, MIDI data collection and random sample are obtained;
S120, each MIDI data concentrated to MIDI data format, and it is corresponding to obtain the MIDI data Music score constitutes music data collection by the corresponding music score of the MIDI data;
S130, according to preset melody condition, the music data collection is screened, the music data after being screened Collection;
S140, the music data collection input production after the random sample and the screening is fought into network, obtains mesh Standard configuration is happy.
Specifically, obtaining existing MIDI data, MIDI data collection is formed, existing MIDI number can be specifically crawled from network According to, can also from storage MIDI data the existing MIDI data of first party server pull.It is concentrated from MIDI data and extracts one A MIDI data, formats MIDI data, according to the time step of MIDI data, pitch, bar length and track number Amount, obtains the corresponding music data of the MIDI data, by the above-mentioned means, each MIDI data to MIDI data collection carries out Format conversion, obtains the corresponding music data of each MIDI data, and all music datas are formed music data collection.According to pre- If melody condition music data collection is screened, as the big skip frame degree of interval, interval are continuously jumped greatly in melody direction with And frequency of use of note etc., the music data collection after being screened.By after the screening music data collection and one section with Press proof this input production fights network, obtains target and dubs in background music, wherein according to the different-style for data set of dubbing in background music, can train The production of different-style fights network out, and then dubbing in background music for different-style can be generated.
In the embodiment of the present application, use trained production to fight network G AN as production and fight network.First The structure of GAN is simply introduced, mainly includes two networks: production network and discriminate network in GAN.According to defeated The music data entered generates target in production network and dubs in background music, the purpose is to make the target generated dub in background music as far as possible with music score number According to close.Wherein, music data is the music data that above-mentioned MIDI data obtains after format is converted and is screened, and is true Data, it is data that production network is generated according to music data that target, which is dubbed in background music, is emulation data.The purpose of discriminate network What is be to try to distinguishes target with happy sample music data, that is, distinguishes " true and false " of sample, be production net on earth The emulation data or truthful data that network generates.During training GAN, in fixed production network and discriminate network One, then another network is trained using the training objective of another network, alternating iteration.In alternating iteration In the process, both sides strongly optimize the network of oneself, to form competition confrontation, dynamically balance until both sides reach one, That is Nash Equilibrium, so far GAN training finish.
Using method provided by the embodiments of the present application, it is directed to the characteristics of production fights network, sets melody condition pair Music data collection optimizes, and improves the training effect of production confrontation network, further, it is only necessary to adjust music data collection Style, so that it may change the style dubbed in background music that production confrontation network generates, and then increase the flexibility for generation of dubbing in background music.
Fig. 2 shows the schematic diagrames that MIDI data provided by the embodiments of the present application is converted to music data, as shown in Fig. 2, The each MIDI data concentrated to MIDI data formats, after obtaining the corresponding music data of the MIDI data, Include:
The track for merging similar musical instrument in the music data obtains merging the music data after track;
Music data collection after merging track is formed by the music data after merging track.
It is formatted specifically, extracting the MIDI data that MIDI data is concentrated, it is corresponding to obtain MIDI data Music data, and according to the time step of MIDI data, pitch, bar length and track quantity, obtain a corresponding music score number According to.
In the music data, only there are several notes in some musical instruments, and such case can make data excessively sparse, Therefore, it is necessary to solve the problems, such as this data nonbalance by the track for merging similar musical instrument.According to the similitude of musical instrument, will find pleasure in The track of all musical instruments in modal data, the track with bass, drum, guitar and piano merges respectively.After track being merged Music data form music data collection.
Music data collection after obtaining merging track, it is also necessary in the following manner to the music data after merging track Collection is screened:
According to preset melody condition, the music data collection is screened, the music data collection after being screened, is wrapped It includes:
The music data collection after the merging track is screened according to interval relation;And
The music data collection after the merging track is screened according to the frequency of use of note.
Wherein, the music data collection is screened according to interval relation, comprising:
Big music data of the skip frame degree less than 8 degree for filtering out the interval is concentrated from the music data;And
The interval music data that there is no continuously jumping greatly is filtered out from music data concentration;And
It filters out interval from music data concentration to occur after jumping greatly, the music score that the up-downlink direction of interval changes Data.
Specifically, melody condition needed for screening the music data collection after merging track includes but is not limited to the following contents:
(1) the big skip frame degree of interval is less than 8 degree.(2) when the front and back interval in melody be grading interval when, interval it is big Skip frame degree is less than 5 degree.(3) there is no continuously jumping greatly interval.(4) there is no continuous equidirectional big jumps for interval.(5) Interval carry out greatly jump after, what the direction of next interval changed.(6) when three notes successively rise, big jump occurs Afterwards, a subinterval is followed closely, (7) when three notes successively descend, lesser interval is in front.(8) do not have in melodic line Use tri-tone or diminished fifth.(9) it does not continuously repeat using the same note.(10) not in every trifle strong beat position It sets and reuses same note.(11) same note is not used for multiple times in one section of melody.
Music data collection after screening is formed by the music data that above-mentioned melody conditional filtering comes out.
After the music data collection after being screened, it is also necessary to data cleansing is carried out to the music data collection after screening, it will Music data collection is cut into suitable length, such as sets 4 trifles as a segment, further, gets rid of a part of music score Excessively high perhaps too low the pitch such as pitch lower than C1 or the pitch higher than C8, finally obtains and is most suitable for number of dubbing in background music in data Dub in background music according to collection and generates used music data collection.
For example, one music data of extraction is concentrated to format from MIDI data, according in MIDI data Bar length, time step, pitch and track quantity, obtaining a data tensor is 20 (trifle) × 96 (time step) × 128 (sounds It is high) × 15 (track) music datas.According to the similarity of musical instrument effect in music data, respectively by 15 sounds in music data This 4 tracks of rail and bass, guitar, drum and piano merge.Music data after all merging tracks is formed and is merged The music data collection of music data after track, and according to above-mentioned melody condition to the music score number of the music data after merging track It is screened according to collection, picks out while meeting the music data of melody condition, the music data collection after composition screening.Further , data cleansing is carried out to the music data collection after screening, if 4 trifles get rid of the pitch lower than C1 as a segment Or the pitch higher than C8, finally obtaining a data tensor is 4 (trifle) × 96 (time step) × 84 (pitch) × 4 (track) Music data.
It is by the music data collection after the composition cleaning of the music data of data cleansing, the music data collection after cleaning is defeated Enter production confrontation network, is trained.
Optionally, the music data collection input production after the random sample and the screening is fought into network, obtained Target is dubbed in background music, comprising:
The random sample is inputted into the production network in the GAN, the music data collection after the screening is inputted Discriminate network in the GAN, the training GAN;
Network is fought using the trained GAN as the production.
Specifically, GAN mainly by generating portion (i.e. production network) and differentiates that part (i.e. discriminate network) forms, it will Random sample inputs production network, obtains testing data, by the testing data (emulation data) and music data (true number According to) input discriminate network, discriminate network judges whether currently received data are truthful data, according to this as a result, to Production network returns differential information, corrects production network, production network regenerates testing data, by new number to be measured According to judged in music data input discriminate network, until discriminate network cannot be distinguished true and false, obtain best performance Production network.
Fig. 3 shows the structural schematic diagram of GAN provided by the embodiments of the present application, as shown in figure 3, production network is denoted as G, Discriminate network is denoted as D, and random sample is denoted as z, and random sample is inputted production network, obtains by production network to be measured Testing data G (z) and music data x is inputted discriminate network, then by discriminate network to the music score of input by data G (z) Data x and testing data G (z) export a probability value respectively, and that discriminate network judges input accordingly is music data x Or the testing data G (z) generated, and then also it may determine that the performance of production network.Finally when discriminate network can not When distinguishing music data x and testing data G (z), the performance for being considered as production network has reached optimal.
The differential information that discriminate network returns is used to correct production network, will in the training process, and differential information reaches When to minimum value, then illustrate that production network has reached optimal performance.The target of discriminate network is to generate production network Testing data G (z) on discriminate network performance D (x) and target dub in background music the performance D's (G (z)) on discriminate network Differential information is as big as possible, can thus distinguish, and the target of production network is then to make D (G (z)) and D (x) difference Information is as small as possible, and discriminate network is allowed to cannot distinguish between.Therefore, the optimization process of module is one and vies each other and confront with each other The performance of process, production network and discriminate network is continuously improved during iterating, until final D (G (z)) and happy Unanimously, production network and discriminate network all cannot further optimize the performance D (x) of modal data x at this time, so far, complete The training of production confrontation network.
Above-mentioned production network and discriminate network can use HMM (Hidden MarkovModel, Hidden Markov Model).
GAN network can directly input true data, and the data that GAN network generates can complete approaching to reality number According to the feature is the maximum advantage of GAN network, and the purpose of GAN network is to carry out estimation survey to the potential distribution of data sample It calculates, new data sample is then generated according to estimation result.
After obtaining target by production confrontation network and dubbing in background music, the main rotation that target be dubbed in background music can be identified by HMM algorithm Rule, and generate the harmony accompaniment mutually agreed with the theme that target is dubbed in background music.
Music score input production after the random sample and the screening is fought into network, is obtained after target dubs in background music, packet It includes: being dubbed in background music based on Hidden Markov Model HMM algorithm and the target, generate and dub in background music matched harmony accompaniment with the target.
Specifically, Hidden Markov Model is a kind of model based on probability statistics, think, certain in random process is for the moment The state at quarter, it is only related with the state of its previous moment.Fig. 4 shows Hidden Markov Model provided by the embodiments of the present application Schematic diagram, as shown in figure 4, each circle of first row represents the hidden state at corresponding moment, status switch s0, s1, s2, s3, S4 can not be directly observed.Each circle of second row represents the observation at corresponding moment, status switch o0, o1, o2, o3, o4 It is directly observed, and s0, s1, s2, s3, s4 and o0, o1, o2, o3, o4 is corresponded.
In the embodiment of the present application, observation corresponds to the theme that the target of GAN generation is dubbed in background music, and hidden state corresponds to target Dub in background music the harmony accompaniment information for including, using HMM algorithm, according to current target dub in background music in theme obtain first segment chord Accompaniment, so according to first segment harmony accompaniment calculate second segment harmony accompaniment, according to second segment harmony accompaniment calculate third section and String accompaniment, and so on, the feed forward type carried out repeatedly generates, and obtains the harmony accompaniment to match of dubbing in background music with target.
Melody melody signal is identified using HMM algorithm and automatically generates harmony accompaniment, signal Central Plains can be analyzed well Some temporal characteristics improve the fluency and accuracy of Algorithmic Composition.
Based on the same technical idea, the embodiment of the present application also provides a kind of generating means of dubbing in background music based on GAN, electronics is set Standby and computer storage medium etc., for details, reference can be made to following embodiments.
Embodiment two
Fig. 5 shows a kind of structural schematic diagram of generating means of dubbing in background music based on GAN provided by the embodiments of the present application, such as schemes Shown in 5, a kind of generating means of dubbing in background music based on GAN, comprising:
Acquiring unit 210, for obtaining MIDI data collection and random sample;
Format conversion unit 220 obtains described for formatting to each MIDI data that MIDI data is concentrated The corresponding music score of MIDI data constitutes music data collection by the corresponding music score of the MIDI data;
Music data screening unit 230, for being screened to the music data collection according to preset melody condition, Music data collection after being screened;
Generation unit of dubbing in background music 240, for the music data collection after the random sample and the screening to be inputted production Network G AN is fought, target is obtained and dubs in background music.
Embodiment three
Fig. 6 is the schematic diagram of the hardware configuration of electronic equipment provided by the embodiments of the present application, as shown in fig. 6, the equipment packet It includes:
One or more processors 310 and memory 320, in Fig. 6 by taking a processor 310 as an example.
Processor 310 can be connected with memory 320 by bus or other modes, to be connected by bus in Fig. 6 For.
Memory 320 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey Sequence, non-volatile computer executable program and module, the side of generation of dubbing in background music such as one of the embodiment of the present application based on GAN Corresponding program instruction/the module of method.Processor 310 is by running the non-volatile software program being stored in memory 320, referring to It enables and module is realized in above method embodiment thereby executing the various function application and data processing of server A kind of generation method of dubbing in background music based on GAN.
Memory 320 may include storing program area and storage data area, wherein storing program area can store operation system Application program required for system, at least one function;Storage data area, which can be stored, uses institute according to any approach described above The data etc. of creation.In addition, memory 320 may include high-speed random access memory, it can also include non-volatile memories Device, for example, at least a disk memory, flush memory device or other non-volatile solid state memory parts.In some embodiments In, optional memory 320 includes the memory remotely located relative to processor 310, these remote memories can pass through net Network is connected to the processor for running any approach described above.The example of above-mentioned network includes but is not limited to internet, in enterprise Portion's net, local area network, mobile radio communication and combinations thereof.
One or more of modules are stored in the memory 320, when by one or more of processors 310 execute when, execute one of any of the above-described embodiment of the method based on GAN dub in background music generation method the step of.
Example IV
The embodiment of the present application provides a kind of computer readable storage medium, wherein deposits in computer readable storage medium Computer executable instructions are contained, one of above-mentioned application embodiment is executed when computer executable instructions are run by processor Based on GAN dub in background music generation method the step of.
A kind of computer program product of generation method of dubbing in background music is carried out provided by the embodiment of the present application, including stores processing The computer readable storage medium of the executable non-volatile program code of device, the instruction that said program code includes can be used for holding Row previous methods method as described in the examples, specific implementation can be found in embodiment of the method, and details are not described herein.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed systems, devices and methods, it can be with It realizes by another way.The apparatus embodiments described above are merely exemplary, for example, the division of the unit, Only a kind of logical function partition, there may be another division manner in actual implementation, in another example, multiple units or components can To combine or be desirably integrated into another system, or some features can be ignored or not executed.Another point, it is shown or beg for The mutual coupling, direct-coupling or communication connection of opinion can be through some communication interfaces, device or unit it is indirect Coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in the executable non-volatile computer-readable storage medium of a processor.Based on this understanding, the application The part of part or the technical solution that substantially contributes in other words to the relevant technologies of technical solution can be with software The form of product embodies, which is stored in a storage medium, including some instructions use so that One computer equipment (can be personal computer, server or the network equipment etc.) executes each embodiment institute of the application State all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. is various to deposit Store up the medium of program code.
Finally, it should be noted that embodiment described above, the only specific embodiment of the application, to illustrate the application Technical solution, rather than its limitations, the protection scope of the application is not limited thereto, although with reference to the foregoing embodiments to this Shen It please be described in detail, those skilled in the art should understand that: anyone skilled in the art Within the technical scope of the present application, it can still modify to technical solution documented by previous embodiment or can be light It is readily conceivable that variation or equivalent replacement of some of the technical features;And these modifications, variation or replacement, do not make The essence of corresponding technical solution is detached from the spirit and scope of the embodiment of the present application technical solution, should all cover the protection in the application Within the scope of.Therefore, the protection scope of the application shall be subject to the protection scope of the claim.

Claims (10)

1. a kind of generation method of dubbing in background music based on GAN characterized by comprising
Obtain MIDI data collection and random sample;
The each MIDI data concentrated to MIDI data formats, and obtains the corresponding music data of the MIDI data, Music data collection is constituted by the corresponding music data of the MIDI data;
According to preset melody condition, the music data collection is screened, the music data collection after being screened;
Music data collection input production after the random sample and the screening is fought into network G AN, target is obtained and dubs in background music.
2. a kind of generation method of dubbing in background music based on GAN according to claim 1, which is characterized in that concentrated to MIDI data Each MIDI data format, after obtaining the corresponding music data of the MIDI data, comprising:
The track for merging similar musical instrument in the music data obtains merging the music data after track;
Music data collection after merging track is formed by the music data after merging track.
3. a kind of generation method of dubbing in background music based on GAN according to claim 2, which is characterized in that according to preset melody Condition, screens the music data collection, the music data collection after being screened, comprising:
The music data collection after the merging track is screened according to interval relation;And
The music data collection after the merging track is screened according to the frequency of use of note.
4. a kind of generation method of dubbing in background music based on GAN according to claim 3, which is characterized in that according to interval relation pair The music data collection is screened, comprising:
Big music data of the skip frame degree less than 8 degree for filtering out the interval is concentrated from the music data;And
The interval music data that there is no continuously jumping greatly is filtered out from music data concentration;And
It filters out interval from music data concentration to occur after jumping greatly, the music score number that the up-downlink direction of interval changes According to.
5. a kind of generation method of dubbing in background music based on GAN according to claim 3, which is characterized in that according to preset melody Condition screens the music data collection, after the music data collection after being screened, comprising: according to preset format Condition carries out data cleansing to the music data collection after screening, the music data collection after obtaining data cleansing.
6. a kind of generation method of dubbing in background music based on GAN according to claim 5, which is characterized in that by the random sample Network is fought with the music data collection input production after the screening, target is obtained and dubs in background music, comprising:
The random sample is inputted into the production network in the GAN, it will be described in the music data collection input after the cleaning Discriminate network in GAN, the training GAN;
Network is fought using the trained GAN as the production.
7. a kind of generation method of dubbing in background music based on GAN according to claim 1, which is characterized in that by the random sample Network G AN is fought with the music data input production after the screening, is obtained after target dubs in background music, comprising: be based on hidden Ma Er Section husband model HMM algorithm and the target are dubbed in background music, and are generated and are dubbed in background music matched harmony accompaniment with the target.
8. a kind of generating means of dubbing in background music based on GAN characterized by comprising
Acquiring unit, for obtaining MIDI data collection and random sample;
Format conversion unit obtains the MIDI number for formatting to each MIDI data that MIDI data is concentrated According to corresponding music score, music data collection is constituted by the corresponding music score of the MIDI data;
Music data screening unit, for screening, being screened to the music data collection according to preset melody condition Music data collection afterwards;
It dubs in background music generation unit, for the music data collection input production after the random sample and the screening to be fought net Network obtains target and dubs in background music.
9. a kind of electronic equipment characterized by comprising processor, memory and bus, the memory are stored with the place The executable machine readable instructions of device are managed, when electronic equipment operation, pass through bus between the processor and the memory Communication, the machine readable instructions execute one kind as described in any one of claim 1 to 7 and are based on when being executed by the processor GAN dub in background music generation method the step of.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer journey on the computer readable storage medium Sequence, the computer program execute a kind of dubbing in background music based on GAN as described in any one of claim 1 to 7 when being run by processor The step of generation method.
CN201810885875.2A 2018-08-06 2018-08-06 A kind of generation method of dubbing in background music, device and storage medium based on GAN Pending CN109086416A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810885875.2A CN109086416A (en) 2018-08-06 2018-08-06 A kind of generation method of dubbing in background music, device and storage medium based on GAN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810885875.2A CN109086416A (en) 2018-08-06 2018-08-06 A kind of generation method of dubbing in background music, device and storage medium based on GAN

Publications (1)

Publication Number Publication Date
CN109086416A true CN109086416A (en) 2018-12-25

Family

ID=64834049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810885875.2A Pending CN109086416A (en) 2018-08-06 2018-08-06 A kind of generation method of dubbing in background music, device and storage medium based on GAN

Country Status (1)

Country Link
CN (1) CN109086416A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109545177A (en) * 2019-01-04 2019-03-29 平安科技(深圳)有限公司 A kind of melody is dubbed in background music method and device
CN109872708A (en) * 2019-01-23 2019-06-11 平安科技(深圳)有限公司 A kind of music generating method and device based on DCGAN
CN110136678A (en) * 2019-04-26 2019-08-16 北京奇艺世纪科技有限公司 A kind of music method, apparatus and electronic equipment
CN110136730A (en) * 2019-04-08 2019-08-16 华南理工大学 A kind of automatic allocation system of piano harmony and method based on deep learning
CN110288965A (en) * 2019-05-21 2019-09-27 北京达佳互联信息技术有限公司 A kind of music synthesis method, device, electronic equipment and storage medium
CN110781835A (en) * 2019-10-28 2020-02-11 中国传媒大学 Data processing method and device, electronic equipment and storage medium
CN111488422A (en) * 2019-01-25 2020-08-04 深信服科技股份有限公司 Incremental method and device for structured data sample, electronic equipment and medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106652984A (en) * 2016-10-11 2017-05-10 张文铂 Automatic song creation method via computer

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106652984A (en) * 2016-10-11 2017-05-10 张文铂 Automatic song creation method via computer

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HAO-WEN DONG ET AL.: "MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment", 《CONFERENCE: THE THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI)》 *
IAN SIMON ET AL.: "MySong: automatic accompaniment generation for vocal melodies", 《PROCEEDINGS OF THE SIGCHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS》 *
王坤峰等: "生成式对抗网络GAN 的研究进展与展望", 《自动化学报》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109545177B (en) * 2019-01-04 2023-08-22 平安科技(深圳)有限公司 Melody matching method and device
CN109545177A (en) * 2019-01-04 2019-03-29 平安科技(深圳)有限公司 A kind of melody is dubbed in background music method and device
WO2020151150A1 (en) * 2019-01-23 2020-07-30 平安科技(深圳)有限公司 Dcgan-based music generation method, and music generation apparatus
CN109872708A (en) * 2019-01-23 2019-06-11 平安科技(深圳)有限公司 A kind of music generating method and device based on DCGAN
CN111488422A (en) * 2019-01-25 2020-08-04 深信服科技股份有限公司 Incremental method and device for structured data sample, electronic equipment and medium
CN110136730A (en) * 2019-04-08 2019-08-16 华南理工大学 A kind of automatic allocation system of piano harmony and method based on deep learning
CN110136730B (en) * 2019-04-08 2021-07-20 华南理工大学 Deep learning-based piano and acoustic automatic configuration system and method
CN110136678B (en) * 2019-04-26 2022-06-03 北京奇艺世纪科技有限公司 Music editing method and device and electronic equipment
CN110136678A (en) * 2019-04-26 2019-08-16 北京奇艺世纪科技有限公司 A kind of music method, apparatus and electronic equipment
CN110288965A (en) * 2019-05-21 2019-09-27 北京达佳互联信息技术有限公司 A kind of music synthesis method, device, electronic equipment and storage medium
CN110288965B (en) * 2019-05-21 2021-06-18 北京达佳互联信息技术有限公司 Music synthesis method and device, electronic equipment and storage medium
CN110781835A (en) * 2019-10-28 2020-02-11 中国传媒大学 Data processing method and device, electronic equipment and storage medium
CN110781835B (en) * 2019-10-28 2022-08-23 中国传媒大学 Data processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109086416A (en) A kind of generation method of dubbing in background music, device and storage medium based on GAN
Ellis Prediction-driven computational auditory scene analysis
Hainsworth et al. Particle filtering applied to musical tempo tracking
CN109346045B (en) Multi-vocal part music generation method and device based on long-short time neural network
US8380331B1 (en) Method and apparatus for relative pitch tracking of multiple arbitrary sounds
US11557269B2 (en) Information processing method
CN104217729A (en) Audio processing method, audio processing device and training method
Yildiz et al. A hierarchical neuronal model for generation and online recognition of birdsongs
CN109346043B (en) Music generation method and device based on generation countermeasure network
KR102224070B1 (en) Method for making rhythm game
Manzelli et al. An end to end model for automatic music generation: Combining deep raw and symbolic audio networks
CN109804427A (en) It plays control method and plays control device
Dadman et al. Toward interactive music generation: A position paper
Bonnasse-Gahot An update on the SOMax project
CN113178182A (en) Information processing method, information processing device, electronic equipment and storage medium
Miron Automatic detection of hindustani talas
Pons Puig Deep neural networks for music and audio tagging
Holzapfel et al. Bayesian meter tracking on learned signal representations
Roy et al. Modeling high performance music computing using Petri Nets
Rönnberg Classification of heavy metal subgenres with machine learning
Jiang et al. Piano score-following by tracking note evolution
Vogl Deep Learning Methods for Drum Transcription and Drum Pattern Generation/submitted by Richard Vogl
Asesh Markov chain sequence modeling
US20240087552A1 (en) Sound generation method and sound generation device using a machine learning model
Loyola Polyphonic music generation using neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181225