CN113035161A - Chord-based song melody generation method, device, equipment and storage medium - Google Patents
Chord-based song melody generation method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN113035161A CN113035161A CN202110285841.1A CN202110285841A CN113035161A CN 113035161 A CN113035161 A CN 113035161A CN 202110285841 A CN202110285841 A CN 202110285841A CN 113035161 A CN113035161 A CN 113035161A
- Authority
- CN
- China
- Prior art keywords
- vector
- target
- chord
- pitch
- lyric
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 239000013598 vector Substances 0.000 claims abstract description 352
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 239000011295 pitch Substances 0.000 description 92
- 235000009508 confectionery Nutrition 0.000 description 14
- 238000010586 diagram Methods 0.000 description 6
- 238000010276 construction Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 2
- 235000012907 honey Nutrition 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
- G10H1/0025—Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/101—Music Composition or musical creation; Tools or processes therefor
- G10H2210/105—Composing aid, e.g. for supporting creation, edition or modification of a piece of music
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Auxiliary Devices For Music (AREA)
Abstract
The invention relates to the field of artificial intelligence, and discloses a chord-based song melody generation method, a chord-based song melody generation device, chord-based song melody generation equipment and a chord-based song melody storage medium, which are applied to the field of intelligent education and are used for improving the audibility of song melodies and improving the generation efficiency of song melodies. The method comprises the following steps: acquiring target lyrics input by a user and target chords selected by the user in advance; generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord; combining the target tone vector and the target chord vector to generate a target tone chord vector; inputting the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector; and generating the target song melody according to the target lyric pitch vector and the target lyric time value vector.
Description
Technical Field
The invention relates to the field of audio conversion, in particular to a chord-based song melody generation method, device, equipment and storage medium.
Background
Under normal conditions, the creation of songs is a difficult task and needs to be adjusted according to the comprehensive effect of songs and the inspiration of creators. With the popularization of artificial intelligence technology, it becomes possible to generate songs directly by using the artificial intelligence technology. Because song creation is a comprehensive art, besides word writing, multiple links such as composition and singing are carried out step by step.
The generation of songs needs to consider different music characteristics, the expression forms of the music characteristics are also various, melody is the most important factor influencing the songs, and the melody generated by the existing scheme has low audibility and low melody quality.
Disclosure of Invention
The invention provides a chord-based song melody generation method, device and equipment and a storage medium, which are used for improving the quality of song melodies, improving the audibility of the song melodies and improving the generation efficiency of the song melodies.
A first aspect of an embodiment of the present invention provides a method for generating a song melody based on a chord, including: acquiring target lyrics input by a user and target chords preselected by the user; generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord; combining the target key vector and the target chord vector to generate a target key chord vector; inputting the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric duration vector; and generating a target song melody according to the target lyric pitch vector and the target lyric time value vector.
Optionally, in a first implementation manner of the first aspect of the embodiment of the present invention, before the obtaining the target lyric input by the user and the target chord pre-selected by the user, the method for generating the song melody based on the chord further includes: acquiring preset training data, wherein the preset training data comprises a digital music score of a plurality of songs; and training a preset initial model by using the preset training data to obtain a Transformer model.
Optionally, in a second implementation manner of the first aspect of the embodiment of the present invention, the training a preset initial model by using the preset training data to obtain a Transformer model includes: obtaining a plurality of digital music scores from preset training data, wherein each digital music score is used for indicating the lyrics, the tone, the chord, the pitch and the duration of a song; extracting tone information, chord information, pitch information and duration information of the songs from the digital music score corresponding to each song; sequentially generating a tone vector T, a chord vector C, a pitch vector P and a duration vector D of the song according to the tone information, the chord information, the pitch information and the duration information, and respectively combining the tone vector T, the chord vector C, the pitch vector P and the duration vector D of a plurality of songs to obtain a tone vector sequence T, a chord vector sequence C, a pitch vector sequence P and a duration vector sequence D; and training the preset initial model by using the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P and the duration vector sequence D to obtain the Transformer model.
Optionally, in a third implementation manner of the first aspect of the embodiment of the present invention, the training the preset initial model by using the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P, and the duration vector sequence D to obtain the transform model includes: converting the pitch vector sequence T to [ T ]1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain a first high-dimensional vector Input, where Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]](ii) a Let pitch vector sequence P ═ P1,p2,p3,…,pn]And the sequence of duration vectors D ═ D1,d2,d3,…,dn]Combining to obtain a second high-dimensional vector Output, Output [ [ p ]1,d1],[p2,d2],[p3,d3],…,[pn,dn](ii) a And taking the first high-dimensional vector Input as an Input vector and the second high-dimensional vector Output as an Output vector, and training the preset initial model to obtain a Transformer model.
Optionally, in a fourth implementation manner of the first aspect of the embodiment of the present invention, the extracting tone information, chord information, pitch information, and duration information of the song from the digital music score includes: converting a plurality of digital music scores into an XML format, wherein each digital music score corresponds to one song; reading lyric information from the digital music score in the XML format to obtain song Chinese characters in each digital music score; determining tone information and pitch information corresponding to each word in each song according to the lyric Chinese characters in each digital music score; determining corresponding chord information and time value information according to each digital music score; tone information, chord information, pitch information, and duration information corresponding to the plurality of songs are generated.
Optionally, in a fifth implementation manner of the first aspect of the embodiment of the present invention, the training the preset initial model to obtain the fransformer model by using the first high-dimensional vector Input as an Input vector and the second high-dimensional vector Output as an Output vector includes: inputting the first high-dimensional vector Input into an encoder at the head end of an encoding assembly in a preset initial model, wherein the encoding assembly comprises a plurality of encoders which are connected in sequence, and each encoder comprises a multi-head attention layer and a feedforward network layer; inputting a result output by an encoder at the tail end in an encoding assembly into each multi-head attention layer in a decoding assembly, wherein the decoding assembly comprises a plurality of decoders which are connected in sequence, and each decoder comprises a mask multi-head attention layer, a multi-head attention layer and a feedforward network layer; inputting the second high-dimensional vector Output into a mask multi-head attention layer of a coder at the head end in a decoding assembly in a preset initial model; and inputting the output result of the encoder at the tail end in the decoding assembly into a linear network to obtain a Transformer model.
Optionally, in a sixth implementation manner of the first aspect of the embodiment of the present invention, the generating a target song melody according to the target lyric pitch vector and the target lyric duration vector includes: aligning the target lyric pitch vector with the target lyric time value vector; generating a melody line of the target lyric based on the target lyric pitch vector; generating the beat of the target lyric according to the target lyric time value vector; and generating a target song melody according to the melody line of the target lyric and the beat of the target lyric.
A second aspect of an embodiment of the present invention provides a chord-based song melody generating apparatus, including: the obtaining module is used for obtaining target lyrics input by a user and target chords selected by the user in advance; the first generation module is used for generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord; a combination module for combining the target tone vector and the target chord vector to generate a target tone chord vector; the input module is used for inputting the target tone chord vector into a trained transform model to obtain an output characteristic vector, and the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector; and the second generation module is used for generating the target song melody according to the target lyric pitch vector and the target lyric duration vector.
Optionally, in a first implementation manner of the second aspect of the embodiment of the present invention, the chord-based song melody generating apparatus further includes: the data acquisition module is used for acquiring preset training data, and the preset training data comprises a digital music score of a plurality of songs; and the training module is used for training a preset initial model by using the preset training data to obtain a Transformer model.
Optionally, in a second implementation manner of the second aspect of the embodiment of the present invention, the training module includes: a score obtaining unit for obtaining a plurality of digital scores from preset training data, wherein each digital score is used for indicating lyrics, tone, chord, pitch and duration of a song; an information acquisition unit for extracting tone information, chord information, pitch information and duration information of a song from a digital music score; the sequence generating unit is used for sequentially generating a tone vector T, a chord vector C, a pitch vector P and a duration vector D of the song according to the tone information, the chord information, the pitch information and the duration information, and respectively combining the tone vector T, the chord vector C, the pitch vector P and the duration vector D of a plurality of songs to obtain a tone vector sequence T, a chord vector sequence C, a pitch vector sequence P and a duration vector sequence D; and the model training unit is used for training the preset initial model by utilizing the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P and the duration vector sequence D to obtain the Transformer model.
Optionally, in a third implementation manner of the second aspect of the embodiment of the present invention, the model training unit specifically includes: a first combining subunit for combining the pitch vector sequence T ═ T1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain a first high-dimensional vector Input, where Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]](ii) a A second combination subunit for combining the pitch vector sequence P ═ P1,p2,p3,…,pn]And the sequence of duration vectors D ═ D1,d2,d3,…,dn]Combining to obtain a second high-dimensional vector Output, Output [ [ p ]1,d1],[p2,d2],[p3,d3],…,[pn,dn](ii) a And the training subunit is used for training the preset initial model to obtain the transform model by taking the first high-dimensional vector Input as an Input vector and the second high-dimensional vector Output as an Output vector.
Optionally, in a fourth implementation manner of the second aspect of the embodiment of the present invention, the information obtaining unit is specifically configured to: converting a plurality of digital music scores into an XML format, wherein each digital music score corresponds to one song; reading lyric information from the digital music score in the XML format to obtain song Chinese characters in each digital music score; determining tone information and pitch information corresponding to each word in each song according to the lyric Chinese characters in each digital music score; determining corresponding chord information and time value information according to each digital music score; tone information, chord information, pitch information, and duration information corresponding to the plurality of songs are generated.
Optionally, in a fifth implementation manner of the second aspect of the embodiment of the present invention, the training subunit is specifically configured to: inputting the first high-dimensional vector Input into an encoder at the head end of an encoding assembly in a preset initial model, wherein the encoding assembly comprises a plurality of encoders which are connected in sequence, and each encoder comprises a multi-head attention layer and a feedforward network layer; inputting a result output by an encoder at the tail end in an encoding assembly into each multi-head attention layer in a decoding assembly, wherein the decoding assembly comprises a plurality of decoders which are connected in sequence, and each decoder comprises a mask multi-head attention layer, a multi-head attention layer and a feedforward network layer; inputting the second high-dimensional vector Output into a mask multi-head attention layer of a coder at the head end in a decoding assembly in a preset initial model; and inputting the output result of the encoder at the tail end in the decoding assembly into a linear network to obtain a Transformer model.
Optionally, in a sixth implementation manner of the second aspect of the embodiment of the present invention, the second generating module is specifically configured to: aligning the target lyric pitch vector with the target lyric time value vector; generating a melody line of the target lyric based on the target lyric pitch vector; generating the beat of the target lyric according to the target lyric time value vector; and generating a target song melody according to the melody line of the target lyric and the beat of the target lyric.
A third aspect of embodiments of the present invention provides a chord-based song melody generating device, a memory having instructions stored therein and at least one processor, the memory and the at least one processor being interconnected by a line; the at least one processor invokes the instructions in the memory to cause the chord-based song melody generating device to perform the chord-based song melody generating method described above.
A fourth aspect of an embodiment of the present invention provides a computer-readable storage medium storing instructions that, when executed by a processor, implement the steps of the chord-based song melody generation method according to any one of the above-described embodiments.
According to the technical scheme provided by the embodiment of the invention, the target lyrics input by a user and the target chord selected by the user in advance are obtained; generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord; combining the target tone vector and the target chord vector to generate a target tone chord vector; inputting the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector; and generating the target song melody according to the target lyric pitch vector and the target lyric time value vector. According to the embodiment of the invention, the initial Transformer model is trained through the chord, the pitch, the tone and the time value, and then the trained Transformer model is used for generating the target song melody, so that the quality of the song melody is improved, the audibility of the song melody is improved, and the generation efficiency of the song melody is further improved.
Drawings
FIG. 1 is a diagram of an embodiment of a chord-based song melody generation method according to an embodiment of the present invention;
FIG. 2 is a diagram of another embodiment of a chord-based song melody generation method according to an embodiment of the present invention;
FIG. 3 is a diagram of an embodiment of a chord-based song melody generating apparatus according to an embodiment of the present invention;
FIG. 4 is a diagram of another embodiment of the song melody generating device based on the chord in the embodiment of the present invention;
fig. 5 is a diagram illustrating an embodiment of a chord-based song melody generating apparatus according to an embodiment of the present invention.
Detailed Description
The invention provides a chord-based song melody generation method, device and equipment and a storage medium, which are used for improving the quality of song melodies and the audibility of the song melodies and further improving the generation efficiency of the song melodies.
In order to make the technical field of the invention better understand the scheme of the invention, the embodiment of the invention will be described in conjunction with the attached drawings in the embodiment of the invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," or "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, a flowchart of a method for generating a song melody based on a chord according to an embodiment of the present invention specifically includes:
101. and acquiring the target lyrics input by the user and the target chord selected by the user in advance.
The server obtains target lyrics input by a user and a target chord selected in advance by the user, wherein the target lyrics are character combinations input by the user, and the target chord is the target chord selected by the user in preset chords. Wherein, when the user inputs the word combination thought by himself, the word combination is used as the lyric of the song wanted to be composed, for example, the target lyric can be 'love oneself, love you by someone, this optimistic saying word'. The preset chords include a third chord, a seventh chord, a main fourth chord, a main sixth chord, a secondary sixth chord and the like, and are not described in detail herein.
It is to be understood that the executing body of the present invention may be the chord-based song melody generating apparatus, or may be the server, and is not limited thereto. The embodiment of the present invention is described by taking a server as an execution subject.
102. And generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord.
The server generates a target pitch vector according to the target lyrics and generates a target chord vector according to the target chord.
Wherein, the length of the target lyrics is not limited, for example, the lyrics of song "forget tonight" is "forget tonight and forget tonight, no matter how much ever the sky is career and haijiao, all the miles of China embrace together, wish to be good in the country, go bad in the night, forget tonight, hope tonight, no matter how much the sky is career and haijiao, all the miles of China embrace together, wish to be good in the country, go good in the country, wish to be good in the country; the Qingshan Zhao and the Weitai Zhang are distinguished, no matter new friends are handed over with the old, the Qingshan is invited again in the next year spring, the Qingshan is not old, the Qingshan is good in wish country and good in wish country, and the Gosshan comprises 129 Chinese characters; the lyrics of the song "sweet honey" are that "sweet you laugh to get sweet, as if flowers are bloomed in the spring wind, and bloomed in the spring wind, where you are seen, your smile is familiar with that i do not want to start at a moment, he is in dream, and you are seen in dream of dream, and sweet laugh to get sweet, that is, you are dreamful, where you are seen, your smile is familiar with that i does not want to start at a moment, he is in dream, where you are seen, your smile is familiar with that i do not want to start at a moment, he is in dream, and you are in dream; you are seen in dream, and the sweet laughs are much more sweet, namely you, dreams, and where you are seen, your smile is familiar with the fact that I does not want to start at any time, and then in dream, including 179 Chinese characters.
It should be noted that, for different songs, the lengths of the corresponding lyrics may be different, the numbers of the corresponding chinese characters are also different, the number of generated tones is also different, and one chord may correspond to a plurality of chinese characters (i.e., a repeated chord occurs), so that the lengths of the target pitch vector and the target chord vector are the same, which is convenient for alignment, and details are not repeated here.
103. And combining the target tone vector and the target chord vector to generate the target tone chord vector.
The server combines the target key vector and the target chord vector to generate a target key chord vector.
Specifically, the tone and chord of the lyrics are combined into a higher-dimensional vector. For example, let the pitch vector sequence T ═ T1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain high dimensional quantity Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]]。
104. And inputting the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector.
And the server inputs the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector.
For example, if the Input vector is Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]]Obtaining an Output characteristic vector Output [ [ p ] through a trained Transformer model1,d1],[p2,d2],[p3,d3],…,[pn,dn]Including the vector as a pitch vector p1,p2,p3,…,pnAnd the value vector d1,d2,d3,…,dn。
105. And generating the target song melody according to the target lyric pitch vector and the target lyric time value vector.
Specifically, the server aligns a target lyric pitch vector and a target lyric duration vector; the server generates a melody line of the target lyrics based on the pitch vector of the target lyrics; the server generates the beat of the target lyrics according to the time value vector of the target lyrics; the server generates a target song melody according to the melody line of the target lyric and the beat of the target lyric.
According to the embodiment of the invention, the initial Transformer model is trained through the chord, the pitch, the tone and the time value, and then the trained Transformer model is used for generating the target song melody, so that the quality of the song melody is improved, the audibility of the song melody is improved, and the generation efficiency of the song melody is further improved. And this scheme can be applied to in the wisdom education field to promote the construction in wisdom city.
Referring to fig. 2, another flowchart of the method for generating a song melody based on a chord according to the embodiment of the present invention specifically includes:
201. preset training data is obtained, the preset training data including a digital score of a plurality of songs.
The server obtains preset training data, wherein the preset training data comprises a digital music score of a plurality of songs.
It should be noted that, in this embodiment, the digital music score at least includes lyrics, tones, chords, pitches and duration values, and may further include other musical features, for example, rest characters, end lines, accent marks, and the like, which are not limited herein.
It is to be understood that the executing body of the present invention may be the chord-based song melody generating apparatus, or may be the server, and is not limited thereto. The embodiment of the present invention is described by taking a server as an execution subject.
For example, if a song "keyword" is used as training data, it is necessary to determine a digital music score of the song, including lyrics "love oneself, love you by someone, the optimistic saying words, happiness, i feel good and real, cannot find adjectives, silence under the cover, quickly flood the enthusiasm, only leave words and help words, have a step, when you shout me name in mouth, position of fallen leaves, and spectrum a poem, time is elapsed, our story begins, which is the first time, i see love, can be generous and selfish, you are keywords, not very certain, the best mode of love, is a verb or noun, want to tell you, most naked feelings, but forget words, always disperse, and laugh, i do not worry about laterals, have a step, have a name in mind, position of fallen leaves, spectrum a poem, time is first time, the story is started, the story is first time, i can see love, i can be generous and selfish, i is my keyword, i is hidden in lyrics, represents meaning, is proper noun, and the position of fallen leaves is used for making a poem. Taking the first sentence of lyrics "good for oneself, love you for someone, this optimistic saying word" as an example, the corresponding tones are "3, 4, 3, 2, 4, 3, 4, 1, 0, 1, 2", respectively, wherein the numbers 0, 1, 2, 3, 4 represent the whispering, first, second, third and fourth sounds of the chinese character tone, respectively, the corresponding pitches are "D5, E5, D5, E5, a4, E5, D5, E5, D5, E5, a4, E5, D5, E5, D5, E5, G4", the corresponding values are "0.25, 0.5-, 0.25, 0.5, 0.25 for the remaining corresponding chord, 0.25 for the corresponding _, 0.25, and will not be described herein in detail.
202. And training the preset initial model by using preset training data to obtain a Transformer model.
Specifically, the server acquires a plurality of digital music scores from preset training data, wherein each digital music score is used for indicating the lyrics, the tone, the chord, the pitch and the duration of a song; the server extracts tone information, chord information, pitch information and duration information of the songs from the digital music score; the server sequentially generates a tone vector T, a chord vector C, a pitch vector P and a duration vector D corresponding to each song according to the tone information, the chord information, the pitch information and the duration information, and combines the tone vector T, the chord vector C, the pitch vector P and the duration vector D of a plurality of songs respectively to obtain a tone vector sequence T, a chord vector sequence C, a pitch vector sequence P and a duration vector sequence D; and the server trains a preset initial model by using the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P and the duration vector sequence D to obtain a Transformer model.
Optionally, the training, by the server, of the preset initial model by using the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P, and the duration vector sequence D to obtain the transform model specifically includes: the server sets the pitch vector sequence T to T1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain a first high-dimensional vector Input, where Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]](ii) a The server sets the pitch vector sequence P as [ P ]1,p2,p3,…,pn]And the sequence of duration vectors D ═ D1,d2,d3,…,dn]Combining to obtain a second high-dimensional vector Output, Output [ [ p ]1,d1],[p2,d2],[p3,d3],…,[pn,dn](ii) a And the server takes the first high-dimensional vector Input as an Input vector and takes the second high-dimensional vector Output as an Output vector, and trains a preset initial model to obtain a Transformer model.
Optionally, the extracting, by the server, the key information, chord information, pitch information, and duration information of the song from the digital music score specifically includes: the server converts a plurality of digital music scores into an XML format, wherein each digital music score corresponds to one song; the server reads lyric information from the digital music score in the XML format to obtain song Chinese characters in each digital music score; the server determines tone information and pitch information corresponding to each word in each song according to the lyric Chinese characters in each digital music score; the server determines corresponding chord information and time value information according to each digital music score; the server generates tone information, chord information, pitch information, and duration information corresponding to the plurality of songs.
Optionally, the training, by the server, the preset initial model to obtain the Transformer model by using the first high-dimensional vector Input as an Input vector and the second high-dimensional vector Output as an Output vector specifically includes: the server inputs the first high-dimensional vector Input into an encoder at the head end of an encoding assembly in a preset initial model, the encoding assembly comprises a plurality of encoders which are connected in sequence, and each encoder comprises a multi-head attention layer and a feedforward network layer; the server inputs the output result of the encoder at the tail end in the encoding assembly into each multi-head attention layer in the decoding assembly, the decoding assembly comprises a plurality of decoders which are connected in sequence, and each decoder comprises a mask multi-head attention layer, a multi-head attention layer and a feedforward network layer; the server inputs the second high-dimensional vector Output to a mask multi-head attention layer of an encoder at the head end in a decoding assembly in a preset initial model; and the server inputs the output result of the encoder at the tail end in the decoding assembly into a linear network to obtain a Transformer model.
203. And acquiring the target lyrics input by the user and the target chord selected by the user in advance.
The server obtains target lyrics input by a user and a target chord selected in advance by the user, wherein the target lyrics are character combinations input by the user, and the target chord is the target chord selected by the user in preset chords. Wherein, when the user inputs the word combination thought by himself, the word combination is used as the lyric of the song wanted to be composed, for example, the target lyric can be 'love oneself, love you by someone, this optimistic saying word'. The preset chords include a third chord, a seventh chord, a main fourth chord, a main sixth chord, a secondary sixth chord and the like, and are not described in detail herein.
204. And generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord.
The server generates a target pitch vector according to the target lyrics and generates a target chord vector according to the target chord.
Wherein, the length of the target lyrics is not limited, for example, the lyrics of song "forget tonight" is "forget tonight and forget tonight, no matter how much ever the sky is career and haijiao, all the miles of China embrace together, wish to be good in the country, go bad in the night, forget tonight, hope tonight, no matter how much the sky is career and haijiao, all the miles of China embrace together, wish to be good in the country, go good in the country, wish to be good in the country; the Qingshan Zhao and the Weitai Zhang are distinguished, no matter new friends are handed over with the old, the Qingshan is invited again in the next year spring, the Qingshan is not old, the Qingshan is good in wish country and good in wish country, and the Gosshan comprises 129 Chinese characters; the lyrics of the song "sweet honey" are that "sweet you laugh to get sweet, as if flowers are bloomed in the spring wind, and bloomed in the spring wind, where you are seen, your smile is familiar with that i do not want to start at a moment, he is in dream, and you are seen in dream of dream, and sweet laugh to get sweet, that is, you are dreamful, where you are seen, your smile is familiar with that i does not want to start at a moment, he is in dream, where you are seen, your smile is familiar with that i do not want to start at a moment, he is in dream, and you are in dream; you are seen in dream, and the sweet laughs are much more sweet, namely you, dreams, and where you are seen, your smile is familiar with the fact that I does not want to start at any time, and then in dream, including 179 Chinese characters.
It should be noted that, for different songs, the lengths of the corresponding lyrics may be different, the numbers of the corresponding chinese characters are also different, the number of generated tones is also different, and one chord may correspond to a plurality of chinese characters (i.e., a repeated chord occurs), so that the lengths of the target pitch vector and the target chord vector are the same, which is convenient for alignment, and details are not repeated here.
205. And combining the target tone vector and the target chord vector to generate the target tone chord vector.
The server combines the target key vector and the target chord vector to generate a target key chord vector.
Specifically, the tone and chord of the lyrics are combined into a higher-dimensional vector. For example, let the pitch vector sequence T ═ T1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain high dimensional quantity Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]]。
206. And inputting the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector.
And the server inputs the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector.
For example, if the Input vector is Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]]Obtaining an Output characteristic vector Output [ [ p ] through a trained Transformer model1,d1],[p2,d2],[p3,d3],…,[pn,dn]Including the vector as a pitch vector p1,p2,p3,…,pnAnd the value vector d1,d2,d3,…,dn。
207. And generating the target song melody according to the target lyric pitch vector and the target lyric time value vector.
Specifically, the server aligns a target lyric pitch vector and a target lyric duration vector; the server generates a melody line of the target lyrics based on the pitch vector of the target lyrics; the server generates the beat of the target lyrics according to the time value vector of the target lyrics; the server generates a target song melody according to the melody line of the target lyric and the beat of the target lyric.
According to the embodiment of the invention, the initial Transformer model is trained through the chord, the pitch, the tone and the time value, and then the trained Transformer model is used for generating the target song melody, so that the quality of the song melody is improved, the audibility of the song melody is improved, and the generation efficiency of the song melody is further improved. And this scheme can be applied to in the wisdom education field to promote the construction in wisdom city.
In the above description of the chord-based song melody generating method according to the embodiment of the present invention, referring to fig. 3, a chord-based song melody generating device according to the embodiment of the present invention is described below, and an embodiment of the chord-based song melody generating device according to the embodiment of the present invention includes:
an obtaining module 301, configured to obtain a target lyric input by a user and a target chord pre-selected by the user;
a first generating module 302, configured to generate a target pitch vector according to the target lyric and generate a target chord vector according to the target chord;
a combining module 303, configured to combine the target tone vector and the target chord vector to generate a target tone chord vector;
an input module 304, configured to input the target tone and chord vector into a trained transform model to obtain an output feature vector, where the output feature vector includes a target lyric pitch vector and a target lyric duration vector;
a second generating module 305, configured to generate a target song melody according to the target lyric pitch vector and the target lyric duration vector.
According to the embodiment of the invention, the initial Transformer model is trained through the chord, the pitch, the tone and the time value, and then the trained Transformer model is used for generating the target song melody, so that the quality of the song melody is improved, the audibility of the song melody is improved, and the generation efficiency of the song melody is further improved. And this scheme can be applied to in the wisdom education field to promote the construction in wisdom city.
Referring to fig. 4, another embodiment of the chord-based song melody generating apparatus according to the embodiment of the present invention includes:
an obtaining module 301, configured to obtain a target lyric input by a user and a target chord pre-selected by the user;
a first generating module 302, configured to generate a target pitch vector according to the target lyric and generate a target chord vector according to the target chord;
a combining module 303, configured to combine the target tone vector and the target chord vector to generate a target tone chord vector;
an input module 304, configured to input the target tone and chord vector into a trained transform model to obtain an output feature vector, where the output feature vector includes a target lyric pitch vector and a target lyric duration vector;
a second generating module 305, configured to generate a target song melody according to the target lyric pitch vector and the target lyric duration vector.
Optionally, the chord-based song melody generating apparatus further includes:
a data obtaining module 306, configured to obtain preset training data, where the preset training data includes a digital music score of a plurality of songs;
and the training module 307 is configured to train a preset initial model by using the preset training data to obtain a Transformer model.
Optionally, the training module 307 includes:
a score obtaining unit 3071 for obtaining a plurality of digital scores from preset training data, wherein each digital score is used for indicating lyrics, tone, chord, pitch and duration of a song;
an information obtaining unit 3072 for extracting key information, chord information, pitch information and duration information of the song from the digital music score;
a sequence generating unit 3073, configured to sequentially generate a tone vector T, a chord vector C, a pitch vector P, and a duration vector D of a song according to the tone information, the chord information, the pitch information, and the duration information, and combine the tone vector T, the chord vector C, the pitch vector P, and the duration vector D of a plurality of songs respectively to obtain a tone vector sequence T, a chord vector sequence C, a pitch vector sequence P, and a duration vector sequence D;
and the model training unit 3074 is configured to train the preset initial model by using the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P, and the duration vector sequence D to obtain the Transformer model.
Optionally, the model training unit 3074 includes:
a first combining subunit 30741 for combining the pitch vector sequence T ═ T1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain a first high-dimensional vector Input, where Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]];
A second combination subunit 30742 for combining the pitch vector sequence P ═ P1,p2,p3,…,pn]And the sequence of duration vectors D ═ D1,d2,d3,…,dn]Combining to obtain a second high-dimensional vector Output, Output [ [ p ]1,d1],[p2,d2],[p3,d3],…,[pn,dn];
The training subunit 30743 is configured to train the preset initial model to obtain the Transformer model, where the first high-dimensional vector Input is used as an Input vector, and the second high-dimensional vector Output is used as an Output vector.
Optionally, the information obtaining unit 3072 is specifically configured to:
converting a plurality of digital music scores into an XML format, wherein each digital music score corresponds to one song; reading lyric information from the digital music score in the XML format to obtain song Chinese characters in each digital music score; determining tone information and pitch information corresponding to each word in each song according to the lyric Chinese characters in each digital music score; determining corresponding chord information and time value information according to each digital music score; tone information, chord information, pitch information, and duration information corresponding to the plurality of songs are generated.
Optionally, the training subunit 30743 is specifically configured to:
inputting the first high-dimensional vector Input into an encoder at the head end of an encoding assembly in a preset initial model, wherein the encoding assembly comprises a plurality of encoders which are connected in sequence, and each encoder comprises a multi-head attention layer and a feedforward network layer; inputting a result output by an encoder at the tail end in an encoding assembly into each multi-head attention layer in a decoding assembly, wherein the decoding assembly comprises a plurality of decoders which are connected in sequence, and each decoder comprises a mask multi-head attention layer, a multi-head attention layer and a feedforward network layer; inputting the second high-dimensional vector Output into a mask multi-head attention layer of a coder at the head end in a decoding assembly in a preset initial model; and inputting the output result of the encoder at the tail end in the decoding assembly into a linear network to obtain a Transformer model.
Optionally, the second generating module 305 is specifically configured to:
aligning the target lyric pitch vector with the target lyric time value vector; generating a melody line of the target lyric based on the target lyric pitch vector; generating the beat of the target lyric according to the target lyric time value vector; and generating a target song melody according to the melody line of the target lyric and the beat of the target lyric.
According to the embodiment of the invention, the initial Transformer model is trained through the chord, the pitch, the tone and the time value, and then the trained Transformer model is used for generating the target song melody, so that the quality of the song melody is improved, the audibility of the song melody is improved, and the generation efficiency of the song melody is further improved. And this scheme can be applied to in the wisdom education field to promote the construction in wisdom city.
Fig. 3 to 4 above describe the chord-based song melody generating device in the embodiment of the present invention in detail from the perspective of the modular functional entity, and the chord-based song melody generating apparatus in the embodiment of the present invention is described in detail from the perspective of the hardware processing.
Fig. 5 is a schematic structural diagram of a chord-based song melody generating apparatus 500 according to an embodiment of the present invention, which may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, one or more storage media 530 (e.g., one or more mass storage devices) storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown), each of which may include a series of instruction operations in the chord-based song melody generating apparatus 500. Still further, the processor 510 may be configured to communicate with the storage medium 530 and execute a series of instruction operations in the storage medium 530 on the chord-based song melody generating device 500.
The chord-based song melody generating device 500 may further include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows server, Mac OS X, Unix, Linux, FreeBSD, etc. Those skilled in the art will appreciate that the chord-based song melody generating device configuration shown in fig. 5 does not constitute a limitation of the chord-based song melody generating device, and may include more or less components than those shown, or combine certain components, or a different arrangement of components.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, having stored therein instructions, which, when run on a computer, cause the computer to perform the steps of the chord-based song melody generating method.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A chord-based song melody generating method, comprising:
acquiring target lyrics input by a user and target chords preselected by the user;
generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord;
combining the target key vector and the target chord vector to generate a target key chord vector;
inputting the target tone chord vector into a trained Transformer model to obtain an output characteristic vector, wherein the output characteristic vector comprises a target lyric pitch vector and a target lyric duration vector;
and generating a target song melody according to the target lyric pitch vector and the target lyric time value vector.
2. The chord-based song melody generating method of claim 1, wherein before the obtaining of the target lyrics inputted by the user and the target chord pre-selected by the user, the chord-based song melody generating method further comprises:
acquiring preset training data, wherein the preset training data comprises a digital music score of a plurality of songs;
and training a preset initial model by using the preset training data to obtain a Transformer model.
3. The method of claim 2, wherein the training a preset initial model with the preset training data to obtain a Transformer model comprises:
obtaining a plurality of digital music scores from preset training data, wherein each digital music score is used for indicating the lyrics, the tone, the chord, the pitch and the duration of a song;
extracting tone information, chord information, pitch information and duration information of the songs from the digital music score;
sequentially generating a tone vector T, a chord vector C, a pitch vector P and a duration vector D of the song according to the tone information, the chord information, the pitch information and the duration information, and respectively combining the tone vector T, the chord vector C, the pitch vector P and the duration vector D of a plurality of songs to obtain a tone vector sequence T, a chord vector sequence C, a pitch vector sequence P and a duration vector sequence D;
and training the preset initial model by using the tone vector sequence T, the chord vector sequence C, the pitch vector sequence P and the duration vector sequence D to obtain the Transformer model.
4. The method of claim 3, wherein the training the preset initial model with the pitch vector sequence T, the chord vector sequence C, the pitch vector sequence P, and the duration vector sequence D to obtain the Transformer model comprises:
converting the pitch vector sequence T to [ T ]1,t2,t3,…,tn]And chord vector sequence C ═ C1,c2,c3,…,cn]Combining to obtain a first high-dimensional vector Input, where Input [ [ t ]1,c1],[t2,c2],[t3,c3],…,[tn,cn]];
Let pitch vector sequence P ═ P1,p2,p3,…,pn]And the sequence of duration vectors D ═ D1,d2,d3,…,dn]Combining to obtain a second high-dimensional vector Output, Output [ [ p ]1,d1],[p2,d2],[p3,d3],…,[pn,dn];
And taking the first high-dimensional vector Input as an Input vector and the second high-dimensional vector Output as an Output vector, and training the preset initial model to obtain the Transformer model.
5. The chord-based song melody generating method of claim 3, wherein the extracting of the key information, chord information, pitch information, and duration information of the song from the numeric musical score comprises:
converting a plurality of digital music scores into an XML format, wherein each digital music score corresponds to one song;
reading lyric information from the digital music score in the XML format to obtain song Chinese characters in each digital music score;
determining tone information and pitch information corresponding to each word in each song according to the lyric Chinese characters in each digital music score;
determining corresponding chord information and time value information according to each digital music score;
tone information, chord information, pitch information, and duration information corresponding to the plurality of songs are generated.
6. The method of claim 4, wherein the training the preset initial model to obtain the fransformer model by using the first high-dimensional vector Input as an Input vector and the second high-dimensional vector Output as an Output vector comprises:
inputting the first high-dimensional vector Input into an encoder at the head end of an encoding assembly in a preset initial model, wherein the encoding assembly comprises a plurality of encoders which are connected in sequence, and each encoder comprises a multi-head attention layer and a feedforward network layer;
inputting a result output by an encoder at the tail end in an encoding assembly into each multi-head attention layer in a decoding assembly, wherein the decoding assembly comprises a plurality of decoders which are connected in sequence, and each decoder comprises a mask multi-head attention layer, a multi-head attention layer and a feedforward network layer;
inputting the second high-dimensional vector Output into a mask multi-head attention layer of a coder at the head end in a decoding assembly in a preset initial model;
and inputting the output result of the encoder at the tail end in the decoding assembly into a linear network to obtain a Transformer model.
7. The chord-based song melody generating method of any one of claims 1 to 6, wherein the generating a target song melody according to the target lyric pitch vector and the target lyric duration vector comprises:
aligning the target lyric pitch vector with the target lyric time value vector;
generating a melody line of the target lyric based on the target lyric pitch vector;
generating the beat of the target lyric according to the target lyric time value vector;
and generating a target song melody according to the melody line of the target lyric and the beat of the target lyric.
8. A chord-based song melody generating apparatus, comprising:
the obtaining module is used for obtaining target lyrics input by a user and target chords selected by the user in advance;
the first generation module is used for generating a target tone vector according to the target lyrics and generating a target chord vector according to the target chord;
a combination module for combining the target tone vector and the target chord vector to generate a target tone chord vector;
the input module is used for inputting the target tone chord vector into a trained transform model to obtain an output characteristic vector, and the output characteristic vector comprises a target lyric pitch vector and a target lyric time value vector;
and the second generation module is used for generating the target song melody according to the target lyric pitch vector and the target lyric duration vector.
9. A chord-based song melody generating apparatus, characterized in that the chord-based song melody generating apparatus comprises: a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor invokes the instructions in the memory to cause the chord-based song melody generating device to perform the chord-based song melody generating method of any of claims 1-7.
10. A computer-readable storage medium storing instructions which, when executed by a processor, implement the chord-based song melody generating method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110285841.1A CN113035161B (en) | 2021-03-17 | 2021-03-17 | Song melody generation method, device and equipment based on chord and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110285841.1A CN113035161B (en) | 2021-03-17 | 2021-03-17 | Song melody generation method, device and equipment based on chord and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113035161A true CN113035161A (en) | 2021-06-25 |
CN113035161B CN113035161B (en) | 2024-08-20 |
Family
ID=76471283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110285841.1A Active CN113035161B (en) | 2021-03-17 | 2021-03-17 | Song melody generation method, device and equipment based on chord and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113035161B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113920969A (en) * | 2021-10-09 | 2022-01-11 | 北京灵动音科技有限公司 | Information processing method, information processing device, electronic equipment and storage medium |
CN116343723A (en) * | 2023-03-17 | 2023-06-27 | 广州趣研网络科技有限公司 | Melody generation method and device, storage medium and computer equipment |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004302318A (en) * | 2003-03-31 | 2004-10-28 | Doshisha | System, apparatus, and method for music data generation |
CN103902642A (en) * | 2012-12-21 | 2014-07-02 | 香港科技大学 | Music composition system using correlation between melody and lyrics |
JP2014170146A (en) * | 2013-03-05 | 2014-09-18 | Univ Of Tokyo | Method and device for automatically composing chorus from japanese lyrics |
KR101554662B1 (en) * | 2014-04-29 | 2015-09-21 | 김명구 | Method for providing chord for digital audio data and an user terminal thereof |
CN109166564A (en) * | 2018-07-19 | 2019-01-08 | 平安科技(深圳)有限公司 | For the method, apparatus and computer readable storage medium of lyrics text generation melody |
CN109859739A (en) * | 2019-01-04 | 2019-06-07 | 平安科技(深圳)有限公司 | Melody generation method, device and terminal device based on speech synthesis |
CN112309353A (en) * | 2020-10-30 | 2021-02-02 | 北京有竹居网络技术有限公司 | Composing method and device, electronic equipment and storage medium |
CN112489606A (en) * | 2020-11-26 | 2021-03-12 | 北京有竹居网络技术有限公司 | Melody generation method, device, readable medium and electronic equipment |
-
2021
- 2021-03-17 CN CN202110285841.1A patent/CN113035161B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004302318A (en) * | 2003-03-31 | 2004-10-28 | Doshisha | System, apparatus, and method for music data generation |
CN103902642A (en) * | 2012-12-21 | 2014-07-02 | 香港科技大学 | Music composition system using correlation between melody and lyrics |
JP2014170146A (en) * | 2013-03-05 | 2014-09-18 | Univ Of Tokyo | Method and device for automatically composing chorus from japanese lyrics |
KR101554662B1 (en) * | 2014-04-29 | 2015-09-21 | 김명구 | Method for providing chord for digital audio data and an user terminal thereof |
CN109166564A (en) * | 2018-07-19 | 2019-01-08 | 平安科技(深圳)有限公司 | For the method, apparatus and computer readable storage medium of lyrics text generation melody |
CN109859739A (en) * | 2019-01-04 | 2019-06-07 | 平安科技(深圳)有限公司 | Melody generation method, device and terminal device based on speech synthesis |
CN112309353A (en) * | 2020-10-30 | 2021-02-02 | 北京有竹居网络技术有限公司 | Composing method and device, electronic equipment and storage medium |
CN112489606A (en) * | 2020-11-26 | 2021-03-12 | 北京有竹居网络技术有限公司 | Melody generation method, device, readable medium and electronic equipment |
Non-Patent Citations (1)
Title |
---|
KYOYUN CHOI 等: "Chord_Conditioned_Melody_Generation_With_Transformer_Based_Decoders", IEEE, vol. 9, pages 42071, XP011844616, DOI: 10.1109/ACCESS.2021.3065831 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113920969A (en) * | 2021-10-09 | 2022-01-11 | 北京灵动音科技有限公司 | Information processing method, information processing device, electronic equipment and storage medium |
CN116343723A (en) * | 2023-03-17 | 2023-06-27 | 广州趣研网络科技有限公司 | Melody generation method and device, storage medium and computer equipment |
CN116343723B (en) * | 2023-03-17 | 2024-02-06 | 广州趣研网络科技有限公司 | Melody generation method and device, storage medium and computer equipment |
Also Published As
Publication number | Publication date |
---|---|
CN113035161B (en) | 2024-08-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Petermann | The musical novel: Imitation of musical structure, performance, and reception in contemporary fiction | |
US7696426B2 (en) | Recombinant music composition algorithm and method of using the same | |
CN111554255B (en) | MIDI playing style automatic conversion system based on recurrent neural network | |
KR102367772B1 (en) | Method and Apparatus for Generating Music Based on Deep Learning | |
CN113035161B (en) | Song melody generation method, device and equipment based on chord and storage medium | |
Swain | Harmonic rhythm: Analysis and interpretation | |
CN113010730A (en) | Music file generation method, device, equipment and storage medium | |
CN108922505B (en) | Information processing method and device | |
Wu et al. | MelodyGLM: multi-task pre-training for symbolic melody generation | |
CN106898341B (en) | Personalized music generation method and device based on common semantic space | |
Fifield | The German symphony between Beethoven and Brahms: the fall and rise of a genre | |
Dubnov et al. | Deep and shallow: Machine learning in music and audio | |
CN116052621A (en) | Music creation auxiliary method based on language model | |
Jensen | Evolutionary music composition: A quantitative approach | |
Madhumani et al. | Automatic neural lyrics and melody composition | |
CN111627410B (en) | MIDI multi-track sequence representation method and application | |
Wu et al. | MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing | |
KR102227415B1 (en) | System, device, and method to generate polyphonic music | |
CN113096624A (en) | Method, device, equipment and storage medium for automatically creating symphony music | |
Oura et al. | Parsing and memorizing tonal and modal melodies 1 | |
Wentink | Creating and evaluating a lyrics generator specialized in rap lyrics with a high rhyme density | |
JP3571925B2 (en) | Voice information processing device | |
CN113053355B (en) | Human voice synthesizing method, device, equipment and storage medium for Buddha music | |
Sela | Giovanni Bassano's divisions: a computational approach to analyzing the gap between theory and practice | |
Dai et al. | An Efficient AI Music Generation mobile platform Based on Machine Learning and ANN Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |