WO2020015153A1 - Method and device for generating music for lyrics text, and computer-readable storage medium - Google Patents

Method and device for generating music for lyrics text, and computer-readable storage medium Download PDF

Info

Publication number
WO2020015153A1
WO2020015153A1 PCT/CN2018/106267 CN2018106267W WO2020015153A1 WO 2020015153 A1 WO2020015153 A1 WO 2020015153A1 CN 2018106267 W CN2018106267 W CN 2018106267W WO 2020015153 A1 WO2020015153 A1 WO 2020015153A1
Authority
WO
WIPO (PCT)
Prior art keywords
lyrics
feature
text
melody
rhythm
Prior art date
Application number
PCT/CN2018/106267
Other languages
French (fr)
Chinese (zh)
Inventor
刘奡智
王健宗
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020015153A1 publication Critical patent/WO2020015153A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • G10H1/0025Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/40Rhythm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor
    • G10H2210/111Automatic composing, i.e. using predefined musical rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/341Rhythm pattern selection, synthesis or composition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/121Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/121Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
    • G10H2240/131Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
    • G10H2240/141Library retrieval matching, i.e. any of the steps of matching an inputted segment or phrase with musical database contents, e.g. query by humming, singing or playing; the steps may include, e.g. musical analysis of the input, musical feature extraction, query formulation, or details of the retrieval process

Definitions

  • the present disclosure relates to the field of Internet technologies, and in particular, to a method, an apparatus, and a computer-readable storage medium for generating music for lyrics text.
  • an object of the present application is to provide a method, an apparatus, and a computer-readable storage medium for generating music for lyrics text.
  • a method for generating music for lyrics text based on a random forest includes: obtaining lyrics text, the lyrics text being a sequence of several words; performing feature extraction on the lyrics text to obtain the sequence-mapped text Feature; perform feature matching between the text feature and the lyrics feature in the corpus to obtain the lyrics feature corresponding to the text feature; perform the word recognition in the sequence on the obtained lyrics feature through the trained random forest classifier Corresponding to the prediction of melody and rhythm, music data adapted to the lyrics text is generated.
  • an apparatus for generating music for lyrics text based on a random forest includes: an acquisition module configured to obtain lyrics text, the lyrics text is a sequence of several words; a text feature extraction module is configured to The lyrics text is subjected to feature extraction to obtain the text features of the sequence map; a feature matching module is configured to perform feature matching between the text features and the lyrics features in the corpus to obtain the lyrics features corresponding to the text features; music data generation A module configured to predict the melody and rhythm corresponding to the words in the sequence by using the trained random forest classifier to generate the lyrics data corresponding to the lyrics text.
  • an apparatus for generating music for lyrics text based on a random forest includes a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to execute the lyrics text as described above. Method of generating music.
  • a computer-readable storage medium has stored thereon a computer program that, when executed by a processor, implements the method for generating a musical composition for lyrics text as described above.
  • the lyrics data corresponding to the lyrics text is automatically generated based on the obtained lyrics text, and the user does not need to master professional
  • the music knowledge can realize the composition of the music according to the lyrics text, so that the general public can use this disclosure to automatically generate the music based on the lyrics text.
  • Fig. 1 is a schematic diagram of an implementation environment according to the present disclosure, according to an exemplary embodiment.
  • Fig. 2 is a block diagram of a device 200 according to an exemplary embodiment.
  • Fig. 3 is a flow chart showing a method for generating music for lyrics text according to an exemplary embodiment.
  • FIG. 4 is a flowchart of steps before step S110 of the embodiment shown in FIG. 3 in an exemplary embodiment.
  • FIG. 5 is a flowchart of an exemplary embodiment of step S170 of the embodiment shown in FIG. 3.
  • FIG. 6 is a flowchart of step S175 in the embodiment shown in FIG. 5 in an exemplary embodiment.
  • FIG. 7 is a flowchart of step S303 in the embodiment shown in FIG. 6 in an exemplary embodiment.
  • Fig. 8 is a block diagram of a device for generating music for lyrics text according to an exemplary embodiment.
  • Fig. 9 is a block diagram of a device for generating music for lyrics text according to another exemplary embodiment.
  • FIG. 10 is a block diagram of a module 170 of the embodiment shown in FIG. 8 in an exemplary embodiment.
  • FIG. 11 is a block diagram of a musical piece data generating unit 175 of the embodiment shown in FIG. 10 in an exemplary embodiment.
  • FIG. 12 is a block diagram of the note information combining unit 303 of the embodiment shown in FIG. 11 in an exemplary embodiment.
  • Fig. 1 is a schematic diagram of an implementation environment according to the present disclosure, according to an exemplary embodiment.
  • the implementation environment is a terminal 100 and a server 200 that establish a network communication connection, where the server 200 is implemented as a back end for generating music for lyrics text by the present disclosure.
  • the terminal 100 may be a computer, a smart phone, or other communication devices capable of generating client music for lyrics and text, and a communication device having a network connection function, which is not limited herein.
  • the terminal 100 can initiate a request to generate music data and provide lyrics text, so that the server 300 receives the request initiated by the terminal 100, and generates music data for the lyrics text according to the lyrics text provided by the terminal 100, and then outputs the generated music data to
  • the server 300 may be a web server or an APP server.
  • Fig. 2 is a block diagram showing a hardware structure of a server according to an exemplary embodiment.
  • the server can be used to generate music data for lyrics text and deployed in the implementation environment shown in FIG. 1.
  • the server 200 is only an example adapted to the present disclosure, and cannot be considered to provide any limitation on the scope of use of the present disclosure.
  • the server 200 cannot also be interpreted as needing to rely on or having one or more of the components shown in FIG. 2.
  • the server 200 includes: a power supply 210, an interface 230, at least one memory 250, and at least one processor (CPU, Central Processing Units) 270.
  • the power supply 210 is used to provide working voltages for each hardware device on the server 200.
  • the interface 230 includes at least one wired or wireless network interface 231, at least one serial-to-parallel conversion interface 233, at least one input-output interface 235, and at least one USB interface 237, etc., for communicating with external devices.
  • the terminal 100 in the implementation environment of FIG. 1 may be communicated through a wireless network interface.
  • the memory 250 serves as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk, or an optical disk.
  • the resources stored on the memory 250 include an operating system 251, an application program 253, and data 255.
  • the storage method may be temporary storage or permanent storage.
  • the operating system 251 is used to manage and control various hardware devices and application programs 253 on the server 200 to implement the calculation and processing of massive data 255 by the processor 270, which may be Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the application program 253 is a computer program that completes at least one specific task based on the operating system 251. It may include at least one module (not shown in FIG. 2), and each module may respectively contain a series of computers for the server 200. Readable instructions.
  • the data 255 may be photos, pictures, and the like stored on the disk.
  • the processor 270 may include one or more processors, and is configured to communicate with the memory 250 through a bus, for calculating and processing the massive data 255 in the memory 250. As described in detail above, the server 200 to which the present disclosure is applied will complete the generation of music for the lyrics text by the processor 270 reading a series of computer-readable instructions stored in the memory 250.
  • the present disclosure can also be implemented by a hardware circuit or a hardware circuit in combination with software. Therefore, the implementation of the present disclosure is not limited to any specific hardware circuit, software, and a combination of the two.
  • Fig. 3 is a flowchart illustrating a method for generating music for lyrics text according to an exemplary embodiment. As shown in Fig. 3, the method in this embodiment includes:
  • step S110 the lyrics text is obtained, and the lyrics text is a sequence composed of several words.
  • the word is the smallest unit in the lyrics text.
  • each word in the lyrics text is the word of the lyrics text.
  • the language of the lyrics text is not limited.
  • the words in the Chinese text are the words of the lyrics text, and the words of the English text are the words of the lyrics text, which is not limited here.
  • the obtained lyrics text may be the lyrics text input by the user in the interactive interface.
  • the length of the lyrics text is not limited, and it can be a sentence or a paragraph of text.
  • the interactive interface may also display the lyrics text recommended by the server. The user may select the recommended lyrics text on the interactive interface, and the server inputs the selected lyrics text to the server according to the user's selection operation. .
  • the server also provides an option for adjusting the number of words in the lyrics text, so that the user can enter text and adjust the number of words in the text.
  • Step S130 Perform feature extraction on the lyrics text to obtain the text features of the sequence map.
  • Feature extraction of the lyrics text means obtaining the syllable information corresponding to the lyrics text from the lyrics text, that is, reflecting the syllable information of the lyrics text with the text features, wherein each word in the lyrics text corresponds to a syllable.
  • the text features reflecting the syllable information of the lyrics text include, but are not limited to, the syllable type, the number of syllables, the word frequency, and the word rareness of each word in the lyrics text.
  • the syllable type refers to the classification of the syllable corresponding to the word in the lyrics, which can be: single syllable, start syllable, central syllable, end syllable, and the like.
  • the number of syllables refers to the number of syllables
  • the word frequency refers to the frequency of a word in the lyrics text; the word rareness is a function of the word frequency, where:
  • the lyrics text uniquely corresponds to the text feature, that is, the extracted text feature is a sequence mapped text feature, so that the lyrics text can be described by the syllable information reflected by the text feature.
  • text feature extraction can be performed in a python programming manner, and a method for extracting text features such as the syllable type, word frequency, and word rareness of each word in the lyrics text is set in a pre-written program.
  • text feature extraction by deep neural networks is also applicable to the present disclosure, which is not limited herein.
  • Step S150 Perform feature matching between the text features and the lyrics features in the corpus to obtain the lyrics features corresponding to the text features.
  • the corpus is composed of music files of several songs.
  • the music files include the lyrics features, rhythm features and melody features extracted for each song.
  • the music file exists in XML format in the corpus. It is worth noting that the corpus has been constructed before text feature matching. The detailed construction of the corpus is described below.
  • the lyrics features similar to the text features that is, the lyrics features corresponding to the text features, can be matched from the corpus. Therefore, in the subsequent steps, the rhythm and melody are predicted according to the matched lyrics characteristics.
  • the matching between the text features and lyrics features in the corpus includes the syllable type, number of syllables, word frequency, and word rareness Matching, so that through matching, lyrics features similar to the syllable type, number of syllables, word frequency, and word rareness of the lyrics text can be obtained from the corpus.
  • step S170 the obtained lyrics feature is used to predict the melody and rhythm corresponding to the words in the sequence through the trained random forest classifier to generate music data adapted to the lyrics text.
  • the duration of the notes in the composition of the music is the rhythm of the composition.
  • Rhythm characteristics are used to reflect the duration information of the notes, such as the start of the note and the duration.
  • the rhythm characteristics may include: a tick mark, an offset, a measurement offset, a time value, and the like.
  • the tempo sign refers to the tempo sign of the note in the music corresponding to the lyrics;
  • the offset refers to the number of beats before the start of the music;
  • the measurement offset refers to the number of beats before the start of the measurement;
  • the time value refers to the corresponding note of a word in the music The length of time.
  • the melody of each note in the composition is the melody of the composition.
  • Melody features are used to reflect the pitch information of a note.
  • the melody characteristics may include: pitch symbols, tone levels, temporary marks, weak beats, and the like. Among them, the tonality symbol refers to the tone symbol corresponding to a word in the lyrics; the tone level refers to the tone level corresponding to a word in the lyrics; the temporary mark refers to the diacritical mark placed directly before the note; the weak beat means no strong The unit of sound is beat.
  • Lyric characteristics may include: syllable type, number of syllables, word frequency, and word rareness of each word in the lyrics text.
  • rhythm feature and melody feature may further include features other than the features listed above, or a combination of some features listed above and other features not listed above.
  • the lyrics features in the corpus correspond to the rhythm and melody features. Therefore, on the basis of matching the text features of the lyrics text with the lyrics features in the corpus, the corresponding melody and rhythm can be predicted according to the lyrics features corresponding to the obtained text features. Among them, predicting the rhythm and melody, that is, predicting the rhythm and melody characteristics through the lyrics characteristics corresponding to the text characteristics, and then generating the rhythm and melody based on the predicted rhythm and melody characteristics, and then generating music data.
  • the random forest classifier is a combination of multiple decision trees constructed by features, where each feature constitutes a node of the decision tree. Therefore, the random forest classifier can combine features non-linearly, and does not require a large amount of sample data to train the random forest classifier, so that it can train the random forest classification without a large amount of sample data on the basis of ensuring the quality of the generated music Device.
  • the prediction is performed according to the characteristics of each node of the random forest classifier.
  • the lyrics feature prediction as an example, that is, the input of the random forest classifier is the lyrics feature, and the output is the rhythm feature.
  • the predicted rhythm is constructed based on the lyrics features in the corpus.
  • Random forest classifier that is, the lyrics features in the corpus determine the characteristics and judgment conditions of each node of the random forest classifier. For example, the number of syllables, syllable type, word frequency, and word rareness are set at random from top to bottom.
  • the nodes of the forest classifier decision tree are then determined according to the judgment conditions on each node, for example, on this node of the number of syllables: if the number of syllables ⁇ 3, the corresponding time value is output; if the number of syllables> 3, according to the syllable type Judging a certain time value of the output.
  • judgment conditions and output based on different conditions on syllable type, word frequency, and sub-rare degree, so that the time value can be output based on lyrics characteristics.
  • other rhythm characteristics can be predicted by similar decision trees.
  • the rhythm feature corresponding to the lyrics feature can be obtained through the lyrics feature corresponding to the text feature and the random forest classifier prediction, and then the rhythm corresponding to the lyrics text can be obtained.
  • a similar method can be used to predict the melody feature corresponding to the lyrics feature by predicting the lyrics feature corresponding to the text feature to achieve the melody prediction. Then according to the predicted rhythm characteristics and melody characteristics, the rhythm and melody corresponding to the lyrics text are formed, and then the music data corresponding to the lyrics text are generated.
  • the generated music data may be in an XML format or a MIDI format, and is not specifically limited herein.
  • the lyrics data corresponding to the lyrics text is automatically generated based on the obtained lyrics text, without the need for the user to master professional music knowledge.
  • Music composition is performed according to the lyrics text, so that the general public can also perform music composition according to the present disclosure.
  • FIG. 4 is a flowchart of steps in an exemplary embodiment before steps S110 in the embodiment shown in FIG. 3. As shown in FIG. 4, before step S110, the method in this embodiment further includes:
  • Step S210 Extract lyrics features from the lyrics sample text in the sample data, and extract rhythm features and melody features from the sample data of the music data corresponding to the lyrics sample text.
  • the sample data is used to train several songs collected by the random forest classifier, where each song collected includes lyrics and music.
  • the lyrics sample text is the lyrics of the song in the sample data.
  • the lyrics feature is used to describe the lyrics of the sample data
  • the rhythm feature and melody feature are used to describe the rhythm and melody of the sample data, respectively.
  • the lyrics feature reflects the syllable information of each word in the sample text of the lyrics
  • the rhythm feature reflects the time value information of the notes in the music
  • the melody feature reflects the level information of the notes in the music.
  • the syllable is Corresponds to a note, that is, each word in the lyrics corresponds to a note.
  • the sample data used is a song with a single track and a single instrument, so that in the sample data, each word of the lyrics uniquely corresponds to a note.
  • the extracted lyrics features may include: syllable type, number of syllables, word frequency, sub-rareness, and the like.
  • the extracted rhythm features may include features such as beat signatures, offsets, measurement offsets, and time values.
  • the extracted melody features may include: pitch symbols, tone levels, temporary marks, weak shots, and the like.
  • the specific categories of the lyrics features, rhythm features, and melody features shown above are only examples suitable for the present disclosure, and cannot be considered as providing any limitation on the scope of use of the present disclosure. Nor can it be interpreted as the need to extract all or only the specific lyrics features, rhythm features, and melody features in the above examples to achieve the present disclosure.
  • features that are more or less than the lyrics features, rhythm features, and melody features of the specific categories listed above may be extracted to implement the present disclosure.
  • the extracted lyrics features, rhythm features, and melody features can more fully describe the song lyrics and corresponding music, which can better improve the accuracy of the random forest classifier, so that the lyrics text corresponding to the lyrics text is automatically generated based on the lyrics text. The accuracy of the song data is higher.
  • the way of extracting lyrics features, rhythm features, and melody features may be to extract each feature (lyric features, rhythm features, and melody features) through a deep neural network.
  • each specific lyrics feature, rhythm feature, and melody feature may be extracted by a Python programming method. The method for extracting features is not limited herein.
  • step S230 a corpus is constructed from the lyrics features, rhythm features and melody features.
  • the extracted lyrics features, rhythm features, and melody features are used as a corpus of the random forest classifier, so that the random forest classifier is trained according to the corpus, and the rhythm and melody corresponding to the lyrics text are predicted after the training is completed.
  • the corpus may also include the sample text of the lyrics and the corresponding music data.
  • a corpus is constructed by extracting the lyrics features, rhythm features, and melody features of 24 single-track, single-music popular music.
  • the corpus includes 59 features and has 12,358 observations.
  • the observed value refers to a value corresponding to a specific feature, for example, a feature of a syllable type, and the observed value for the feature may be a single syllable, a start syllable, a central syllable, and an end syllable.
  • Step S250 iterative training of the random forest classifier is performed through the lyrics feature, rhythm feature, and melody feature, until the trained random forest classifier predicts the melody and rhythm of the known song text to a specified accuracy, then the random forest classification is stopped Iterative training.
  • the output rhythm features and melody features are separately associated with the rhythm feature and melody feature predicted by the random forest classifier for the lyrics feature output. Compare the actual rhythm features and melody features corresponding to the lyrics features. If they are not the same, adjust the parameters of the random forest classifier, and then re-enter the lyrics features into the parameter-adjusted random forest classifier to determine the output based on the prediction results. If the rhythm feature and melody feature are the same as the rhythm feature and melody feature corresponding to the lyrics feature, if they are different, repeat the above steps; if they are the same, then use the next set of lyrics features in the corpus to train the random forest classifier. This process is the iterative training of the random forest classifier. That is, the random forest classifier is iteratively trained in a deep learning manner, so that the random forest classifier after training is implemented to predict the melody and rhythm.
  • the random forest classifier After training for a period of time, the random forest classifier is evaluated, that is, the accuracy of the random forest classifier is evaluated.
  • the classifier After training the random forest classifier with lyrics features, rhythm features, and melody features from several sample data, the classifier is evaluated.
  • the evaluation process is: input the lyrics characteristics of a song that already has a song, and the random forest classification predicts and outputs the corresponding rhythm characteristics and melody characteristics.
  • the rhythm characteristics and melody characteristics obtained through the random forest are compared with the actual lyrics
  • the rhythm feature and melody feature are compared to calculate the accuracy of the random forest classifier.
  • the accuracy of the random forest classifier corresponding to each song is averaged to obtain the The accuracy of the random forest classifier. If the calculated accuracy reaches the specified accuracy, the training of the random forest classifier is completed; if the calculated prediction accuracy does not reach the specified accuracy, the random forest classifier is continued to be trained by the sample data's lyrics feature, rhythm feature, and melody feature.
  • sample data used for random forest classifier evaluation is different from the sample data used during training. For example, if training a random forest classifier uses the lyrics features, rhythm features, and melody features of the song "hurried that year" in the sample data, then the random forest classifier cannot be used for evaluation The lyrics feature, rhythm feature and melody feature corresponding to the song are evaluated by the random forest classifier.
  • the random forest classifier includes a rhythm classifier and a melody classifier, and the evaluation of the random forest classifier can evaluate the rhythm classifier and the melody classifier respectively, so that after the evaluation, the rhythm classifier and the melody can be obtained respectively Classifier accuracy.
  • step S170 includes :
  • step S171 a rhythm feature corresponding to the lyrics feature is obtained by predicting the rhythm classifier.
  • the rhythm classifier is a model that predicts rhythm characteristics by combining several decision trees.
  • the rhythm characteristics are predicted by the lyrics characteristics.
  • each node of the decision tree of the rhythm classifier is composed of lyrics from sample data. Feature construction. Because in the corpus of the random forest classifier, the lyrics feature has a corresponding rhythm feature, the rhythm feature can predict the rhythm feature corresponding to the lyrics feature, and the obtained rhythm feature can be the beat signature and offset corresponding to the lyrics feature. Characteristics, measurement offset, time value, and other characteristics and their combinations.
  • step S173 the lyrics feature and rhythm feature are input to the melody classifier to predict and obtain the melody feature corresponding to the lyrics feature.
  • the melody classifier is a model that predicts melody characteristics by combining several decision trees.
  • the melody feature is predicted by the lyrics feature and the rhythm feature.
  • each node of the decision tree of the melody classifier is constructed from the lyrics feature and the rhythm feature of the sample data.
  • Rhythmic features and lyrics features are input into the melody classifier to predict the melody features corresponding to the lyrics features, such as features such as sound level, weak beats, temporary marks, or combinations thereof.
  • step S175 the obtained rhythm characteristics and melody characteristics are combined to generate music data adapted to the lyrics text.
  • the time value and pitch corresponding to the syllable information of each word in the lyrics text are obtained, so that each word will correspond to a note, and each of the words will be combined in the order of the words in the lyrics text.
  • the musical notes corresponding to the words thereby obtaining the music data of the obtained lyrics text.
  • the generated music data may be in an XML format, and may also be in a MIDI format.
  • a rhythm feature is first obtained through a rhythm classifier, then a melody feature is obtained through a melody classifier, and finally the melody data of the obtained lyrics text is combined with the rhythm feature and the melody feature.
  • This method of generating music data is only an exemplary embodiment of step S170.
  • the melody feature may also be obtained by inputting the lyrics feature into the melody classifier, and then the lyric feature and the melody feature are input into the rhythm classifier to obtain the rhythm feature.
  • each node of the decision tree of the melody classifier in this embodiment is constructed from the lyrics features of the sample data
  • each node of the decision tree of the rhythm classifier is constructed from the lyrics features and melody features of the sample data.
  • Whether to obtain the rhythm feature or the melody feature first can be determined according to the accuracy of the random forest classifier (including the rhythm classifier and the melody classifier) obtained in step S250, that is, if the accuracy of the rhythm classifier is higher than the melody after training
  • the classifier can first obtain the rhythm feature through the rhythm classifier prediction and then the melody feature through the melody classifier prediction; if the accuracy of the rhythm classifier is lower than the melody classifier after training, you can follow the way that the melody feature is followed by the rhythm feature. Generate song data. Therefore, the corresponding features predicted by the classifier with higher accuracy can improve the accuracy of the overall prediction result. Of course, you can also decide from other angles whether to obtain the rhythm feature or the melody feature first.
  • FIG. 6 is a flowchart of an exemplary embodiment of step S175 in the embodiment shown in FIG. 5. As shown in FIG. 6, step S175 includes:
  • Step S301 Generate the note information corresponding to the words in the sequence by using the obtained rhythm characteristics and melody characteristics.
  • the obtained rhythm characteristics and melody characteristics are combined to obtain the time value information and pitch information of the words in the lyrics text, so that each word will get a note, that is, the note information corresponding to the words in the sequence is generated.
  • the subsequent note information is combined with the characteristics of the generated note. For example, when generating one note information, the features of the first 5 notes (such as Time value, pitch, etc.), so that each word in each sequence can generate corresponding note information.
  • step S303 the note information corresponding to the words in the sequence is combined to generate music data of the lyrics text.
  • the note information corresponding to the words in the lyrics text is combined to obtain music data of the lyrics text.
  • Step S303 of the embodiment shown in FIG. 7 and FIG. 6 is a flowchart in an exemplary embodiment. As shown in FIG. 7, step S303 includes:
  • Step S3031 Combine the note information corresponding to the words in the sequence according to the order of the words in the sequence to generate a note sequence corresponding to the lyrics text.
  • the words in the lyrics text are a sequence of sequences. After generating the note information corresponding to the words in the lyrics text according to the rhythm and melody characteristics, the note information is combined in the order of the words in the lyrics text to obtain the notes corresponding to the lyrics text. sequence.
  • Step S3033 Filter the note sequence according to the set note threshold.
  • Filtering a note sequence refers to removing certain notes in the note sequence, and the set note threshold may be a specific note or a range of notes. For example, if you want to ignore shorter notes (such as 1/16 notes), you can set 1/16 notes as a threshold, that is, keep other notes except 1/16 notes, and get a new note sequence.
  • the note threshold can be adjusted according to actual needs.
  • the note threshold set for different lyrics texts can be different.
  • the note threshold can be set to 1/64 notes in a certain piece of lyrics text. Thus, 1/64 notes in the note sequence are removed, and 1/32 notes in the note sequence are filtered out by setting a note threshold in another piece of lyrics text.
  • step S3035 the music data of the lyrics text is generated by the filtered note sequence.
  • the following is a device embodiment of the present disclosure, which can be used to execute an embodiment of the method for generating music for lyrics text performed by the server 200 of the present disclosure.
  • the server 200 of the present disclosure For details not disclosed in the device embodiment of the present disclosure, please refer to the embodiment of the method for generating music for lyrics text according to the present disclosure.
  • Fig. 8 is a block diagram of a device for generating music for lyrics text according to an exemplary embodiment.
  • the apparatus for generating music for lyrics text can be used in the server 200 shown in FIG. 2. All or part of the steps of the method for generating music for lyrics text shown in the method embodiment shown above.
  • the device includes, but is not limited to, an acquisition module 110, a text feature extraction module 130, a feature matching module 150, and a music data generation module 170.
  • the acquisition module 110 is configured to acquire lyrics text, and the lyrics text is several A sequence of words.
  • the text feature extraction module 130 is connected to the acquisition module 110 and is configured to perform feature extraction on the lyrics text to obtain the sequence-mapped text features.
  • a feature matching module 150 which is connected to the text feature extraction module 130 and is configured to perform feature matching between text features and lyrics features in the corpus to obtain lyrics features corresponding to the text features.
  • a music data generating module 170 which is connected to the feature matching module 150, is configured to predict the melody and rhythm corresponding to the words in the sequence by the trained random forest classifier on the obtained lyrics features to generate a music adapted to the lyrics text data.
  • Fig. 9 is a block diagram showing a device for generating music for lyrics text according to another exemplary embodiment.
  • the device in this embodiment further includes a feature extraction module 210 configured to extract lyrics features from the sample text of the lyrics in the sample data, and from the sample data The rhythm data and melody feature are extracted from the music data corresponding to the lyrics sample text.
  • a corpus construction module 230 which is connected to the feature extraction module 210, is configured to construct the corpus from the lyrics features, rhythm features, and melody features.
  • a training module 250 which is connected to the corpus building module 230, and is configured to perform iterative training of a random forest classifier by using lyrics features, rhythm features, and melody features, until the trained random forest classifier has a melody and If the prediction of the rhythm reaches a specified accuracy, the iterative training of the random forest classifier is stopped.
  • FIG. 10 is a block diagram of a module 170 of the embodiment shown in FIG. 8 in an exemplary embodiment.
  • the random forest classifier includes a rhythm classifier and a melody classifier.
  • the music data generation module 170 includes a rhythm feature obtaining unit 171 configured to obtain the lyrics feature corresponding to the lyrics feature predicted by the rhythm classifier. Rhythmic characteristics.
  • the melody feature obtaining unit 173 is connected to the rhythm feature obtaining unit 171 and is configured to input lyrics features and rhythm features to the melody classifier to predict and obtain melody features corresponding to the lyrics features.
  • a music data generating unit 175, which is connected to the melody feature obtaining unit connection 173, is configured to combine the obtained rhythm features and melody features to generate music data adapted to the lyrics text.
  • FIG. 11 is an exemplary block diagram of the music data generating unit 175 of the embodiment shown in FIG. 10.
  • the music data generating unit 175 includes a note information generating unit 301 configured to pass the obtained rhythm characteristics and melody characteristics. To generate note information corresponding to words in the sequence.
  • the note information combining unit 303 is connected to the note information generating unit 301 and is configured to combine the note information corresponding to the words in the sequence to generate music data of the lyrics text.
  • FIG. 12 is an exemplary block diagram of the note information combining unit 303 shown in FIG. 11.
  • the note information combining unit 303 includes a note sequence generating unit 3031 configured to combine the word correspondences in the sequence in the order of the words in the sequence. Note information to generate a note sequence corresponding to the lyrics text.
  • the filtering unit 3033 is connected to the note sequence generating unit 3031 and is configured to filter the note sequence according to a set note threshold.
  • a music data generating unit 3035 is connected to the filtering unit 3033 and is configured to generate music data of the lyrics text through the filtered note sequence.
  • modules or units can be implemented by hardware, software, or a combination of both.
  • these modules or units may be implemented as one or more hardware modules, such as one or more application specific integrated circuits.
  • these modules or units may be implemented as one or more computer programs executing on one or more processors.
  • the present disclosure also provides a device for generating music for lyrics text.
  • the device may be used for the server 200 described in FIG. 2.
  • the device includes: a processor; and a memory for storing processor-executable instructions.
  • the processor is configured to execute the method for generating music for lyrics text in the embodiment shown in any one of FIG. 3 to FIG. 7.
  • the present disclosure also provides a computer-readable storage medium.
  • the computer-readable storage medium may be a memory 250 storing a computer program, which may be executed by the processor 270 of the server 200 to complete the foregoing. Method for generating music for lyrics text.

Abstract

A method and a device for generating music for a lyrics text on the basis of a random forest and a computer-readable storage medium, relating to the field of artificial intelligence technology. Said method comprises: acquiring a lyrics text, the lyrics text being a sequence composed of several words in sequence (S110); performing feature extraction on the lyrics text to obtain text features mapped in sequence (S130); performing feature matching between the text features and lyrics features in a corpus, to obtain lyrics features corresponding to the text features (S150); and predicting, by means of a trained random forest classifier, the melody and rhythm corresponding to the words in the sequence of the obtained lyrics features, to generate music data adapted to the lyrics text (S170). The method automatically generates, by means of a random forest model and according to a lyrics text, music data corresponding to the lyrics text, and a user can perform music composition according to the lyrics text without the need to master professional musical knowledge, so that an ordinary person can use the method to automatically generate music according to a lyrics text.

Description

为歌词文本生成乐曲的方法、装置及计算机可读存储介质Method, device and computer-readable storage medium for generating lyrics for lyrics text 技术领域Technical field
本申请要求2018年7月19日递交、发明名称为“为歌词文本生成乐曲的方法、装置及计算机可读存储介质”的中国专利申请CN201810798036.7的优先权,在此通过引用将其全部内容合并于此。This application claims priority from Chinese patent application CN201810798036.7, filed on July 19, 2018, with the invention name "Method, Apparatus and Computer-readable Storage Media for Generating Music for Lyrics and Texts", which is hereby incorporated by reference in its entirety. Merge here.
本公开涉及互联网技术领域,特别涉及一种为歌词文本生成乐曲的方法、装置及计算机可读存储介质。The present disclosure relates to the field of Internet technologies, and in particular, to a method, an apparatus, and a computer-readable storage medium for generating music for lyrics text.
背景技术Background technique
根据歌词进行乐曲创作对专业要求度很高,一般需要运用大量相关的音乐知识比如基本乐理、和声学、复调、配器法、曲式结构等来进行创作。所以通常乐曲的创作都是由具有丰富音乐相关理论知识的人士完成。而对于缺乏音乐理论知识的普通大众来说,根据歌词进行乐曲的创作基本是不可能的。Music composition based on lyrics has high professional requirements, and generally requires the use of a large amount of relevant music knowledge such as basic music theory, harmony, polyphony, orchestration, and melody structure. Therefore, the creation of music is usually done by people with a wealth of theoretical knowledge about music. For ordinary people who lack knowledge of music theory, it is basically impossible to create music based on lyrics.
技术问题technical problem
因此需要一种可以实现根据歌词自动进行乐曲创作的方法,从而可以让普通大众也可以参与到乐曲创作中。Therefore, there is a need for a method that can automatically perform music composition based on lyrics, so that the general public can also participate in music composition.
技术解决方案Technical solutions
为了解决上述技术问题,本申请的一个目的在于提供一种为歌词文本生成乐曲的方法、装置及计算机可读存储介质。In order to solve the above technical problems, an object of the present application is to provide a method, an apparatus, and a computer-readable storage medium for generating music for lyrics text.
其中,本申请所采用的技术方案为:Among them, the technical solutions used in this application are:
一方面,一种基于随机森林为歌词文本生成乐曲的方法,包括:获取歌词文本,所述歌词文本是若干词所顺序构成的序列;对所述歌词文本进行特征提取获得所述序列映射的文本特征;进行所述文本特征与语料库中歌词特征之间的特征匹配,获得所述文本特征对应的歌词特征;通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据。In one aspect, a method for generating music for lyrics text based on a random forest includes: obtaining lyrics text, the lyrics text being a sequence of several words; performing feature extraction on the lyrics text to obtain the sequence-mapped text Feature; perform feature matching between the text feature and the lyrics feature in the corpus to obtain the lyrics feature corresponding to the text feature; perform the word recognition in the sequence on the obtained lyrics feature through the trained random forest classifier Corresponding to the prediction of melody and rhythm, music data adapted to the lyrics text is generated.
另一方面,一种基于随机森林为歌词文本生成乐曲的装置,包括:获取模块,配置为获取歌词文本,所述歌词文本是若干词所顺序构成的序列;文本特征提取模块,配置为对所述歌词文本进行特征提取获得所述序列映射的文本特征;特征匹配模块,配置为进行所述文本特征与语料库中歌词特征之间的特征匹配,获得所述文本特征对应的歌词特征;乐曲数据生成模块,配置为通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据。On the other hand, an apparatus for generating music for lyrics text based on a random forest includes: an acquisition module configured to obtain lyrics text, the lyrics text is a sequence of several words; a text feature extraction module is configured to The lyrics text is subjected to feature extraction to obtain the text features of the sequence map; a feature matching module is configured to perform feature matching between the text features and the lyrics features in the corpus to obtain the lyrics features corresponding to the text features; music data generation A module configured to predict the melody and rhythm corresponding to the words in the sequence by using the trained random forest classifier to generate the lyrics data corresponding to the lyrics text.
另一方面,一种基于随机森林为歌词文本生成乐曲的装置,包括处理器;以及用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行如上所述的为歌词文本生成乐曲的方法。On the other hand, an apparatus for generating music for lyrics text based on a random forest includes a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to execute the lyrics text as described above. Method of generating music.
另一方面,一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的为歌词文本生成乐曲的方法。In another aspect, a computer-readable storage medium has stored thereon a computer program that, when executed by a processor, implements the method for generating a musical composition for lyrics text as described above.
有益效果Beneficial effect
在上述技术方案中,通过对歌词文本进行特征提取、特征匹配、并通过随机森林分类器进行节奏和旋律预测,实现根据所获取的歌词文本自动生成该歌词文本对应的乐曲数据,用户无需掌握专业的音乐知识即可实现根据歌词文本进行乐曲创作,从而普通大众可以利用本公开根据歌词文本进行乐曲的自动生成。In the above technical solution, by performing feature extraction, feature matching, and rhythm and melody prediction through a random forest classifier, the lyrics data corresponding to the lyrics text is automatically generated based on the obtained lyrics text, and the user does not need to master professional The music knowledge can realize the composition of the music according to the lyrics text, so that the general public can use this disclosure to automatically generate the music based on the lyrics text.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。It should be understood that the above general description and the following detailed description are merely exemplary and explanatory, and should not limit the present application.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并于说明书一起用于解释本申请的原理。The drawings herein are incorporated in and constitute a part of the specification, illustrate embodiments consistent with the present application, and together with the description, serve to explain the principles of the application.
图1是根据一示例性实施例示出的本公开所涉及的实施环境示意图。Fig. 1 is a schematic diagram of an implementation environment according to the present disclosure, according to an exemplary embodiment.
图2是根据一示例性实施例示出的一种装置200的框图。Fig. 2 is a block diagram of a device 200 according to an exemplary embodiment.
图3是根据一示例性实施例示出的一种为歌词文本生成乐曲的方法的流程图。Fig. 3 is a flow chart showing a method for generating music for lyrics text according to an exemplary embodiment.
图4是图3所示实施例步骤S110之前的步骤在一示例性实施例中的流程图。FIG. 4 is a flowchart of steps before step S110 of the embodiment shown in FIG. 3 in an exemplary embodiment.
图5是图3所示实施例的步骤S170在一示例性实施例中的流程图。FIG. 5 is a flowchart of an exemplary embodiment of step S170 of the embodiment shown in FIG. 3.
图6是图5所示实施例的步骤S175在一示例性实施例中的流程图。FIG. 6 is a flowchart of step S175 in the embodiment shown in FIG. 5 in an exemplary embodiment.
图7是图6所示实施例的步骤S303在一示例性实施例中的流程图。FIG. 7 is a flowchart of step S303 in the embodiment shown in FIG. 6 in an exemplary embodiment.
图8是根据一示例性实施例示出的一种为歌词文本生成乐曲的装置框图。Fig. 8 is a block diagram of a device for generating music for lyrics text according to an exemplary embodiment.
图9是根据另一示例性实施例示出的一种为歌词文本生成乐曲的装置框图。Fig. 9 is a block diagram of a device for generating music for lyrics text according to another exemplary embodiment.
图10是图8所示实施例的模块170在一示例性实施例中的框图。FIG. 10 is a block diagram of a module 170 of the embodiment shown in FIG. 8 in an exemplary embodiment.
图11是图10所示实施例的乐曲数据生成单元175在一示例性实施例中的框图。FIG. 11 is a block diagram of a musical piece data generating unit 175 of the embodiment shown in FIG. 10 in an exemplary embodiment.
图12是图11所示实施例的音符信息组合单元303在一示例性实施例中的框图。FIG. 12 is a block diagram of the note information combining unit 303 of the embodiment shown in FIG. 11 in an exemplary embodiment.
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述,这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。Through the above drawings, specific embodiments of the present application have been shown, which will be described in more detail later. These drawings and text descriptions are not intended to limit the scope of the concept of the present application in any way, but by referring to specific embodiments. Those skilled in the art will explain the concepts of this application.
本发明的实施方式Embodiments of the invention
这里将详细地对示例性实施例执行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of devices and methods consistent with certain aspects of the application as detailed in the appended claims.
图1是根据一示例性实施例示出的本公开所涉及的实施环境示意图。该实施环境为建立网络通信连接的终端100和服务器200,其中服务器200作为本公开为歌词文本生成乐曲的后端实现。其中终端100可以是电脑、智能手机或者其他可供为歌词文本生成乐曲的客户端运行以及具备网络连接功能的通信设备,在此不进行限定。终端100可以发起生成乐曲数据的请求,并提供歌词文本,从而服务器300接收终端100发起的请求,并根据终端100所提供的歌词文本实现为该歌词文本生成乐曲数据,然后将生成乐曲数据输出到终端100,在一种示例性实施例中,服务器300可以是网页服务器,还可以是APP服务器。Fig. 1 is a schematic diagram of an implementation environment according to the present disclosure, according to an exemplary embodiment. The implementation environment is a terminal 100 and a server 200 that establish a network communication connection, where the server 200 is implemented as a back end for generating music for lyrics text by the present disclosure. The terminal 100 may be a computer, a smart phone, or other communication devices capable of generating client music for lyrics and text, and a communication device having a network connection function, which is not limited herein. The terminal 100 can initiate a request to generate music data and provide lyrics text, so that the server 300 receives the request initiated by the terminal 100, and generates music data for the lyrics text according to the lyrics text provided by the terminal 100, and then outputs the generated music data to The terminal 100, in an exemplary embodiment, the server 300 may be a web server or an APP server.
图2是根据一示例性实施例示出的一种服务器的硬件结构框图。该服务器可用于为歌词文本生成乐曲数据而部署在图1所示的实施环境中。需要说明的是,该服务器200只是一个适配于本公开的示例,不能认为是提供了对本公开的使用范围的任何限制。该服务器200也不能解释为需要依赖于或者必须具有图2中示出一个或者多个组件。Fig. 2 is a block diagram showing a hardware structure of a server according to an exemplary embodiment. The server can be used to generate music data for lyrics text and deployed in the implementation environment shown in FIG. 1. It should be noted that the server 200 is only an example adapted to the present disclosure, and cannot be considered to provide any limitation on the scope of use of the present disclosure. The server 200 cannot also be interpreted as needing to rely on or having one or more of the components shown in FIG. 2.
该服务器的硬件结构可因配置或者性能的不同而产生较大的差异,如图2所示,服务器200包括:电源210、接口230、至少一存储器250、以及至少一处理器(CPU, Central Processing Units)270。其中,电源210用于为服务器200上的各硬件设备提供工作电压。The hardware structure of the server may vary greatly due to different configurations or performance. As shown in FIG. 2, the server 200 includes: a power supply 210, an interface 230, at least one memory 250, and at least one processor (CPU, Central Processing Units) 270. The power supply 210 is used to provide working voltages for each hardware device on the server 200.
接口230包括至少一有线或无线网络接口231、至少一串并转换接口233、至少一输入输出接口235以及至少一USB接口237等,用于与外部设备通信。在一示例性实施例中,可以通过无线网络接口与图1实施环境中的终端100通信。The interface 230 includes at least one wired or wireless network interface 231, at least one serial-to-parallel conversion interface 233, at least one input-output interface 235, and at least one USB interface 237, etc., for communicating with external devices. In an exemplary embodiment, the terminal 100 in the implementation environment of FIG. 1 may be communicated through a wireless network interface.
存储器250作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源包括操作系统251、应用程序253及数据255等,存储方式可以是短暂存储或者永久存储。其中,操作系统251用于管理与控制服务器200上的各硬件设备以及应用程序253,以实现处理器270对海量数据255的计算与处理,其可以是Windows ServerTM、Mac OS XTM、UnixTM、LinuxTM、FreeBSDTM等。应用程序253是基于操作系统251之上完成至少一项特定工作的计算机程序,其可以包括至少一模块(图2中未示出),每个模块都可以分别包含有对服务器200的一系列计算机可读指令。数据255可以是存储于磁盘中的照片、图片等。The memory 250 serves as a carrier for resource storage, and may be a read-only memory, a random access memory, a magnetic disk, or an optical disk. The resources stored on the memory 250 include an operating system 251, an application program 253, and data 255. The storage method may be temporary storage or permanent storage. . The operating system 251 is used to manage and control various hardware devices and application programs 253 on the server 200 to implement the calculation and processing of massive data 255 by the processor 270, which may be Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc. The application program 253 is a computer program that completes at least one specific task based on the operating system 251. It may include at least one module (not shown in FIG. 2), and each module may respectively contain a series of computers for the server 200. Readable instructions. The data 255 may be photos, pictures, and the like stored on the disk.
处理器270可以包括一个或多个以上的处理器,并设置为通过总线与存储器250通信,用于运算与处理存储器250中的海量数据255。如上面所详细描述的,适用本公开的服务器200将通过处理器270读取存储器250中存储的一系列计算机可读指令的形式来完成为歌词文本生成乐曲。The processor 270 may include one or more processors, and is configured to communicate with the memory 250 through a bus, for calculating and processing the massive data 255 in the memory 250. As described in detail above, the server 200 to which the present disclosure is applied will complete the generation of music for the lyrics text by the processor 270 reading a series of computer-readable instructions stored in the memory 250.
此外,通过硬件电路或者硬件电路结合软件也能同样实现本公开,因此,实现本公开并不限于任何特定硬件电路、软件以及两者的组合。In addition, the present disclosure can also be implemented by a hardware circuit or a hardware circuit in combination with software. Therefore, the implementation of the present disclosure is not limited to any specific hardware circuit, software, and a combination of the two.
图3是根据一示例性实施例示出的为歌词文本生成乐曲的方法的流程图,如图3所示,该实施例的方法包括:Fig. 3 is a flowchart illustrating a method for generating music for lyrics text according to an exemplary embodiment. As shown in Fig. 3, the method in this embodiment includes:
步骤S110,获取歌词文本,歌词文本是若干词所顺序构成的序列。In step S110, the lyrics text is obtained, and the lyrics text is a sequence composed of several words.
词是歌词文本中的最小单位,例如“还没好好的感受, 雪花绽放的气候”这一句歌词文本中,歌词文本中的每个字即为该歌词文本的词。当然歌词文本的语言不限,对于中文文本其中的字即为该歌词文本的词,对于英文文本,其中的单词即为该歌词文本的词,在此不进行限定。The word is the smallest unit in the lyrics text. For example, in the lyrics text "not yet feeling well, the climate of snowflakes blooming", each word in the lyrics text is the word of the lyrics text. Of course, the language of the lyrics text is not limited. The words in the Chinese text are the words of the lyrics text, and the words of the English text are the words of the lyrics text, which is not limited here.
在一实施例中,所获取的歌词文本可以是用户在交互界面中输入的歌词文本。其中,歌词文本的长度不限,可以是一句、一段文本。在另一实施例中,交互界面还可以展示服务器推荐的歌词文本,用户可以在交互界面上对所推荐的歌词文本进行选择操作,服务器根据用户的选择操作将所选择的歌词文本输入到服务器中。在另一实施例中,服务器还提供歌词文本字数调整选项,从而用户可以输入文本并进行文本字数调整。In an embodiment, the obtained lyrics text may be the lyrics text input by the user in the interactive interface. The length of the lyrics text is not limited, and it can be a sentence or a paragraph of text. In another embodiment, the interactive interface may also display the lyrics text recommended by the server. The user may select the recommended lyrics text on the interactive interface, and the server inputs the selected lyrics text to the server according to the user's selection operation. . In another embodiment, the server also provides an option for adjusting the number of words in the lyrics text, so that the user can enter text and adjust the number of words in the text.
步骤S130,对歌词文本进行特征提取获得序列映射的文本特征。Step S130: Perform feature extraction on the lyrics text to obtain the text features of the sequence map.
对歌词文本进行特征提取即从歌词文本中获取该歌词文本对应的音节信息,即用文本特征来反映歌词文本的音节信息,其中,歌词文本中每个词对应一个音节。其中反映歌词文本音节信息的文本特征包括但不仅限于:歌词文本中各词的音节类型、音节数、字频率、字罕见度。其中,音节类型是指歌词中该词对应音节的分类,可以是:单音节、起始音节、中央音节、结束音节等。音节数是指音节的数量,字频率是指在该歌词文本中某字出现的频率;字罕见度是在字频率的函数,其中:Feature extraction of the lyrics text means obtaining the syllable information corresponding to the lyrics text from the lyrics text, that is, reflecting the syllable information of the lyrics text with the text features, wherein each word in the lyrics text corresponds to a syllable. The text features reflecting the syllable information of the lyrics text include, but are not limited to, the syllable type, the number of syllables, the word frequency, and the word rareness of each word in the lyrics text. The syllable type refers to the classification of the syllable corresponding to the word in the lyrics, which can be: single syllable, start syllable, central syllable, end syllable, and the like. The number of syllables refers to the number of syllables, and the word frequency refers to the frequency of a word in the lyrics text; the word rareness is a function of the word frequency, where:
Figure 346351dest_path_image001
Figure 346351dest_path_image001
歌词文本唯一对应文本特征,即所提取的文本特征为序列映射的文本特征,从而可以通过该文本特征所反映的音节信息来描述该歌词文本。 The lyrics text uniquely corresponds to the text feature, that is, the extracted text feature is a sequence mapped text feature, so that the lyrics text can be described by the syllable information reflected by the text feature.
在一示例性实施例中,可以通过python编程的方式进行文本特征的提取,在预先编写的程序中设定提取歌词文本中各词的音节类型、字频率、以及字罕见度等文本特征的方法。在其他实施例中,通过深度神经网络进行文本特征提取也适用于本公开,在此不进行限制。In an exemplary embodiment, text feature extraction can be performed in a python programming manner, and a method for extracting text features such as the syllable type, word frequency, and word rareness of each word in the lyrics text is set in a pre-written program. . In other embodiments, text feature extraction by deep neural networks is also applicable to the present disclosure, which is not limited herein.
步骤S150,进行文本特征与语料库中歌词特征之间的特征匹配,获得文本特征对应的歌词特征。Step S150: Perform feature matching between the text features and the lyrics features in the corpus to obtain the lyrics features corresponding to the text features.
语料库由若干首歌曲的音乐文件组成,其中音乐文件中包括了针对每首歌曲所提取的歌词特征、节奏特征和旋律特征。在一种实施例中,音乐文件在语料库以XML格式存在。值得说明的是,在进行文本特征匹配前,已经构建了该语料库。具体语料库的构建详见下文描述。通过将文本特征与语料库中的歌词特征进行匹配,从而可以从语料库中匹配到与该文本特征相似的歌词特征,即文本特征对应的歌词特征。从而在后续的步骤,根据所匹配到的歌词特征进行节奏和旋律的预测。The corpus is composed of music files of several songs. The music files include the lyrics features, rhythm features and melody features extracted for each song. In one embodiment, the music file exists in XML format in the corpus. It is worth noting that the corpus has been constructed before text feature matching. The detailed construction of the corpus is described below. By matching the text features with the lyrics features in the corpus, the lyrics features similar to the text features, that is, the lyrics features corresponding to the text features, can be matched from the corpus. Therefore, in the subsequent steps, the rhythm and melody are predicted according to the matched lyrics characteristics.
以所提取的文本特征为音节类型、音节数、字频率、字罕见度为例,所进行的文本特征与语料库中的歌词特征的匹配包括音节类型、音节数、字频率、以及字罕见度的匹配,从而通过匹配,可以从语料库中获得与该歌词文本的音节类型、音节数、字频率、字罕见度这些特征相似的歌词特征。Taking the extracted text features as syllable type, number of syllables, word frequency, and word rareness as examples, the matching between the text features and lyrics features in the corpus includes the syllable type, number of syllables, word frequency, and word rareness Matching, so that through matching, lyrics features similar to the syllable type, number of syllables, word frequency, and word rareness of the lyrics text can be obtained from the corpus.
步骤S170,通过所训练的随机森林分类器对获得的歌词特征进行序列中词所对应旋律和节奏的预测,生成适配于歌词文本的乐曲数据。In step S170, the obtained lyrics feature is used to predict the melody and rhythm corresponding to the words in the sequence through the trained random forest classifier to generate music data adapted to the lyrics text.
乐曲中音符的时长组合成了该乐曲的节奏。节奏特征用于反映音符的时长信息,比如音的起始、时值等。其中节奏特征可以包括:拍子记号、偏移量、测量偏移量、时值等。拍子记号是指歌词所对应乐曲中音符的拍子记号;偏移量是指音乐开始之前的节拍数量;测量偏移量是指开始测量音乐之前的节拍数量;时值是指乐曲中某字对应音符的时长。The duration of the notes in the composition of the music is the rhythm of the composition. Rhythm characteristics are used to reflect the duration information of the notes, such as the start of the note and the duration. The rhythm characteristics may include: a tick mark, an offset, a measurement offset, a time value, and the like. The tempo sign refers to the tempo sign of the note in the music corresponding to the lyrics; the offset refers to the number of beats before the start of the music; the measurement offset refers to the number of beats before the start of the measurement; the time value refers to the corresponding note of a word in the music The length of time.
乐曲中各个音符的音级组合成了该乐曲的旋律。旋律特征用于反映音符的音级信息。其中旋律特征可以包括:音调符号、音级、临时记号、弱拍等。其中音调符号是指歌词中某词对应音符的音调符号;音级是指歌词中某词对应音符的音级,临时记号是指直接放在音符前的变音记号;弱拍是指不带强音的单位拍。歌词特征可以包括:歌词文本中各词的音节类型、音节数、字频率、字罕见度。值得说明的是,以上所列举的节奏特征、旋律特征仅仅是适配于本公开的举例,不能认为是对本公开的使用的限制。在其他实施例中,节奏特征、旋律特征还可以包括除以上列举的特征外的特征,或者以上列举的某些特征与其他未列举特征的组合。The melody of each note in the composition is the melody of the composition. Melody features are used to reflect the pitch information of a note. The melody characteristics may include: pitch symbols, tone levels, temporary marks, weak beats, and the like. Among them, the tonality symbol refers to the tone symbol corresponding to a word in the lyrics; the tone level refers to the tone level corresponding to a word in the lyrics; the temporary mark refers to the diacritical mark placed directly before the note; the weak beat means no strong The unit of sound is beat. Lyric characteristics may include: syllable type, number of syllables, word frequency, and word rareness of each word in the lyrics text. It is worth noting that the rhythm characteristics and melody characteristics listed above are just examples adapted to the present disclosure, and should not be considered as a limitation on the use of the present disclosure. In other embodiments, the rhythm feature and melody feature may further include features other than the features listed above, or a combination of some features listed above and other features not listed above.
对应于每首歌曲,由于歌曲的歌词和乐曲是相对应的,从而语料库中的歌词特征是与节奏特征、旋律特征相对应的。从而可以在对歌词文本的文本特征与语料库中的歌词特征进行匹配的基础上,根据所获得文本特征对应的歌词特征进行对应旋律和节奏的预测。其中,进行节奏和旋律的预测,即通过文本特征对应的歌词特征进行节奏特征和旋律特征的预测,然后根据所预测得到的节奏特征和旋律特征生成节奏和旋律,进而生成乐曲数据。Corresponding to each song, since the lyrics and music of the song are corresponding, the lyrics features in the corpus correspond to the rhythm and melody features. Therefore, on the basis of matching the text features of the lyrics text with the lyrics features in the corpus, the corresponding melody and rhythm can be predicted according to the lyrics features corresponding to the obtained text features. Among them, predicting the rhythm and melody, that is, predicting the rhythm and melody characteristics through the lyrics characteristics corresponding to the text characteristics, and then generating the rhythm and melody based on the predicted rhythm and melody characteristics, and then generating music data.
随机森林分类器是通过特征所构建的多个决策树组合而成,其中各个特征构成了决策树的节点。从而随机森林分类器可以非线性地把各个特征结合起来,而且不需要大量的样本数据来训练随机森林分类器,从而可以在保证所生成乐曲质量的基础上不需要大量的样本数据训练随机森林分类器。The random forest classifier is a combination of multiple decision trees constructed by features, where each feature constitutes a node of the decision tree. Therefore, the random forest classifier can combine features non-linearly, and does not require a large amount of sample data to train the random forest classifier, so that it can train the random forest classification without a large amount of sample data on the basis of ensuring the quality of the generated music Device.
在进行旋律和节奏的预测时,是根据随机森林分类器的各个节点处的特征来进行预测。以通过歌词特征进行节奏的预测为例,即随机森林分类器的输入是歌词特征,输出为节奏特征,那么在构建和训练随机森林分类器时,根据语料库中的歌词特征构建了该预测节奏的随机森林分类器,即语料库中的歌词特征确定了该随机森林分类器的各个节点出的特征以及判断条件,例如,把音节数、音节类型、字频率、字罕见度从上至下设置在随机森林分类器决策树的节点上然后根据各个节点上的判断条件,例如在音节数的这一节点上判断条件:如果音节数≤3,输出对应的时值;如果音节数>3,根据音节类型判断输出的某一时值。同理,音节类型、字频率、子罕见度上也有相应的判断条件和基于不同条件对应的输出,从而可以基于歌词特征输出时值,当然,其他的节奏特征可以通过类似的决策树预测得到。从而可以通过文本特征对应的歌词特征和随机森林分类器预测得到该歌词特征对应的节奏特征,进而得到该歌词文本对应的节奏。当然,可以通过类似的方法通过文本特征对应的歌词特征预测得到该歌词特征对应的旋律特征,实现旋律的预测。然后根据预测得到的节奏特征、旋律特征形成歌词文本对应的节奏和旋律,进而生成歌词文本对应的乐曲数据。所生成的乐曲数据可以是XML格式、MIDI格式,在此不进行具体限定。In the prediction of melody and rhythm, the prediction is performed according to the characteristics of each node of the random forest classifier. Take the lyrics feature prediction as an example, that is, the input of the random forest classifier is the lyrics feature, and the output is the rhythm feature. Then when constructing and training the random forest classifier, the predicted rhythm is constructed based on the lyrics features in the corpus. Random forest classifier, that is, the lyrics features in the corpus determine the characteristics and judgment conditions of each node of the random forest classifier. For example, the number of syllables, syllable type, word frequency, and word rareness are set at random from top to bottom. The nodes of the forest classifier decision tree are then determined according to the judgment conditions on each node, for example, on this node of the number of syllables: if the number of syllables ≤ 3, the corresponding time value is output; if the number of syllables> 3, according to the syllable type Judging a certain time value of the output. In the same way, there are corresponding judgment conditions and output based on different conditions on syllable type, word frequency, and sub-rare degree, so that the time value can be output based on lyrics characteristics. Of course, other rhythm characteristics can be predicted by similar decision trees. Thereby, the rhythm feature corresponding to the lyrics feature can be obtained through the lyrics feature corresponding to the text feature and the random forest classifier prediction, and then the rhythm corresponding to the lyrics text can be obtained. Of course, a similar method can be used to predict the melody feature corresponding to the lyrics feature by predicting the lyrics feature corresponding to the text feature to achieve the melody prediction. Then according to the predicted rhythm characteristics and melody characteristics, the rhythm and melody corresponding to the lyrics text are formed, and then the music data corresponding to the lyrics text are generated. The generated music data may be in an XML format or a MIDI format, and is not specifically limited herein.
通过对歌词文本进行特征提取、特征匹配、以及通过随机森林分类器进行节奏和旋律预测,实现根据所获取的歌词文本自动生成该歌词文本对应的乐曲数据,无需用户掌握专业的音乐知识即可实现根据歌词文本进行乐曲创作,从而普通大众也可以根据本公开进行乐曲创作。By performing feature extraction, feature matching, and rhythm and melody prediction through a random forest classifier, the lyrics data corresponding to the lyrics text is automatically generated based on the obtained lyrics text, without the need for the user to master professional music knowledge. Music composition is performed according to the lyrics text, so that the general public can also perform music composition according to the present disclosure.
图4是图3所示实施例步骤S110之前的步骤在一示例性实施例中的流程图,如图4所示,在步骤S110之前,该实施例的方法还包括:FIG. 4 is a flowchart of steps in an exemplary embodiment before steps S110 in the embodiment shown in FIG. 3. As shown in FIG. 4, before step S110, the method in this embodiment further includes:
步骤S210,从样本数据中的歌词样本文本提取歌词特征,从样本数据中对应于歌词样本文本的乐曲数据提取节奏特征和旋律特征。Step S210: Extract lyrics features from the lyrics sample text in the sample data, and extract rhythm features and melody features from the sample data of the music data corresponding to the lyrics sample text.
样本数据是为了训练随机森林分类器所收集的若干首歌曲,其中所收集的每首歌曲均包括歌词和乐曲。歌词样本文本即为样本数据中歌曲的歌词。其中歌词特征用于描述该样本数据的歌词,节奏特征和旋律特征分别用于描述该样本数据的节奏和旋律。或者说,歌词特征体现了歌词样本文本中每个词的音节信息,节奏特征体现了乐曲中音符的时值信息,旋律特征体现了乐曲中音符的音级信息,在每首歌曲中,音节是与音符相对应的,即歌词中的每个词对应一个音符。为了保证随机森林分类器的预测效果,在一示例性实施例中,所使用的样本数据是单音轨、单乐器的歌曲,从而该样本数据中,歌词的每个词唯一对应一个音符。在一示例性实施例中,所提取的歌词特征可以包括:音节类型、音节数、字频率、子罕见度等。所提取的节奏特征可以包括:拍子记号、偏移量、测量偏移量、时值等特征。所提取的旋律特征可以包括:音调符号、音级、临时记号、弱拍等。The sample data is used to train several songs collected by the random forest classifier, where each song collected includes lyrics and music. The lyrics sample text is the lyrics of the song in the sample data. The lyrics feature is used to describe the lyrics of the sample data, and the rhythm feature and melody feature are used to describe the rhythm and melody of the sample data, respectively. In other words, the lyrics feature reflects the syllable information of each word in the sample text of the lyrics, the rhythm feature reflects the time value information of the notes in the music, and the melody feature reflects the level information of the notes in the music. In each song, the syllable is Corresponds to a note, that is, each word in the lyrics corresponds to a note. In order to ensure the prediction effect of the random forest classifier, in an exemplary embodiment, the sample data used is a song with a single track and a single instrument, so that in the sample data, each word of the lyrics uniquely corresponds to a note. In an exemplary embodiment, the extracted lyrics features may include: syllable type, number of syllables, word frequency, sub-rareness, and the like. The extracted rhythm features may include features such as beat signatures, offsets, measurement offsets, and time values. The extracted melody features may include: pitch symbols, tone levels, temporary marks, weak shots, and the like.
需要说明的是,以上所示出的歌词特征、节奏特征、旋律特征的具体类别只是一个适配于本公开的示例,不能认为是提供了对本公开的使用范围的任何限制。也不能解释为需要都提取或者只能提取以上示例中具体的歌词特征、节奏特征、旋律特征才能实现本公开。在其他实施例中,可以提取比以上列举的具体类别的歌词特征、节奏特征以及旋律特征更多或者更少的特征来实施本公开。当然,所提取的歌词特征、节奏特征、旋律特征能够越充分地描述该歌曲歌词以及对应的乐曲,可以更好地提高随机森林分类器的精度,使在根据歌词文本自动生成该歌词文本对应的乐曲数据时的精度更高。It should be noted that the specific categories of the lyrics features, rhythm features, and melody features shown above are only examples suitable for the present disclosure, and cannot be considered as providing any limitation on the scope of use of the present disclosure. Nor can it be interpreted as the need to extract all or only the specific lyrics features, rhythm features, and melody features in the above examples to achieve the present disclosure. In other embodiments, features that are more or less than the lyrics features, rhythm features, and melody features of the specific categories listed above may be extracted to implement the present disclosure. Of course, the extracted lyrics features, rhythm features, and melody features can more fully describe the song lyrics and corresponding music, which can better improve the accuracy of the random forest classifier, so that the lyrics text corresponding to the lyrics text is automatically generated based on the lyrics text. The accuracy of the song data is higher.
在一示例性实施例中,提取歌词特征、节奏特征、旋律特征的方式可以是通过深度神经网络,通过深度神经网络提取各个特征(歌词特征、节奏特征以及旋律特征)。在另一示例性实施例中,可以通过python编程的方法提取各个具体的歌词特征、节奏特征以及旋律特征。在此不对提取特征的方式进行限定。In an exemplary embodiment, the way of extracting lyrics features, rhythm features, and melody features may be to extract each feature (lyric features, rhythm features, and melody features) through a deep neural network. In another exemplary embodiment, each specific lyrics feature, rhythm feature, and melody feature may be extracted by a Python programming method. The method for extracting features is not limited herein.
步骤S230,由歌词特征、节奏特征和旋律特征构建语料库。In step S230, a corpus is constructed from the lyrics features, rhythm features and melody features.
所提取的各种歌词特征、节奏特征以及旋律特征作为随机森林分类器的语料库,从而根据该语料库进行随机森林分类器的训练以及训练完成后进行歌词文本对应的节奏和旋律的预测。当然语料库还可以包括样本数据的歌词样本文本和对应的乐曲数据。在一示例性实施例中,通过提取24首单音轨、单乐器的流行音乐的歌词特征、节奏特征、旋律特征来构建语料库,该语料库中包括59个特征,并有12358个观测值。其中观测值是指某一具体特征对应的取值,例如,音节类型这一特征,针对该特征的观测值可以是单音节、起始音节、中央音节以及结束音节。The extracted lyrics features, rhythm features, and melody features are used as a corpus of the random forest classifier, so that the random forest classifier is trained according to the corpus, and the rhythm and melody corresponding to the lyrics text are predicted after the training is completed. Of course, the corpus may also include the sample text of the lyrics and the corresponding music data. In an exemplary embodiment, a corpus is constructed by extracting the lyrics features, rhythm features, and melody features of 24 single-track, single-music popular music. The corpus includes 59 features and has 12,358 observations. The observed value refers to a value corresponding to a specific feature, for example, a feature of a syllable type, and the observed value for the feature may be a single syllable, a start syllable, a central syllable, and an end syllable.
步骤S250,通过歌词特征、节奏特征和旋律特征进行随机森林分类器的迭代训练,直至所训练得到的随机森林分类器对已知歌曲文本的旋律和节奏的预测达到指定精度,则停止随机森林分类器的迭代训练。Step S250, iterative training of the random forest classifier is performed through the lyrics feature, rhythm feature, and melody feature, until the trained random forest classifier predicts the melody and rhythm of the known song text to a specified accuracy, then the random forest classification is stopped Iterative training.
在一实施例中,通过将歌词特征输入到随机森林分类器中,根据随机森林分类器中预测得到的针对该歌词特征输出的节奏特征和旋律特征,将输出的节奏特征、旋律特征分别与该歌词特征实际对应的节奏特征、旋律特征进行对比,如果不相同,则调整随机森林分类器的参数,然后将该歌词特征重新输入到参数调整后的随机森林分类器中,判断根据预测结果输出的节奏特征和旋律特征是否与该歌词特征实际对应的节奏特征和旋律特征相同,如果不同,重复上述步骤;如果相同,则利用语料库中的下一组歌词特征进行随机森林分类器的训练。该过程即为随机森林分类器的迭代训练。即采用深度学习的方式对随机森林分类器进行迭代训练,从而使完成训练之后的随机森林分类器实现旋律和节奏的预测。In one embodiment, by inputting the lyrics features into a random forest classifier, the output rhythm features and melody features are separately associated with the rhythm feature and melody feature predicted by the random forest classifier for the lyrics feature output. Compare the actual rhythm features and melody features corresponding to the lyrics features. If they are not the same, adjust the parameters of the random forest classifier, and then re-enter the lyrics features into the parameter-adjusted random forest classifier to determine the output based on the prediction results. If the rhythm feature and melody feature are the same as the rhythm feature and melody feature corresponding to the lyrics feature, if they are different, repeat the above steps; if they are the same, then use the next set of lyrics features in the corpus to train the random forest classifier. This process is the iterative training of the random forest classifier. That is, the random forest classifier is iteratively trained in a deep learning manner, so that the random forest classifier after training is implemented to predict the melody and rhythm.
训练一段时间后,进行该随机森林分类器的评估,即评估该随机森林分类器的精度。After training for a period of time, the random forest classifier is evaluated, that is, the accuracy of the random forest classifier is evaluated.
用若干样本数据的歌词特征、节奏特征、旋律特征训练随机森林分类器后,进行分类器的评估。评估的过程为:输入一首已有乐曲的歌曲对应的歌词特征,随机森林分类经预测输出对应的节奏特征、旋律特征,将经过随机森林获得的节奏特征、旋律特征与该歌词实际对应乐曲的节奏特征、旋律特征进行对比,从而计算得到随机森林分类器的精度。After training the random forest classifier with lyrics features, rhythm features, and melody features from several sample data, the classifier is evaluated. The evaluation process is: input the lyrics characteristics of a song that already has a song, and the random forest classification predicts and outputs the corresponding rhythm characteristics and melody characteristics. The rhythm characteristics and melody characteristics obtained through the random forest are compared with the actual lyrics The rhythm feature and melody feature are compared to calculate the accuracy of the random forest classifier.
在一实施例中,如果是用多首歌曲的歌词特征、节奏特征、旋律特征进行随机森林分类器的评估,将每首歌曲对应计算得到的随机森林分类器的精度求平均值,从而得到该随机森林分类器的精度。如果计算所得精度达到指定精度,则完成随机森林分类器的训练;如果计算的预测精度未达到指定精度,则继续通过样本数据歌词特征、节奏特征、旋律特征训练随机森林分类器。In an embodiment, if the lyrics feature, rhythm feature, and melody feature of multiple songs are used to evaluate the random forest classifier, the accuracy of the random forest classifier corresponding to each song is averaged to obtain the The accuracy of the random forest classifier. If the calculated accuracy reaches the specified accuracy, the training of the random forest classifier is completed; if the calculated prediction accuracy does not reach the specified accuracy, the random forest classifier is continued to be trained by the sample data's lyrics feature, rhythm feature, and melody feature.
值得说明的是,进行随机森林分类器评估所使用的样本数据不同与训练时所使用的样本数据。举例来说,即如果训练随机森林分类器使用了样本数据中的“匆匆那年”这首歌曲对应的歌词特征、节奏特征和旋律特征,那么进行随机森林分类器评估时,就不能使用该首歌曲对应的歌词特征、节奏特征和旋律特征进行该随机森林分类器的评估。It is worth noting that the sample data used for random forest classifier evaluation is different from the sample data used during training. For example, if training a random forest classifier uses the lyrics features, rhythm features, and melody features of the song "hurried that year" in the sample data, then the random forest classifier cannot be used for evaluation The lyrics feature, rhythm feature and melody feature corresponding to the song are evaluated by the random forest classifier.
在一示例性实施例中,随机森林分类器包括节奏分类器和旋律分类器,则随机森林分类器的评估可以分别评估节奏分类器和旋律分类器,从而评估后可以分别得到节奏分类器和旋律分类器的精度。In an exemplary embodiment, the random forest classifier includes a rhythm classifier and a melody classifier, and the evaluation of the random forest classifier can evaluate the rhythm classifier and the melody classifier respectively, so that after the evaluation, the rhythm classifier and the melody can be obtained respectively Classifier accuracy.
图5是图3所示实施例步骤S170在一示例性实施例中的流程图,在该实施例中,随机森林分类器包括节奏分类器和旋律分类器,如图5所示,步骤S170包括:5 is a flowchart of an exemplary embodiment of step S170 of the embodiment shown in FIG. 3. In this embodiment, the random forest classifier includes a rhythm classifier and a melody classifier. As shown in FIG. 5, step S170 includes :
步骤S171,通过节奏分类器预测得到歌词特征对应的节奏特征。In step S171, a rhythm feature corresponding to the lyrics feature is obtained by predicting the rhythm classifier.
节奏分类器是通过若干个决策树组合而成的预测节奏特征的模型,在本实施例中,通过歌词特征进行节奏特征的预测,对应的,节奏分类器的决策树各个节点由样本数据的歌词特征构建。由于在随机森林分类器的语料库中,歌词特征有对应的节奏特征,通过节奏分类器可以预测得到该歌词特征对应的节奏特征,所得到的节奏特征可以是该歌词特征对应的拍子记号、偏移量、测量偏移量、时值等特征及其组合。The rhythm classifier is a model that predicts rhythm characteristics by combining several decision trees. In this embodiment, the rhythm characteristics are predicted by the lyrics characteristics. Correspondingly, each node of the decision tree of the rhythm classifier is composed of lyrics from sample data. Feature construction. Because in the corpus of the random forest classifier, the lyrics feature has a corresponding rhythm feature, the rhythm feature can predict the rhythm feature corresponding to the lyrics feature, and the obtained rhythm feature can be the beat signature and offset corresponding to the lyrics feature. Characteristics, measurement offset, time value, and other characteristics and their combinations.
步骤S173,将歌词特征和节奏特征输入到旋律分类器预测得到歌词特征对应的旋律特征。In step S173, the lyrics feature and rhythm feature are input to the melody classifier to predict and obtain the melody feature corresponding to the lyrics feature.
旋律分类器是通过若干个决策树组合而成的预测旋律特征的模型。在本实施例中,通过歌词特征和节奏特征来进行旋律特征的预测,相应的,旋律分类器的决策树各个节点由样本数据的歌词特征和节奏特征构建。将节奏特征、歌词特征输入到旋律分类器中进行预测得到歌词特征对应的旋律特征,比如音级、弱拍、临时记号等特征或者其组合。The melody classifier is a model that predicts melody characteristics by combining several decision trees. In this embodiment, the melody feature is predicted by the lyrics feature and the rhythm feature. Accordingly, each node of the decision tree of the melody classifier is constructed from the lyrics feature and the rhythm feature of the sample data. Rhythmic features and lyrics features are input into the melody classifier to predict the melody features corresponding to the lyrics features, such as features such as sound level, weak beats, temporary marks, or combinations thereof.
步骤S175,组合所得到的节奏特征和旋律特征,生成适配于歌词文本的乐曲数据。In step S175, the obtained rhythm characteristics and melody characteristics are combined to generate music data adapted to the lyrics text.
通过组合所得到的节奏特征和旋律特征,得到了歌词文本中每个词的音节信息所对应的时值以及音高,从而每个词都会对应一个音符,按照歌词文本中词的顺序组合每个词所对应的音符,从而得到所获取的歌词文本的乐曲数据。By combining the obtained rhythm and melody characteristics, the time value and pitch corresponding to the syllable information of each word in the lyrics text are obtained, so that each word will correspond to a note, and each of the words will be combined in the order of the words in the lyrics text. The musical notes corresponding to the words, thereby obtaining the music data of the obtained lyrics text.
在一示例性实施例中,所生成的乐曲数据可以XML格式,还可以以MIDI格式。In an exemplary embodiment, the generated music data may be in an XML format, and may also be in a MIDI format.
需要说明的是,在本实施例中,先通过节奏分类器获得节奏特征,然后再通过旋律分类器获得旋律特征,最后组合节奏特征和旋律特征生成所获取的歌词文本的乐曲数据。该种生成乐曲数据的方式仅仅是步骤S170的一示例性实施例。It should be noted that, in this embodiment, a rhythm feature is first obtained through a rhythm classifier, then a melody feature is obtained through a melody classifier, and finally the melody data of the obtained lyrics text is combined with the rhythm feature and the melody feature. This method of generating music data is only an exemplary embodiment of step S170.
在其他实施例中,还可以通过将歌词特征输入到旋律分类器中获得旋律特征,然后将歌词特征和旋律特征输入到节奏分类器中获得节奏特征。相应的,该实施例中的旋律分类器的决策树各个节点由样本数据的歌词特征构建,节奏分类器的决策树各个节点由样本数据的歌词特征和旋律特征构建。最后组合所获得的旋律特征和节奏特征生成所获取的歌词文本对应的乐曲数据。In other embodiments, the melody feature may also be obtained by inputting the lyrics feature into the melody classifier, and then the lyric feature and the melody feature are input into the rhythm classifier to obtain the rhythm feature. Accordingly, each node of the decision tree of the melody classifier in this embodiment is constructed from the lyrics features of the sample data, and each node of the decision tree of the rhythm classifier is constructed from the lyrics features and melody features of the sample data. Finally, the obtained melody and rhythm characteristics are combined to generate music data corresponding to the obtained lyrics text.
对于是先获得节奏特征还是旋律特征,可以根据步骤S250中所得到的随机森林分类器(包括节奏分类器和旋律分类器)的精度来确定,即如果训练后,节奏分类器的精度高于旋律分类器,可以先通过节奏分类器预测得到节奏特征然后再通过旋律分类器预测得到旋律特征;如果训练后,节奏分类器的精度低于旋律分类器,则可以按照先旋律特征后节奏特征的方式生成乐曲数据。从而根据精度较高分类器预测得到对应的特征可以提高整体预测结果的精度。当然还可以从其他角度决定是先获得节奏特征还是旋律特征。Whether to obtain the rhythm feature or the melody feature first can be determined according to the accuracy of the random forest classifier (including the rhythm classifier and the melody classifier) obtained in step S250, that is, if the accuracy of the rhythm classifier is higher than the melody after training The classifier can first obtain the rhythm feature through the rhythm classifier prediction and then the melody feature through the melody classifier prediction; if the accuracy of the rhythm classifier is lower than the melody classifier after training, you can follow the way that the melody feature is followed by the rhythm feature. Generate song data. Therefore, the corresponding features predicted by the classifier with higher accuracy can improve the accuracy of the overall prediction result. Of course, you can also decide from other angles whether to obtain the rhythm feature or the melody feature first.
图6是图5所示实施例的步骤S175在一示例性实施例中的流程图,如图6所示,步骤S175包括:FIG. 6 is a flowchart of an exemplary embodiment of step S175 in the embodiment shown in FIG. 5. As shown in FIG. 6, step S175 includes:
步骤S301,通过所得到的节奏特征和旋律特征,生成序列中词对应的音符信息。Step S301: Generate the note information corresponding to the words in the sequence by using the obtained rhythm characteristics and melody characteristics.
组合所得到的节奏特征和旋律特征,从而得到歌词文本中的词对应的时值信息和音级信息,从而每个词会得到一个音符,即生成序列中词对应的音符信息。在一示例性实施例中,在生成序列中词对应的音符信息时,结合已经生成的音符的特征生成后续的音符信息,例如,在生成一个音符信息时,结合前5个音符的特征(比如时值、音高等),从而可以保证每个序列中每个词均生成对应的音符信息。The obtained rhythm characteristics and melody characteristics are combined to obtain the time value information and pitch information of the words in the lyrics text, so that each word will get a note, that is, the note information corresponding to the words in the sequence is generated. In an exemplary embodiment, when generating the note information corresponding to a word in the sequence, the subsequent note information is combined with the characteristics of the generated note. For example, when generating one note information, the features of the first 5 notes (such as Time value, pitch, etc.), so that each word in each sequence can generate corresponding note information.
步骤S303,组合序列中词对应的音符信息,生成歌词文本的乐曲数据。将歌词文本中词对应的音符信息进行组合,从而得到该歌词文本的乐曲数据。In step S303, the note information corresponding to the words in the sequence is combined to generate music data of the lyrics text. The note information corresponding to the words in the lyrics text is combined to obtain music data of the lyrics text.
图7图6所示实施例的步骤S303在一示例性实施例中的流程图,如图7所示,步骤S303包括:Step S303 of the embodiment shown in FIG. 7 and FIG. 6 is a flowchart in an exemplary embodiment. As shown in FIG. 7, step S303 includes:
步骤S3031,按照序列中词的顺序组合序列中词对应的音符信息,生成歌词文本对应的音符序列。歌词文本中的词是顺序构成的序列,在根据节奏特征和旋律特征生成歌词文本中词所对应的音符信息后,将该音符信息按照歌词文本中词的顺序进行组合,得到歌词文本对应的音符序列。Step S3031: Combine the note information corresponding to the words in the sequence according to the order of the words in the sequence to generate a note sequence corresponding to the lyrics text. The words in the lyrics text are a sequence of sequences. After generating the note information corresponding to the words in the lyrics text according to the rhythm and melody characteristics, the note information is combined in the order of the words in the lyrics text to obtain the notes corresponding to the lyrics text. sequence.
步骤S3033,根据设定的音符阈值过滤音符序列。Step S3033: Filter the note sequence according to the set note threshold.
其中过滤音符序列是指除去音符序列中的某些音符,其中所设定的音符阈值可以是某一具体的音符,也可以是某一范围内的音符。例如,如果想忽略较短的音符(比如1/16音符),可以将1/16音符设定为阈值,即保留除1/16的音符外的其它音符,得到新的音符序列。在一示例性实施例中,音符阈值可以根据实际需要进行调整,针对不同的歌词文本所设定的音符阈值可以不同,例如在某一段歌词文本中可以将音符阈值设定为1/64音符,从而除去音符序列中的1/64音符,而在另一段歌词文本中通过设定音符阈值过滤掉音符序列中的1/32音符。Filtering a note sequence refers to removing certain notes in the note sequence, and the set note threshold may be a specific note or a range of notes. For example, if you want to ignore shorter notes (such as 1/16 notes), you can set 1/16 notes as a threshold, that is, keep other notes except 1/16 notes, and get a new note sequence. In an exemplary embodiment, the note threshold can be adjusted according to actual needs. The note threshold set for different lyrics texts can be different. For example, the note threshold can be set to 1/64 notes in a certain piece of lyrics text. Thus, 1/64 notes in the note sequence are removed, and 1/32 notes in the note sequence are filtered out by setting a note threshold in another piece of lyrics text.
步骤S3035,通过过滤后的音符序列生成歌词文本的乐曲数据。In step S3035, the music data of the lyrics text is generated by the filtered note sequence.
下述为本公开装置实施例,可以用于执行本公开上述服务器200执行的为歌词文本生成乐曲方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开为歌词文本生成乐曲方法实施例。The following is a device embodiment of the present disclosure, which can be used to execute an embodiment of the method for generating music for lyrics text performed by the server 200 of the present disclosure. For details not disclosed in the device embodiment of the present disclosure, please refer to the embodiment of the method for generating music for lyrics text according to the present disclosure.
图8是根据一示例性实施例示出的一种为歌词文本生成乐曲的装置的框图。该为歌词文本生成乐曲的装置可以用于图2所示服务器200中,上文方法实施例示出的为歌词文本生成乐曲的方法的全部或者部分步骤。如图8所示,该装置包括但不限于:获取模块110、文本特征提取模块130、特征匹配模块150以及乐曲数据生成模块170,其中: 获取模块110,配置为获取歌词文本,歌词文本是若干词所顺序构成的序列。文本特征提取模块130,该模块与获取模块110连接,配置为对歌词文本进行特征提取获得序列映射的文本特征。特征匹配模块150,该模块与文本特征提取模块130连接,配置为进行文本特征与语料库中歌词特征之间的特征匹配,获得文本特征对应的歌词特征。乐曲数据生成模块170,该模块与特征匹配模块150连接,配置为通过所训练的随机森林分类器对获得的歌词特征进行序列中词所对应旋律和节奏的预测,生成适配于歌词文本的乐曲数据。Fig. 8 is a block diagram of a device for generating music for lyrics text according to an exemplary embodiment. The apparatus for generating music for lyrics text can be used in the server 200 shown in FIG. 2. All or part of the steps of the method for generating music for lyrics text shown in the method embodiment shown above. As shown in FIG. 8, the device includes, but is not limited to, an acquisition module 110, a text feature extraction module 130, a feature matching module 150, and a music data generation module 170. The acquisition module 110 is configured to acquire lyrics text, and the lyrics text is several A sequence of words. The text feature extraction module 130 is connected to the acquisition module 110 and is configured to perform feature extraction on the lyrics text to obtain the sequence-mapped text features. A feature matching module 150, which is connected to the text feature extraction module 130 and is configured to perform feature matching between text features and lyrics features in the corpus to obtain lyrics features corresponding to the text features. A music data generating module 170, which is connected to the feature matching module 150, is configured to predict the melody and rhythm corresponding to the words in the sequence by the trained random forest classifier on the obtained lyrics features to generate a music adapted to the lyrics text data.
图9是根据另一示例性实施例示出的一种为歌词文本生成乐曲装置的框图。如图9所示,该实施例中的装置除了包括图8所示的各模块外,还包括:特征提取模块210,配置为从样本数据中的歌词样本文本提取歌词特征,从所述样本数据中对应于所述歌词样本文本的乐曲数据提取节奏特征和旋律特征。语料库构建模块230,该模块与特征提取模块210连接,配置为由所述歌词特征、节奏特征和旋律特征构建所述语料库。训练模块250,该模块与语料库构建模块230连接,配置为通过歌词特征、节奏特征和旋律特征进行随机森林分类器的迭代训练,直至所训练得到的随机森林分类器对已知歌曲文本的旋律和节奏的预测达到指定精度,则停止所述随机森林分类器的迭代训练。Fig. 9 is a block diagram showing a device for generating music for lyrics text according to another exemplary embodiment. As shown in FIG. 9, in addition to the modules shown in FIG. 8, the device in this embodiment further includes a feature extraction module 210 configured to extract lyrics features from the sample text of the lyrics in the sample data, and from the sample data The rhythm data and melody feature are extracted from the music data corresponding to the lyrics sample text. A corpus construction module 230, which is connected to the feature extraction module 210, is configured to construct the corpus from the lyrics features, rhythm features, and melody features. A training module 250, which is connected to the corpus building module 230, and is configured to perform iterative training of a random forest classifier by using lyrics features, rhythm features, and melody features, until the trained random forest classifier has a melody and If the prediction of the rhythm reaches a specified accuracy, the iterative training of the random forest classifier is stopped.
图10是图8所示实施例的模块170在一示例性实施例中的框图。在该实施例中,随机森林分类器包括节奏分类器和旋律分类器,如图10所示,乐曲数据生成模块170包括:节奏特征获得单元171,配置为通过节奏分类器预测得到歌词特征对应的节奏特征。旋律特征获得单元173,该单元与节奏特征获得单元171连接,配置为将歌词特征和节奏特征输入到旋律分类器预测得到歌词特征对应的旋律特征。乐曲数据生成单元175,该单元与旋律特征获得单元连接173连接,配置为组合所得到的节奏特征和旋律特征,生成适配于歌词文本的乐曲数据。FIG. 10 is a block diagram of a module 170 of the embodiment shown in FIG. 8 in an exemplary embodiment. In this embodiment, the random forest classifier includes a rhythm classifier and a melody classifier. As shown in FIG. 10, the music data generation module 170 includes a rhythm feature obtaining unit 171 configured to obtain the lyrics feature corresponding to the lyrics feature predicted by the rhythm classifier. Rhythmic characteristics. The melody feature obtaining unit 173 is connected to the rhythm feature obtaining unit 171 and is configured to input lyrics features and rhythm features to the melody classifier to predict and obtain melody features corresponding to the lyrics features. A music data generating unit 175, which is connected to the melody feature obtaining unit connection 173, is configured to combine the obtained rhythm features and melody features to generate music data adapted to the lyrics text.
图11是图10所示实施例的乐曲数据生成单元175的示例性框图,在该实施例中,乐曲数据生成单元175包括:音符信息生成单元301,配置为通过所得到的节奏特征和旋律特征,生成序列中词对应的音符信息。音符信息组合单元303,该单元与音符信息生成单元301连接,配置为组合序列中词对应的音符信息,生成歌词文本的乐曲数据。FIG. 11 is an exemplary block diagram of the music data generating unit 175 of the embodiment shown in FIG. 10. In this embodiment, the music data generating unit 175 includes a note information generating unit 301 configured to pass the obtained rhythm characteristics and melody characteristics. To generate note information corresponding to words in the sequence. The note information combining unit 303 is connected to the note information generating unit 301 and is configured to combine the note information corresponding to the words in the sequence to generate music data of the lyrics text.
图12是图11所示的音符信息组合单元303的示例性框图,在该实施例中,音符信息组合单元303包括:音符序列生成单元3031,配置为按照序列中词的顺序组合序列中词对应的音符信息,生成歌词文本对应的音符序列。过滤单元3033,该单元与音符序列生成单元3031连接,配置为根据设定的音符阈值过滤音符序列。乐曲数据生成单元3035,该单元与过滤单元3033连接,配置为通过过滤后的音符序列生成歌词文本的乐曲数据。FIG. 12 is an exemplary block diagram of the note information combining unit 303 shown in FIG. 11. In this embodiment, the note information combining unit 303 includes a note sequence generating unit 3031 configured to combine the word correspondences in the sequence in the order of the words in the sequence. Note information to generate a note sequence corresponding to the lyrics text. The filtering unit 3033 is connected to the note sequence generating unit 3031 and is configured to filter the note sequence according to a set note threshold. A music data generating unit 3035 is connected to the filtering unit 3033 and is configured to generate music data of the lyrics text through the filtered note sequence.
上述装置中各个模块的功能和作用的实现过程具体详见上述为歌词文本生成乐曲的方法中对应步骤的实现过程,在此不再赘述。可以理解,这些模块或单元可以通过硬件、软件、或二者结合来实现。当以硬件方式实现时,这些模块或单元可以实施为一个或多个硬件模块,例如一个或多个专用集成电路。当以软件方式实现时,这些模块或单元可以实施为在一个或多个处理器上执行的一个或多个计算机程序。For details of the implementation process of the functions and functions of the modules in the above device, see the implementation process of corresponding steps in the method for generating music for lyrics text, and details are not described herein again. It can be understood that these modules or units can be implemented by hardware, software, or a combination of both. When implemented in hardware, these modules or units may be implemented as one or more hardware modules, such as one or more application specific integrated circuits. When implemented in software, these modules or units may be implemented as one or more computer programs executing on one or more processors.
可选的,本公开还提供一种为歌词文本生成乐曲的装置,该装置可以用于图2所述的服务器200,该装置包括:处理器;以及用于存储处理器可执行指令的存储器。其中,处理器被配置为执行图3至图7任一所示实施例的为歌词文本生成乐曲的方法。Optionally, the present disclosure also provides a device for generating music for lyrics text. The device may be used for the server 200 described in FIG. 2. The device includes: a processor; and a memory for storing processor-executable instructions. Wherein, the processor is configured to execute the method for generating music for lyrics text in the embodiment shown in any one of FIG. 3 to FIG. 7.
该实施例中的装置的处理器执行操作的具体方式已经在有关该为歌词文本生成乐曲的方法的实施例中执行了详细描述,此处将不做详细阐述说明。The specific manner in which the processor of the device in this embodiment performs operations has been described in detail in the embodiment of the method for generating music for lyrics text, and will not be described in detail here.
在示例性实施例中,本公开还提供了一种计算机可读存储介质,该计算机可读存储介质可以是存储有计算机程序的存储器250,该计算机程序可由服务器200的处理器270执行以完成上述为歌词文本生成乐曲的方法。In an exemplary embodiment, the present disclosure also provides a computer-readable storage medium. The computer-readable storage medium may be a memory 250 storing a computer program, which may be executed by the processor 270 of the server 200 to complete the foregoing. Method for generating music for lyrics text.
上述内容,仅为本申请的较佳示例性实施例,并非用于限制本申请的实施方案,本领域普通技术人员根据本申请的主要构思和精神,可以十分方便地进行相应的变通或修改,故本申请的保护范围应以权利要求书所要求的保护范围为准。The above content is only a preferred exemplary embodiment of the present application, and is not intended to limit the implementation of the present application. Those skilled in the art can easily make corresponding variations or modifications according to the main idea and spirit of the present application. Therefore, the protection scope of this application shall be subject to the protection scope claimed in the claims.

Claims (20)

  1. 一种基于随机森林为歌词文本生成乐曲的方法,其中,所述方法包括:A method for generating music for lyrics text based on a random forest, wherein the method includes:
    获取歌词文本,所述歌词文本是若干词所顺序构成的序列;Obtaining lyrics text, which is a sequence composed of several words in sequence;
    对所述歌词文本进行特征提取获得所述序列映射的文本特征;Performing feature extraction on the lyrics text to obtain text features of the sequence map;
    进行所述文本特征与语料库中歌词特征之间的特征匹配,获得所述文本特征对应的歌词特征;Performing feature matching between the text feature and the lyrics feature in the corpus to obtain the lyrics feature corresponding to the text feature;
    通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据。Predict the melody and rhythm corresponding to the words in the sequence by using the trained random forest classifier to generate the melody data adapted to the lyrics text.
  2. 如权利要求1所述的方法,其中,所述获取歌词文本之前,所述方法还包括:The method according to claim 1, wherein before the obtaining the lyrics text, the method further comprises:
    从样本数据中的歌词样本文本提取歌词特征,从所述样本数据中对应于所述歌词样本文本的乐曲数据提取节奏特征和旋律特征;Extracting lyrics features from the lyrics sample text in the sample data, and extracting rhythm features and melody features from the sample data of the song data corresponding to the lyrics sample text;
    由所述歌词特征、节奏特征和旋律特征构建所述语料库;Constructing the corpus from the lyrics characteristics, rhythm characteristics and melody characteristics;
    通过所述歌词特征、节奏特征和旋律特征进行所述随机森林分类器的迭代训练,直至所训练得到的随机森林分类器对已知歌曲文本的旋律和节奏的预测达到指定精度,则停止所述随机森林分类器的迭代训练。Iterative training of the random forest classifier by using the lyrics feature, rhythm feature, and melody feature, until the trained random forest classifier predicts the melody and rhythm of a known song text to a specified accuracy, then stop Iterative training of a random forest classifier.
  3. 如权利要求1或2所述的方法,其中,所述随机森林分类器包括节奏分类器和旋律分类器,所述通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据,包括:The method according to claim 1 or 2, wherein the random forest classifier includes a rhythm classifier and a melody classifier, and the obtained lyrics feature is performed in the sequence by the trained random forest classifier. Prediction of the melody and rhythm corresponding to a word, and generating music data adapted to the lyrics text include:
    通过所述节奏分类器预测得到所述歌词特征对应的节奏特征;Obtaining a rhythm feature corresponding to the lyrics feature through prediction by the rhythm classifier;
    将所述歌词特征和所述节奏特征输入到所述旋律分类器预测得到所述歌词特征对应的旋律特征;Inputting the lyrics feature and the rhythm feature to the melody classifier to predict and obtain the melody feature corresponding to the lyrics feature;
    组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据。The obtained rhythm characteristics and the melody characteristics are combined to generate music data adapted to the lyrics text.
  4. 如权利要求3所述的方法,其中,所述组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据,包括:The method according to claim 3, wherein generating the melody data adapted to the lyrics text by the combination of the rhythm feature and the melody feature comprises:
    通过所得到的所述节奏特征和所述旋律特征,生成所述序列中词对应的音符信息;Generate the note information corresponding to the words in the sequence by using the obtained rhythm feature and the melody feature;
    组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据。Combine the note information corresponding to the words in the sequence to generate music data of the lyrics text.
  5. 如权利要求4所述的方法,其中,所述组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据,包括:The method according to claim 4, wherein the combining the note information corresponding to the words in the sequence to generate music data of the lyrics text comprises:
    按照所述序列中词的顺序组合所述序列中词对应的音符信息,生成所述歌词文本对应的音符序列;Combining the note information corresponding to the words in the sequence according to the order of the words in the sequence to generate a note sequence corresponding to the lyrics text;
    根据设定的音符阈值过滤所述音符序列;Filtering the note sequence according to a set note threshold;
    通过过滤后的所述音符序列生成所述歌词文本的乐曲数据。Music data of the lyrics text is generated by the filtered note sequence.
  6. 一种基于随机森林为歌词文本生成乐曲的装置,其特征在于,所述装置包括:A device for generating music for lyrics text based on a random forest, characterized in that the device includes:
    获取模块,配置为获取歌词文本,所述歌词文本是若干词所顺序构成的序列;An obtaining module configured to obtain lyrics text, where the lyrics text is a sequence composed of several words in sequence;
    文本特征提取模块,配置为对所述歌词文本进行特征提取获得所述序列映射的文本特征;A text feature extraction module configured to perform feature extraction on the lyrics text to obtain the text features of the sequence map;
    特征匹配模块,配置为进行所述文本特征与语料库中歌词特征之间的特征匹配,获得所述文本特征对应的歌词特征;A feature matching module configured to perform feature matching between the text features and lyrics features in a corpus to obtain lyrics features corresponding to the text features;
    乐曲数据生成模块,配置为通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据。The music data generating module is configured to predict the melody and rhythm corresponding to the words in the sequence through the trained random forest classifier to generate the music data adapted to the lyrics text.
  7. 根据权利要求6所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 6, further comprising:
    特征提取模块,配置为从样本数据中的歌词样本文本提取歌词特征,从所述样本数据中对应于所述歌词样本文本的乐曲数据提取节奏特征和旋律特征;A feature extraction module configured to extract lyrics features from the lyrics sample text in the sample data, and extract rhythm features and melody features from the sample data of music data corresponding to the lyrics sample text;
    语料库构建模块,配置为由所述歌词特征、节奏特征和旋律特征构建所述语料库;A corpus construction module configured to construct the corpus from the lyrics characteristics, rhythm characteristics, and melody characteristics;
    训练模块,配置为通过所述歌词特征、节奏特征和旋律特征进行所述随机森林分类器的迭代训练,直至所训练得到的随机森林分类器对已知歌曲文本的旋律和节奏的预测达到指定精度,则停止所述随机森林分类器的迭代训练。A training module configured to perform iterative training of the random forest classifier by using the lyrics feature, rhythm feature, and melody feature until the trained random forest classifier predicts the melody and rhythm of a known song text to a specified accuracy , Stop the iterative training of the random forest classifier.
  8. 根据权利要求7所述的装置,其特征在于,所述随机森林分类器包括节奏分类器和旋律分类器,所述乐曲数据生成模块包括:The apparatus according to claim 7, wherein the random forest classifier includes a rhythm classifier and a melody classifier, and the music data generation module includes:
    节奏特征获得单元,配置为通过所述节奏分类器预测得到所述歌词特征对应的节奏特征;A rhythm feature obtaining unit configured to obtain a rhythm feature corresponding to the lyrics feature through prediction by the rhythm classifier;
    旋律特征获得单元,配置为将所述歌词特征和所述节奏特征输入到所述旋律分类器预测得到所述歌词特征对应的旋律特征;A melody feature obtaining unit configured to input the lyrics feature and the rhythm feature to the melody classifier to predict and obtain a melody feature corresponding to the lyrics feature;
    乐曲数据生成单元,配置为组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据。A music data generating unit is configured to combine the obtained rhythm feature and the melody feature to generate music data adapted to the lyrics text.
  9. 如权利要求8所述的装置,其中,所述乐曲数据生成单元包括:The apparatus according to claim 8, wherein the music data generating unit comprises:
    音符信息生成单元,配置为通过所得到的所述节奏特征和所述旋律特征,生成所述序列中词对应的音符信息;A note information generating unit configured to generate note information corresponding to a word in the sequence by using the obtained rhythm feature and the melody feature;
    音符信息组合单元,配置为组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据。A note information combining unit is configured to combine note information corresponding to words in the sequence to generate music data of the lyrics text.
  10. 如权利要求9所述的装置,其中所述音符信息组合单元包括:The apparatus according to claim 9, wherein the note information combining unit comprises:
    音符序列生成单元,配置为按照所述序列中词的顺序组合所述序列中词对应的音符信息,生成所述歌词文本对应的音符序列;A note sequence generating unit configured to combine the note information corresponding to the words in the sequence according to the order of the words in the sequence to generate a note sequence corresponding to the lyrics text;
    过滤单元,配置为根据设定的音符阈值过滤所述音符序列;A filtering unit configured to filter the note sequence according to a set note threshold;
    乐曲数据生成单元,配置为通过过滤后的所述音符序列生成所述歌词文本的乐曲数据。The music data generating unit is configured to generate music data of the lyrics text through the filtered note sequence.
  11. 一种基于随机森林为歌词文本生成乐曲的装置,所述装置包括:An apparatus for generating music for lyrics text based on a random forest, the apparatus includes:
    处理器;以及用于存储处理器可执行指令的存储器;A processor; and a memory for storing processor-executable instructions;
    其中,所述处理器被配置为执行以下步骤:The processor is configured to perform the following steps:
    获取歌词文本,所述歌词文本是若干词所顺序构成的序列;Obtaining lyrics text, which is a sequence composed of several words in sequence;
    对所述歌词文本进行特征提取获得所述序列映射的文本特征;Performing feature extraction on the lyrics text to obtain text features of the sequence map;
    进行所述文本特征与语料库中歌词特征之间的特征匹配,获得所述文本特征对应的歌词特征;Performing feature matching between the text feature and the lyrics feature in the corpus to obtain the lyrics feature corresponding to the text feature;
    通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据。Predict the melody and rhythm corresponding to the words in the sequence by using the trained random forest classifier to generate the melody data adapted to the lyrics text.
  12. 如权利要求11所述的装置,其中,所述获取歌词文本步骤之前,所述处理器执行以下步骤:The apparatus according to claim 11, wherein before the step of obtaining lyrics text, the processor performs the following steps:
    从样本数据中的歌词样本文本提取歌词特征,从所述样本数据中对应于所述歌词样本文本的乐曲数据提取节奏特征和旋律特征;Extracting lyrics features from the lyrics sample text in the sample data, and extracting rhythm features and melody features from the sample data of the song data corresponding to the lyrics sample text;
    由所述歌词特征、节奏特征和旋律特征构建所述语料库;Constructing the corpus from the lyrics characteristics, rhythm characteristics and melody characteristics;
    通过所述歌词特征、节奏特征和旋律特征进行所述随机森林分类器的迭代训练,直至所训练得到的随机森林分类器对已知歌曲文本的旋律和节奏的预测达到指定精度,则停止所述随机森林分类器的迭代训练。Iterative training of the random forest classifier by using the lyrics feature, rhythm feature, and melody feature, until the trained random forest classifier predicts the melody and rhythm of a known song text to a specified accuracy, then stop Iterative training of a random forest classifier.
  13. 如权利要求11或12所述的装置,其中,所述随机森林分类器包括节奏分类器和旋律分类器,所述通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据步骤中,所述处理器执行以下步骤:The device according to claim 11 or 12, wherein the random forest classifier comprises a rhythm classifier and a melody classifier, and the obtained lyrics features are processed in the sequence by the trained random forest classifier. In the step of predicting the melody and rhythm corresponding to a word and generating music data adapted to the lyrics text, the processor performs the following steps:
    通过所述节奏分类器预测得到所述歌词特征对应的节奏特征;Obtaining a rhythm feature corresponding to the lyrics feature through prediction by the rhythm classifier;
    将所述歌词特征和所述节奏特征输入到所述旋律分类器预测得到所述歌词特征对应的旋律特征;Inputting the lyrics feature and the rhythm feature to the melody classifier to predict and obtain the melody feature corresponding to the lyrics feature;
    组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据。The obtained rhythm characteristics and the melody characteristics are combined to generate music data adapted to the lyrics text.
  14. 如权利要求13所述的装置,其中,所述组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据步骤中,所述处理器执行以下步骤:The apparatus according to claim 13, wherein in the step of generating the music data adapted to the lyrics text by the combination of the rhythm feature and the melody feature, the processor performs the following steps:
    通过所得到的所述节奏特征和所述旋律特征,生成所述序列中词对应的音符信息;Generate the note information corresponding to the words in the sequence by using the obtained rhythm feature and the melody feature;
    组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据。Combine the note information corresponding to the words in the sequence to generate music data of the lyrics text.
  15. 如权利要求14所述的装置,其中,所述组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据的步骤中,所述处理器执行以下步骤:The apparatus according to claim 14, wherein in the step of combining musical note information corresponding to words in the sequence to generate music data of the lyrics text, the processor performs the following steps:
    按照所述序列中词的顺序组合所述序列中词对应的音符信息,生成所述歌词文本对应的音符序列;Combining the note information corresponding to the words in the sequence according to the order of the words in the sequence to generate a note sequence corresponding to the lyrics text;
    根据设定的音符阈值过滤所述音符序列;Filtering the note sequence according to a set note threshold;
    通过过滤后的所述音符序列生成所述歌词文本的乐曲数据。Music data of the lyrics text is generated by the filtered note sequence.
  16. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行以下步骤:A computer-readable storage medium having stored thereon a computer program, wherein the computer program is executed by a processor with the following steps:
    获取歌词文本,所述歌词文本是若干词所顺序构成的序列;Obtaining lyrics text, which is a sequence composed of several words in sequence;
    对所述歌词文本进行特征提取获得所述序列映射的文本特征;Performing feature extraction on the lyrics text to obtain text features of the sequence map;
    进行所述文本特征与语料库中歌词特征之间的特征匹配,获得所述文本特征对应的歌词特征;Performing feature matching between the text feature and the lyrics feature in the corpus to obtain the lyrics feature corresponding to the text feature;
    通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据。Predict the melody and rhythm corresponding to the words in the sequence by using the trained random forest classifier to generate the melody data adapted to the lyrics text.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述获取歌词文本步骤之前,所述处理器执行以下步骤:The computer-readable storage medium of claim 16, wherein before the step of obtaining lyrics text, the processor performs the following steps:
    从样本数据中的歌词样本文本提取歌词特征,从所述样本数据中对应于所述歌词样本文本的乐曲数据提取节奏特征和旋律特征;Extracting lyrics features from the lyrics sample text in the sample data, and extracting rhythm features and melody features from the sample data of the song data corresponding to the lyrics sample text;
    由所述歌词特征、节奏特征和旋律特征构建所述语料库;Constructing the corpus from the lyrics characteristics, rhythm characteristics and melody characteristics;
    通过所述歌词特征、节奏特征和旋律特征进行所述随机森林分类器的迭代训练,直至所训练得到的随机森林分类器对已知歌曲文本的旋律和节奏的预测达到指定精度,则停止所述随机森林分类器的迭代训练。Iterative training of the random forest classifier by using the lyrics feature, rhythm feature, and melody feature, until the trained random forest classifier predicts the melody and rhythm of a known song text to a specified accuracy, then stop Iterative training of a random forest classifier.
  18. 如权利要求16或17所述的计算机可读存储介质,其中,所述随机森林分类器包括节奏分类器和旋律分类器,所述通过所训练的随机森林分类器对获得的所述歌词特征进行所述序列中词所对应旋律和节奏的预测,生成适配于所述歌词文本的乐曲数据的步骤中,所述处理器执行以下步骤:The computer-readable storage medium according to claim 16 or 17, wherein the random forest classifier includes a rhythm classifier and a melody classifier, and the obtained lyrics feature is performed by the trained random forest classifier In the step of predicting the melody and rhythm corresponding to the words in the sequence and generating music data adapted to the lyrics text, the processor performs the following steps:
    通过所述节奏分类器预测得到所述歌词特征对应的节奏特征;Obtaining a rhythm feature corresponding to the lyrics feature through prediction by the rhythm classifier;
    将所述歌词特征和所述节奏特征输入到所述旋律分类器预测得到所述歌词特征对应的旋律特征;Inputting the lyrics feature and the rhythm feature to the melody classifier to predict and obtain the melody feature corresponding to the lyrics feature;
    组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据。The obtained rhythm characteristics and the melody characteristics are combined to generate music data adapted to the lyrics text.
  19. 如权利要求18所述的计算机可读存储介质,其中,所述组合所得到的所述节奏特征和所述旋律特征,生成适配于所述歌词文本的乐曲数据步骤中,所述处理器执行以下步骤:The computer-readable storage medium of claim 18, wherein in the step of generating the music data adapted to the lyrics text, the rhythm feature and the melody feature obtained by the combination, the processor executes The following steps:
    通过所得到的所述节奏特征和所述旋律特征,生成所述序列中词对应的音符信息;Generate the note information corresponding to the words in the sequence by using the obtained rhythm feature and the melody feature;
    组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据。Combine the note information corresponding to the words in the sequence to generate music data of the lyrics text.
  20. 如权利要求18所述的计算机可读存储介质,其中,所述组合所述序列中词对应的音符信息,生成所述歌词文本的乐曲数据步骤中,所述处理器执行以下步骤:The computer-readable storage medium of claim 18, wherein in the step of combining musical note information corresponding to words in the sequence to generate music data of the lyrics text, the processor performs the following steps:
    按照所述序列中词的顺序组合所述序列中词对应的音符信息,生成所述歌词文本对应的音符序列;Combining the note information corresponding to the words in the sequence according to the order of the words in the sequence to generate a note sequence corresponding to the lyrics text;
    根据设定的音符阈值过滤所述音符序列;Filtering the note sequence according to a set note threshold;
    通过过滤后的所述音符序列生成所述歌词文本的乐曲数据。Music data of the lyrics text is generated by the filtered note sequence.
PCT/CN2018/106267 2018-07-19 2018-09-18 Method and device for generating music for lyrics text, and computer-readable storage medium WO2020015153A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810798036.7 2018-07-19
CN201810798036.7A CN109166564B (en) 2018-07-19 2018-07-19 Method, apparatus and computer readable storage medium for generating a musical composition for a lyric text

Publications (1)

Publication Number Publication Date
WO2020015153A1 true WO2020015153A1 (en) 2020-01-23

Family

ID=64897874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/106267 WO2020015153A1 (en) 2018-07-19 2018-09-18 Method and device for generating music for lyrics text, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN109166564B (en)
WO (1) WO2020015153A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339350A (en) * 2020-03-27 2020-06-26 腾讯音乐娱乐科技(深圳)有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN111754962A (en) * 2020-05-06 2020-10-09 华南理工大学 Folk song intelligent auxiliary composition system and method based on up-down sampling
CN112309435A (en) * 2020-10-30 2021-02-02 北京有竹居网络技术有限公司 Method and device for generating main melody, electronic equipment and storage medium
CN112951187A (en) * 2021-03-24 2021-06-11 平安科技(深圳)有限公司 Van-music generation method, device, equipment and storage medium

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815493B (en) * 2019-01-09 2020-10-27 厦门大学 Modeling method for intelligent hip-hop music lyric generation
CN109584905B (en) * 2019-01-22 2021-09-28 腾讯音乐娱乐科技(深圳)有限公司 Method, terminal and computer readable medium for measuring music speed
CN109920397B (en) * 2019-01-31 2021-06-01 李奕君 System and method for making audio function in physics
CN110148115A (en) * 2019-04-04 2019-08-20 中国科学院深圳先进技术研究院 A kind of screening technique, device and the storage medium of metastasis of cancer prediction image feature
CN110222226B (en) * 2019-04-17 2024-03-12 平安科技(深圳)有限公司 Method, device and storage medium for generating rhythm by words based on neural network
CN110516110B (en) * 2019-07-22 2023-06-23 平安科技(深圳)有限公司 Song generation method, song generation device, computer equipment and storage medium
CN110516103B (en) * 2019-08-02 2022-10-14 平安科技(深圳)有限公司 Song rhythm generation method, device, storage medium and apparatus based on classifier
CN110517656B (en) * 2019-08-02 2024-04-26 平安科技(深圳)有限公司 Lyric rhythm generation method, device, storage medium and apparatus
CN112309353A (en) * 2020-10-30 2021-02-02 北京有竹居网络技术有限公司 Composing method and device, electronic equipment and storage medium
CN112489606B (en) * 2020-11-26 2022-09-27 北京有竹居网络技术有限公司 Melody generation method, device, readable medium and electronic equipment
CN113066456B (en) * 2021-03-17 2023-09-29 平安科技(深圳)有限公司 Method, device, equipment and storage medium for generating melody based on Berlin noise
CN113035161A (en) * 2021-03-17 2021-06-25 平安科技(深圳)有限公司 Chord-based song melody generation method, device, equipment and storage medium
CN113920968A (en) * 2021-10-09 2022-01-11 北京灵动音科技有限公司 Information processing method, information processing device, electronic equipment and storage medium
CN116645957B (en) * 2023-07-27 2023-10-03 腾讯科技(深圳)有限公司 Music generation method, device, terminal, storage medium and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902642A (en) * 2012-12-21 2014-07-02 香港科技大学 Music composition system using correlation between melody and lyrics
CN105513607A (en) * 2015-11-25 2016-04-20 网易传媒科技(北京)有限公司 Method and apparatus for music composition and lyric writing
CN106652984A (en) * 2016-10-11 2017-05-10 张文铂 Automatic song creation method via computer
CN106991993A (en) * 2017-05-27 2017-07-28 佳木斯大学 A kind of mobile communication terminal and its composing method with music composing function

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3309766B2 (en) * 1996-05-27 2002-07-29 ヤマハ株式会社 Automatic melody generator and recording medium
JP3932258B2 (en) * 2002-01-09 2007-06-20 株式会社ナカムラ Emergency escape ladder
WO2016029217A1 (en) * 2014-08-22 2016-02-25 Zya, Inc. System and method for automatically converting textual messages to musical compositions
CN104391980B (en) * 2014-12-08 2019-03-08 百度在线网络技术(北京)有限公司 The method and apparatus for generating song
CN106652997B (en) * 2016-12-29 2020-07-28 腾讯音乐娱乐(深圳)有限公司 Audio synthesis method and terminal
CN108268530B (en) * 2016-12-30 2022-04-29 阿里巴巴集团控股有限公司 Lyric score generation method and related device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902642A (en) * 2012-12-21 2014-07-02 香港科技大学 Music composition system using correlation between melody and lyrics
CN105513607A (en) * 2015-11-25 2016-04-20 网易传媒科技(北京)有限公司 Method and apparatus for music composition and lyric writing
CN106652984A (en) * 2016-10-11 2017-05-10 张文铂 Automatic song creation method via computer
CN106991993A (en) * 2017-05-27 2017-07-28 佳木斯大学 A kind of mobile communication terminal and its composing method with music composing function

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339350A (en) * 2020-03-27 2020-06-26 腾讯音乐娱乐科技(深圳)有限公司 Data processing method, data processing device, storage medium and electronic equipment
CN111339350B (en) * 2020-03-27 2023-11-28 腾讯音乐娱乐科技(深圳)有限公司 Data processing method and device, storage medium and electronic equipment
CN111754962A (en) * 2020-05-06 2020-10-09 华南理工大学 Folk song intelligent auxiliary composition system and method based on up-down sampling
CN111754962B (en) * 2020-05-06 2023-08-22 华南理工大学 Intelligent auxiliary music composing system and method based on lifting sampling
CN112309435A (en) * 2020-10-30 2021-02-02 北京有竹居网络技术有限公司 Method and device for generating main melody, electronic equipment and storage medium
CN112951187A (en) * 2021-03-24 2021-06-11 平安科技(深圳)有限公司 Van-music generation method, device, equipment and storage medium
CN112951187B (en) * 2021-03-24 2023-11-03 平安科技(深圳)有限公司 Var-bei music generation method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109166564B (en) 2023-06-06
CN109166564A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
WO2020015153A1 (en) Method and device for generating music for lyrics text, and computer-readable storage medium
CN110264991B (en) Training method of speech synthesis model, speech synthesis method, device, equipment and storage medium
JP6429945B2 (en) Method and apparatus for processing audio data
EP3803846B1 (en) Autonomous generation of melody
CN107123415B (en) Automatic song editing method and system
CN109785820A (en) A kind of processing method, device and equipment
US9620092B2 (en) Composition using correlation between melody and lyrics
CN111081272B (en) Method and device for identifying climax clips of songs
CN109346045B (en) Multi-vocal part music generation method and device based on long-short time neural network
CN106157979B (en) A kind of method and apparatus obtaining voice pitch data
CN111798821B (en) Sound conversion method, device, readable storage medium and electronic equipment
Tsunoo et al. Beyond timbral statistics: Improving music classification using percussive patterns and bass lines
WO2022227190A1 (en) Speech synthesis method and apparatus, and electronic device and storage medium
CN111916054B (en) Lip-based voice generation method, device and system and storage medium
JP2020003535A (en) Program, information processing method, electronic apparatus and learnt model
CN109829482A (en) Song training data processing method, device and computer readable storage medium
CN110164460A (en) Sing synthetic method and device
CN113813609B (en) Game music style classification method and device, readable medium and electronic equipment
Samsekai Manjabhat et al. Raga and tonic identification in carnatic music
CN111108557A (en) Method of modifying a style of an audio object, and corresponding electronic device, computer-readable program product and computer-readable storage medium
CN113421589A (en) Singer identification method, singer identification device, singer identification equipment and storage medium
CN113421554B (en) Voice keyword detection model processing method and device and computer equipment
JP6910987B2 (en) Recognition device, recognition system, terminal device, server device, method and program
TW202123091A (en) Microcontroller updating system and method
CN116189636B (en) Accompaniment generation method, device, equipment and storage medium based on electronic musical instrument

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18927008

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18927008

Country of ref document: EP

Kind code of ref document: A1