CN110516103B - Song rhythm generation method, device, storage medium and apparatus based on classifier - Google Patents

Song rhythm generation method, device, storage medium and apparatus based on classifier Download PDF

Info

Publication number
CN110516103B
CN110516103B CN201910720248.8A CN201910720248A CN110516103B CN 110516103 B CN110516103 B CN 110516103B CN 201910720248 A CN201910720248 A CN 201910720248A CN 110516103 B CN110516103 B CN 110516103B
Authority
CN
China
Prior art keywords
lyric
rhythm
sentence
song
lyrics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910720248.8A
Other languages
Chinese (zh)
Other versions
CN110516103A (en
Inventor
朱照华
王健宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910720248.8A priority Critical patent/CN110516103B/en
Publication of CN110516103A publication Critical patent/CN110516103A/en
Application granted granted Critical
Publication of CN110516103B publication Critical patent/CN110516103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/65Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/685Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using automatically derived transcript of audio data, e.g. lyrics

Abstract

The invention discloses a song rhythm generating method, equipment, a storage medium and a device based on a classifier, wherein the method comprises the following steps: acquiring a lyric text to be processed, and extracting first sentence lyrics from the lyric text to be processed; selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model; determining the initial position of the rhythm of the first sentence according to the target line and a preset rule; extracting lyric characteristic information from a lyric text to be processed; performing note prediction through a preset song rhythm generation model according to the characteristic information of the lyrics to obtain a target note time value corresponding to each lyric in a lyric text to be processed; and generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note time value. Based on artificial intelligence, the reasonable music rhythm is generated in a self-adaptive mode through the preset song rhythm generation model according to the lyrics, and the method is not limited by the length of the lyrics and the length of the paragraph and has good adaptability.

Description

Song rhythm generation method, device, storage medium and apparatus based on classifier
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a song rhythm generation method, equipment, a storage medium and a device based on a classifier.
Background
The rhythm is one of the important components of music, the quality of the rhythm directly influences the expressive force of the music, the rhythm is various, and the rhythm is obviously related to the music style. Compared with pure music, songs have different meanings from pure music in the field of automatic composition by the characteristics of mass multiple elements of the songs, the quality of the melody needs to be considered in the song composition process, and the combination of the melody and the lyrics needs to be considered at the same time.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a method, equipment, a storage medium and a device for generating song rhythm based on a classifier, and aims to solve the technical problem of poor quality of automatically composing songs in the prior art.
In order to achieve the above object, the present invention provides a song rhythm generating method based on a classifier, which comprises the following steps:
acquiring a lyric text to be processed, and extracting first sentence lyrics from the lyric text to be processed;
selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model;
determining the initial position of the rhythm of the first sentence according to the target line and a preset rule;
extracting characteristic information of the lyrics from the lyric text to be processed;
performing note prediction through the preset song rhythm generation model according to the lyric characteristic information to obtain a target note time value corresponding to each lyric in the lyric text to be processed;
and generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note duration.
Preferably, the determining the starting position of the first sentence rhythm according to the target line and a preset rule includes:
obtaining the probability of each element in the target row;
randomly generating a control parameter, and respectively matching the control parameter with each probability;
taking an element corresponding to the probability of successful matching as a target element;
and acquiring the rhythm position of the target element as the initial position of the rhythm of the first sentence.
Preferably, the selecting a target row corresponding to the first sentence of lyrics from a statistical matrix of a preset song rhythm generating model includes:
acquiring a first word number of the lyrics of the first sentence, and acquiring a second word number corresponding to each row in a statistical matrix of a preset song rhythm generation model;
matching the first word number with each second word number respectively;
and selecting a row corresponding to the successfully matched second word number from a statistical matrix of a preset song rhythm generation model as a target row corresponding to the lyrics of the first sentence.
Preferably, before the obtaining of the lyric text to be processed and the extracting of the first sentence lyrics from the lyric text to be processed, the method for generating the song rhythm based on the classifier further includes:
acquiring a training sample set;
and training a random forest model according to each music sample in the training sample set to obtain a preset song rhythm generation model.
Preferably, the training a random forest model according to each music sample in the training sample set to obtain a preset song rhythm generation model includes:
preprocessing each music sample in the training sample set to obtain preprocessed music samples;
carrying out feature statistics on the preprocessed music samples to obtain different types of feature information corresponding to the preprocessed music samples;
converting the characteristic information in a floating point number form to obtain conversion characteristic information in the floating point number form;
and training a random forest model according to the conversion characteristic information to obtain a preset song rhythm generation model.
Preferably, the preprocessing each music sample in the training sample set to obtain a preprocessed music sample includes:
acquiring the lyric number of each music sample in the training sample set;
extracting the melody parts of the music samples and counting the number of notes of the melody parts;
judging whether the lyric number is equal to the note number;
if the lyric number is not equal to the note number, traversing all note rhythms in the melody part, and searching note rhythms without lyric information;
and combining the searched note rhythm without the lyric information with the previous note rhythm with the lyric information to obtain a preprocessed music sample.
Preferably, the training the random forest model according to the conversion feature information to obtain a preset song rhythm generation model includes:
extracting sentence characteristic information from the conversion characteristic information;
counting the starting positions of sentences with different word numbers according to the sentence characteristic information, and recording the starting positions as a counting matrix;
taking information except the sentence characteristic information in the conversion characteristic information as sample lyric characteristic information;
and training a random forest model according to the sample lyric characteristic information and the statistical matrix to obtain a preset song rhythm generation model.
Furthermore, to achieve the above object, the present invention further provides a classifier-based song tempo generating device, which includes a memory, a processor, and a classifier-based song tempo generating program stored in the memory and operable on the processor, wherein the classifier-based song tempo generating program is configured to implement the steps of the classifier-based song tempo generating method as described above.
In addition, to achieve the above object, the present invention also provides a storage medium having a classifier-based song tempo generation program stored thereon, which when executed by a processor, implements the steps of the classifier-based song tempo generation method as described above.
In addition, in order to achieve the above object, the present invention further provides a song rhythm generating device based on a classifier, including:
the extraction module is used for acquiring a lyric text to be processed and extracting first sentence lyrics from the lyric text to be processed;
the selection module is used for selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model;
the determining module is used for determining the initial position of the rhythm of the first sentence according to the target line and a preset rule;
the extraction module is also used for extracting the characteristic information of the lyrics from the lyrics text to be processed;
the note prediction module is used for performing note prediction through the preset song rhythm generation model according to the characteristic information of the lyrics to obtain target note values corresponding to the lyrics in the lyric text to be processed;
and the generating module is used for generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note duration.
According to the method, a lyric text to be processed is obtained, first sentence lyrics are extracted from the lyric text to be processed, a target line corresponding to the first sentence lyrics is selected from a statistical matrix of a preset song rhythm generation model, the initial position of the first sentence rhythm is determined according to the target line and a preset rule, the initial position of the first sentence rhythm is determined in a self-adaptive mode according to the lyrics and the preset song rhythm generation model based on artificial intelligence, the method is not limited by the length of the lyrics and the length of paragraphs, and the method has good adaptability; extracting lyric characteristic information from the lyric text to be processed, performing note prediction through the preset song rhythm generation model according to the lyric characteristic information to obtain a target note time value corresponding to each lyric in the lyric text to be processed, generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note time value, and generating a reasonable music rhythm in a self-adaptive mode through the preset song rhythm generation model according to the lyrics.
Drawings
Fig. 1 is a schematic diagram of a classifier-based song tempo generation apparatus of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a first embodiment of a song tempo generation method based on a classifier according to the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of a song tempo generation method based on a classifier according to the present invention;
FIG. 4 is a flowchart illustrating a third embodiment of a song tempo generation method based on a classifier according to the present invention;
fig. 5 is a block diagram illustrating a first embodiment of a song tempo generating apparatus based on a classifier according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a classifier-based song tempo generation device of a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the classifier-based song tempo generating apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), and the optional user interface 1003 may further include a standard wired interface and a wireless interface, and the wired interface for the user interface 1003 may be a USB interface in the present invention. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory or a Non-volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the classifier-based song tempo generating device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a classifier-based song tempo generation program.
In the song tempo generation apparatus based on a classifier shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and communicating with the background server; the user interface 1003 is mainly used for connecting user equipment; the classifier-based song tempo generating apparatus invokes a classifier-based song tempo generating program stored in the memory 1005 via the processor 1001 and executes the classifier-based song tempo generating method provided by the embodiment of the present invention.
Based on the hardware structure, the embodiment of the song rhythm generation method based on the classifier is provided.
Referring to fig. 2, fig. 2 is a flowchart illustrating a first embodiment of a song tempo generation method based on a classifier according to the present invention, and proposes the first embodiment of the song tempo generation method based on a classifier according to the present invention.
In a first embodiment, the method for generating a song tempo based on a classifier comprises the steps of:
step S10: and acquiring a lyric text to be processed, and extracting first sentence lyrics from the lyric text to be processed.
It should be understood that the main subject of the present embodiment is the classifier-based song rhythm generating device, wherein the classifier-based song rhythm generating device may be an electronic device such as a personal computer or a server. The lyric text to be processed is a given lyric text W = { W1, W2, \8230;, wc }, wherein c represents the number of sentences in the lyric text. The lyric text to be processed can be split into a plurality of sentences, and the first sentence lyrics are extracted from the split sentences according to the time sequence. A note stream S can be obtained by initializing the note stream for storing the generated rhythm sequence; the time parameter d can be obtained by initializing the time parameter for synchronously recording the time in the note stream.
Step S20: and selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model.
It can be understood that the word number d of the first sentence of lyrics is counted, each row in the statistical matrix of the preset song rhythm generation model is a row corresponding to a lyric sentence with different word numbers, and a designated row H [ d ] is selected from the statistical matrix H of the song rhythm generation model according to the word number d of the first sentence of lyrics, namely a target row corresponding to the sentence word number of the first sentence of lyrics.
Step S30: and determining the initial position of the rhythm of the first sentence according to the target line and a preset rule.
It should be noted that, the start position of the first sentence rhythm is determined according to the target row H [ d ], and the preset rule may be to randomly generate a control parameter r, and determine the start position of the first sentence rhythm according to the probability interval in which the control parameter r is located. The method specifically comprises the following steps: and matching the control parameter r with the probability of each element in the target row, and taking the rhythm position corresponding to the probability of successful matching as the initial position of the first sentence rhythm, namely updating the value of the time parameter d.
Step S40: and extracting lyric characteristic information from the lyric text to be processed.
It should be understood that the lyric feature information includes position information, rhythm information, and statistical information of the lyric, the position information of the lyric includes that the lyric is the number of words in the current sentence, whether the lyric is the first measure, whether the lyric is the first word in the sentence, and whether the lyric is the last word in the sentence, the rhythm information of the lyric includes the beat number, and the statistical information of the lyric includes the number of sixteenth notes in the sentence before the current position, the number of trilogs in the sentence before the current position, the number of eighth notes in the sentence before the current position, the number of quarter notes in the sentence before the current position, the number of half notes in the sentence before the current position, and the number of other notes in the sentence before the current position.
Step S50: and performing note prediction through the preset song rhythm generation model according to the characteristic information of the lyrics to obtain a target note time value corresponding to each lyric in the lyric text to be processed.
In the specific implementation, a large number of music samples are obtained, lyric characteristic information of the music samples is extracted, and a random forest model is trained through the lyric characteristic information of the large number of music samples to obtain a preset song rhythm generation model. Therefore, the lyric characteristic information extracted from the lyric text to be processed can be input into the preset song rhythm generation model, and the target note time values corresponding to the lyrics in the lyric text to be processed can be automatically predicted.
Step S60: and generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note duration.
It can be understood that the generated target note time value is added to the note stream S, the value d of the time parameter is updated, whether all the lyrics in the lyric text to be processed output the corresponding target note time value is judged, if the last lyric is not reached, step S40 is executed until all the lyrics in the lyric text to be processed have the corresponding target note time value, when all the lyrics in the lyric text to be processed generate the corresponding target note time values, the target note time value of each lyric is added to the note stream according to the starting position, and the time parameter d is updated, so as to obtain the rhythm corresponding to the lyric text to be processed.
In the embodiment, a lyric text to be processed is obtained, first sentence lyrics are extracted from the lyric text to be processed, a target line corresponding to the first sentence lyrics is selected from a statistical matrix of a preset song rhythm generation model, the starting position of the first sentence rhythm is determined according to the target line and a preset rule, the starting position of the first sentence rhythm is determined in a self-adaptive mode according to the lyrics and the preset song rhythm generation model based on artificial intelligence, and the method is free from the constraint of the length of the lyrics and the length of paragraphs and has good adaptability; extracting lyric characteristic information from the lyric text to be processed, performing note prediction through the preset song rhythm generation model according to the lyric characteristic information to obtain a target note time value corresponding to each lyric in the lyric text to be processed, generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note time value, and generating a reasonable music rhythm in a self-adaptive mode through the preset song rhythm generation model according to the lyrics.
Referring to fig. 3, fig. 3 is a schematic flowchart of a second embodiment of the song rhythm generation method based on a classifier according to the present invention, and the second embodiment of the song rhythm generation method based on a classifier according to the present invention is provided based on the first embodiment shown in fig. 2.
In the second embodiment, the step S30 includes:
step S301: and acquiring the probability of each element in the target row.
It should be understood that, by using the feature that a strong classifier is generated by a plurality of weak classifiers in a random forest model, different decision tree models are constructed by analyzing specific features in music, so as to generate a reliable tempo generation model, i.e. the preset song tempo generation model. And in the training stage of the preset song rhythm generation model, counting sentence characteristic information of a music sample to obtain a statistical matrix H for recording the word number type of the first sentence and the sentence starting position under different word numbers, and then probabilistically counting all parameters of each row in the statistical matrix H line by line. The probability of each element in the target row can be obtained from the statistical matrix H of the preset song tempo generation model.
Step S302: and randomly generating a control parameter, and respectively matching the control parameter with each probability.
It can be understood that, first, a value range of the probability of each element in the target row is obtained, a control parameter r is randomly generated in the value range, the control parameter r is a random probability, the control parameter r and the probability of each element in the target row are respectively matched, the probability of each element in the target row can be set to a corresponding probability interval in advance, and the control parameter r is respectively compared with each probability interval to realize matching.
Step S303: and taking the element corresponding to the matching success probability as a target element.
It should be noted that, a probability interval corresponding to the probability of each element in the target row is set in advance, the control parameter r is compared with each probability interval, if the control parameter r is in a certain probability interval, the probability matching between the control parameter and the probability interval is considered to be successful, and the element corresponding to the probability of successful matching is acquired as the target element.
Step S304: and acquiring the rhythm position of the target element as the initial position of the rhythm of the first sentence.
In a specific implementation, in a training stage of the preset song rhythm generation model, sentence characteristic information of a music sample is counted to obtain a statistical matrix H for recording word types of a first sentence and sentence starting positions under different word numbers, and a rhythm position of the target element is obtained from the target row of the statistical matrix H and is used as a starting position of a rhythm of the first sentence, that is, a value of the update time parameter d.
In this embodiment, the step S20 includes:
acquiring a first word number of the lyrics of the first sentence, and acquiring a second word number corresponding to each row in a statistical matrix of a preset song rhythm generation model;
matching the first word number with each second word number respectively;
and selecting a row corresponding to the successfully matched second word number from a statistical matrix of a preset song rhythm generation model as a target row corresponding to the lyrics of the first sentence.
It should be understood that, each row in the statistical matrix of the preset song rhythm generation model is a row corresponding to a lyric sentence with different word numbers, a first word number of the lyric of the first sentence is counted and obtained, a second word number corresponding to each row in the statistical matrix of the preset song rhythm generation model is obtained, the first word number is compared with each second word number respectively, and if the word numbers of the first word number and the second word number are consistent, the matching is determined to be successful.
Understandably, a row corresponding to a second word number which is successfully matched is selected from a statistical matrix of a preset song rhythm generation model, namely, a specified row H [ d ] corresponding to the first word number d of the lyrics of the first sentence is obtained and is used as a target row corresponding to the lyrics of the first sentence.
In this embodiment, by obtaining the probability of each element in the target row, a control parameter is randomly generated, the control parameter is respectively matched with each probability, an element corresponding to the probability that matching is successful is used as a target element, the rhythm position of the target element is obtained as the start position of the first sentence rhythm, and the start position of the first sentence rhythm is determined based on the randomly generated control parameter, so that the generated song rhythm is not single, more various, and has good adaptability.
Referring to fig. 4, fig. 4 is a flowchart illustrating a third embodiment of the song tempo generating method based on a classifier according to the present invention, and the third embodiment of the song tempo generating method based on a classifier according to the present invention is provided based on the second embodiment shown in fig. 3.
In the third embodiment, before the step S10, the method further includes:
step S01: a set of training samples is obtained.
It should be understood that the training sample set includes a plurality of music samples, and different decision tree models are constructed by analyzing specific features in the music samples to generate a reliable song rhythm generation model. The training sample set T = { T1, T2, \8230; ti.. And TN } is the content of a Musical Instrument Digital Interface (MIDI) file with lyric information, and N is an integer greater than or equal to 1 and represents the number of music samples in the training sample set.
Step S02: and training a random forest model according to each music sample in the training sample set to obtain a preset song rhythm generation model.
It is understood that, for any one of the music samples Ti, i is an integer of 1 or more and N or less, U = { U1, U2, \8230:, ua } represents lyric information in the music sample, a represents the number of lyrics, V = { V1, V2, \8230;, vb } represents rhythm information of a melody vocal part in the music sample, and b represents the number of notes in the melody vocal part. And performing characteristic analysis on each music sample in the training sample set, and counting different types of characteristic information of each music sample, wherein the characteristic information comprises sentence information, position information, rhythm information and statistical information. And training the random forest model according to different types of characteristic information of each music sample to obtain a preset song rhythm generation model.
In this embodiment, the step S02 includes:
preprocessing each music sample in the training sample set to obtain preprocessed music samples;
carrying out feature statistics on the preprocessed music samples to obtain different types of feature information corresponding to the preprocessed music samples;
converting the characteristic information in a floating point number form to obtain conversion characteristic information in the floating point number form;
and training a random forest model according to the conversion characteristic information to obtain a preset song rhythm generation model.
It should be noted that, in the training process, each music sample in the training sample set needs to be preprocessed, specifically: in the process of constructing the preset song rhythm generation model, it is required to ensure that each note in the melody vocal part of the music samples in the training sample set can correspond to one lyric, so that whether the number a of the lyrics in the music samples is equal to the number b of the notes in the melody vocal part needs to be detected, if the number a of the lyrics in the music samples is equal to the number b of the notes in the melody vocal part, characteristic information statistics is performed on each music sample, otherwise, all notes in the rhythm information V of the melody vocal part in the music samples are traversed, a note rhythm without lyric information is searched, the note rhythm without lyric information is combined with the previous rhythm with lyric information, and then the step of performing characteristic information statistics on each music sample is performed. In this embodiment, the preprocessing each music sample in the training sample set to obtain a preprocessed music sample includes: acquiring the lyric number of each music sample in the training sample set; extracting the melody parts of the music samples and counting the number of notes of the melody parts; judging whether the lyric number is equal to the note number; if the lyric number is not equal to the note number, traversing all note rhythms in the melody part, and searching note rhythms without lyric information; and combining the searched note rhythm without the lyric information with the previous note rhythm with the lyric information to obtain a preprocessed music sample.
In specific implementation, the features of the music samples are counted according to the feature information shown in table 1 below to obtain different types of feature conditions.
Figure BDA0002154654340000111
TABLE 1
It should be understood that, since features are required to be input in the decision tree model in the floating point type, the above feature information needs to be converted into the floating point type to be applied to the model, the feature information is converted into the floating point type, whether the above features are judged is represented in the floating point type, and the sample beat number is recorded in the form of a + 10+ b, where the sample beat number is a
Figure BDA0002154654340000112
And marking the beat number in the music, thereby obtaining the conversion characteristic information in the form of floating point numbers.
It can be understood that, after the conversion characteristic information of all music samples is obtained, the following two operations are performed:
the first section is operative to count sentence characteristic information including whether or not a first sentence, the number of words in the sentence, and a start position of the first sentence from the conversion characteristic information, to obtain a vector O = { O1, O2, \8230;, on _ H } recording the number of kinds of words in the first sentence, and a statistical matrix H (n _ H, n _ w) of the start positions of the sentences under different numbers of words, where n _ H denotes the number of categories of the first sentences of different numbers in the music sample, n _ w denotes different beat positions in the music sample, H [ i, j ] denotes the number of sentences having the number of words O [ i ] and starting at jth beat among all the music samples, and then probabilizes all parameters of each line among the matrix H line by line to obtain the statistical matrix H.
The second part of the operation takes other characteristics except sentence characteristics as sample lyric characteristic information, and the sample lyric characteristic information comprises position information, rhythm information and statistical information. And using the sample lyric characteristic information to train the random forest model, and using a note time value corresponding to the current lyric as a basis for constructing a decision tree. In the training process, a plurality of music samples are randomly selected from all music samples to construct different decision trees as a generation construction process, and the process is repeated until a final song rhythm generation model is constructed. In this embodiment, the training a random forest model according to the conversion feature information to obtain a preset song rhythm generation model includes: extracting sentence characteristic information from the conversion characteristic information; counting the starting positions of sentences with different word numbers according to the sentence characteristic information, and recording the starting positions as a counting matrix; taking information except the sentence characteristic information in the conversion characteristic information as sample lyric characteristic information; and training a random forest model according to the sample lyric characteristic information and the statistical matrix to obtain a preset song rhythm generation model.
In the embodiment, the training sample set is obtained, the random forest model is trained according to each music sample in the training sample set, and the preset song rhythm generation model is obtained, so that the reasonable music rhythm is generated in a self-adaptive mode through the preset song rhythm generation model according to the lyrics, the constraint of the lyric length and the paragraph length is avoided, the adaptability is good, the random forest rhythm generation model can be used for generating music rhythms of different styles, and the random forest rhythm generation model is realized by using the training sample set of a specific style and has good expansibility.
Furthermore, an embodiment of the present invention further provides a storage medium, where a classifier-based song tempo generation program is stored on the storage medium, and the classifier-based song tempo generation program, when executed by a processor, implements the steps of the classifier-based song tempo generation method as described above.
In addition, referring to fig. 5, an embodiment of the present invention further provides a song rhythm generating apparatus based on a classifier, where the song rhythm generating apparatus based on the classifier includes:
and the extraction module 10 is configured to obtain a lyric text to be processed, and extract first sentence lyrics from the lyric text to be processed.
It should be understood that the lyrics text to be processed is a given piece of lyrics text W = { W1, W2, \8230;, wc }, where c represents the number of sentences in the lyrics text. The lyric text to be processed can be split into a plurality of sentences, and the lyrics of the first sentence are extracted from the split sentences according to the time sequence. A note stream S can be obtained by initializing the note stream for storing the generated rhythm sequence; the time parameter d can be obtained by initializing the time parameter for synchronously recording the time in the note stream.
And the selecting module 20 is configured to select a target row corresponding to the lyric of the first sentence from a statistical matrix of a preset song rhythm generating model.
It can be understood that the word number d of the sentence of the lyric of the first sentence is counted, each row in the statistical matrix of the preset song rhythm generation model is a row corresponding to a lyric sentence with different word numbers, and a designated row H [ d ], namely a target row corresponding to the word number of the sentence of the lyric of the first sentence, is selected from the statistical matrix H of the song rhythm generation model according to the word number d of the lyric of the first sentence.
And the determining module 30 is configured to determine, according to the target line and according to a preset rule, a start position of the first sentence rhythm.
It should be noted that, the start position of the first sentence rhythm is determined according to the target line H [ d ], and the preset rule may be to randomly generate a control parameter r, and determine the start position of the first sentence rhythm according to the probability interval where the control parameter r is located. The method specifically comprises the following steps: and matching the control parameter r with the probability of each element in the target row, and taking the rhythm position corresponding to the probability of successful matching as the initial position of the first sentence rhythm, namely updating the value of the time parameter d.
The extraction module 10 is further configured to extract lyric feature information from the to-be-processed lyric text.
It should be understood that the lyric feature information includes position information, rhythm information, and statistical information of the lyric, the position information of the lyric includes that the lyric is the number of words in the current sentence, whether the lyric is the first measure, whether the lyric is the first word in the sentence, and whether the lyric is the last word in the sentence, the rhythm information of the lyric includes the beat number, and the statistical information of the lyric includes the number of sixteenth notes in the sentence before the current position, the number of trilogs in the sentence before the current position, the number of eighth notes in the sentence before the current position, the number of quarter notes in the sentence before the current position, the number of half notes in the sentence before the current position, and the number of other notes in the sentence before the current position.
And the note prediction module 40 is configured to perform note prediction according to the lyric feature information through the preset song rhythm generation model, and obtain a target note time value corresponding to each lyric in the lyric text to be processed.
In the specific implementation, a large number of music samples are obtained, lyric characteristic information of the music samples is extracted, and a random forest model is trained through the lyric characteristic information of the large number of music samples to obtain a preset song rhythm generation model. Therefore, the lyric characteristic information extracted from the lyric text to be processed can be input into the preset song rhythm generation model, and the target note time values corresponding to the lyrics in the lyric text to be processed can be automatically predicted.
And a generating module 50, configured to generate a song rhythm corresponding to the to-be-processed lyric text according to the starting position and the target note duration.
It can be understood that the generated target note time is added to the note stream S, the value d of the time parameter is updated, whether all the lyrics in the lyric text to be processed output corresponding target note time is judged, if not, the last lyric is executed, step S40 is executed until all the lyrics in the lyric text to be processed have corresponding target note time, and when all the lyrics in the lyric text to be processed generate corresponding target note time, the target note time of each lyric is added to the note stream according to the starting position, and the time parameter d is updated, so as to obtain the song rhythm corresponding to the lyric text to be processed.
In the embodiment, a lyric text to be processed is obtained, first sentence lyrics are extracted from the lyric text to be processed, a target line corresponding to the first sentence lyrics is selected from a statistical matrix of a preset song rhythm generation model, the starting position of the first sentence rhythm is determined according to the target line and a preset rule, the starting position of the first sentence rhythm is determined in a self-adaptive mode according to the lyrics and the preset song rhythm generation model based on artificial intelligence, and the method is free from the constraint of the length of the lyrics and the length of paragraphs and has good adaptability; extracting lyric characteristic information from the lyric text to be processed, performing note prediction through the preset song rhythm generation model according to the lyric characteristic information to obtain a target note time value corresponding to each lyric in the lyric text to be processed, generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note time value, and generating a reasonable music rhythm in a self-adaptive mode through the preset song rhythm generation model according to the lyrics.
In one embodiment, the classifier-based song rhythm generating apparatus further includes:
the acquisition module is used for acquiring the probability of each element in the target row;
the matching module is used for randomly generating a control parameter and matching the control parameter with each probability respectively;
the determining module 30 is further configured to take an element corresponding to the probability of successful matching as a target element;
the obtaining module is further configured to obtain a rhythm position of the target element as an initial position of the first sentence rhythm.
In an embodiment, the obtaining module is further configured to obtain a first word count of the lyrics of the first sentence, and obtain a second word count corresponding to each line in a statistical matrix of a preset song rhythm generation model;
the matching module is further used for matching the first word number with each second word number respectively;
the selecting module 20 is further configured to select, from the statistical matrix of the preset song rhythm generating model, a line corresponding to the second word number that is successfully matched as a target line corresponding to the first sentence of lyrics.
In one embodiment, the classifier-based song rhythm generating apparatus further includes:
the acquisition module is further used for acquiring a training sample set;
and the training module is used for training the random forest model according to each music sample in the training sample set to obtain a preset song rhythm generation model.
In an embodiment, the training module is further configured to pre-process each music sample in the training sample set to obtain a pre-processed music sample; carrying out feature statistics on the preprocessed music samples to obtain different types of feature information corresponding to the preprocessed music samples; converting the characteristic information in a floating point number form to obtain conversion characteristic information in the floating point number form; and training a random forest model according to the conversion characteristic information to obtain a preset song rhythm generation model.
In an embodiment, the training module is further configured to obtain a lyric number of each music sample in the training sample set; extracting the melody vocal part of each music sample, and counting the number of notes of each melody vocal part; judging whether the lyric number is equal to the note number; if the lyric number is not equal to the note number, traversing all note rhythms in the melody part, and searching note rhythms without lyric information; and combining the searched note rhythm without the lyric information with the previous note rhythm with the lyric information to obtain a preprocessed music sample.
In an embodiment, the training module is further configured to extract sentence feature information from the conversion feature information; counting the starting positions of sentences with different word numbers according to the sentence characteristic information, and recording the starting positions as a counting matrix; taking information except the sentence characteristic information in the conversion characteristic information as sample lyric characteristic information; and training a random forest model according to the sample lyric characteristic information and the statistical matrix to obtain a preset song rhythm generation model.
Other embodiments or specific implementation manners of the song rhythm generating device based on the classifier according to the present invention may refer to the above method embodiments, and are not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order, but rather the words first, second, etc. are to be interpreted as indicating.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (such as a Read Only Memory image (ROM)/Random Access Memory (RAM), a magnetic disk, and an optical disk), and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (8)

1. A song rhythm generating method based on a classifier is characterized by comprising the following steps of:
acquiring a lyric text to be processed, and extracting first-sentence lyrics from the lyric text to be processed;
selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model;
determining the initial position of the rhythm of the first sentence according to the target line and a preset rule;
extracting lyric characteristic information from the lyric text to be processed;
performing note prediction through the preset song rhythm generation model according to the characteristic information of the lyrics to obtain target note values corresponding to the lyrics in the lyric text to be processed;
generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note duration;
the determining of the initial position of the rhythm of the first sentence according to the target line and a preset rule comprises the following steps:
obtaining the probability of each element in the target row;
randomly generating a control parameter, and respectively matching the control parameter with each probability;
taking an element corresponding to the probability of successful matching as a target element;
acquiring the rhythm position of the target element as the initial position of the rhythm of the first sentence;
the selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model comprises:
acquiring a first word number of the lyrics of the first sentence, and acquiring a second word number corresponding to each row in a statistical matrix of a preset song rhythm generation model;
matching the first word number with each second word number respectively;
and selecting a row corresponding to the successfully matched second word number from a statistical matrix of a preset song rhythm generation model as a target row corresponding to the lyrics of the first sentence.
2. The method for classifier based song tempo generation as claimed in claim 1, wherein said obtaining a text of lyrics to be processed, before extracting first sentence lyrics from said text of lyrics to be processed, said method for classifier based song tempo generation further comprises:
acquiring a training sample set;
and training a random forest model according to each music sample in the training sample set to obtain a preset song rhythm generation model.
3. The method for generating song tempos based on a classifier according to claim 2, wherein the training of the random forest model according to each music sample in the training sample set to obtain the preset song tempo generation model comprises:
preprocessing each music sample in the training sample set to obtain preprocessed music samples;
carrying out feature statistics on the preprocessed music samples to obtain different types of feature information corresponding to the preprocessed music samples;
converting the characteristic information in a floating point number form to obtain conversion characteristic information in the floating point number form;
and training a random forest model according to the conversion characteristic information to obtain a preset song rhythm generation model.
4. The classifier-based song tempo generation method of claim 3, wherein the pre-processing each music sample in the training sample set to obtain pre-processed music samples comprises:
acquiring the lyric number of each music sample in the training sample set;
extracting the melody vocal part of each music sample, and counting the number of notes of each melody vocal part;
judging whether the lyric number is equal to the note number;
if the lyric number is not equal to the note number, traversing all note rhythms in the melody part, and searching note rhythms without lyric information;
and combining the searched note rhythm without the lyric information with the previous note rhythm with the lyric information to obtain a preprocessed music sample.
5. The classifier-based song tempo generation method of claim 3, wherein the training of the random forest model according to the conversion feature information to obtain a preset song tempo generation model comprises:
extracting sentence characteristic information from the conversion characteristic information;
counting the starting positions of sentences with different word numbers according to the sentence characteristic information, and recording the starting positions as a counting matrix;
taking information except the sentence characteristic information in the conversion characteristic information as sample lyric characteristic information;
and training a random forest model according to the sample lyric characteristic information and the statistical matrix to obtain a preset song rhythm generation model.
6. A classifier-based song tempo generating apparatus, comprising: a memory, a processor, and a classifier-based song tempo generation program stored on the memory and executable on the processor, the classifier-based song tempo generation program when executed by the processor implementing the steps of the classifier-based song tempo generation method according to any one of claims 1-5.
7. A storage medium, characterized in that the storage medium has stored thereon a classifier-based song tempo generation program which, when executed by a processor, implements the steps of the classifier-based song tempo generation method according to any one of claims 1 to 5.
8. A classifier-based song tempo generating apparatus, comprising:
the extraction module is used for acquiring a lyric text to be processed and extracting first sentence lyrics from the lyric text to be processed;
the selection module is used for selecting a target row corresponding to the lyrics of the first sentence from a statistical matrix of a preset song rhythm generation model;
the determining module is used for determining the initial position of the first sentence rhythm according to the target line and a preset rule;
the extraction module is also used for extracting the characteristic information of the lyrics from the lyrics text to be processed;
the note prediction module is used for performing note prediction through the preset song rhythm generation model according to the lyric characteristic information to obtain a target note time value corresponding to each lyric in the lyric text to be processed;
the generating module is used for generating a song rhythm corresponding to the lyric text to be processed according to the starting position and the target note duration;
the determining module is further configured to obtain probabilities of elements in the target row; randomly generating a control parameter, and respectively matching the control parameter with each probability; taking an element corresponding to the probability of successful matching as a target element; acquiring the rhythm position of the target element as the initial position of the rhythm of the first sentence;
the selection module is further used for acquiring a first word number of the lyrics of the first sentence and acquiring a second word number corresponding to each row in a statistical matrix of a preset song rhythm generation model; matching the first word number with each second word number respectively; and selecting a row corresponding to the successfully matched second word number from a statistical matrix of a preset song rhythm generation model as a target row corresponding to the lyrics of the first sentence.
CN201910720248.8A 2019-08-02 2019-08-02 Song rhythm generation method, device, storage medium and apparatus based on classifier Active CN110516103B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910720248.8A CN110516103B (en) 2019-08-02 2019-08-02 Song rhythm generation method, device, storage medium and apparatus based on classifier

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910720248.8A CN110516103B (en) 2019-08-02 2019-08-02 Song rhythm generation method, device, storage medium and apparatus based on classifier

Publications (2)

Publication Number Publication Date
CN110516103A CN110516103A (en) 2019-11-29
CN110516103B true CN110516103B (en) 2022-10-14

Family

ID=68625187

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910720248.8A Active CN110516103B (en) 2019-08-02 2019-08-02 Song rhythm generation method, device, storage medium and apparatus based on classifier

Country Status (1)

Country Link
CN (1) CN110516103B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111445897B (en) * 2020-03-23 2023-04-14 北京字节跳动网络技术有限公司 Song generation method and device, readable medium and electronic equipment
CN116343723B (en) * 2023-03-17 2024-02-06 广州趣研网络科技有限公司 Melody generation method and device, storage medium and computer equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105788589A (en) * 2016-05-04 2016-07-20 腾讯科技(深圳)有限公司 Audio data processing method and device
JP2016157086A (en) * 2015-02-26 2016-09-01 パイオニア株式会社 Lyrics voice output device, lyrics voice output method, and program
CN107871012A (en) * 2017-11-22 2018-04-03 广州酷狗计算机科技有限公司 Audio-frequency processing method, device, storage medium and terminal
CN109166564A (en) * 2018-07-19 2019-01-08 平安科技(深圳)有限公司 For the method, apparatus and computer readable storage medium of lyrics text generation melody
CN109841202A (en) * 2019-01-04 2019-06-04 平安科技(深圳)有限公司 Rhythm generation method, device and terminal device based on speech synthesis
CN109979497A (en) * 2017-12-28 2019-07-05 阿里巴巴集团控股有限公司 Generation method, device and system and the data processing and playback of songs method of song

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016157086A (en) * 2015-02-26 2016-09-01 パイオニア株式会社 Lyrics voice output device, lyrics voice output method, and program
CN105788589A (en) * 2016-05-04 2016-07-20 腾讯科技(深圳)有限公司 Audio data processing method and device
CN107871012A (en) * 2017-11-22 2018-04-03 广州酷狗计算机科技有限公司 Audio-frequency processing method, device, storage medium and terminal
CN109979497A (en) * 2017-12-28 2019-07-05 阿里巴巴集团控股有限公司 Generation method, device and system and the data processing and playback of songs method of song
CN109166564A (en) * 2018-07-19 2019-01-08 平安科技(深圳)有限公司 For the method, apparatus and computer readable storage medium of lyrics text generation melody
CN109841202A (en) * 2019-01-04 2019-06-04 平安科技(深圳)有限公司 Rhythm generation method, device and terminal device based on speech synthesis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
改进的BPM音频节奏特征提取算法研究;吴昊等;《兰州文理学院学报(自然科学版)》;20180710(第04期);57-64 *

Also Published As

Publication number Publication date
CN110516103A (en) 2019-11-29

Similar Documents

Publication Publication Date Title
CN109523986B (en) Speech synthesis method, apparatus, device and storage medium
Gururani et al. An attention mechanism for musical instrument recognition
US8115089B2 (en) Apparatus and method for creating singing synthesizing database, and pitch curve generation apparatus and method
CN109920449B (en) Beat analysis method, audio processing method, device, equipment and medium
JPWO2010047019A1 (en) Statistical model learning apparatus, statistical model learning method, and program
CN110516103B (en) Song rhythm generation method, device, storage medium and apparatus based on classifier
CN109346043B (en) Music generation method and device based on generation countermeasure network
US10297241B2 (en) Sound signal processing method and sound signal processing apparatus
JPH09293083A (en) Music retrieval device and method
US20120300950A1 (en) Management of a sound material to be stored into a database
CN110010159B (en) Sound similarity determination method and device
CN113010730B (en) Music file generation method, device, equipment and storage medium
CN116710998A (en) Information processing system, electronic musical instrument, information processing method, and program
CN111198965B (en) Song retrieval method, song retrieval device, server and storage medium
JP2017161572A (en) Sound signal processing method and sound signal processing device
JP6954780B2 (en) Karaoke equipment
CN110517656B (en) Lyric rhythm generation method, device, storage medium and apparatus
JP2008233759A (en) Mixed model generating device, sound processor, and program
CN113196381A (en) Sound analysis method and sound analysis device
CN113658570B (en) Song processing method, apparatus, computer device, storage medium, and program product
CN117235300B (en) Song recommendation method, system and storage medium of intelligent K song system
US20240013760A1 (en) Text providing method and text providing device
CN116189636B (en) Accompaniment generation method, device, equipment and storage medium based on electronic musical instrument
CN116312425A (en) Audio adjustment method, computer device and program product
WO2019208391A1 (en) Method for presenting musicality information, musicality information presenting device, and musicality information presenting system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant