CN111739492B - Music melody generation method based on pitch contour curve - Google Patents
Music melody generation method based on pitch contour curve Download PDFInfo
- Publication number
- CN111739492B CN111739492B CN202010559217.1A CN202010559217A CN111739492B CN 111739492 B CN111739492 B CN 111739492B CN 202010559217 A CN202010559217 A CN 202010559217A CN 111739492 B CN111739492 B CN 111739492B
- Authority
- CN
- China
- Prior art keywords
- long
- melody
- length
- term structure
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
- G10H1/0025—Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/101—Music Composition or musical creation; Tools or processes therefor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/101—Music Composition or musical creation; Tools or processes therefor
- G10H2210/111—Automatic composing, i.e. using predefined musical rules
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Abstract
The invention relates to the technical field of music generation, in particular to a music melody generation method based on a pitch contour curve, which comprises the following steps: step one, extracting long-term structure information of a pitch contour in a frequency domain, wherein the long-term structure information comprises a low-frequency part in a frequency domain sequence of the pitch contour, and reflects a long-term trend rule of a melody; fitting long-term structure information by using a neural network with label control to generate long-term structure information corresponding to the label; and thirdly, training another neural network by utilizing the long-term structure information and the melody length information of the music data so that the neural network has the capacity of estimating the melody length information according to the long-term structure information. The invention generates the music melody with controllable long-term structure by utilizing the frequency domain characteristic of the pitch contour curve, and can realize the music distribution which is closer to and more realistic than the music generated by a long-short-time network.
Description
Technical Field
The invention relates to the technical field of music generation, in particular to a music melody generation method based on a pitch contour curve.
Background
Music generation is always the direction of constant exploration by people in the field of computer arts. In the early days of computer development, people began to use traditional algorithms to implement music generation. In recent years, attempts have been increasingly made to use deep neural networks for music generation, such as long and short time memory networks, countermeasure generation networks, convolutional neural networks, and improved variational self-encoders. The performance of short-time music generated using these networks is quite excellent, however, the study of long-time music generation is slightly insufficient. How to make the melody of the generated long-time music have reasonable phrase arrangement, and satisfactory sequence and stable transition exist among different chapters, so that no good solution exists at present. In view of this, we propose a musical melody generation method based on pitch contour.
Disclosure of Invention
In order to make up for the defects, the invention provides a music melody generation method based on a pitch contour.
The technical scheme of the invention is as follows:
a music melody generation method based on a pitch contour curve comprises the following steps:
step one, extracting long-term structure information of a pitch contour in a frequency domain, wherein the long-term structure information comprises a low-frequency part in a frequency domain sequence of the pitch contour, and reflects a long-term trend rule of a melody;
fitting long-term structure information by using a neural network with label control to generate long-term structure information corresponding to the label;
training another neural network by utilizing the long-term structure information and the melody length information of the music data to enable the neural network to have the capacity of predicting the melody length information according to the long-term structure information;
determining the length of the generated target melody by using a trained neural network, and expanding the long-term structure in a frequency domain to obtain a rough melody curve;
step five, utilizing a vocabulary collected from the music data set to gradually perform vocabulary matching replacement on the obtained rough melody curve, and finally obtaining the music with optimized details.
As a preferred technical scheme of the invention, the specific steps of the long-term structure fitting network in the second step are as follows:
firstly, determining a proper length to realize the compression of a long-term structure, and finally unifying the compressed long-term structure to be 300 bits in length through reasonable selection;
then, the average value of the pitches of all the melodies is adjusted to be C3, namely 60, after the direct current component of the frequency domain sequence is deleted, the frequency domain sequence only has the information of the long-term characteristics of the melodies, and the separation from the melodies is realized;
then, separating the real axis from the virtual axis of the frequency domain sequence data, and recombining the frequency domain sequence data into a sequence with the length of 600;
finally, the label information is used for describing the height change of the long-term structure of the melody, and the height change is sent into the fitting network together with the corresponding long-term structure.
As a preferable technical scheme of the invention, the embedded layer network is used for realizing trend control of generating the long-term structure in the process of fitting the long-term structure into the network in the second step.
As a preferable technical scheme of the invention, the specific steps of the spiral length determination network in the step four are as follows:
firstly, generating a melody frequency domain sequence of a music piece by using a long-short time memory network;
then, a module for assisting in memorizing the low frequency is designed as a mark for stopping the long-short-time memorizing network and can be used as a reference mark of other frequency bands, on the basis, the module for assisting in memorizing the low frequency can be independently separated into an independent network module, and the possible length of the melody can be estimated from the low frequency part of the frequency domain sequence by utilizing the network;
then, determining a range for the melody length of the music of the training network, and normalizing the range output by the neural network by using the range;
finally, this length range is uniformly transformed to the output range (-1, 1) using the tanh activation function.
As a preferred technical scheme of the invention, the data format used for training the long-short memory network is one-sixteenth note length of the time step, the C3 map 60 pitch coding pitch contour curve, the long-short memory network uses RMSProp as an optimizer, and the length of the generated melody is 500.
As a preferred technical scheme of the invention, the specific steps of vocabulary matching in the step five are as follows:
firstly, counting the tunes of all melodies in a music library, and uniformly adjusting the tunes to be C major tunes;
then, cutting the melodies of the music according to the vocabulary length to obtain a corpus;
finally, the corpus is used for piecewise matching with the rough melody generated by utilizing the neural network, and the matching standard is the minimization of the mean square error.
The preferable technical scheme of the invention comprises the following parameter settings:
the label length is set to 10;
the noise input length is 100;
the length of the outputted frequency domain information is 600;
the frequency domain intensity scaling factor is set to 0.2;
the long-term structure fitting network uses an Adam optimizer for parameter optimization, and the learning rate of the Adam optimizer is set to be 1×e -4 。
The preferable technical scheme of the invention further comprises the following parameter settings:
length determination network usage parameter set to 1×e -4 Performing parameter optimization by an Adam optimizer of (a);
the length of vocabulary matching is set to 8, and classification labels are adopted for quick retrieval;
the melody length of the music is set to be between 300 bits and 3000 bits, and the corresponding melody duration is 40 seconds to 7 minutes.
Compared with the prior art, the invention has the beneficial effects that:
the invention generates the music melody with controllable long-term structure by utilizing the frequency domain characteristic of the pitch contour curve, and can realize the music distribution which is closer to and more realistic than the music generated by a long-short-time network.
Drawings
FIG. 1 is a basic framework diagram of the operational flow of the present invention;
FIG. 2 is a schematic diagram of a long-term structure-fitting network according to the present invention;
FIG. 3 is a schematic diagram of a length-determining network according to the present invention;
FIG. 4 is a diagram illustrating a vocabulary matching step according to the present invention;
FIG. 5 is a diagram of a long and short term memory network used in the comparative experiments of the present invention;
FIG. 6 is a schematic diagram of a method for calculating a rhythm transfer matrix according to the present invention;
fig. 7 is a diagram showing a long-term structure of a music melody and a corresponding label according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the description of the present invention, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention.
A music melody generation method based on a pitch contour curve comprises the following steps:
step one, extracting long-term structure information of a pitch contour in a frequency domain, wherein the long-term structure information comprises a low-frequency part in a frequency domain sequence of the pitch contour, and reflects a long-term trend rule of a melody;
fitting long-term structure information by using a neural network with label control to generate long-term structure information corresponding to the label;
training another neural network by utilizing the long-term structure information and the melody length information of the music data to enable the neural network to have the capacity of predicting the melody length information according to the long-term structure information;
determining the length of the generated target melody by using a trained neural network, and expanding the long-term structure in a frequency domain to obtain a rough melody curve;
step five, utilizing a vocabulary collected from the music data set to gradually perform vocabulary matching replacement on the obtained rough melody curve, and finally obtaining the music with optimized details.
In a specific operation process, as shown in fig. 1, a data set related to music data is obtained first, music in the data set is processed to obtain a compressed long-term structure, a long-term structure tag and a music length set, and a long-term structure fitting network is trained by using the long-term structure and the long-term structure tag; processing music in the data set to obtain a compressed long-term structure, a long-term structure label and a music length set, and training a fixed-length network by using the long-term structure and the music length set; a melody basic vocabulary is obtained from the dataset.
In a specific operation, as shown in fig. 2, the specific steps of the long-term structure fitting network in the second step are as follows:
firstly, determining a proper length to realize the compression of a long-term structure, and finally unifying the compressed long-term structure to be 300 bits in length through reasonable selection;
then, the average value of the pitches of all the melodies is adjusted to be C3, namely 60, after the direct current component of the frequency domain sequence is deleted, the frequency domain sequence only has the information of the long-term characteristics of the melodies, and the separation from the melodies is realized;
then, separating the real axis from the virtual axis of the frequency domain sequence data, and recombining the frequency domain sequence data into a sequence with the length of 600;
finally, the label information is used for describing the height change of the long-term structure of the melody, and the height change is sent into the fitting network together with the corresponding long-term structure.
It should be noted that, in the last step, the melody is uniformly divided into ten regions, the average value of the pitch of each region is compared with the average value of the pitch of the full melody, the region higher than the average value of the pitch is marked as 1, the region lower than the average value of the pitch is marked as 0, and finally the 10-bit length label is obtained.
It should be noted that, as shown in fig. 2, in the second step, the long-term structure is fitted by using the fully connected layer, before that, noise with a length sequence of 600 needs to be input, and meanwhile, the embedded layer network is used to realize the trend control for generating the long-term structure. The embedded layer network is a special neural network layer structure, and the neural network layer automatically updates the neuron connection weight adapting to the embedded layer according to the counter-propagation weight updating information. The input tag information may be encoded and mapped in a high-dimensional space to some extent so that other portions of the network can better understand and execute the information contained in the tag.
In a specific operation process, as shown in fig. 3, the specific steps of the disclination length determination network in the step four are as follows:
firstly, generating a melody frequency domain sequence of a music piece by using a long-short time memory network;
then, a module for assisting in memorizing the low frequency is designed as a mark for stopping the long-short-time memorizing network and can be used as a reference mark of other frequency bands, on the basis, the module for assisting in memorizing the low frequency can be independently separated into an independent network module, and the possible length of the melody can be estimated from the low frequency part of the frequency domain sequence by utilizing the network;
then, determining a range for the melody length of the music of the training network, and normalizing the range output by the neural network by using the range;
finally, this length range is uniformly transformed to the output range (-1, 1) using the tanh activation function.
It should be noted that, the data format used for training the long-short memory network is one-sixteenth note length of the time step, the C3 maps 60 pitch-coded pitch contour curves, the long-short memory network uses RMSProp as an optimizer, and the generated melody length is 500.
In a specific operation process, as shown in fig. 4, the specific steps of vocabulary matching in the step five are as follows:
firstly, counting the tunes of all melodies in a music library, and uniformly adjusting the tunes to be C major tunes;
then, cutting the melodies of the music according to the vocabulary length to obtain a corpus;
finally, the corpus is used for piecewise matching with the rough melody generated by utilizing the neural network, and the matching standard is the minimization of the mean square error.
It should be noted that the above operation steps of the present invention include the following parameter settings:
the label length is set to 10;
the noise input length is 100;
the length of the outputted frequency domain information is 600;
the frequency domain intensity scaling factor is set to 0.2;
the long-term structure fitting network uses an Adam optimizer for parameter optimization, and the learning rate of the Adam optimizer is set to be 1×e -4 。
It should be noted that the above operation steps of the present invention further include the following parameter settings:
length determination network usage parameter set to 1×e -4 Performing parameter optimization by an Adam optimizer of (a);
the length of vocabulary matching is set to 8, and classification labels are adopted for quick retrieval;
in addition, as shown in the rule diagram of the distribution of the melody length in the music library in fig. 7, the melody length of the music has obvious distribution rule, and the range of the melody length of the music is defined between 300 bits and 3000 bits, and the range of the corresponding melody duration is 40 seconds to 7 minutes.
A total of 120 initiatives were generated using the networks described herein for performance assessment by the comparative experiments described below. In consideration of the degree of optimization between networks, music generated by a long-short-term memory network of a three-layer structure as shown in fig. 5 was selected for comparison experiments. In consideration of the training time problem of the long-short-time memory network, the original music library is shortened and then used for training parameters of the long-short-time memory network. Similarly, the trained long and short duration memory network is used to generate 120 first melodies for performance comparison. There are many ways to count the relationship of the internal changes of the melody, but they are essentially rules describing the change of the melody. The statistical method of the rhythm and pitch transfer rule shown in fig. 6 is designed by referring to the thought of a Markov chain. The transition matrix size of the rhythm variation is set to 16 in consideration of the distribution state of the actual melody of the musical composition, corresponding to the length of one sixteenth note to one full note. Following the above concept, a calculation method of a pitch change transfer matrix can also be given, and the size of the pitch change transfer matrix is set to 12, corresponding to a pitch change of one half tone to a pitch change of one octave.
The following table shows the mean square error between the true values and the rhythm transfer matrix and pitch change transfer matrix in the method of the present invention and the long-short-time memory network method by using the performance statistics method described above.
By comparing the results, we can see; the music generated by the method provided by the invention is more similar to the real music distribution than the music generated by the long-short time network.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.
Claims (8)
1. A music melody generation method based on a pitch contour curve is characterized by comprising the following steps of: the method comprises the following steps:
step one, extracting long-term structure information of a pitch contour in a frequency domain, wherein the long-term structure information comprises a low-frequency part in a frequency domain sequence of the pitch contour, and reflects a long-term trend rule of a melody;
fitting long-term structure information by using a neural network with label control to generate long-term structure information corresponding to the label;
training another neural network by utilizing the long-term structure information and the melody length information of the music data to enable the neural network to have the capacity of predicting the melody length information according to the long-term structure information;
determining the length of the generated target melody by using a trained neural network, and expanding the long-term structure in a frequency domain to obtain a rough melody curve;
step five, utilizing a vocabulary collected from the music data set to gradually perform vocabulary matching replacement on the obtained rough melody curve, and finally obtaining the music with optimized details.
2. The method for generating a musical melody based on a pitch contour as defined in claim 1, wherein: the specific steps of the long-term structure fitting network in the second step are as follows:
firstly, determining a proper length to realize the compression of a long-term structure, and finally unifying the compressed long-term structure to be 300 bits in length through reasonable selection;
then, the average value of the pitches of all the melodies is adjusted to be C3, namely 60, after the direct current component of the frequency domain sequence is deleted, the frequency domain sequence only has the information of the long-term characteristics of the melodies, and the separation from the melodies is realized;
then, separating the real axis from the virtual axis of the frequency domain sequence data, and recombining the frequency domain sequence data into a sequence with the length of 600;
finally, the label information is used for describing the height change of the long-term structure of the melody, and the height change is sent into the fitting network together with the corresponding long-term structure.
3. The method for generating a musical melody based on a pitch contour as defined in claim 1, wherein: and step two, realizing trend control for generating the long-term structure by using an embedded layer network in the process of fitting the long-term structure into the network.
4. The method for generating a musical melody based on a pitch contour as defined in claim 1, wherein: the specific steps of the spiral length determination network in the fourth step are as follows:
firstly, generating a melody frequency domain sequence of a music piece by using a long-short time memory network;
then, a module for assisting in memorizing the low frequency is designed as a mark for stopping the long-short-time memorizing network and can be used as a reference mark of other frequency bands, on the basis, the module for assisting in memorizing the low frequency can be independently separated into an independent network module, and the possible length of the melody can be estimated from the low frequency part of the frequency domain sequence by utilizing the network;
then, determining a range for the melody length of the music of the training network, and normalizing the range output by the neural network by using the range;
finally, this length range is uniformly transformed to the output range (-1, 1) using the tanh activation function.
5. The method for generating a musical melody based on a pitch contour as defined in claim 4, wherein: the data format used for training the long and short memory network is one-sixteenth note length of the time step, the C3 map 60 pitch codes the pitch contour curve, the long and short memory network uses RMSProp as an optimizer, and the length of the generated melody is 500.
6. The method for generating a musical melody based on a pitch contour as defined in claim 1, wherein: the specific steps of vocabulary matching in the fifth step are as follows:
firstly, counting the tunes of all melodies in a music library, and uniformly adjusting the tunes to be C major tunes;
then, cutting the melodies of the music according to the vocabulary length to obtain a corpus;
finally, the corpus is used for piecewise matching with the rough melody generated by utilizing the neural network, and the matching standard is the minimization of the mean square error.
7. The method for generating a musical melody based on a pitch contour as defined in claim 1, wherein: the method comprises the following parameter settings:
the label length is set to 10;
the noise input length is 100;
the length of the outputted frequency domain information is 600;
the frequency domain intensity scaling factor is set to 0.2;
the long-term structure fitting network uses an Adam optimizer for parameter optimization, and the learning rate of the Adam optimizer is set to be 1×e -4 。
8. The method for generating a musical melody based on a pitch contour as defined in claim 1, wherein: the method also comprises the following parameter settings:
length determination network usage parameter set to 1×e -4 Performing parameter optimization by an Adam optimizer of (a);
the length of vocabulary matching is set to 8, and classification labels are adopted for quick retrieval;
the melody length of the music is set to be between 300 bits and 3000 bits, and the corresponding melody duration is 40 seconds to 7 minutes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010559217.1A CN111739492B (en) | 2020-06-18 | 2020-06-18 | Music melody generation method based on pitch contour curve |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010559217.1A CN111739492B (en) | 2020-06-18 | 2020-06-18 | Music melody generation method based on pitch contour curve |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111739492A CN111739492A (en) | 2020-10-02 |
CN111739492B true CN111739492B (en) | 2023-07-11 |
Family
ID=72649711
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010559217.1A Active CN111739492B (en) | 2020-06-18 | 2020-06-18 | Music melody generation method based on pitch contour curve |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111739492B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112489606B (en) * | 2020-11-26 | 2022-09-27 | 北京有竹居网络技术有限公司 | Melody generation method, device, readable medium and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002287757A (en) * | 2001-03-23 | 2002-10-04 | Yamaha Corp | Sound data transfer method and sound data transfer apparatus, and program |
CN1737798A (en) * | 2005-09-08 | 2006-02-22 | 上海交通大学 | Music rhythm sectionalized automatic marking method based on eigen-note |
CN101000765A (en) * | 2007-01-09 | 2007-07-18 | 黑龙江大学 | Speech synthetic method based on rhythm character |
KR20170128073A (en) * | 2017-02-23 | 2017-11-22 | 반병현 | Music composition method based on deep reinforcement learning |
WO2018065029A1 (en) * | 2016-10-03 | 2018-04-12 | Telefonaktiebolaget Lm Ericsson (Publ) | User authentication by subvocalization of melody singing |
WO2018155800A1 (en) * | 2017-02-24 | 2018-08-30 | Samsung Electronics Co., Ltd. | Mobile device and method for executing music-related application |
CN110263728A (en) * | 2019-06-24 | 2019-09-20 | 南京邮电大学 | Anomaly detection method based on improved pseudo- three-dimensional residual error neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5293460B2 (en) * | 2009-07-02 | 2013-09-18 | ヤマハ株式会社 | Database generating apparatus for singing synthesis and pitch curve generating apparatus |
-
2020
- 2020-06-18 CN CN202010559217.1A patent/CN111739492B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002287757A (en) * | 2001-03-23 | 2002-10-04 | Yamaha Corp | Sound data transfer method and sound data transfer apparatus, and program |
CN1737798A (en) * | 2005-09-08 | 2006-02-22 | 上海交通大学 | Music rhythm sectionalized automatic marking method based on eigen-note |
CN101000765A (en) * | 2007-01-09 | 2007-07-18 | 黑龙江大学 | Speech synthetic method based on rhythm character |
WO2018065029A1 (en) * | 2016-10-03 | 2018-04-12 | Telefonaktiebolaget Lm Ericsson (Publ) | User authentication by subvocalization of melody singing |
KR20170128073A (en) * | 2017-02-23 | 2017-11-22 | 반병현 | Music composition method based on deep reinforcement learning |
WO2018155800A1 (en) * | 2017-02-24 | 2018-08-30 | Samsung Electronics Co., Ltd. | Mobile device and method for executing music-related application |
CN110263728A (en) * | 2019-06-24 | 2019-09-20 | 南京邮电大学 | Anomaly detection method based on improved pseudo- three-dimensional residual error neural network |
Non-Patent Citations (3)
Title |
---|
Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics;Justin Salamon 等;IEEE Transactions on Audio, Speech, and Language Processing;第1759-1760页 * |
基于哼唱的音乐检索系统的研究与实现;李扬;中国优秀硕士学位论文全文数据库;第19-40页 * |
基于外轮廓模糊处理的多尺度目标检测;程艳云;南京邮电大学学报;第38卷(第2期);第78-80页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111739492A (en) | 2020-10-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112364894B (en) | Zero sample image classification method of countermeasure network based on meta-learning | |
CN104854654B (en) | For the method and system using the speech recognition of search inquiry information to process | |
CN104347067A (en) | Audio signal classification method and device | |
CN110310666B (en) | Musical instrument identification method and system based on SE convolutional network | |
CN110287325A (en) | A kind of power grid customer service recommended method and device based on intelligent sound analysis | |
CN108197294A (en) | A kind of text automatic generation method based on deep learning | |
CN109727590A (en) | Music generating method and device based on Recognition with Recurrent Neural Network | |
CN102822889B (en) | Pre-saved data compression for tts concatenation cost | |
CN109801645B (en) | Musical tone recognition method | |
TW201417092A (en) | Guided speaker adaptive speech synthesis system and method and computer program product | |
Chen et al. | Diagnose Parkinson’s disease and cleft lip and palate using deep convolutional neural networks evolved by IP-based chimp optimization algorithm | |
CN111414513B (en) | Music genre classification method, device and storage medium | |
CN111382260A (en) | Method, device and storage medium for correcting retrieved text | |
CN111739492B (en) | Music melody generation method based on pitch contour curve | |
CN114676687A (en) | Aspect level emotion classification method based on enhanced semantic syntactic information | |
Marxer et al. | Unsupervised incremental online learning and prediction of musical audio signals | |
CN113948066A (en) | Error correction method, system, storage medium and device for real-time translation text | |
CN110675879B (en) | Audio evaluation method, system, equipment and storage medium based on big data | |
CN115841119A (en) | Emotional cause extraction method based on graph structure | |
Chuan et al. | Generating and evaluating musical harmonizations that emulate style | |
CN111178051A (en) | Building information model self-adaptive Chinese word segmentation method and device | |
CN107886132A (en) | A kind of Time Series method and system for solving music volume forecasting | |
CN110032642B (en) | Modeling method of manifold topic model based on word embedding | |
CN110134823B (en) | MIDI music genre classification method based on normalized note display Markov model | |
JP4945465B2 (en) | Voice information processing apparatus and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |