CN109935222B - Method and device for constructing chord transformation vector and computer readable storage medium - Google Patents

Method and device for constructing chord transformation vector and computer readable storage medium Download PDF

Info

Publication number
CN109935222B
CN109935222B CN201811409175.2A CN201811409175A CN109935222B CN 109935222 B CN109935222 B CN 109935222B CN 201811409175 A CN201811409175 A CN 201811409175A CN 109935222 B CN109935222 B CN 109935222B
Authority
CN
China
Prior art keywords
chord
sample
neural network
network model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811409175.2A
Other languages
Chinese (zh)
Other versions
CN109935222A (en
Inventor
马丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
Original Assignee
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Migu Cultural Technology Co Ltd, China Mobile Communications Group Co Ltd filed Critical Migu Cultural Technology Co Ltd
Priority to CN201811409175.2A priority Critical patent/CN109935222B/en
Publication of CN109935222A publication Critical patent/CN109935222A/en
Application granted granted Critical
Publication of CN109935222B publication Critical patent/CN109935222B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a method, a device and a computer readable storage medium for constructing chord transformation vectors, wherein the method comprises the following steps: obtaining a chord sample to be analyzed; preprocessing the chord sample to be analyzed to obtain a sample coding data set; inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, wherein the output of the neural network model is a predicted chord code at the time t; determining the training quality of the sample encoding data set according to an objective function of the neural network model; and obtaining the weight of the hidden layer of the neural network model when the preset training quality is met, and using the weight as a chord transformation vector.

Description

Method and device for constructing chord transformation vector and computer readable storage medium
Technical Field
The present invention relates to the field of machine learning technologies, and in particular, to a method and an apparatus for constructing a chord transformation vector, and a computer-readable storage medium.
Background
In the prior art, the existing chord analysis is performed based on symbolic form expression, and is a high-level abstract summary of human experience. If a machine is used for performing the analysis work of the chord data, the expression form of the chord data needs to be converted into a mathematical vector form so as to facilitate the reading and calculation of the chord data. In the field of intelligent music research, no mature technical scheme can encode chord data into a Vector representation form. If the chord data is analyzed and processed based on the form of the manually defined rule, rich music theory is needed, and the processing efficiency is extremely low. The chord data is not converted into a vector representation form, and the chord data which is in a discrete form cannot be processed from the point of view of numerical analysis in the field of artificial intelligence, and is used as resource data for machine learning. Therefore, the prior art is lack of converting different chords into a vector representation form, which is not beneficial to applying chord data to the field of artificial intelligence.
Disclosure of Invention
To solve the foregoing technical problem, embodiments of the present invention provide a method, an apparatus, and a computer-readable storage medium for constructing a chord transformation vector.
The method for constructing the chord transformation vector provided by the embodiment of the invention comprises the following steps:
obtaining a chord sample to be analyzed;
preprocessing the chord sample to be analyzed to obtain a sample coding data set;
inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, wherein the output of the neural network model is a predicted chord code at the time t;
determining the training quality of the sample encoding data set according to an objective function of the neural network model;
and obtaining the weight of the hidden layer of the neural network model when the preset training quality is met, and using the weight as a chord transformation vector.
Wherein the preprocessing the chord sample to be analyzed comprises:
determining a root note of the chord sample;
calculating the frequency of the root note;
on the basis of the root, normalizing the notes formed by the chord samples to obtain normalized chords;
splitting the normalized chord based on twelve-tone equal temperament to obtain a temperament chord;
carrying out unique hot coding on the rule chord to obtain a coded chord;
combining the frequency of the root note with the coding chord to obtain the sample coding data set;
the sample coding data set is divided into a sample coding training set and a sample coding verification set according to a preset proportion.
Wherein the inputting the sample encoded data set into a neural network model for training comprises:
and inputting the sample coding training set into the neural network model to be used as neurons for training the neural network model.
Wherein the inputting the sample encoding dataset into a neural network model for training further comprises:
calculating chord sample prediction accuracy of the neural network model on the sample code verification set at intervals of a preset training interval;
when the chord sample prediction accuracy is lower than a first prediction threshold value, judging that the quality of the predicted chord code output by the neural network model is unqualified;
and when the chord sample prediction accuracy is higher than a second prediction threshold value, judging that the quality of the predicted chord code is qualified.
Wherein the objective function is:
L=Ε[||chord_predict(t)-chord_ground_truth(t)||2]
chord _ prediction (t) is the predicted chord code at the time t, and chord _ group _ true (t) is the chord code at the time t in the sample code training set.
Wherein the determining the training quality of the sample encoding dataset according to the objective function of the neural network model comprises:
calculating a current calculated value of the objective function;
when the current calculated value of the objective function is lower than a first loss threshold value, judging that the predicted chord code is qualified;
and when the current calculated value of the objective function is higher than a second loss threshold value, judging that the predicted chord code is unqualified.
Wherein the determining the training quality of the sample encoded data set according to the objective function of the neural network model further comprises:
and determining the quality of the predicted chord code output by the neural network model according to the current calculated value of the target function and the chord sample prediction accuracy.
The embodiment of the invention provides a device for constructing chord transformation vectors, which comprises:
the sample obtaining module is used for obtaining a chord sample to be analyzed;
the preprocessing module is used for preprocessing the chord sample to be analyzed to obtain a sample coding data set;
the input module is used for inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, and the output of the neural network model is the predicted chord code at the time t;
a determining module, configured to determine a training quality of the sample encoding data set according to an objective function of the neural network model;
and the output module is used for obtaining the weight of the hidden layer of the neural network model when the preset training quality is met and using the weight as a chord transformation vector.
Wherein, in the preprocessing module, the preprocessing the chord sample to be analyzed includes:
determining a root note of the chord sample;
calculating the frequency of the root note;
on the basis of the root, normalizing the notes formed by the chord samples to obtain normalized chords;
splitting the normalized chord based on twelve-tone equal temperament to obtain a temperament chord;
carrying out unique hot coding on the rule chord to obtain a coded chord;
combining the frequency of the root note with the coding chord to obtain the sample coding data set;
the sample coding data set is divided into a sample coding training set and a sample coding verification set according to a preset proportion.
Wherein, in the input module, the inputting the sample coding data set into a neural network model for training includes:
and inputting the sample coding training set into the neural network model to be used as neurons for training the neural network model.
Wherein, in the input module, the inputting the sample encoding data set into a neural network model for training further comprises:
calculating chord sample prediction accuracy of the neural network model on the sample code verification set at intervals of a preset training interval;
when the chord sample prediction accuracy is lower than a first prediction threshold value, judging that the quality of the predicted chord code output by the neural network model is unqualified;
and when the chord sample prediction accuracy is higher than a second prediction threshold value, judging that the quality of the predicted chord code is qualified.
Wherein, in the determining module, the objective function is:
L=Ε[||chord_predict(t)-chord_ground_truth(t)||2]
chord _ prediction (t) is the predicted chord code at the time t, and chord _ group _ true (t) is the chord code at the time t in the sample code training set.
Wherein, in the determining module, the determining the training quality of the sample encoding data set according to the objective function of the neural network model includes:
calculating a current calculated value of the objective function;
when the current calculated value of the objective function is lower than a first loss threshold value, judging that the predicted chord code is qualified;
and when the current calculated value of the objective function is higher than a second loss threshold value, judging that the predicted chord code is unqualified.
Wherein, in the determining module, the determining the training quality of the sample encoding data set according to the objective function of the neural network model further includes:
and determining the quality of the predicted chord code output by the neural network model according to the current calculated value of the target function and the chord sample prediction accuracy.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements any one of the steps in the foregoing method for constructing a chord transformation vector.
In the technical scheme of the embodiment of the invention, a chord sample to be analyzed is obtained; preprocessing the chord sample to be analyzed to obtain a sample coding data set; inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, wherein the output of the neural network model is a predicted chord code at the time t; determining the training quality of the sample encoding data set according to an objective function of the neural network model; and obtaining the weight of the hidden layer of the neural network model when the preset training quality is met, and using the weight as a chord transformation vector. Therefore, the chord conversion vector is constructed, the chord data is expressed in a vector form, and the chord data can be applied to the field of artificial intelligence.
Drawings
The accompanying drawings generally illustrate, by way of example and not by way of limitation, various embodiments discussed herein;
FIG. 1 is a diagram illustrating a mapping relationship of chord transformation into a vector according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for constructing a chord transformation vector according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a three-layer neural network model according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an apparatus for constructing a chord transformation vector according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an apparatus for constructing a chord transformation vector according to an embodiment of the present invention.
Detailed Description
So that the manner in which the features and elements of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings.
The embodiment of the invention provides a method for constructing a chord (chord) transformation vector (vector), which is characterized in that different types of chords are mapped into an N-dimensional vector space by utilizing training of a neural network, and each chord is regarded as a point in the space (as shown in fig. 1, fig. 1 is a schematic diagram for transforming three chords into two-dimensional vectors).
The embodiment of the invention provides a method for constructing a chord transformation vector, which is used for converting a chord (chord) represented by a symbol into a vector (vector) to represent and providing a feasible, simple and efficient training method based on a neural network to realize the embedded transformation. Based on the language model generating system of deep learning, the neural network can extract characteristic information from a large number of chord samples in a black box mode, and the weight of the neuron is adjusted to achieve the purpose of fitting the samples. In the training process, the weight value output by the middle Hidden layer (Hidden layer) is the chord transformation vector required to be obtained.
Fig. 2 is a schematic flowchart of a method for constructing a chord transformation vector according to an embodiment of the present invention, and as shown in fig. 2, the method includes the following steps:
step 201, obtaining a chord sample to be analyzed.
Firstly, a large amount of chord data needs to be collected as chord samples, and the chord sequence in the chord data set needs to conform to the tone law rule, namely, harmony of the chord and reasonability of the chord sequence trend need to be considered. The source of acquisition here can be selected from existing MIDI samples, such as the published MIDI data set Nottingham.
Step 202, preprocessing the chord sample to be analyzed to obtain a sample coding data set.
To normalize the chord patterns for machine understanding, the chord data collected from the MIDI samples may be preprocessed to form a sample encoded data set.
In one embodiment, preprocessing the chord sample to be analyzed comprises: determining a root note of the chord sample; calculating the frequency of the root note; on the basis of the root, normalizing the notes formed by the chord samples to obtain normalized chords; splitting the normalized chord based on twelve-tone equal temperament to obtain a temperament chord; carrying out unique hot coding on the rule chord to obtain a coded chord; combining the frequency of the root note with the coding chord to obtain the sample coding data set; the sample coding data set is divided into a sample coding training set and a sample coding verification set according to a preset proportion. Specifically, the chord data is expressed as "root frequency" (floating point number) + "chord mode" (one-hot Encoding).
"root" is the expression of the note code of the root of the chord in the MIDI standard, such as for the tertiary chord Em of C key, the constituent notes are 3 (treble) 5 (treble), 7 (west), and the root is 3 (miam) of C key, then according to the MIDI standard, the note is encoded as 78, according to the frequency calculation formula:
(d-69)/12
f=2*440Hz
by calculation, the root note frequency of 3 (mials) with root note of C key is expressed as: 739.98 Hz.
The "chord pattern" patterns a common chord and can be summarized as 30 patterns [ 'maj', 'min', 'dim', 'aug', 'sus 2', … ], and the like. Then, the root note is unified to be 0 based on the patterns, and the notes such as the rest three notes, the five notes and the like are correspondingly adjusted on the basis of keeping the interval unchanged to obtain a unified pattern, for example, the constituent notes of the Em minor chord are 3 (mi) 5 (treble), and 7 (west) is normalized to obtain [0,2,4 ]. And then splitting (considering semitones) according to twelve equal temperaments in the temperaments to obtain final mode data of [0,3,7], and similarly obtaining a chord mode of an Am chord normalized by the min chord of [0,4,7], thereby completing the unification of the chord modes.
The coding modes are shown in table 1:
name of mode Data of
0 ‘maj’ [0,4,7]
1 ‘min’ [0,3,7]
2 ‘dim’ [0,3,6]
30 ‘7-13’ [0,4,7,10,20]
TABLE 1
To facilitate model training, the above normalized vectors are subjected to one-hot Encoding (one-hot Encoding).
Combining the above two steps, an exact chord can be converted into a floating point (float) variable + one-hot coded form for representation. For example, the expression form of Em chord finally input into the neural network for machine learning is as follows: (739.98, [0,0,0,0,1, ….0 ]). This is done. And obtaining a sample coding data set through preprocessing, and dividing the sample coding data set into a sample coding training set and a sample coding verification set according to a preset proportion (for example, 8: 2).
Although the chord is coded symbolically here, the coding still has disadvantages, such as: presented with the individual data of the splits, the correlation between chords cannot be reasonably demonstrated; the floating-point number and the single hot coding are adopted for combined expression, the uniformity is lacked, and the further calculation processing is not facilitated. However, the code combination mode has a basic meaning function, which is enough for a machine to correctly understand the input chord data, so that the code combination mode can be used as basic input data of a neural network model for training, and finally, the chord vector with spatial correlation required to be obtained by the embodiment of the invention is extracted.
And 203, inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, wherein the output of the neural network model is the predicted chord code at the time t.
The neural network model here may be an n-layer (n > ═ 3) neural network model, and only three layers of shallow neural networks are used as examples below, and as shown in fig. 3, the neural network model is divided into an input layer, a hidden layer and an output layer.
The input layer simply accepts the chord data set expressed by the symbols, and the chord sequence input each time needs to meet the rationality of the music theory level, such as: am- > Dm- > G- > C- > Em- > C … … or C- > Am- > G- > … and the like are input according to the timing sequence of chord progression. The format of the input data is a mixed mode of "root frequency + chord mode". As shown in FIG. 3, the input layer inputs chord data at times t-1, t +1, and t + 2.
Hidden layer, unlike natural language processing, the category information of chords is relatively small, so here unlike language model processing: the efficient calculation mode of vector addition can be simply used, an activation function can also be added, nonlinear transformation is carried out on the input data of the layer, and the comprehensiveness of feature extraction is increased. And the trained weight of the hidden layer neuron is the chord vector which is finally required to be obtained in the embodiment of the invention.
And the output layer outputs the data format which is consistent with that of the input layer, outputs the predicted chord code at the time t, and consists of two branches, wherein one branch is used for carrying out the classification operation of the unique hot code based on a regression model (such as a softmax function), and the other branch is used for fitting the floating point variable frequency value of the root note and simultaneously participates in the subsequent calculation of the target function.
In one embodiment, the training of the input of the sample encoded data set into a neural network model comprises: and inputting the sample coding training set into the neural network model to be used as neurons for training the neural network model.
In one embodiment, the inputting the sample encoding dataset into a neural network model for training further comprises: calculating chord sample prediction accuracy of the neural network model on the sample code verification set at intervals of a preset training interval; when the chord sample prediction accuracy is lower than a first prediction threshold value, judging that the quality of the predicted chord code output by the neural network model is unqualified; and when the chord sample prediction accuracy is higher than a second prediction threshold value, judging that the quality of the predicted chord code is qualified. For example, the chord sample prediction accuracy of the neural network model on the sample code verification set is set to be observed every 50 cycles, wherein the chord sample prediction accuracy is set to be unqualified for less than 60 percent and qualified or excellent for more than 95 percent.
Step 204, determining the training quality of the sample encoding data set according to the objective function of the neural network model.
In one embodiment, the objective function may be selected as:
L=Ε[||chord_predict(t)-chord_ground_truth(t)||2]
chord _ prediction (t) is the predicted chord code at the time t, and chord _ group _ true (t) is the chord code at the time t in the sample code training set.
The objective function, or loss function, has the meaning: the L2 distance (euclidean distance) of both chord _ predict (t) and chord _ group _ true (t) is taken to be a desired value.
In general, the closer the objective function is to a zero value, the better the fitting distribution effect of the neural network model is, and the more correctly the dense vectors extracted from the hidden layer can represent the relationship between the chords. The specific decision threshold of the objective function is influenced by external conditions such as the quality, the size, the model complexity and the like of the training set, and needs to be analyzed according to a specific application scenario.
In one embodiment, the determining the training quality of the sample encoding dataset according to the objective function of the neural network model comprises: calculating a current calculated value of the objective function; when the current calculated value of the objective function is lower than a first loss threshold value, judging that the predicted chord code is qualified; and when the current calculated value of the objective function is higher than a second loss threshold value, judging that the predicted chord code is unqualified. For example, the objective function value is observed, and if the objective function value is greater than 0.5, the objective function value is unqualified, and if the objective function value is less than 0.0001, the objective function value is qualified or excellent.
In one embodiment, the determining the training quality of the sample encoding dataset according to the objective function of the neural network model further comprises: and determining the quality of the predicted chord code output by the neural network model according to the current calculated value of the target function and the chord sample prediction accuracy.
Specifically, the sample training and the determination of the training quality of the embodiment of the present invention may be performed according to steps S1-S5:
step S1: and inputting the sample coding training set into the neural network model to be used as the neuron for training the neural network model.
Step S2: and observing the chord sample prediction accuracy of the neural network model on the sample code verification set at regular training intervals (for example, setting the chord sample prediction accuracy to be every 50 cycles), wherein the chord sample prediction accuracy is unqualified for less than 60 percent, and is qualified or excellent for more than 95 percent.
Step S3: and observing the value of the objective function, wherein the value of the objective function is unqualified if the value of the objective function is more than 0.5, and qualified or excellent if the value of the objective function is less than 0.0001, wherein the value of the objective function is related to the realization of the neural network model and the selection of the sample coding data set.
Step S4: and determining whether the training duration and the hyper-parameter of the neural network model are reasonable or not based on the chord sample prediction accuracy and the objective function value.
Step S5: and if the training time of the neural network model and the hyper-parameters are unreasonable, adjusting the learning rate of the hyper-parameters. If the chord sample prediction accuracy rate and the target function value have larger oscillation, the learning rate is reduced; if the chord sample prediction accuracy and the objective function value tend to be smooth but the total value is larger and does not decrease, the learning rate is increased, and the step S2 is returned to continue training to achieve the best training quality.
And step 205, obtaining the weight of the hidden layer of the neural network model when the preset training quality is met, and using the weight as a chord transformation vector.
And when the training quality reaches a preset standard, for example, the current calculated value of the objective function is less than 0.0001, and the Chord sample prediction accuracy of the neural network model on the sample coding verification set is greater than 95%, extracting the weight of the hidden layer at the moment to serve as the finally obtained Chord conversion dense Vector Chord2 Vector.
Example one
1. An experimental model: and building a three-layer neural network model for testing based on the neural network model architecture schematic diagram of the previous embodiment.
2. Data set: and taking a public MIDI data set Nottingham as training data, extracting 4000 chord track segments from the public MIDI data set Nottingham, and cleaning and pre-coding the data according to the mode in the embodiment.
3. Key experimental environment equipment: and OS: ubuntu 16.04; a deep learning framework: tensoflow 1.2.1; a display card: NVIDIA1070ti (video memory 8G).
4. Training results are as follows: it is expected that after sufficient training, the distance of the chord with similarity in the vector space should be smaller than the vector distance corresponding to the chord with larger difference, where the similar chord is taken as Bm and B7 (the two can be replaced by stronger chord), the difference chord is taken as Bm and Am, and the training effect is shown in table 2:
duration of training Loss Dis_Bm_B7 Dis_Bm_Am
Results
1 3 hours and 20 minutes 2.8839 0.7236 0.6668
Results 2 8 hours and 15 minutes 1.3611 0.4090 0.7224
Results 3 14 hours and 15 minutes 0.8735 0.1042 0.8670
Results 4 26 hours and 30 minutes 0.2245 0.3566 0.8252
TABLE 2
The fourth column in table 2 records the vector space distance of Bm chord and B7 chord calculated by the neural network, and the fifth column records the vector space distance of Bm chord and Am chord, and the smaller the former value is, the better the experimental target is, while the larger the latter value is, the better the experimental target is, and from the viewpoint of training effect, the expected result is "result 3". The chord vectors extracted from the result 3 can accurately represent the relationship between the chords and be used for further calculation training.
The core idea of the embodiment of the invention is as follows: in order to convert chord data into low-dimensional vector representation capable of reflecting correlation, a training model is built by means of a neural network, the training output of the model is not the key point concerned by the invention, a matrix formed by node weights of a hidden layer in the training process can simulate the distribution of a required chord vector space to the maximum extent, each row of the matrix represents a specified chord vector required by the invention, and the row number is the number of a chord library. The invention can also approximate the trend of the prediction sequence by constructing a multilayer neural network and taking the chord sequence as training data.
The chord data are converted into the vector representation through a method for constructing the chord conversion vector, the distances between different chord vectors (cosine distances, Euclidean distances and the like selected according to different application scenes) in a low-dimensional continuous vector space can intuitively represent the correlation and the difference between the chord data or quantitatively measure the relationship of the chord data, and the relationship between the chord data and the chord data can be simply and efficiently represented. Based on the appointed mass music database, the relevance between chord data is measured, the music is investigated from the granularity of the chord data, the melody frame, the emotion tendentiousness and the like of the music can be comprehensively analyzed, and a brand new theoretical basis is provided for the work of intelligent music, music theory analysis, music composition and the like.
The invention further converts the chord with the characteristics of time sequence continuity, context correlation and the like into the vector by a method for constructing the chord conversion vector. Because the chord is replaceable to a certain extent, and the number of the chord is relatively small relative to the number of the language vocabularies of the natural language, the method does not need to construct the Huffman tree coding and the hierarchy softmax (hierarchical softmax) structure which are commonly used in similar language models, and can more conveniently construct the mapping of the distributed sparse data.
Fig. 4 provides an apparatus for constructing a chord transformation vector according to an embodiment of the present invention, and as shown in fig. 4, the apparatus 400 for constructing a chord transformation vector includes:
a sample obtaining module 401, configured to obtain a chord sample to be analyzed;
a preprocessing module 402, configured to preprocess the chord sample to be analyzed to obtain a sample encoded data set;
an input module 403, configured to input the sample encoded data set into a neural network model for training according to a time sequence of chord progression of the chord sample, where an output of the neural network model is a predicted chord code at time t;
a determining module 404, configured to determine a training quality of the sample encoding data set according to an objective function of the neural network model;
and the output module 405 is configured to obtain a weight of the hidden layer of the neural network model when the preset training quality is met, and use the weight as a chord transformation vector.
In one embodiment, the preprocessing module 402, the preprocessing the chord sample to be analyzed includes: determining a root note of the chord sample; calculating the frequency of the root note; on the basis of the root, normalizing the notes formed by the chord samples to obtain normalized chords; splitting the normalized chord based on twelve-tone equal temperament to obtain a temperament chord; carrying out unique hot coding on the rule chord to obtain a coded chord; combining the frequency of the root note with the coding chord to obtain the sample coding data set; the sample coding data set is divided into a sample coding training set and a sample coding verification set according to a preset proportion.
In one embodiment, in the inputting module 403, the inputting the sample coding data set into the neural network model for training includes: and inputting the sample coding training set into the neural network model to be used as neurons for training the neural network model.
In one embodiment, in the inputting module 403, the inputting the sample encoding data set into the neural network model for training further includes: calculating chord sample prediction accuracy of the neural network model on the sample code verification set at intervals of a preset training interval; when the chord sample prediction accuracy is lower than a first prediction threshold value, judging that the quality of the predicted chord code output by the neural network model is unqualified; and when the chord sample prediction accuracy is higher than a second prediction threshold value, judging that the quality of the predicted chord code is qualified.
In one embodiment, in the determining module 404, the objective function is:
L=Ε[||chord_predict(t)-chord_ground_truth(t)||2]
chord _ prediction (t) is the predicted chord code at the time t, and chord _ group _ true (t) is the chord code at the time t in the sample code training set.
In one embodiment, in the determining module 404, the determining the training quality of the sample encoding data set according to the objective function of the neural network model includes: calculating a current calculated value of the objective function; when the current calculated value of the objective function is lower than a first loss threshold value, judging that the predicted chord code is qualified; and when the current calculated value of the objective function is higher than a second loss threshold value, judging that the predicted chord code is unqualified.
In one embodiment, in the determining module 404, the determining the training quality of the sample encoding data set according to the objective function of the neural network model further includes: and determining the quality of the predicted chord code output by the neural network model according to the current calculated value of the target function and the chord sample prediction accuracy.
It will be understood by those skilled in the art that the functions of the modules in the apparatus for constructing a chord transformation vector shown in fig. 4 can be understood by referring to the related description of the method for constructing a chord transformation vector. The functions of the blocks of the apparatus for constructing a chord transformation vector shown in fig. 4 may be implemented by a program running on a processor, or may be implemented by specific logic circuits.
Fig. 5 is a schematic structural diagram of an apparatus for constructing a chord transformation vector according to an embodiment of the present invention, and the apparatus 500 for constructing a chord transformation vector shown in fig. 5 is disposed on the terminal, and includes: at least one processor 501, memory 502, user interface 503, at least one network interface 504. The various components in the apparatus 500 for constructing a chord transformation vector are coupled together by a bus system 505. It is understood that the bus system 505 is used to enable connection communications between these components. The bus system 505 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 505 in FIG. 5.
The user interface 503 may include a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, a touch screen, or the like, among others.
The memory 502 in embodiments of the present invention is used to store various types of data to support the operation of the apparatus 500 for constructing chord transformation vectors. Examples of such data include: any computer program for operating on the apparatus 500 for constructing chord transformation vectors, such as the operating system 5021 and the application programs 5022; the operating system 5021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application programs 5022 may contain various application programs for implementing various application services. The program for implementing the method according to the embodiment of the present invention may be included in the application program 5022.
The method disclosed by the above-mentioned embodiments of the present invention may be applied to the processor 501, or implemented by the processor 501. The processor 501 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 501. The processor 501 described above may be a general purpose processor, a digital signal processor, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 501 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 502, and the processor 501 reads the information in the memory 502 and performs the steps of the aforementioned methods in conjunction with its hardware.
It will be appreciated that the memory 502 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic random access Memory (FRAM), a ferromagnetic access Memory (Flash Memory), a magnetic surface Memory, an optical Disc, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM), Enhanced Synchronous Dynamic Random Access Memory (Enhanced DRAM), Synchronous Dynamic Random Access Memory (SLDRAM), Direct Memory (DRmb Access), and Random Access Memory (DRAM). The memory 502 described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
Based on the method for constructing a chord transformation vector provided in the embodiments of the present application, the present application further provides a computer-readable storage medium, which is shown in fig. 5 and may include: a memory 502 for storing a computer program executable by the processor 501 of the apparatus 500 for constructing a chord transformation vector for performing the steps of the method as described above. The computer readable storage medium may be Memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface Memory, optical disk, or CD-ROM.
It should be noted that: the technical schemes described in the embodiments of the present invention can be combined arbitrarily without conflict.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (15)

1. A method of constructing a chord transition vector, the method comprising:
obtaining a chord sample to be analyzed;
preprocessing the chord sample to be analyzed to obtain a sample coding data set;
inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, wherein the output of the neural network model is a predicted chord code at the time t;
determining the training quality of the sample encoding data set according to an objective function of the neural network model;
and obtaining the weight of the hidden layer of the neural network model when the preset training quality is met, and using the weight as a chord transformation vector.
2. The method of constructing chord transition vectors according to claim 1, wherein said preprocessing the chord samples to be analyzed comprises:
determining a root note of the chord sample;
calculating the frequency of the root note;
on the basis of the root, normalizing the notes formed by the chord samples to obtain normalized chords;
splitting the normalized chord based on twelve-tone equal temperament to obtain a temperament chord;
carrying out unique hot coding on the rule chord to obtain a coded chord;
combining the frequency of the root note with the coding chord to obtain the sample coding data set;
the sample coding data set is divided into a sample coding training set and a sample coding verification set according to a preset proportion.
3. The method of constructing chord transformation vectors according to claim 2, wherein said training of said sample encoded data sets input to a neural network model comprises:
and inputting the sample coding training set into the neural network model to be used as neurons for training the neural network model.
4. The method of constructing chord transformation vectors according to claim 2, wherein said training of said sample encoded data sets into a neural network model further comprises:
calculating chord sample prediction accuracy of the neural network model on the sample code verification set at intervals of a preset training interval;
when the chord sample prediction accuracy is lower than a first prediction threshold value, judging that the quality of the predicted chord code output by the neural network model is unqualified;
and when the chord sample prediction accuracy is higher than a second prediction threshold value, judging that the quality of the predicted chord code is qualified.
5. The method of constructing chord transformation vectors according to claim 4, wherein the objective function is:
L=E[||chord_predict(t)-chord_ground_truth(t)||2]
wherein chord _ prediction (t) is a predicted chord code at time t, and chord _ group _ true (t) is a chord code at time t in the sample code data set; e (X) represents the expectation function.
6. The method of constructing chord transformation vectors according to claim 5, wherein said determining the training quality of the sample encoded data sets according to the objective function of the neural network model comprises:
calculating a current calculated value of the objective function;
when the current calculated value of the objective function is lower than a first loss threshold value, judging that the predicted chord code is qualified;
and when the current calculated value of the objective function is higher than a second loss threshold value, judging that the predicted chord code is unqualified.
7. The method of constructing chord transformation vectors according to claim 4 or 6, wherein said determining the training quality of the sample encoded data sets according to the objective function of the neural network model further comprises:
and determining the quality of the predicted chord code output by the neural network model according to the current calculated value of the target function and the chord sample prediction accuracy.
8. An apparatus for constructing a chord transition vector, the apparatus comprising:
the sample obtaining module is used for obtaining a chord sample to be analyzed;
the preprocessing module is used for preprocessing the chord sample to be analyzed to obtain a sample coding data set;
the input module is used for inputting the sample coded data set into a neural network model for training according to the time sequence of the chord progression of the chord sample, and the output of the neural network model is the predicted chord code at the time t;
a determining module, configured to determine a training quality of the sample encoding data set according to an objective function of the neural network model;
and the output module is used for obtaining the weight of the hidden layer of the neural network model when the preset training quality is met and using the weight as a chord transformation vector.
9. The apparatus for constructing a chord transformation vector according to claim 8, wherein in the preprocessing module, the preprocessing the chord samples to be analyzed comprises:
determining a root note of the chord sample;
calculating the frequency of the root note;
on the basis of the root, normalizing the notes formed by the chord samples to obtain normalized chords;
splitting the normalized chord based on twelve-tone equal temperament to obtain a temperament chord;
carrying out unique hot coding on the rule chord to obtain a coded chord;
combining the frequency of the root note with the coding chord to obtain the sample coding data set;
the sample coding data set is divided into a sample coding training set and a sample coding verification set according to a preset proportion.
10. The apparatus for constructing chord transformation vectors according to claim 9, wherein said input module wherein said training of said sample encoded data sets into a neural network model comprises:
and inputting the sample coding training set into the neural network model to be used as neurons for training the neural network model.
11. The apparatus for constructing chord transformation vectors according to claim 9, wherein said input module wherein said training of said sample encoded data sets into a neural network model further comprises:
calculating chord sample prediction accuracy of the neural network model on the sample code verification set at intervals of a preset training interval;
when the chord sample prediction accuracy is lower than a first prediction threshold value, judging that the quality of the predicted chord code output by the neural network model is unqualified;
and when the chord sample prediction accuracy is higher than a second prediction threshold value, judging that the quality of the predicted chord code is qualified.
12. The apparatus for constructing a chord transformation vector according to claim 11, wherein in the determining module, the objective function is:
L=E[||chord_predict(t)-chord_ground_truth(t)||2]
wherein chord _ prediction (t) is a predicted chord code at time t, chord _ group _ true (t) is a chord code at time t in the sample code data set, and e (x) represents a desired function.
13. The apparatus for constructing chord transformation vectors according to claim 12, wherein in said determining module, said determining the training quality of said sample encoded data sets according to the objective function of said neural network model comprises:
calculating a current calculated value of the objective function;
when the current calculated value of the objective function is lower than a first loss threshold value, judging that the predicted chord code is qualified;
and when the current calculated value of the objective function is higher than a second loss threshold value, judging that the predicted chord code is unqualified.
14. The apparatus for constructing chord transformation vectors according to claim 11 or 13, wherein in the determining module, the determining the training quality of the sample encoded data set according to the objective function of the neural network model further comprises:
and determining the quality of the predicted chord code output by the neural network model according to the current calculated value of the target function and the chord sample prediction accuracy.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of constructing a chord transformation vector according to any one of claims 1 to 7.
CN201811409175.2A 2018-11-23 2018-11-23 Method and device for constructing chord transformation vector and computer readable storage medium Active CN109935222B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811409175.2A CN109935222B (en) 2018-11-23 2018-11-23 Method and device for constructing chord transformation vector and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811409175.2A CN109935222B (en) 2018-11-23 2018-11-23 Method and device for constructing chord transformation vector and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109935222A CN109935222A (en) 2019-06-25
CN109935222B true CN109935222B (en) 2021-05-04

Family

ID=66984657

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811409175.2A Active CN109935222B (en) 2018-11-23 2018-11-23 Method and device for constructing chord transformation vector and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN109935222B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539315B (en) * 2020-04-21 2023-04-18 招商局金融科技有限公司 Model training method and device based on black box model, electronic equipment and medium
CN114970651A (en) * 2021-02-26 2022-08-30 北京达佳互联信息技术有限公司 Training method of chord generation model, chord generation method, device and equipment
CN112989107B (en) * 2021-05-18 2021-07-30 北京世纪好未来教育科技有限公司 Audio classification and separation method and device, electronic equipment and storage medium
CN113571030B (en) * 2021-07-21 2023-10-20 浙江大学 MIDI music correction method and device based on hearing harmony evaluation

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1619640A (en) * 2003-11-21 2005-05-25 先锋株式会社 Automatic musical composition classification device and method
CN101123085A (en) * 2006-08-09 2008-02-13 株式会社河合乐器制作所 Chord-name detection apparatus and chord-name detection program
CN101473368A (en) * 2006-07-28 2009-07-01 莫达特公司 Device for producing signals representative of sounds of a keyboard and stringed instrument
JP4315180B2 (en) * 2006-10-20 2009-08-19 ソニー株式会社 Signal processing apparatus and method, program, and recording medium
CN103714806A (en) * 2014-01-07 2014-04-09 天津大学 Chord recognition method combining SVM with enhanced PCP
CN104395953A (en) * 2012-04-30 2015-03-04 诺基亚公司 Evaluation of beats, chords and downbeats from a musical audio signal
CN105976800A (en) * 2015-03-13 2016-09-28 三星电子株式会社 Electronic device, method for recognizing playing of string instrument in electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9715870B2 (en) * 2015-10-12 2017-07-25 International Business Machines Corporation Cognitive music engine using unsupervised learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1619640A (en) * 2003-11-21 2005-05-25 先锋株式会社 Automatic musical composition classification device and method
CN101473368A (en) * 2006-07-28 2009-07-01 莫达特公司 Device for producing signals representative of sounds of a keyboard and stringed instrument
CN101123085A (en) * 2006-08-09 2008-02-13 株式会社河合乐器制作所 Chord-name detection apparatus and chord-name detection program
JP4315180B2 (en) * 2006-10-20 2009-08-19 ソニー株式会社 Signal processing apparatus and method, program, and recording medium
CN104395953A (en) * 2012-04-30 2015-03-04 诺基亚公司 Evaluation of beats, chords and downbeats from a musical audio signal
CN103714806A (en) * 2014-01-07 2014-04-09 天津大学 Chord recognition method combining SVM with enhanced PCP
CN105976800A (en) * 2015-03-13 2016-09-28 三星电子株式会社 Electronic device, method for recognizing playing of string instrument in electronic device

Also Published As

Publication number Publication date
CN109935222A (en) 2019-06-25

Similar Documents

Publication Publication Date Title
CN109935222B (en) Method and device for constructing chord transformation vector and computer readable storage medium
Yu et al. Deep attention based music genre classification
Bretan et al. A unit selection methodology for music generation using deep neural networks
CN109800411A (en) Clinical treatment entity and its attribute extraction method
US20210264277A1 (en) Hierarchical system and method for generating intercorrelated datasets
CN111400461B (en) Intelligent customer service problem matching method and device
CN112990530B (en) Regional population quantity prediction method, regional population quantity prediction device, electronic equipment and storage medium
Shi et al. Symmetry in computer-aided music composition system with social network analysis and artificial neural network methods
Madhavi et al. Multivariate deep causal network for time series forecasting in interdependent networks
Wu et al. The power of fragmentation: a hierarchical transformer model for structural segmentation in symbolic music generation
Arronte Alvarez et al. Distributed vector representations of folksong motifs
Bretan et al. Learning and evaluating musical features with deep autoencoders
Liu [Retracted] Research on Piano Performance Optimization Based on Big Data and BP Neural Network Technology
Fernandes et al. Enhanced deep hierarchal GRU & BILSTM using data augmentation and spatial features for tamil emotional speech recognition
Chang et al. [Retracted] Application of Hidden Markov Model in Financial Time Series Data
Gajjar et al. E-mixup and siamese networks for musical key estimation
Qin et al. Bar transformer: a hierarchical model for learning long-term structure and generating impressive pop music
CN115762657A (en) Molecular property prediction method based on chemical element knowledge graph and functional group prompt
CN109102006A (en) A kind of music automark method based on the enhancing of audio frequency characteristics induction information
CN113268628A (en) Music emotion recognition method based on modularized weighted fusion neural network
Yang et al. Differentiated analysis for music traffic in software defined networks: A method of deep learning
Dawande et al. Music Generation and Composition Using Machine Learning
Remesh et al. Symbolic domain music generation system based on LSTM architecture
Foscarin Chord sequences: Evaluating the effect of complexity on preference
Hu Research on the interaction of genetic algorithm in assisted composition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant