CN117935859A - Partial multi-label music style characteristic selection method integrating characteristic similarity - Google Patents
Partial multi-label music style characteristic selection method integrating characteristic similarity Download PDFInfo
- Publication number
- CN117935859A CN117935859A CN202410107121.XA CN202410107121A CN117935859A CN 117935859 A CN117935859 A CN 117935859A CN 202410107121 A CN202410107121 A CN 202410107121A CN 117935859 A CN117935859 A CN 117935859A
- Authority
- CN
- China
- Prior art keywords
- label
- music
- feature
- labels
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010187 selection method Methods 0.000 title claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims description 26
- 238000000034 method Methods 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 230000004927 fusion Effects 0.000 claims description 2
- 239000003102 growth factor Substances 0.000 claims description 2
- 238000010801 machine learning Methods 0.000 abstract description 2
- 238000003909 pattern recognition Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 8
- 239000011435 rock Substances 0.000 description 3
- 238000013145 classification model Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the field of machine learning and pattern recognition, and relates to a multi-label music style feature selection method integrating feature similarity. The invention combines feature relativity to construct a multi-label music style feature selection method, which divides candidate music style labels into real music style labels and noise music style labels, then respectively utilizes feature similarity to limit coefficient matrix W mapped to the real labels and coefficient matrix S mapped to the noise labels, utilizes a multi-label classifier to limit sum matrix H of W and S, and finally, utilizes the W matrix to score the features to select related music style features. The problem difficulty is reduced, and the performance of the multi-label k nearest neighbor (MLKNN) model is improved.
Description
Technical Field
The invention belongs to the field of machine learning and pattern recognition, and relates to a multi-label music style feature selection method integrating feature similarity.
Background
Music style classification is a way to generalize and differentiate musical compositions, with different musical types being partitioned by specific musical elements and style characteristics. Such a classification system helps to understand and describe the diversity of music, helping listeners to better select their favorite music. In the musical style classification, there are many different genres and types, each with unique sounds and expressions. Classical music is a music form which expresses emotion and thought through complicated structure and exquisite composition, while popular music is more focused on popular and understandable melodies and lyrics, and is popular mainstream music. In addition, jazz music is known for its complex harmony and impulse performance, and electronic music creates sound rich in future feeling through electronic devices and techniques. Ballad music is usually guitar-dominated, emphasizing storyline and realism, while rock music is characterized by strong rhythms and guitar-playing, expressing attitudes to society and individuals. In general, musical style classification is a rich and colorful field reflecting the unique understanding and creation of music by different cultures, ages and individuals. Such a classification system helps us to better appreciate and understand the pluripotency of music, enabling us to find his own favorite sounds in a broad music world. The music style classification adds corresponding labels to the music according to the style of the music, and the music platform can better recommend the music to interested users through the music style labels, so that the experience of the users is improved. Multi-tag classification is also commonly used in music style classification, where a song may be tagged with styles such as "metal," "punk," "rock," "pop," etc. In the actual data set gathering process, a completely accurate style label cannot be obtained. Because the actual collected data is mostly crawled from the network, and markers in the network often appear as unreliable markers, some labeling errors are unavoidable. This means that some labels crawled on the internet during data collection have some artifacts that cause a song to be marked in a style that is not it, only some of the markers give marks that are valid. For example, a song or piece of music, the tags on the network may have seven tags, respectively "ballad", "electronic", "pop", "independent", "jazz" and "classical". A professional carefully recognizing the piece of music can find the tags to have many errors. Of these, only "ballad", "electronic", "popular" and "independent" are effective labels. The classification is performed in a multi-labeled music style classification dataset containing false labels, which we call a partial multi-labeled music style classification.
In the traditional music style classification multi-label learning task, after the extracted music features are reduced by using a feature selection method, the accuracy of a learning algorithm can be effectively improved. The prior theory and practice show that the difficulty of learning tasks can be well reduced and the accuracy of the model can be improved by using a proper feature selection method. The multi-label feature selection technology reduces the time and hardware resources consumed by the learner by reducing redundant music features, removing music features of irrelevant classification tasks. And because the irrelevant and redundant characteristics are eliminated, the influence of the characteristics on the multi-label learner is removed, and the learning performance is improved.
In the scene of multi-label music style classification, the influence of the noise labels on the feature selection result is directly ignored by utilizing a multi-label feature selection algorithm, so that the features related to the noise music style labels are selected in error. The features thus selected lack reliability. The existing multi-label feature selection method is less, one algorithm is to divide the multi-label candidate labels into two parts, limit coefficient matrixes mapped from feature space to label space through kernel norms and l 1 norms respectively, and then let similar examples have similar labels by using manifold regularities.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a multi-label music style feature selection method integrating feature similarity.
The specific steps of the invention are as follows:
step 1: extracting music features to obtain a multi-label music classification data set M and a designated feature subset dimension K, wherein the set M comprises n music samples, q labels and d music features;
step 2: dividing a multi-label music classification data set M into a training sample set MT and a test sample set MP, wherein X is used for representing a feature matrix of the training sample set, (X) ij is used for representing a j-th feature value of an i-th sample, Y is used for representing a candidate label indication matrix of the training sample, (Y) ij is used for representing whether a candidate label of the j-th sample exists or not, 1 exists, and 0 does not exist;
Step 3: the local gram matrix representation feature similarity for each sample x g is calculated. The local gram W g∈Rdxd matrix for sample x g is calculated as follows:
wherein, Delta is the threshold for the neighborhood granularity and delta is the distance metric formula.
Step 4: defining a partial multi-label classifier, and the objective function is as follows
Where W and S are mapping matrices mapped to real labels and noise labels, respectively.
Step 5: the feature similarity is limited to mapping matrices W and S, respectively, where the final objective function is defined using the idea of popular regularization
Where L g=Dg+Wg,Dg is the diagonal matrix formed by the addition of each row of elements of W g.
Step 7: and solving W and S by using an alternate solving method to minimize an objective function, calculating a binary norm for each column vector of W to obtain a scoring value of the features, and then selecting the K features with the largest value.
Step 8: and performing dimension reduction on the training sample set MT and the test sample set MP by using the K selected features to respectively obtain a dimension-reduced training sample set MT ' and a dimension-reduced test sample set MP ', and then inputting the dimension-reduced training sample set MT ' into a multi-label K nearest neighbor (ML-KNN) model for training to obtain a trained multi-label K nearest neighbor model (ML-KNN) model.
The beneficial effects of the invention are as follows: the invention discloses a multi-label music style characteristic selecting method based on feature fusion and compatibility, which comprises the following steps: preprocessing a multi-label data set, wherein the preprocessing comprises missing value filling, data discretization and the like; and performing feature screening on the processed data set by using a multi-label music style feature selection method to obtain a screened feature set. And inputting the obtained characteristic dataset into a multi-label k-nearest neighbor (MLKNN) model to obtain a dataset-optimized multi-label k-nearest neighbor (MLKNN) model. According to the invention, candidate music style labels are divided into real music style labels and noise music style labels, then a coefficient matrix W mapped to the real labels and a coefficient matrix S mapped to the noise labels are respectively limited by utilizing characteristic similarity, a sum matrix H of W and S is limited by utilizing a partial multi-label classifier, and finally, related music style characteristics are selected by scoring the characteristics through the W matrix. The problem difficulty is reduced, and the performance of the multi-label k nearest neighbor (MLKNN) model is improved.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The following describes specific embodiments of the present invention in detail.
Step 1: extracting music features to obtain a multi-label music classification data set M and a designated feature subset dimension K, wherein the set M is provided with n music style samples, q music style labels and d music features;
Step 2: dividing the multi-label music style classification data set M into a training sample set MT and a test sample set MP, wherein X is used for representing a characteristic matrix of the training sample set MT, (X) ij is used for representing a j-th characteristic value of an i-th sample, Y is used for representing a candidate label indication matrix of the training sample, (Y) ij is used for representing whether a candidate label of the j-th sample exists or not, 1 exists, and 0 does not exist;
Step 3: the local gram matrix for each sample is calculated to represent the feature similarity. The local gram W g∈Rdxd matrix for sample x g is calculated as follows:
wherein, Delta is the threshold of the neighborhood granularity and delta is the Euclidean distance.
Step 4: defining a partial multi-label classifier, and the objective function is as follows
Where W and S are mapping matrices mapped to real labels and noise labels, respectively.
Step 5: the feature similarity is limited to mapping matrices W and S, respectively, where the final objective function is defined using the idea of popular regularization
Where L g=Dg+Wg,Dg is the diagonal matrix formed by the addition of each row of elements of W g. λ, β, γ, η 1 and η 2 are equilibrium parameters. And solving W by using an alternate solving method to minimize an objective function, calculating a binary norm for each column vector of W to obtain a characteristic scoring value, and then selecting the K maximum characteristics.
Step 6: and performing dimension reduction on the training sample set MT and the test sample set MP by using the K selected features to respectively obtain a dimension-reduced training sample set MT ' and a dimension-reduced test sample set MP ', and then inputting the dimension-reduced training sample set MT ' into a multi-label K nearest neighbor (ML-KNN) model for training to obtain a trained multi-label K nearest neighbor model (ML-KNN) model.
The λ, β, γ, η 1 and η 2 parameters in step 5 were obtained from cross-validation.
The solving process in the step 5 specifically comprises the following steps: first, converting an objective function into a target function by using a meta-conversion method
Converting the upper mode into the lower mode by LADMAP method
Fixing W, S, P, Q, optimizing H
Hk+1=(YTX+μ1Sk+μ1Wk-Y1)(XXT+λI+μ1I)-1
Fixing W, H, S, Q, optimizing P
Fixing W, H, S, P, optimizing Q
Fixing H, P, Q, optimizing W and S
Is a diagonal matrix,/>
Sk+1=soft2γ/(μ1+μ2)(μ3Q+Y3+μ1H-μ1W+Y1), Wherein soft ξ (x) =sign (x) max (|x| - ζ, 0)
Updating Y 1,Y2,Y3,μ1,μ2,μ3
Where μ max is the maximum value defined by μ 1,μ2,μ3 and ρ is the growth factor.
Verification example:
To verify the validity of the application, we verify on the music_style dataset, which is a biased-tag music style-classified dataset that predicts the style of the music by annotation. The method comprises 10 music style labels of classical, jazz, rock and the like, wherein the 10 music style labels comprise 6839 pieces of music, 98 characteristics of each piece of music are marked as a plurality of 10 labels, original labels collected by a network are candidate labels, and real labels are obtained after manual fine resolution.
According to the steps of the present invention, λ is set to 1, β is set to 1, γ is set to 0.1, η 1 is set to 0.01, η 2 is set to 0.01. The set M of inputs at this time is music_styles, and the feature subset dimension K of the inputs is 24. Finally, the selected feature numbers {65, 66, 26,4, 54, 64, 96, 81, 92, 27, 79, 52, 29, 49, 1628, 58,1,5, 95, 55, 57,9, 93}, then a new training set is created from the selected feature set, and finally the model MLKNN-FS is trained on the MLKNN classifier model using the new training set.
One Error, ranking Loss, coverage Error, average Precision, etc. were used as criteria for evaluating the multi-label classification model. And then comparing and verifying experiments, directly training MLKNN the model by using a complete training set, and obtaining the model MKLNN-ALL without feature selection. Substituting the four indexes into the test set to obtain the MLKNN-ALL model. The above data are aggregated into a table as follows:
TABLE 1 comparison of four metrics for model MLKNN-ALL and model MLKNN-FS predictions
The index Average Precision in Table 1 is the larger the better, while the indices of Coverage Error, one Error, and Ranking Loss are the smaller the better. Experimental results show that MLKNN-FS classifier is better than MLKNN-ALL classifier in various indexes. This shows that the invention can effectively improve the performance of the multi-label classification model.
Claims (3)
1. A multi-label music style feature selection method integrating feature similarity comprises the following steps:
step 1: extracting music features to obtain a multi-label music classification data set M and a designated selected feature number K, wherein the set M comprises n music samples, q labels and d music features;
step 2: dividing a multi-label music classification data set M into a training sample set MT and a test sample set MP, wherein X is used for representing a feature matrix of the training sample set, (X) ij is used for representing a j-th feature value of an i-th sample, Y is used for representing a candidate label indication matrix of the training sample, (Y) ij is used for representing whether a candidate label of the j-th sample exists or not, 1 exists, and 0 does not exist;
Step 3: calculating the local gram matrix representation feature similarity of each sample x g; the local gram W g∈Rdxd matrix for sample x g is calculated as follows:
wherein, Delta is the threshold of the neighborhood granularity, delta is the distance metric formula;
Step 4: defining a partial multi-label classifier, and the objective function is as follows
Wherein W and S are mapping matrices mapped to real tags and noise tags, respectively;
step 5: the feature similarity is limited to mapping matrices W and S, respectively, where the final objective function is defined using the idea of popular regularization
Wherein L g=Dg+Wg,Dg is a diagonal matrix formed by adding each row of elements of W g; λ, β, γ, η 1 and η 2 are balance parameters; solving W and S by using an alternate solving method to minimize an objective function, calculating a binary norm for each column vector of W to obtain a scoring value of the feature, and then selecting the K features with the largest value;
Step 6: and performing dimension reduction on the training sample set MT and the test sample set MP by using the K selected features to respectively obtain a dimension-reduced training sample set MT ' and a dimension-reduced test sample set MP ', and then inputting the dimension-reduced training sample set MT ' into a multi-label K nearest neighbor (ML-KNN) model for training to obtain a trained multi-label K nearest neighbor model (ML-KNN) model.
2. The method for selecting multi-label music style characteristics with feature similarity fusion according to claim 1, wherein in step 5
First, converting an objective function into a target function by using a meta-conversion method
Converting the upper mode into the lower mode by LADMAP method
Fixing W, S, P, Q, optimizing H
Hk+1=(YTX+μ1Sk+μ1Wk-Y1)(XXT+λI+μ1I)-1
Fixing W, H, S, Q, optimizing P
Fixing W, H, S, P, optimizing Q
Fixing H, P, Q, optimizing W and S
Is a diagonal matrix,/>
Wherein soft ξ (x) =sign (x) max (|x| - ζ, 0)
Updating Y 1,Y2,Y3,μ1,μ2,μ3
Where μ max is the maximum value defined by μ 1,μ2,μ3 and ρ is the growth factor.
3. The method of claim 1, wherein in step 6, the step of training MLKNN the classifier comprises:
The newly generated feature subset is input to MLKNN models, at this time, the number of parameters k of MLKNN models is 10, and other parameters remain default, so that an optimized MLKNN model is finally obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410107121.XA CN117935859A (en) | 2024-01-25 | 2024-01-25 | Partial multi-label music style characteristic selection method integrating characteristic similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410107121.XA CN117935859A (en) | 2024-01-25 | 2024-01-25 | Partial multi-label music style characteristic selection method integrating characteristic similarity |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117935859A true CN117935859A (en) | 2024-04-26 |
Family
ID=90766259
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410107121.XA Pending CN117935859A (en) | 2024-01-25 | 2024-01-25 | Partial multi-label music style characteristic selection method integrating characteristic similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117935859A (en) |
-
2024
- 2024-01-25 CN CN202410107121.XA patent/CN117935859A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7788279B2 (en) | System and method for storing and retrieving non-text-based information | |
Wu et al. | Automatic audio chord recognition with MIDI-trained deep feature and BLSTM-CRF sequence decoding model | |
CN100397387C (en) | Summarizing digital audio data | |
Typke | Music retrieval based on melodic similarity | |
CN115393692A (en) | Generation formula pre-training language model-based association text-to-image generation method | |
JPWO2010047019A1 (en) | Statistical model learning apparatus, statistical model learning method, and program | |
CN112051986B (en) | Code search recommendation device and method based on open source knowledge | |
CN113813609B (en) | Game music style classification method and device, readable medium and electronic equipment | |
CN107993636B (en) | Recursive neural network-based music score modeling and generating method | |
CN101409070A (en) | Music reconstruction method base on movement image analysis | |
CN111523055A (en) | Collaborative recommendation method and system based on agricultural product characteristic attribute comment tendency | |
CN113707112B (en) | Automatic generation method of recursion jump connection deep learning music based on layer standardization | |
CN113506553B (en) | Audio automatic labeling method based on transfer learning | |
CN107194468A (en) | Towards the decision tree Increment Learning Algorithm of information big data | |
Glickman et al. | (A) Data in the Life: Authorship Attribution of Lennon-McCartney Songs | |
CN110516109B (en) | Music label association method and device and storage medium | |
CN112148919A (en) | Music click rate prediction method and device based on gradient lifting tree algorithm | |
CN117935859A (en) | Partial multi-label music style characteristic selection method integrating characteristic similarity | |
Ramoneda et al. | Predicting performance difficulty from piano sheet music images | |
CN115472181A (en) | Method, device and storage medium for singing recognition based on feature fusion and clustering | |
CN110659382B (en) | Mixed music recommendation method based on heterogeneous information network representation learning technology | |
Choi | Deep neural networks for music tagging | |
De Prisco et al. | Creative DNA computing: splicing systems for music composition | |
Lupker et al. | Music theory, the missing link between music-related big data and artificial intelligence. | |
Widmer et al. | Automatic recognition of famous artists by machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |