CN117935859A - Partial multi-label music style characteristic selection method integrating characteristic similarity - Google Patents

Partial multi-label music style characteristic selection method integrating characteristic similarity Download PDF

Info

Publication number
CN117935859A
CN117935859A CN202410107121.XA CN202410107121A CN117935859A CN 117935859 A CN117935859 A CN 117935859A CN 202410107121 A CN202410107121 A CN 202410107121A CN 117935859 A CN117935859 A CN 117935859A
Authority
CN
China
Prior art keywords
label
music
feature
labels
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410107121.XA
Other languages
Chinese (zh)
Inventor
杨涛
刘海波
马希骜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Gongshang University
Original Assignee
Zhejiang Gongshang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Gongshang University filed Critical Zhejiang Gongshang University
Priority to CN202410107121.XA priority Critical patent/CN117935859A/en
Publication of CN117935859A publication Critical patent/CN117935859A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of machine learning and pattern recognition, and relates to a multi-label music style feature selection method integrating feature similarity. The invention combines feature relativity to construct a multi-label music style feature selection method, which divides candidate music style labels into real music style labels and noise music style labels, then respectively utilizes feature similarity to limit coefficient matrix W mapped to the real labels and coefficient matrix S mapped to the noise labels, utilizes a multi-label classifier to limit sum matrix H of W and S, and finally, utilizes the W matrix to score the features to select related music style features. The problem difficulty is reduced, and the performance of the multi-label k nearest neighbor (MLKNN) model is improved.

Description

Partial multi-label music style characteristic selection method integrating characteristic similarity
Technical Field
The invention belongs to the field of machine learning and pattern recognition, and relates to a multi-label music style feature selection method integrating feature similarity.
Background
Music style classification is a way to generalize and differentiate musical compositions, with different musical types being partitioned by specific musical elements and style characteristics. Such a classification system helps to understand and describe the diversity of music, helping listeners to better select their favorite music. In the musical style classification, there are many different genres and types, each with unique sounds and expressions. Classical music is a music form which expresses emotion and thought through complicated structure and exquisite composition, while popular music is more focused on popular and understandable melodies and lyrics, and is popular mainstream music. In addition, jazz music is known for its complex harmony and impulse performance, and electronic music creates sound rich in future feeling through electronic devices and techniques. Ballad music is usually guitar-dominated, emphasizing storyline and realism, while rock music is characterized by strong rhythms and guitar-playing, expressing attitudes to society and individuals. In general, musical style classification is a rich and colorful field reflecting the unique understanding and creation of music by different cultures, ages and individuals. Such a classification system helps us to better appreciate and understand the pluripotency of music, enabling us to find his own favorite sounds in a broad music world. The music style classification adds corresponding labels to the music according to the style of the music, and the music platform can better recommend the music to interested users through the music style labels, so that the experience of the users is improved. Multi-tag classification is also commonly used in music style classification, where a song may be tagged with styles such as "metal," "punk," "rock," "pop," etc. In the actual data set gathering process, a completely accurate style label cannot be obtained. Because the actual collected data is mostly crawled from the network, and markers in the network often appear as unreliable markers, some labeling errors are unavoidable. This means that some labels crawled on the internet during data collection have some artifacts that cause a song to be marked in a style that is not it, only some of the markers give marks that are valid. For example, a song or piece of music, the tags on the network may have seven tags, respectively "ballad", "electronic", "pop", "independent", "jazz" and "classical". A professional carefully recognizing the piece of music can find the tags to have many errors. Of these, only "ballad", "electronic", "popular" and "independent" are effective labels. The classification is performed in a multi-labeled music style classification dataset containing false labels, which we call a partial multi-labeled music style classification.
In the traditional music style classification multi-label learning task, after the extracted music features are reduced by using a feature selection method, the accuracy of a learning algorithm can be effectively improved. The prior theory and practice show that the difficulty of learning tasks can be well reduced and the accuracy of the model can be improved by using a proper feature selection method. The multi-label feature selection technology reduces the time and hardware resources consumed by the learner by reducing redundant music features, removing music features of irrelevant classification tasks. And because the irrelevant and redundant characteristics are eliminated, the influence of the characteristics on the multi-label learner is removed, and the learning performance is improved.
In the scene of multi-label music style classification, the influence of the noise labels on the feature selection result is directly ignored by utilizing a multi-label feature selection algorithm, so that the features related to the noise music style labels are selected in error. The features thus selected lack reliability. The existing multi-label feature selection method is less, one algorithm is to divide the multi-label candidate labels into two parts, limit coefficient matrixes mapped from feature space to label space through kernel norms and l 1 norms respectively, and then let similar examples have similar labels by using manifold regularities.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a multi-label music style feature selection method integrating feature similarity.
The specific steps of the invention are as follows:
step 1: extracting music features to obtain a multi-label music classification data set M and a designated feature subset dimension K, wherein the set M comprises n music samples, q labels and d music features;
step 2: dividing a multi-label music classification data set M into a training sample set MT and a test sample set MP, wherein X is used for representing a feature matrix of the training sample set, (X) ij is used for representing a j-th feature value of an i-th sample, Y is used for representing a candidate label indication matrix of the training sample, (Y) ij is used for representing whether a candidate label of the j-th sample exists or not, 1 exists, and 0 does not exist;
Step 3: the local gram matrix representation feature similarity for each sample x g is calculated. The local gram W g∈Rdxd matrix for sample x g is calculated as follows:
wherein, Delta is the threshold for the neighborhood granularity and delta is the distance metric formula.
Step 4: defining a partial multi-label classifier, and the objective function is as follows
Where W and S are mapping matrices mapped to real labels and noise labels, respectively.
Step 5: the feature similarity is limited to mapping matrices W and S, respectively, where the final objective function is defined using the idea of popular regularization
Where L g=Dg+Wg,Dg is the diagonal matrix formed by the addition of each row of elements of W g.
Step 7: and solving W and S by using an alternate solving method to minimize an objective function, calculating a binary norm for each column vector of W to obtain a scoring value of the features, and then selecting the K features with the largest value.
Step 8: and performing dimension reduction on the training sample set MT and the test sample set MP by using the K selected features to respectively obtain a dimension-reduced training sample set MT ' and a dimension-reduced test sample set MP ', and then inputting the dimension-reduced training sample set MT ' into a multi-label K nearest neighbor (ML-KNN) model for training to obtain a trained multi-label K nearest neighbor model (ML-KNN) model.
The beneficial effects of the invention are as follows: the invention discloses a multi-label music style characteristic selecting method based on feature fusion and compatibility, which comprises the following steps: preprocessing a multi-label data set, wherein the preprocessing comprises missing value filling, data discretization and the like; and performing feature screening on the processed data set by using a multi-label music style feature selection method to obtain a screened feature set. And inputting the obtained characteristic dataset into a multi-label k-nearest neighbor (MLKNN) model to obtain a dataset-optimized multi-label k-nearest neighbor (MLKNN) model. According to the invention, candidate music style labels are divided into real music style labels and noise music style labels, then a coefficient matrix W mapped to the real labels and a coefficient matrix S mapped to the noise labels are respectively limited by utilizing characteristic similarity, a sum matrix H of W and S is limited by utilizing a partial multi-label classifier, and finally, related music style characteristics are selected by scoring the characteristics through the W matrix. The problem difficulty is reduced, and the performance of the multi-label k nearest neighbor (MLKNN) model is improved.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The following describes specific embodiments of the present invention in detail.
Step 1: extracting music features to obtain a multi-label music classification data set M and a designated feature subset dimension K, wherein the set M is provided with n music style samples, q music style labels and d music features;
Step 2: dividing the multi-label music style classification data set M into a training sample set MT and a test sample set MP, wherein X is used for representing a characteristic matrix of the training sample set MT, (X) ij is used for representing a j-th characteristic value of an i-th sample, Y is used for representing a candidate label indication matrix of the training sample, (Y) ij is used for representing whether a candidate label of the j-th sample exists or not, 1 exists, and 0 does not exist;
Step 3: the local gram matrix for each sample is calculated to represent the feature similarity. The local gram W g∈Rdxd matrix for sample x g is calculated as follows:
wherein, Delta is the threshold of the neighborhood granularity and delta is the Euclidean distance.
Step 4: defining a partial multi-label classifier, and the objective function is as follows
Where W and S are mapping matrices mapped to real labels and noise labels, respectively.
Step 5: the feature similarity is limited to mapping matrices W and S, respectively, where the final objective function is defined using the idea of popular regularization
Where L g=Dg+Wg,Dg is the diagonal matrix formed by the addition of each row of elements of W g. λ, β, γ, η 1 and η 2 are equilibrium parameters. And solving W by using an alternate solving method to minimize an objective function, calculating a binary norm for each column vector of W to obtain a characteristic scoring value, and then selecting the K maximum characteristics.
Step 6: and performing dimension reduction on the training sample set MT and the test sample set MP by using the K selected features to respectively obtain a dimension-reduced training sample set MT ' and a dimension-reduced test sample set MP ', and then inputting the dimension-reduced training sample set MT ' into a multi-label K nearest neighbor (ML-KNN) model for training to obtain a trained multi-label K nearest neighbor model (ML-KNN) model.
The λ, β, γ, η 1 and η 2 parameters in step 5 were obtained from cross-validation.
The solving process in the step 5 specifically comprises the following steps: first, converting an objective function into a target function by using a meta-conversion method
Converting the upper mode into the lower mode by LADMAP method
Fixing W, S, P, Q, optimizing H
Hk+1=(YTX+μ1Sk1Wk-Y1)(XXT+λI+μ1I)-1
Fixing W, H, S, Q, optimizing P
Fixing W, H, S, P, optimizing Q
Fixing H, P, Q, optimizing W and S
Is a diagonal matrix,/>
Sk+1=soft/(μ12)(μ3Q+Y31H-μ1W+Y1), Wherein soft ξ (x) =sign (x) max (|x| - ζ, 0)
Updating Y 1,Y2,Y3123
Where μ max is the maximum value defined by μ 123 and ρ is the growth factor.
Verification example:
To verify the validity of the application, we verify on the music_style dataset, which is a biased-tag music style-classified dataset that predicts the style of the music by annotation. The method comprises 10 music style labels of classical, jazz, rock and the like, wherein the 10 music style labels comprise 6839 pieces of music, 98 characteristics of each piece of music are marked as a plurality of 10 labels, original labels collected by a network are candidate labels, and real labels are obtained after manual fine resolution.
According to the steps of the present invention, λ is set to 1, β is set to 1, γ is set to 0.1, η 1 is set to 0.01, η 2 is set to 0.01. The set M of inputs at this time is music_styles, and the feature subset dimension K of the inputs is 24. Finally, the selected feature numbers {65, 66, 26,4, 54, 64, 96, 81, 92, 27, 79, 52, 29, 49, 1628, 58,1,5, 95, 55, 57,9, 93}, then a new training set is created from the selected feature set, and finally the model MLKNN-FS is trained on the MLKNN classifier model using the new training set.
One Error, ranking Loss, coverage Error, average Precision, etc. were used as criteria for evaluating the multi-label classification model. And then comparing and verifying experiments, directly training MLKNN the model by using a complete training set, and obtaining the model MKLNN-ALL without feature selection. Substituting the four indexes into the test set to obtain the MLKNN-ALL model. The above data are aggregated into a table as follows:
TABLE 1 comparison of four metrics for model MLKNN-ALL and model MLKNN-FS predictions
The index Average Precision in Table 1 is the larger the better, while the indices of Coverage Error, one Error, and Ranking Loss are the smaller the better. Experimental results show that MLKNN-FS classifier is better than MLKNN-ALL classifier in various indexes. This shows that the invention can effectively improve the performance of the multi-label classification model.

Claims (3)

1. A multi-label music style feature selection method integrating feature similarity comprises the following steps:
step 1: extracting music features to obtain a multi-label music classification data set M and a designated selected feature number K, wherein the set M comprises n music samples, q labels and d music features;
step 2: dividing a multi-label music classification data set M into a training sample set MT and a test sample set MP, wherein X is used for representing a feature matrix of the training sample set, (X) ij is used for representing a j-th feature value of an i-th sample, Y is used for representing a candidate label indication matrix of the training sample, (Y) ij is used for representing whether a candidate label of the j-th sample exists or not, 1 exists, and 0 does not exist;
Step 3: calculating the local gram matrix representation feature similarity of each sample x g; the local gram W g∈Rdxd matrix for sample x g is calculated as follows:
wherein, Delta is the threshold of the neighborhood granularity, delta is the distance metric formula;
Step 4: defining a partial multi-label classifier, and the objective function is as follows
Wherein W and S are mapping matrices mapped to real tags and noise tags, respectively;
step 5: the feature similarity is limited to mapping matrices W and S, respectively, where the final objective function is defined using the idea of popular regularization
Wherein L g=Dg+Wg,Dg is a diagonal matrix formed by adding each row of elements of W g; λ, β, γ, η 1 and η 2 are balance parameters; solving W and S by using an alternate solving method to minimize an objective function, calculating a binary norm for each column vector of W to obtain a scoring value of the feature, and then selecting the K features with the largest value;
Step 6: and performing dimension reduction on the training sample set MT and the test sample set MP by using the K selected features to respectively obtain a dimension-reduced training sample set MT ' and a dimension-reduced test sample set MP ', and then inputting the dimension-reduced training sample set MT ' into a multi-label K nearest neighbor (ML-KNN) model for training to obtain a trained multi-label K nearest neighbor model (ML-KNN) model.
2. The method for selecting multi-label music style characteristics with feature similarity fusion according to claim 1, wherein in step 5
First, converting an objective function into a target function by using a meta-conversion method
Converting the upper mode into the lower mode by LADMAP method
Fixing W, S, P, Q, optimizing H
Hk+1=(YTX+μ1Sk1Wk-Y1)(XXT+λI+μ1I)-1
Fixing W, H, S, Q, optimizing P
Fixing W, H, S, P, optimizing Q
Fixing H, P, Q, optimizing W and S
Is a diagonal matrix,/>
Wherein soft ξ (x) =sign (x) max (|x| - ζ, 0)
Updating Y 1,Y2,Y3123
Where μ max is the maximum value defined by μ 123 and ρ is the growth factor.
3. The method of claim 1, wherein in step 6, the step of training MLKNN the classifier comprises:
The newly generated feature subset is input to MLKNN models, at this time, the number of parameters k of MLKNN models is 10, and other parameters remain default, so that an optimized MLKNN model is finally obtained.
CN202410107121.XA 2024-01-25 2024-01-25 Partial multi-label music style characteristic selection method integrating characteristic similarity Pending CN117935859A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410107121.XA CN117935859A (en) 2024-01-25 2024-01-25 Partial multi-label music style characteristic selection method integrating characteristic similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410107121.XA CN117935859A (en) 2024-01-25 2024-01-25 Partial multi-label music style characteristic selection method integrating characteristic similarity

Publications (1)

Publication Number Publication Date
CN117935859A true CN117935859A (en) 2024-04-26

Family

ID=90766259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410107121.XA Pending CN117935859A (en) 2024-01-25 2024-01-25 Partial multi-label music style characteristic selection method integrating characteristic similarity

Country Status (1)

Country Link
CN (1) CN117935859A (en)

Similar Documents

Publication Publication Date Title
US7788279B2 (en) System and method for storing and retrieving non-text-based information
Wu et al. Automatic audio chord recognition with MIDI-trained deep feature and BLSTM-CRF sequence decoding model
CN100397387C (en) Summarizing digital audio data
Typke Music retrieval based on melodic similarity
CN115393692A (en) Generation formula pre-training language model-based association text-to-image generation method
JPWO2010047019A1 (en) Statistical model learning apparatus, statistical model learning method, and program
CN112051986B (en) Code search recommendation device and method based on open source knowledge
CN113813609B (en) Game music style classification method and device, readable medium and electronic equipment
CN107993636B (en) Recursive neural network-based music score modeling and generating method
CN101409070A (en) Music reconstruction method base on movement image analysis
CN111523055A (en) Collaborative recommendation method and system based on agricultural product characteristic attribute comment tendency
CN113707112B (en) Automatic generation method of recursion jump connection deep learning music based on layer standardization
CN113506553B (en) Audio automatic labeling method based on transfer learning
CN107194468A (en) Towards the decision tree Increment Learning Algorithm of information big data
Glickman et al. (A) Data in the Life: Authorship Attribution of Lennon-McCartney Songs
CN110516109B (en) Music label association method and device and storage medium
CN112148919A (en) Music click rate prediction method and device based on gradient lifting tree algorithm
CN117935859A (en) Partial multi-label music style characteristic selection method integrating characteristic similarity
Ramoneda et al. Predicting performance difficulty from piano sheet music images
CN115472181A (en) Method, device and storage medium for singing recognition based on feature fusion and clustering
CN110659382B (en) Mixed music recommendation method based on heterogeneous information network representation learning technology
Choi Deep neural networks for music tagging
De Prisco et al. Creative DNA computing: splicing systems for music composition
Lupker et al. Music theory, the missing link between music-related big data and artificial intelligence.
Widmer et al. Automatic recognition of famous artists by machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination