LU103103B1 - Method for electronic music classification model construction based on machine learning and deep learning - Google Patents

Method for electronic music classification model construction based on machine learning and deep learning Download PDF

Info

Publication number
LU103103B1
LU103103B1 LU103103A LU103103A LU103103B1 LU 103103 B1 LU103103 B1 LU 103103B1 LU 103103 A LU103103 A LU 103103A LU 103103 A LU103103 A LU 103103A LU 103103 B1 LU103103 B1 LU 103103B1
Authority
LU
Luxembourg
Prior art keywords
music
electronic music
classification
electronic
classification model
Prior art date
Application number
LU103103A
Other languages
German (de)
Inventor
Yaping Tang
Original Assignee
Univ Hunan Humanities Sci & Tech
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Hunan Humanities Sci & Tech filed Critical Univ Hunan Humanities Sci & Tech
Priority to LU103103A priority Critical patent/LU103103B1/en
Application granted granted Critical
Publication of LU103103B1 publication Critical patent/LU103103B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/036Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal of musical genre, i.e. analysing the style of musical pieces, usually for selection, filtering or classification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/081Genre classification, i.e. descriptive metadata for classification or selection of musical pieces according to style
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/311Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a music genre classification model with spectrogram as input, which provides a new idea of audio classification and recognition, the model is used to perform 5 classification simulation experiments on various electronic music signals, this model reduces the time required for constructing an electronic music classifier.

Description

S1XX1LU/SK23025LU 21.04.2023
METHOD FOR ELECTRONIC MUSIC CLASSIFICATION MODEL 04103103
CONSTRUCTION BASED ON MACHINE LEARNING AND DEEP
LEARNING
FIELD OF TECHNOLOGY
[0001] The present invention relates to the field of artificial intelligence, more particularly, to a method for electronic music classification model construction based on machine learning and deep learning.
BACKGROUND
[0002] With the development of information technology and storage technology, digital music has become increasingly popular, and major music companies have also shifted their product focus to digital albums. It is increasingly difficult for us to see physical albums. such as tapes and CDs. Different music has different styles, accompaniment instruments and other components, and music with different hierarchical structures and characteristics can be summarized into different genres. Because most people like listening to music, there are many kinds of electronic music, and everyone likes different types of electronic music. If the types of electronic music signals are classified and identified in advance, listeners can choose the electronic music they want to listen to from the electronic music signal labels, which can greatly improve the management level of electronic music, so the classification and identification of electronic music signals has become an important research direction in the field of artificial intelligence. Electronic music is music made by using electronic musical instruments and related technologies. The electronic musical instruments used realize music data exchange through corresponding digital interfaces, synthesizers, sequencers and computers. With the development of computer technology, more detailed and in-depth research has been conducted on computer audio-visual information processing, and artificial intelligence technology can enable computers to understand music. Apply deep learning to create feature recognition modules, perform adaptive feature fusion on electronic music, and adjust using adaptive mechanisms to facilitate feature fusion in universities. By taking on the fused feature factors through a NN, introducing a distribution structure for multi-layer perceptual classification, and utilizing the electronic music special frequency effect to construct an electronic music classification model. A single electronic music feature can provide limited electronic music information, making it difficult to accurately describe the specific content of electronic music and achieve correct classification of electronic music. 1
S1XX1LU/SK23025LU 21.04.2023
Electronic music features have many features such as short-term energy features, LU103103 time-domain features, and frequency-domain features. Through electronic music features, detailed content of electronic music can be described. Subsequently, artificial intelligence electronic music signal identification methods appeared. such as linear discriminant analysis, artificial neural network and support vector machine, which obtained better identification results than manual methods. However, in practical application, these methods have some shortcomings: the accuracy of electronic music signal identification by linear discriminant analysis is low, and it is considered that there is a fixed linear relationship between feature vectors and electronic music signal types, which cannot reflect electronic music signals.
Overall, applying traditional machine learning methods to music genre classification requires manual feature design and relies on professional knowledge and experience in audio signal analysis. The steps are cumbersome, and there are bottlenecks in improving accuracy. The use of deep learning methods, such as the popular NN in recent years, can provide new ideas for music genre classification modeling. In this invention, the classification and recognition of music genres are taken as the research direction, and one-dimensional audio files are processed by short-time Fourier transform, Mel transform and constant Q transform respectively to generate frequency spectrum and related data. Using convolutional neural network, acoustic features such as rhythm, pitch and chord in images are automatically learned and extracted. and a music genre classification model is constructed. In order to ensure the effectiveness of the design, a simulation experiment is carried out by simulating the use environment. The experimental results show that the designed electronic music classification model can be classified by feature fusion, and the classification result is very accurate.
SUMMARY
[0003] In order to address such a technical problem in the prior art, one aspect of the invention provides a method for electronic music classification model construction based on machine learning and deep learning. the method comprising:
[0004] using an interface to transmit received audio signal to an audio processing module, processing the audio signal through analog digital conversion and signal amplification. to form a music data;
[0005] constructing a music classification model with spectrogram as input based on learning and the structural characteristics of NN;
[0006] obtaining dynamic parameters of the electronic music classification model by using 2
S1XX1LU/SK23025LU 21.04.2023 the steganographic analysis algorithm of weight distribution to model the music data; LU103103
[0007] classifying modeled music data using the constructed music classification model.
[0008] According to an embodiment of the invention, a complementary processing is carried out according to the size of the differential gradient in electronic music.
[0009] According to an embodiment of the invention, adaptive docking is used for complementary processing when the characteristics of electronic audio are not obvious.
[0010] According to an embodiment of the invention, wherein a NN multilayer perceptron (MLP) is used to divide the classification process into three layers: a import layer, a classification layer, and a output layer.
[0011] According to an embodiment of the invention, wherein after various types of original ecological electronic music data is collected, the collected electronic music data is denoised.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] Fig. 1 shows a music classification flowchart provided by the embodiment of the present invention;
[0013] Fig. 2 shows a system hardware structure diagram;
[0014] Fig. 3 shows a classification framework of electronic music model;
[0015] Fig. 4 shows classification accuracy corresponding to different time periods;
DETAILED DESCRIPTION
[0016] The invention will now be described in greater detail with reference to the figures.
[0017] With respect to FIG. 1, a method for electronic music classification model construction based on machine learning and deep learning is described. the method comprising the flowing steps:
[0018] Step S101, using an interface to transmit received audio signal to an audio processing module, processing the audio signal through analog digital conversion and signal amplification, to form a music data;
[0019] Step S102, constructing a music classification model with spectrogram as input based on learning and the structural characteristics of NN;
[0020] Step S103, obtaining dynamic parameters of the electronic music classification model by using the steganographic analysis algorithm of weight distribution to model the music data;
[0021] Step S104, classifying modeled music data using the constructed music 3
S1XX1LU/SK23025LU 21.04.2023 classification model. LU103103
[0022] Furthermore, steps S101-S104 could containing the flowing content:
[0023] The adaptive multi feature fusion process of electronic music is actually a screening process. When the background rhythm frequency of electronic music undergoes drastic changes, the features of electronic sound effects change in a continuous form. The feature results obtained by single multi feature fusion are very single and cannot be classified for use.
Currently, electronic music features are typically modeled and analyzed using a single feature, and the amount of information extracted from a single feature is limited, making it difficult to fully describe the type of electronic music. Therefore, this article uses extracting multiple features for electronic music classification. Firstly, collect electronic music signals.
As electronic music signals are continuous, it is necessary to perform frame division processing on the electronic music signals, and to better extract electronic music classification features . Among which the hardware part mainly consists of an audio acquisition module. an audio processing module, a storage module, and a power module. The overall hardware structure is shown in Figure 2.
[0024] Use the interface to transmit the received signal to the audio processing module, process the audio signal through analog digital conversion, signal amplification and other processes, and store it in the memory module after processing. At the same time, transmit it to the computer through the audio device interface, and identify the electronic music through the software part.
[0025] If there is a differential gradient in electronic music, then complementary processing is carried out according to the size of the differential gradient; If the characteristics of electronic audio are not very obvious, adaptive docking is used for complementary processing. The complementary features are recorded each time, so that the features of electronic music can be expressed and integrated from different aspects. The treble and bass have obvious influence on the process of multi-feature fusion, and the fusion features presented by different audio in treble are also different. Collecting and extracting the features of electronic music effects from different levels can ensure the adequacy of the fusion method. Considering the actual function requirements of the system, the
TLV320AIC23 chip introduced by TI, also known as AIC23 chip. is selected. It is a chip that supports MIC and LINEIN input modes, and has programmable gain adjustment for audio input and output. Program-controlled gain is for users to better control and adjust audio signals and realize online parameter modification. The interaction between users and program gain function is realized by coding switch, and users can adjust the switch to the 4
S1XX1LU/SK23025LU 21.04.2023 appropriate gear according to their own needs, so that the gain ratio selected by users is more LU103103 accurate and meets more precise control requirements.
[0026] In the process of multi-layer perceptual feature classification, the NN multilayer perceptron (MLP) is used to divide the classification process into three layers: the import layer, the classification layer (one or more layers), and the output layer. The network classification framework of NNs includes neurons, which can undertake feature fusion factors and solve linear classification problems that cannot be solved in single-layer perceptual classification. It can not only classify in the form of multiple features, but also reflect multiple classification paths. The frequency cepstrum coefficient is inspired by people's auditory characteristics, and according to these characteristics, the human ear sensation is influenced by the actual changes in loudness and amplitude. Aft that spectrum of the amplitude value is logarithmized, the coefficient can be divided into several frequency bands according to frequency. As the obtain frequency vector has highly recognizable characteristics and complex correlation, in order to remove the correlation between loudness and amplitude, Fourier transform audio characteristics should be used and powershift should be used for processing. The extracted music features are processed. The basic framework of the electronic music classification model is shown in Figure 3.
[0027] As can be seen from Figure 3, when constructing the electronic music classification model, we should first collect various types of original ecological electronic music data and denoise the collected electronic music data. The denoised electronic music is detected by framing and endpoint, and effective electronic music signals are obtained.
[0028] Using this method can reduce computational complexity and improve classification speed. The specific factors in the classification layer carry out multiple classifications in the classification layer, and are allocated to different neurons according to different characteristics. Assuming that each neuron can only accept one characteristic factor, in the call process of the output layer, the weights of neurons are called, but the output is the characteristic factors carried by neurons. The NN multilayer perceptron is used for the classification of the same feature factor. When the number of imported neurons is the same as the number of output neurons, the number of output feature factors will be separated by the NN multilayer perceptron. Each neuron in the classification layer is an independent individual, but the connection path is different, which can effectively eliminate the classification process of bidirectional feature factors.
[0029] Electronic music signal identification method based on machine learning algorithm. At present, there principle of maximizing empirical risk, while support vector 5
S1XX1LU/SK23025LU 21.04.2023 machine is based on the principle of minimizing structural risk. The learning effect of NN is LU103103 obviously lower than that of support vector machine. The following is a detailed description of support vector machine.
[0030] Least squares support vector machine is a recently popular machine learning algorithm, which has a faster learning speed and better learning performance than NN.
Therefore, it was chosen to establish an electronic music signal identification model. Set the training sample set consisting of electronic music signal identification features and signal types: {x.y Dh =L2..nx,eR".y,e Rx, and 77 are the identification features and types of electronic music signals, respectively, as shown in equation (1).
[0031] /()=0'e(x)+b
[0032] Equation (1) is transformed and solved, as shown in Equation (2). min|æ]?+ 1 „x ci
[0033] 2 (2)
[0034] 5.
[0035] y,-0"@(x)+b =e, 3)
[0036] Where y represents the parameters of the least squares support vector machine.
Because the calculation process of formula (3) is very complicated, its equivalent form is established, as shown in formula (4).
Lob ga) =30 ar 3rd +
Ya lo’ ox) ~b+E-3)
[0037] = (4)
[0038] According to optimization theory, as shown in equation (5). ps.) ep Jl
[0039] ; 20° (5)
[0040] In the formula, © is the radial base width.
[0041] After digital filtering. the movable Hamming window is used to perform windowing and framing processing, So that the audio characteristics are always stable. The framing processing adopts the method of alternating overlapping between frames, and the alternating part is frame shift, which aims to make the transition between frames smooth and maintain continuity. After ensuring the smooth transition between frames and reducing the truncation effect of audio, the endpoint detection step is entered. Endpoint detection is the key to electronic music signal identification and has great influence on subsequent feature 6
S1XX1LU/SK23025LU 21.04.2023 extraction. Accurately find out the starting point and ending point of a single tone from the LU103103 audio with noise, suppress the noise interference of the silent segment, and reduce the amount of data and computation. and reduce the processing time. After the classifier is determined, the tones in the training set are input into it. Every time a 60-dimensional feature vector of tone data is input, the possibility of each tone calculated by the hidden layer and the output layer can be obtained. The value is between 0 and 1, and the output result is the maximum value. Compare it with the notes corresponding to the input MFCC feature to determine whether it is the same, and output the final result to complete the electronic music signal recognition.
[0042] The following experiments were conducted to verify the rationality of the intelligent classification model design for electronic music based on reasonable weight allocation. The experiment included 10 types of music, including Blue, Classical, Country.
Disco, Hiphop, Jazz, Metal, Pop, Reggae, and Rock. Each type of music contained a total of 100 pieces, and music features were extracted to obtain music fragments. This test is mainly aimed at electronic music with excessive modulation. A designed deep learning electronic music signal recognition system is used to test the decoding time of audio files. At the same time, traditional electronic music signal recognition systems are used to obtain test results for comparative analysis. The number of samples for each type of electronic music is shown in
Table 1.
Table 1 Ten sample distribution of electronic music categories
Number of electronic Electronic Music Name Sample size music types
Popular bel canto
HIP-HOP Music
Folk rhyme
Rock and roll 8 | Film musie 4 2 000000000 9 | World Music
[0043] For these 10 music types, the lowest accuracy of traditional classification method is 100, while the lowest accuracy of this classification method is 300. The highest accuracy of traditional classification method is 20, while the highest accuracy of this classification method is 200. It can be seen that the intelligent classification model of electronic music based on reasonable distribution of weights can improve the classification accuracy and have accurate classification performance. The classification of music is different in different training periods. Rock music with the highest classification accuracy in this invention is 7
S1XX1LU/SK23025LU 21.04.2023 selected as the experimental object to test whether the correct classification rate will be LU103103 affected with the increase of training period. The result is shown in Figure 4.
[0044] From Figure 4, it can be seen that rock music has the highest classification accuracy at a time of 8 seconds, and as time increases, the accuracy shows a decreasing trend. This indicates that the model does not require longer time to achieve higher accuracy in classification, nor does it require shorter time to achieve lower classification accuracy.
Instead, each moment of music segment will affect the extraction of classification information. Therefore, classification will only be accurate at a specific time. The classification output rate is an output value that can indirectly reflect the electronic music classification process. Data that has not undergone multiple feature classification processing cannot be output, and data that is not accurately classified will also be isolated and will not be output.
[0045] The electronic music signal identification system of deep learning plays a very important role in the development of electronic music. The system can not only assist the professional grade examination, but also be suitable for non-professionals to learn music.
Under the same conditions, the identification simulation experiment is carried out with the classical method. The accuracy of electronic music signal identification by machine learning algorithm is much higher than the requirements of practical application, and the signal identification error is lower than that by the classical method. According to the dynamic characteristics of music, the traditional classification method is improved, and the intelligent classification model design of electronic music based on reasonable weight distribution is proposed. The dynamic parameters of the model are obtained by using the steganographic analysis algorithm of weight distribution, and then the music is modeled, and the obtained vectors are used as the sequence model, and the classification results are obtained. Through comparative tests, it is proved that this system overcomes the shortcomings of the traditional identification system, shortens the decoding time of the system for files that are transferred too much, is suitable for application in real life, is convenient for all walks of life to learn and understand electronic music, and makes a contribution to the development of electronic music.
[0046] The embodiments of the present disclosure mean to cover all of these substitutes, modifications, and variations which fall within the scope of the appended claims. Therefore, within the spirit and principle of the present disclosure, any omission, modification, equivalent substitute, and modification should fall within the protection scope of the present disclosure. 8

Claims (5)

S1XX1LU/SK23025LU 21.04.2023 CLAIMS LU103103
1. A method for electronic music classification model construction based on machine learning and deep learning, the method comprising: using an interface to transmit received audio signal to an audio processing module, processing the audio signal through analog digital conversion and signal amplification, to form a music data; constructing a music classification model with spectrogram as input based on learning and the structural characteristics of NN; obtaining dynamic parameters of the electronic music classification model by using the steganographic analysis algorithm of weight distribution to model the music data; classifying modeled music data using the constructed music classification model.
2. The method according to claim 1, wherein a complementary processing is carried out according to the size of the differential gradient in electronic music.
3. The method according to claim 2, wherein adaptive docking is used for complementary processing when the characteristics of electronic audio are not obvious.
4. The method according to claim 3, wherein a NN multilayer perceptron (MLP) is used to divide the classification process into three layers: a import layer, a classification layer, and a output layer.
5. The method according to claim 4, wherein after various types of original ecological electronic music data is collected, the collected electronic music data is denoised. 9
LU103103A 2023-04-26 2023-04-26 Method for electronic music classification model construction based on machine learning and deep learning LU103103B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
LU103103A LU103103B1 (en) 2023-04-26 2023-04-26 Method for electronic music classification model construction based on machine learning and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
LU103103A LU103103B1 (en) 2023-04-26 2023-04-26 Method for electronic music classification model construction based on machine learning and deep learning

Publications (1)

Publication Number Publication Date
LU103103B1 true LU103103B1 (en) 2023-11-30

Family

ID=88925379

Family Applications (1)

Application Number Title Priority Date Filing Date
LU103103A LU103103B1 (en) 2023-04-26 2023-04-26 Method for electronic music classification model construction based on machine learning and deep learning

Country Status (1)

Country Link
LU (1) LU103103B1 (en)

Similar Documents

Publication Publication Date Title
US20230056955A1 (en) Deep Learning Based Method and System for Processing Sound Quality Characteristics
AU2013361099B2 (en) Audio decoding with supplemental semantic audio recognition and report generation
WO2019109787A1 (en) Audio classification method and apparatus, intelligent device, and storage medium
US20070291958A1 (en) Creating Music by Listening
CN111369982A (en) Training method of audio classification model, audio classification method, device and equipment
Zhang Music style classification algorithm based on music feature extraction and deep neural network
CN104900238B (en) A kind of audio real-time comparison method based on perception filtering
CN109408660B (en) Music automatic classification method based on audio features
CN104992713B (en) A kind of quick broadcast audio comparison method
CN109584904B (en) Video-song audio-song name recognition modeling method applied to basic music video-song education
Cheng et al. Convolutional neural networks approach for music genre classification
Elowsson et al. Predicting the perception of performed dynamics in music audio with ensemble learning
Reghunath et al. Transformer-based ensemble method for multiple predominant instruments recognition in polyphonic music
CN117294985A (en) TWS Bluetooth headset control method
Mounika et al. Music genre classification using deep learning
CN104900239B (en) A kind of audio real-time comparison method based on Walsh-Hadamard transform
LU103103B1 (en) Method for electronic music classification model construction based on machine learning and deep learning
Zhang Research on music classification technology based on deep learning
Cai et al. Music creation and emotional recognition using neural network analysis
US20220277040A1 (en) Accompaniment classification method and apparatus
CN110739006A (en) Audio processing method and device, storage medium and electronic equipment
Rituerto-González et al. End-to-end recurrent denoising autoencoder embeddings for speaker identification
CN113781989A (en) Audio animation playing and rhythm stuck point identification method and related device
Li et al. Construction of Electronic Music Classification Model Based on Machine Learning and Deep Learning Algorithm
CN114550675A (en) Piano transcription method based on CNN-Bi-LSTM network

Legal Events

Date Code Title Description
FG Patent granted

Effective date: 20231130