CN111681674A - Method and system for identifying musical instrument types based on naive Bayes model - Google Patents
Method and system for identifying musical instrument types based on naive Bayes model Download PDFInfo
- Publication number
- CN111681674A CN111681674A CN202010483915.8A CN202010483915A CN111681674A CN 111681674 A CN111681674 A CN 111681674A CN 202010483915 A CN202010483915 A CN 202010483915A CN 111681674 A CN111681674 A CN 111681674A
- Authority
- CN
- China
- Prior art keywords
- music
- naive bayes
- bayes model
- instrument
- musical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 239000013598 vector Substances 0.000 claims abstract description 33
- 238000000605 extraction Methods 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 5
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 238000000926 separation method Methods 0.000 abstract description 2
- 230000004069 differentiation Effects 0.000 abstract 1
- 238000001228 spectrum Methods 0.000 description 7
- 238000011161 development Methods 0.000 description 4
- 238000009432 framing Methods 0.000 description 4
- 238000013145 classification model Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000033764 rhythmic process Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/041—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal based on mfcc [mel -frequency spectral coefficients]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/056—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction or identification of individual instrumental parts, e.g. melody, chords, bass; Identification or separation of instrumental parts by their characteristic voices or timbres
Abstract
The invention relates to a naive Bayes model-based musical instrument type identification method and a system, which comprise the following steps: s1 dividing the music to be identified into a plurality of audio frames; s2, extracting time domain information, frequency domain information and Mel frequency cepstrum coefficient in the audio frame to form a feature vector corresponding to the audio frame; s3, inputting the audio feature vectors corresponding to a plurality of musical instruments and the feature vectors corresponding to all audio frames into a naive Bayes model, and identifying the musical instruments according to the probability of the musical instruments appearing in the music. Through the data music characteristic extraction mode, the identification of the types, timbres and techniques of the musical instruments by artificial intelligence is realized, and the refined differentiation of the relationship between the homogeneous musical instruments and the heterogeneous musical instruments is facilitated, particularly the artificial separation and the precise identification of the sound subdivision, the timbre similarity and the technical overlap ratio of the homogeneous musical instrument types are realized.
Description
Technical Field
The invention relates to a naive Bayes model-based musical instrument type identification method and a naive Bayes model-based musical instrument type identification system, and belongs to the technical field of musical instrument identification.
Background
In recent years, with the rapid development of the internet era, the daily life of people is influenced by more and more music applications, digital music also shows explosive growth in the entertainment field, people do not lack music in life, music communities are gradually popularized, the propagation mode of P2P is also gradually developed, and how to help people find music needed by people is an important direction for the future development of music identification technology. With the development of music recognition technology, music recognition from the aspect of singing names, singers and other texts is widely popularized, and in the nineties, music recognition based on melody, rhythm and other music characteristics is developed, so that the music recognition technology directly becomes a very widely applied technology after the music recognition technology appears, and the development of the music recognition technology is promoted. In 1980-.
Currently, systems for identifying musical instruments used in music are not common. This is because, for a music library with a large data set size, compared with text attribute features or melody rhythm recognition, it is more difficult to recognize musical instruments used in music, and although some musical instruments have a great degree of distinction from waveform diagram analysis, it is far from sufficient to recognize musical instruments in music only from features such as tone, pitch, loudness, etc., so it is necessary to analyze audio features more accurately and more characteristically to distinguish different sounds played by different musical instruments. Timbre is the attribute of sound quality, not the loudness and intensity of sound, and can distinguish different instruments playing the same musical note from different sounds in hearing. For example, the human auditory system can distinguish between a 4410Hz violin and a oboe because their high frequency overtone components are different, and the amplitude of the high frequency components is different, which is the timbre. Therefore, the key point for distinguishing different musical instruments in music is to distinguish the timbres of the musical instruments, but how to characterize the music in a characteristic value mode is an urgent problem to be solved in the field.
Disclosure of Invention
In view of the above disadvantages of the prior art, the present invention provides a naive bayes model based method and system for identifying the type of musical instrument, which can realize the identification of the type, timbre and technique of the musical instrument by artificial intelligence through the way of extracting the digitized music characteristics, and help to finely distinguish the relationship between the homogeneous and heterogeneous musical instruments, especially the artificial separation and precise identification of the sound subdivision, timbre similarity and technical overlap ratio of the homogeneous musical instrument type.
In order to achieve the aim, the invention provides a naive Bayes model-based musical instrument type identification method, which comprises the following steps: s1 dividing the music to be identified into a plurality of audio frames; s2, extracting time domain information, frequency domain information and Mel frequency cepstrum coefficient in the audio frame to form a feature vector corresponding to the audio frame; s3, inputting the audio feature vectors corresponding to a plurality of existing musical instruments and the feature vectors corresponding to all audio frames into a naive Bayes model, and identifying the musical instruments according to the probability of the musical instruments appearing in the music.
Further, if the probability of the instrument appearing in the music exceeds the threshold value, it is determined that the instrument appears in the music, and if the probability of the instrument appearing in the music does not exceed the threshold value, it is determined that the instrument does not appear in the music.
Further, the musical instruments used in the music piece include a main musical instrument and a secondary musical instrument, and the main musical instrument and the secondary musical instrument are distinguished by obtaining the probability of each musical instrument appearing in the music piece through a naive Bayes method model.
Further, the instrument with the highest probability of appearing in the music piece is the main instrument, and the other instruments appearing in the music piece are the secondary instruments.
Further, the output formula of the naive bayes model is as follows:
wherein, XiA certain frame, a total z-frame, representing a piece of music X; y isjRepresents a certain musical instrument, and has n types of musical instruments in total.
Further, the specific operation process of S3 is as follows: s3.1, inputting the audio feature vectors corresponding to the plurality of musical instruments and the feature vectors corresponding to the audio frames into a pre-trained naive Bayes model; s3.2 calculating P (y) by using output formula of naive Bayes model1|Xi),P(y2|Xi),…,P(yn|Xi) (ii) a S3.3 by formulaGet the musical instrument yjProbability of appearing in the music piece X.
Further, the pre-training process of the pre-trained naive Bayes model is as follows: inputting a music piece with known musical instrument playing type into the original naive Bayes model, obtaining the probability of a certain musical instrument appearing in the music piece by the music piece according to an output formula of the naive Bayes model, judging whether the probability exceeds a threshold value, comparing the judgment result with the type of the actual playing music piece, and inputting the naive Bayes model as a final output model if the judgment result is the same; if the results are different, the output formula of the naive Bayes model is adjusted until the results are the same.
Further, frequency domain information is obtained by performing Fourier transform on each audio frame, and the frequency domain inversion information is obtained by rotating a frequency domain graph formed by the frequency domain information and representing the amplitude of the frequency domain graph by using a gray scale graph; the time domain information is obtained by stacking the frequency domain plots in the time dimension.
Further, a hamming window is added to several audio frames to prevent frequency leakage.
The invention also discloses a system for identifying the types of musical instruments based on the naive Bayes model, which comprises the following steps: the preprocessing module is used for dividing the music to be identified into a plurality of audio frames; the characteristic extraction module is used for extracting time domain information, frequency domain information, cepstrum domain information and Mel frequency cepstrum coefficients in the audio frame to form a characteristic vector corresponding to the audio frame; and the recognition module is used for inputting the audio feature vectors corresponding to the plurality of musical instruments and the feature vectors corresponding to all the audio frames into the naive Bayes model and recognizing the musical instruments according to the probability of the musical instruments appearing in the music.
Due to the adoption of the technical scheme, the invention has the following advantages:
1. the invention realizes the identification of the type, tone color and technique of the musical instrument by artificial intelligence through the datamation music characteristic extraction mode, helps to finely distinguish the relationship between the homogeneous musical instrument and the heterogeneous musical instrument, and particularly helps to artificially separate and accurately distinguish the sound subdivision, tone color similarity and technical overlap ratio of the homogeneous musical instrument.
2. The method for extracting the music features and the extracted feature phasor can reduce the time consumption of instrument identification in music, and cannot influence the precision and the accuracy of instrument identification.
3. The method can be widely applied to a plurality of fields such as music appreciation, music classification and music recommendation, and the musical instruments used in the music greatly influence the style of the music, so the method can play a certain role in music information retrieval.
4. The invention trains the music by adopting a naive Bayes classification model, and represents musical instruments possibly corresponding to the music by adopting a probability mode, so that the artificial intelligence model learning can be applied to the identification of key elements in the music and common music structures and rules, and provides reference and reference for the artificial intelligence better applied to the music field, such as sound modification, music composition and the like.
Drawings
FIG. 1 is a flow chart of a naive Bayes model based instrument type identification method in an embodiment of the invention;
FIG. 2 is a flow diagram of a pre-processing process for a musical composition in accordance with an embodiment of the present invention;
FIG. 3 is a flow chart of the audio frame timbre feature extraction process in one embodiment of the present invention;
FIG. 4 is a flow chart of a process for extracting the Mel cepstral coefficient feature of an audio frame according to an embodiment of the present invention;
FIG. 5 is a flow diagram of a naive Bayes classification model identification process in an embodiment of the invention;
FIG. 6 is a flow chart of a naive Bayes classification model training process in an embodiment of the invention.
Detailed Description
The present invention is described in detail by way of specific embodiments in order to better understand the technical direction of the present invention for those skilled in the art. It should be understood, however, that the detailed description is provided for a better understanding of the invention only and that they should not be taken as limiting the invention. In describing the present invention, it is to be understood that the terminology used is for the purpose of description only and is not intended to be indicative or implied of relative importance.
The invention is characterized in that the feature vector of the music is formed by extracting the tone color feature of the music and the feature fusion of the Mel cepstrum coefficient (MFCC), the feature vector is used as input, the naive Bayes model is used for identifying the musical instruments playing the music, the musical instruments playing the music comprise a main musical instrument playing the leading role and several secondary musical instruments matched with the main musical instrument, for example, one music is the main melody of a piano, namely the piano is the main musical instrument, and the music also comprises the accompaniment of the music such as violin, flute and the like, namely the music such as the violin, flute and the like is the secondary musical instrument. The technical scheme of the invention can also be used for distinguishing the importance degree of each instrument in the secondary instruments.
Example one
A method for identifying a musical instrument category based on a naive bayes model, as shown in fig. 1, comprises the following steps:
s1 divides the music piece to be recognized into a plurality of audio frames, and determines the number of the audio frames. As shown in fig. 2, each piece of music in the original data set is divided into a plurality of audio frames. A hamming window is then applied to the audio frame to prevent frequency leakage, which acts to smooth out the gibbs effect from frame to frame. In order to store both time domain information and frequency domain information, it is necessary to perform short-time fourier transform on the audio frame after framing and windowing to obtain a spectrogram.
The process of generating the spectrogram by short-time Fourier transform comprises the following steps:
framing the long signal of the music and adding a window; performing Fourier transform on each audio frame; the audio frame at this time is a short-time signal, i.e., a short-time fourier transform. Rotating the spectrogram; representing the magnitude of the spectrogram by a gray scale map; and stacking the frequency domain graphs obtained by Fourier transform according to the time dimension to finally obtain a spectrogram. The frequency domain information is obtained by performing Fourier transform on each audio frame, the frequency domain inversion information is obtained by rotating a frequency domain graph formed by the frequency domain information and representing the amplitude of the frequency domain graph by using a gray scale graph; the time domain information is obtained by stacking the frequency domain plots in the time dimension.
S2, extracting the time domain information, frequency domain information and Mel frequency cepstrum coefficient in the audio frame, and forming the feature vector corresponding to the audio frame.
As shown in fig. 3, based on the MPEG-7(Multimedia Content Description Interface) standard, the timbre of the musical instrument is captured from three levels, i.e., time domain (time domain of timbre), frequency domain (frequency of timbre waveform) and inverse frequency domain (frequency of inverted timbre waveform), and the timbre feature elements of the three levels, i.e., each frame of each music piece in the original data set, are extracted and stored finely.
As shown in fig. 4, the process of extracting Mel-Frequency Cepstrum Coefficient (MFCC) is:
2.1 Pre-emphasis
If the intensity of the data at low frequency is greater than that at high frequency, it is not easy to process, so that it is necessary to filter out the low frequency component in the data, and the high frequency characteristic is more prominent.
2.2 Framing
Framing is to assemble N sampling points into one observation unit. The time covered by each frame is set to be 25ms, and since the sampling rate is 16000, the number of sample points obtained in each frame is 400. In addition, in order to avoid the excessive variation of two adjacent frames, there is an overlapping region between two adjacent frames. Since the set overlap region is 15ms, one frame is taken every 10 ms.
2.3 windowing of each frame
Because the frame signal is treated as a periodic signal during conversion, sudden changes occur at two end points of the frame, and the converted frequency spectrum is greatly different from the original signal frequency spectrum. Each frame is windowed so that no abrupt changes occur at the two end points of the fourier transform of the signal within the frame.
2.4 zero padding for each frame
Since each frame of signal is fourier transformed, which requires a certain input data length, now 400 samples in a frame, zero padding is performed to the nearest 512 bits.
2.5 Fourier transform of the signals of each frame
And carrying out 512-point Fourier transform on each frame signal after windowing the subframe to obtain the frequency spectrum of each frame. And taking an absolute value or a square of the frequency spectrum of the voice signal to obtain a power spectrum of the voice signal.
2.6 mel filtering
The 40 triangular filters are evenly distributed over the mel-frequency spectrum with a 50% overlap between each two filters. Therefore, the actual frequency is converted into the mel frequency, and the minimum actual frequency is 0Hz, and the maximum actual frequency is 16000/2-8000 Hz. After converting into the mel frequency, the mel frequency distribution of the 40 triangular filters is calculated, and then the mel frequency is converted into the actual frequency.
2.7 logarithm of
The logarithm is solved from the output of the triangular window filter group, and the result similar to homomorphic transformation can be obtained.
2.8 discrete cosine transform (DCT transform)
And performing DCT (discrete cosine transformation) on the logarithmic energy Mel spectrum, and taking the first 13 dimensions to output to obtain a Mel cepstrum.
2.9 normalization
All mel cepstra were normalized. And firstly, solving the mean vector of all cepstrum vectors, and then subtracting the mean vector from each cepstrum vector to obtain the output characteristic vector of the mel frequency cepstrum coefficient.
S3, inputting the audio feature vectors corresponding to a plurality of existing musical instruments and the feature vectors corresponding to all audio frames into a naive Bayes model, and identifying the musical instruments according to the probability of the musical instruments appearing in the music.
As shown in fig. 5, the specific operation procedure of step S3 is as follows:
s3.1 sets C ═ y of several instruments1,y2,…,yj,…,ynInputting the feature vector corresponding to the audio frame into a pre-trained naive Bayes model;
s3.2 calculating P (y) by using output formula of naive Bayes model1|Xi),P(y2|Xi),…,P(yn|Xi);
Wherein, the output formula of the naive Bayes model is as follows:
wherein, XiA certain frame, a total z-frame, representing a piece of music X; y isjRepresents a certain musical instrument, and has n types of musical instruments in total.
The probability of each instrument appearing in the music piece X can be obtained by the above procedure, and since the probability of the instrument not appearing in the music piece is not necessarily completely zero, it is necessary to set a threshold value to the probability, and if the probability of the instrument appearing in the music piece exceeds the threshold value, it is determined that the instrument appears in the music piece, and if the probability of the instrument appearing in the music piece does not exceed the threshold value, it is determined that the instrument does not appear in the music piece. It should be noted that the value of the threshold needs to be determined according to specific music or general standards, and the principle of the value is to ensure that not only musical instruments that do not appear in the music but also secondary musical instruments that appear in a shorter time are not removed, and the threshold can be adjusted when the model is pre-trained.
The musical instruments used in the music piece comprise main musical instruments and secondary musical instruments, and the main musical instruments and the secondary musical instruments are distinguished by obtaining the probability of the appearance of each musical instrument in the music piece through a naive Bayes method model. The instrument with the highest probability of appearing in the music piece is the primary instrument, and the other instruments appearing in the music piece are the secondary instruments. Usually, there is only one main instrument of a music piece, but it is not excluded that some music pieces are played by multiple instruments, and the probability of each instrument appearing is not much different. Here, the plurality is two or more. It is not possible to generalize the case where the probabilities of several musical instruments appearing in a music piece are not very different, it is necessary to judge the main musical instrument and the sub musical instrument according to the style of the music piece.
As shown in fig. 6, the pre-training process of the pre-trained naive bayes model is as follows: inputting a music piece with known musical instrument playing type into the original naive Bayes model, obtaining the probability of a certain musical instrument appearing in the music piece by the music piece according to an output formula of the naive Bayes model, judging whether the probability exceeds a threshold value, comparing the judgment result with the type of the actual playing music piece, and inputting the naive Bayes model as a final output model if the judgment result is the same; if the results are different, the output formula of the naive Bayes model is adjusted until the results are the same.
Through the steps, on the basis of obtaining a naive Bayes model through training, musical instruments used by the music needing to be identified can be classified, and meanwhile, because the obtained output result is the probability value of each musical instrument used by each music, the results can be sorted according to needs, and the main musical instrument and the secondary musical instrument of the music can be distinguished.
Example two
Based on the same inventive concept, the embodiment also discloses a system for identifying the type of musical instrument based on the naive bayes model, which comprises:
the preprocessing module is used for dividing the music to be identified into a plurality of audio frames;
the characteristic extraction module is used for extracting time domain information, frequency domain information, cepstrum domain information and Mel frequency cepstrum coefficients in the audio frame to form a characteristic vector corresponding to the audio frame;
and the recognition module is used for inputting the audio feature vectors corresponding to the plurality of musical instruments and the feature vectors corresponding to all the audio frames into the naive Bayes model and recognizing the musical instruments according to the probability of the musical instruments appearing in the music.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A naive Bayes model-based musical instrument type identification method is characterized by comprising the following steps:
s1 dividing the music to be identified into a plurality of audio frames;
s2, extracting time domain information, frequency domain inversion information and Mel frequency cepstrum coefficient in the audio frame to form a feature vector corresponding to the audio frame;
s3, inputting the audio feature vectors corresponding to a plurality of existing musical instruments and the feature vectors corresponding to all the audio frames into a naive Bayes model, and identifying the musical instruments according to the probability of the musical instruments appearing in the music.
2. The naive bayes model-based instrument class identification method of claim 1, wherein if the probability that the instrument appears in the music piece exceeds a threshold value, it is judged that the instrument appears in the music piece, and if the probability that the instrument appears in the music piece does not exceed the threshold value, it is judged that the instrument does not appear in the music piece.
3. The naive bayes model-based instrument class identification method of claim 2, wherein the instruments used in the piece of music comprise primary and secondary instruments, the primary and secondary instruments being distinguished by the probability of each of said instruments appearing in the piece of music obtained through the naive bayes model.
4. The naive bayes model-based instrument class identification method as claimed in claim 3, wherein the instrument with the highest probability appearing in said piece of music is the primary instrument and the other instruments appearing in said piece of music are the secondary instruments.
5. The naive bayes model-based instrument class identification method as claimed in any of claims 1-4, wherein the output formula of the naive bayes model is:
wherein, XiA certain frame, a total z-frame, representing a piece of music X; y isjRepresents a certain musical instrument, and has n types of musical instruments in total.
6. The naive bayes model-based instrument class identification method as claimed in claim 5, wherein the specific operation procedure of S3 is as follows:
s3.1, inputting the audio feature vectors corresponding to the plurality of musical instruments and the feature vectors corresponding to the audio frames into a pre-trained naive Bayes model;
s3.2 calculating P (y) by using output formula of naive Bayes model1|Xi),P(y2|Xi),…,P(yn|Xi);
7. The naive bayes model-based instrument class identification method of claim 6, wherein the pre-training process of the pre-trained naive bayes model is:
inputting a music piece with known musical instrument playing type into an original naive Bayes model, obtaining the probability of a certain musical instrument appearing in the music piece by the music piece according to an output formula of the naive Bayes model, judging whether the probability exceeds a threshold value, comparing the judgment result with the type of the music piece actually played, and inputting the naive Bayes model as a final output model if the judgment result is the same; if the results are different, the output formula of the naive Bayes model is adjusted until the results are the same.
8. The naive bayes model-based instrument class identification method as set forth in any of claims 1-4, wherein said frequency domain information is obtained by performing a fourier transform on each of said audio frames, and said inverse frequency domain information is obtained by rotating a frequency domain graph formed by said frequency domain information and representing the amplitude of said frequency domain graph by a gray scale; the time domain information is obtained by stacking the frequency domain maps in a time dimension.
9. A naive Bayes model based instrument class identification method as in any of claims 1-4, wherein a Hamming window is added to a number of said audio frames to prevent frequency leakage.
10. A naive Bayes model based instrument type identification system, comprising:
the preprocessing module is used for dividing the music to be identified into a plurality of audio frames;
the feature extraction module is used for extracting time domain information, frequency domain information, cepstrum domain information and Mel frequency cepstrum coefficients in the audio frame to form feature vectors corresponding to the audio frame;
and the recognition module is used for inputting the audio feature vectors corresponding to the plurality of musical instruments and the feature vectors corresponding to all the audio frames into a naive Bayes model and recognizing the musical instruments according to the probability of the musical instruments appearing in the music.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010483915.8A CN111681674B (en) | 2020-06-01 | 2020-06-01 | Musical instrument type identification method and system based on naive Bayesian model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010483915.8A CN111681674B (en) | 2020-06-01 | 2020-06-01 | Musical instrument type identification method and system based on naive Bayesian model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111681674A true CN111681674A (en) | 2020-09-18 |
CN111681674B CN111681674B (en) | 2024-03-08 |
Family
ID=72453206
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010483915.8A Active CN111681674B (en) | 2020-06-01 | 2020-06-01 | Musical instrument type identification method and system based on naive Bayesian model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111681674B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113421589A (en) * | 2021-06-30 | 2021-09-21 | 平安科技(深圳)有限公司 | Singer identification method, singer identification device, singer identification equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10319948A (en) * | 1997-05-15 | 1998-12-04 | Nippon Telegr & Teleph Corp <Ntt> | Sound source kind discriminating method of musical instrument included in musical playing |
US20080314231A1 (en) * | 2007-06-20 | 2008-12-25 | Mixed In Key, Llc | System and method for predicting musical keys from an audio source representing a musical composition |
CN101546556A (en) * | 2008-03-28 | 2009-09-30 | 展讯通信(上海)有限公司 | Classification system for identifying audio content |
CN103761965A (en) * | 2014-01-09 | 2014-04-30 | 太原科技大学 | Method for classifying musical instrument signals |
CN105719661A (en) * | 2016-01-29 | 2016-06-29 | 西安交通大学 | Automatic discrimination method for playing timbre of string instrument |
CN106952644A (en) * | 2017-02-24 | 2017-07-14 | 华南理工大学 | A kind of complex audio segmentation clustering method based on bottleneck characteristic |
CN108962279A (en) * | 2018-07-05 | 2018-12-07 | 平安科技(深圳)有限公司 | New Method for Instrument Recognition and device, electronic equipment, the storage medium of audio data |
-
2020
- 2020-06-01 CN CN202010483915.8A patent/CN111681674B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10319948A (en) * | 1997-05-15 | 1998-12-04 | Nippon Telegr & Teleph Corp <Ntt> | Sound source kind discriminating method of musical instrument included in musical playing |
US20080314231A1 (en) * | 2007-06-20 | 2008-12-25 | Mixed In Key, Llc | System and method for predicting musical keys from an audio source representing a musical composition |
CN101546556A (en) * | 2008-03-28 | 2009-09-30 | 展讯通信(上海)有限公司 | Classification system for identifying audio content |
CN103761965A (en) * | 2014-01-09 | 2014-04-30 | 太原科技大学 | Method for classifying musical instrument signals |
CN105719661A (en) * | 2016-01-29 | 2016-06-29 | 西安交通大学 | Automatic discrimination method for playing timbre of string instrument |
CN106952644A (en) * | 2017-02-24 | 2017-07-14 | 华南理工大学 | A kind of complex audio segmentation clustering method based on bottleneck characteristic |
CN108962279A (en) * | 2018-07-05 | 2018-12-07 | 平安科技(深圳)有限公司 | New Method for Instrument Recognition and device, electronic equipment, the storage medium of audio data |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113421589A (en) * | 2021-06-30 | 2021-09-21 | 平安科技(深圳)有限公司 | Singer identification method, singer identification device, singer identification equipment and storage medium |
CN113421589B (en) * | 2021-06-30 | 2024-03-01 | 平安科技(深圳)有限公司 | Singer identification method, singer identification device, singer identification equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111681674B (en) | 2024-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Gerhard | Audio signal classification: History and current techniques | |
US20100332222A1 (en) | Intelligent classification method of vocal signal | |
Zlatintsi et al. | Multiscale fractal analysis of musical instrument signals with application to recognition | |
Lu | Indexing and retrieval of audio: A survey | |
Lehner et al. | Online, loudness-invariant vocal detection in mixed music signals | |
CN102723079B (en) | Music and chord automatic identification method based on sparse representation | |
Hu et al. | Separation of singing voice using nonnegative matrix partial co-factorization for singer identification | |
CN111128236B (en) | Main musical instrument identification method based on auxiliary classification deep neural network | |
Park | Towards automatic musical instrument timbre recognition | |
Yu et al. | Predominant instrument recognition based on deep neural network with auxiliary classification | |
Toghiani-Rizi et al. | Musical instrument recognition using their distinctive characteristics in artificial neural networks | |
CN116665669A (en) | Voice interaction method and system based on artificial intelligence | |
Mousavi et al. | Persian classical music instrument recognition (PCMIR) using a novel Persian music database | |
Lerch | Software-based extraction of objective parameters from music performances | |
CN111681674B (en) | Musical instrument type identification method and system based on naive Bayesian model | |
Pratama et al. | Human vocal type classification using MFCC and convolutional neural network | |
Banchhor et al. | Musical instrument recognition using spectrogram and autocorrelation | |
Kitahara et al. | Instrogram: A new musical instrument recognition technique without using onset detection nor f0 estimation | |
Barthet et al. | Speech/music discrimination in audio podcast using structural segmentation and timbre recognition | |
CN114678039A (en) | Singing evaluation method based on deep learning | |
Ashraf et al. | Integration of speech/music discrimination and mood classification with audio feature extraction | |
Kumari et al. | CLASSIFICATION OF NORTH INDIAN MUSICAL INSTRUMENTS USING SPECTRAL FEATURES. | |
Waghmare et al. | Raga identification techniques for classifying indian classical music: A survey | |
Liang et al. | Extraction of music main melody and Multi-Pitch estimation method based on support vector machine in big data environment | |
Kamarudin et al. | Analysis on Mel Frequency Cepstral Coefficients and Linear Predictive Cepstral Coefficients as Feature Extraction on Automatic Accents Identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |