CN115910099B - Automatic musical instrument identification method based on depth probability map neural network - Google Patents

Automatic musical instrument identification method based on depth probability map neural network Download PDF

Info

Publication number
CN115910099B
CN115910099B CN202211391028.3A CN202211391028A CN115910099B CN 115910099 B CN115910099 B CN 115910099B CN 202211391028 A CN202211391028 A CN 202211391028A CN 115910099 B CN115910099 B CN 115910099B
Authority
CN
China
Prior art keywords
formula
instrument
improved
model
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211391028.3A
Other languages
Chinese (zh)
Other versions
CN115910099A (en
Inventor
张健
侯海薇
杜威
丁世飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN202211391028.3A priority Critical patent/CN115910099B/en
Publication of CN115910099A publication Critical patent/CN115910099A/en
Application granted granted Critical
Publication of CN115910099B publication Critical patent/CN115910099B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Image Analysis (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The automatic musical instrument identification method based on the depth probability map neural network comprises the steps of dividing audio data into time slices, dividing a section of audio into N time slices with fixed lengths, and recording labels corresponding to each time slice; converting the obtained audio data of each time slice into a Mel frequency spectrum image, and then regularizing the image; performing feature extraction on the obtained regularized Mel spectrum image by using a convolutional neural network, mapping the extracted features from two dimensions to one dimension, and combining the tags to form a time slice Mel spectrum image feature tag pair; constructing an improved conditional restricted Boltzmann machine model, and obtaining conditional probability distribution; training an improved CRBM model using Gibbs sampling; constructing an objective function, and training an improved CRBM model by using the objective function to obtain an automatic identification model of the musical instrument; outputting the predicted instrument tag. The method can effectively solve the problem that the complex tone musical instrument is difficult to accurately identify in the prior art.

Description

Automatic musical instrument identification method based on depth probability map neural network
Technical Field
The invention belongs to the technical field of voice recognition, and particularly relates to an automatic musical instrument recognition method based on a depth probability map neural network.
Background
With the development of artificial intelligence, the intelligent music analysis method based on machine learning gradually becomes a core technology and research direction in the fields of main melody recognition, music style detection and the like, wherein automatic recognition of musical instruments in multi-tune music is a key step for realizing intelligent music analysis. In current research and application, the mainstream method is to implement intelligent music analysis from the point of view of signal processing in combination with a machine learning method. However, for the task of identifying a musical instrument of multi-tone music, because of the harmonic nature of the musical instrument, there is a complex signal superposition phenomenon in the multi-tone music, both in the time domain and in the frequency domain, and thus the accuracy of identifying the musical instrument is affected, and this phenomenon has been a difficulty to be solved in identifying the musical instrument of multi-tone music.
From a signal processing perspective, instrument recognition can be seen as a branch of audio data processing, whereas unlike other audio data, vocal and instrumental music have unique properties in terms of harmonic energy distribution, and some scholars feature extract the complex tone music from an acoustic (as well as psychoacoustic) perspective, resulting in a feature representation of the complex tone music, such as: the playing time, the spectrum centroid, the energy envelope, the mel spectrum cepstrum coefficient and the like, and then designing a corresponding traditional machine learning or deep learning method to realize the musical instrument identification task. Such methods, although manually extracting acoustic (psychoacoustic) features, still do not provide an ideal recognition of complex music, especially for distinguishing between different instruments within the same instrument family. For this reason, the harmonic distributions of the instruments are different from one instrument family to another, but the harmonic distributions of the instruments in the same instrument family are similar, and the harmonic properties of the instruments result in complex signals being superimposed on each other, both in the time domain and in the frequency domain, so that the signal at a certain frequency may be the fundamental frequency of a certain instrument, and the fundamental frequency and harmonic signals of other instruments may be superimposed at the same time, and therefore, although the result of instrument identification is independent of the pitch (fundamental frequency) of instrument performance, the task of instrument identification based on machine learning is greatly affected by the fundamental frequency and its harmonic signals, and it is difficult to effectively distinguish the harmonic distributions of different instruments. Along with the development of deep learning, a learner refers to excellent labels of a deep neural network on an image processing task, expresses complex tone music on a time domain in a frequency domain, constructs a spectrogram or a Mel spectrogram of the complex tone music, and then uses the deep neural network to complete a musical instrument recognition task in a supervised learning mode. The deep neural network based on the frequency domain image brings about great progress to the task of identifying the musical instrument, but the method has some unresolved problems, firstly, an algorithm extracts harmonic characteristics of the musical instrument on single-instrument audio, then the harmonic characteristics are applied to a multi-instrument complex tone music test data set for multi-label classification, and the classification result depends on whether the model can fully learn the characteristics of the musical instrument on a training set; in addition, since the test data is a multi-tone music, superposition of various instruments on the frequency spectrum greatly interferes with the result of instrument recognition, and the neural network-based instrument recognition method generally analyzes the instrument recognition problem from the perspective of the frequency spectrum image characteristics or the generic characteristics, thereby converting the instrument recognition problem into an image recognition problem, but rarely considers the similarity between instrument categories as a key feature for distinguishing the harmonic distribution of the instruments at the same time, thereby resulting in low accuracy of instrument recognition.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an automatic musical instrument identification method based on a depth probability map neural network, which has simple steps and high identification precision and can effectively solve the problem that the complex tone musical instrument is difficult to accurately identify in the prior art.
In order to achieve the above object, the present invention provides an automatic instrument recognition method based on a depth probability map neural network, comprising the steps of:
step one: preprocessing data;
s11, dividing the audio data into time slices, dividing a section of audio into N time slices with fixed lengths, and simultaneously recording labels corresponding to each time slice;
s12, converting the obtained audio data of each time slice into a Mel frequency spectrum image, then regularizing the image, and regularizing the value range of the pixel points to obtain a regularized Mel frequency spectrum image;
step two: extracting the characteristics of the data;
performing feature extraction on the obtained regularized Mel spectrum image by using a convolutional neural network to obtain image features of Mel spectrum, mapping the extracted features from two dimensions to one dimension, and combining the tags to form a time slice Mel spectrum image feature tag pair;
step three: modeling tag correlation features using an improved CRBM model;
s31, constructing an improved CRBM model according to the energy function proposed by the formula (1);
where x represents the input spectral feature data, y represents the label corresponding to x, which is also the expected output in the prediction stage, h represents the expected feature expression, s is the additional variable introduced, W, W y Beta, mu, b are training parameters;
s32, obtaining a conditional joint probability distribution according to a formula (1), as shown in a formula (2);
wherein Z represents a partitioning function;
s33, obtaining formulas (3) and (4) based on the formula (2), and obtaining the conditional probability P (h|x, y) of h according to the formula (3); obtaining the activation probability of each component of h according to formula (4); obtaining a conditional probability formula (5) of y based on x and h according to formulas (2) and (3);
P(h|x,y)=Π i P(h i |x,y) (3);
s34, obtaining formulas (6) and (7) based on formulas (2) and (4), and obtaining the activation probability of each component of S according to formula (6); obtaining the activation probability of each component of y according to formula (7);
wherein N represents a Gaussian distribution;
step four: training an improved CRBM model classified as a target by utilizing the correlation characteristics, and outputting a predicted instrument label;
s41, constructing an objective function according to a formula (8), and training an improved CRBM model by using the objective function;
Loss=log(p(y|x)+Rank-Loss(y|x)+σ||y||l1 (8);
where log (P (y|x)) represents the likelihood function, rank-Loss (y|x) represents the ordering Loss function, y l1 Representing l1 regularization, σ being a hyper-parameter;
s42, based on formulas (3), (4), (6) and (7), using Gibbs sampling to obtain a gradient formula (9) of a likelihood function in a formula (8), calculating the gradient of the formula (8) according to the formula (9), training an improved CRBM model according to the gradient of the formula (8), and obtaining a feature expression h containing label correlation and a conditional probability of the label through training;
where E represents a mathematical expectation, θ represents a parameter set, and two mathematical expectations are obtained using Gibbs sampling according to formulas (4), (6) and (7);
s43, after the improved CRBM training is completed, the label y which maximizes log (p (y|x) is calculated according to a formula (10) in the face of the input of the instrument category to be predicted, and therefore a predicted instrument label is output according to the instrument automatic identification model based on the improved CRBM, and an instrument automatic identification model is obtained;
preferably, in step S12 of the first step, the audio data is converted into a mel-frequency spectrum image using an open source tool.
Further, in step two, features of the mel-spectrum image are extracted using a neural network res net101 that has been pre-trained on the ImageNet dataset. Therefore, features which can be used for a musical instrument recognition task can be effectively extracted by means of Mel frequency spectrum mapping and ResNet101 image feature extraction, and recognition accuracy can be further improved by means of a multi-tone musical instrument recognition method based on tag correlation features.
In the method, in a data processing part, after the music audio is cut and muted by time slices, the music audio is converted into a Mel frequency spectrum according to the time slices. In the model construction part, extracting image features of the complex-tuned music mel frequency spectrum by using a convolutional neural network, modeling correlations between the image features and musical instrument labels corresponding to the mel frequency spectrum of the time slice by using an improved condition-limited boltzmann machine (Conditional Restricted Boltzmann Machine, CRBM) model, and finally training an improved CRBM model; based on the obtained correlation characteristics, a predicted instrument tag is output. Since a piece of multi-tune music is typically played by a plurality of instruments at the same time, the recognition problem of multi-tune music is naturally a multi-tag recognition problem. Meanwhile, the existing instrument identification method based on the neural network does not consider correlation among instrument categories as a key feature for distinguishing harmonic distribution of instruments in the process of identifying multi-tone music. To this end, the invention first learns the correlation between harmonic features of instruments in complex-tuned music and different instrument tags and generic features (where generic features (label specific feature) are intended to extract features in the data that are directly related to the tag from the tag's perspective) from both spectral feature extraction and instrument generic features, where generic features are expressed in the form of conditional probabilities P (h|x, y) in the modified CRBM model, where variable pairs (x, y) are introduced in the modified CRBM model to model the correlation between image features and tags, and by the activation probabilities of y to model the correlation between tags, P (y|x) and P (y) i I x, h, s) models the association of harmonic features and different musical instrument tags to be sufficientFeatures that can be used for instrument recognition tasks are extracted. In this way, not only the thought of multi-tag learning is consulted, but also the correlation among a plurality of musical instrument tags in the multi-tone music is modeled from the two angles of the image features and the generic features, so that the musical instruments in the multi-tone music can be identified according to the correlation among the tag correlation and the tag and the spectrum image features, which musical instruments are likely to have harmonic overlapping can be modeled through the tag correlation, and the overlapping which is likely to occur is associated with the corresponding spectrum image through the conditional probability between the spectrum image features and the corresponding tag, so that the musical instruments in the multi-tone music can be effectively distinguished and identified. The method overcomes the defect that the prior identification method only analyzes the characteristics of the spectrum image or the characteristics of the generic type, thereby greatly improving the identification precision and solving the identification problem of the musical instrument of the multi-tone music.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a modified CRBM model constructed in accordance with the present invention;
fig. 3 is a block diagram of an automatic recognition model of musical instruments in the present invention.
Detailed Description
The present invention will be further described below.
As shown in fig. 1 and 3, the present invention provides an automatic musical instrument recognition method based on a depth probability map neural network, comprising the steps of:
step one: preprocessing data;
s11, dividing the audio data into time slices, dividing a section of audio into N time slices with fixed lengths, and simultaneously recording labels corresponding to each time slice;
s12, converting the obtained audio data of each time slice into a Mel frequency spectrum image, then regularizing the image, and regularizing the value range of the pixel points to obtain a regularized Mel frequency spectrum image;
step two: extracting the characteristics of the data;
performing feature extraction on the obtained regularized Mel spectrum image by using a convolutional neural network to obtain image features of Mel spectrum, mapping the extracted features from two dimensions to one dimension, and combining the tags to form a time slice Mel spectrum image feature tag pair;
step three: modeling tag correlation features using a modified CRBM model, as shown in fig. 2;
s31, constructing an improved CRBM model according to the energy function proposed by the formula (1);
where x represents the input spectral feature data, y represents the label corresponding to x, which is also the expected output in the prediction stage, h represents the expected feature expression, s is the additional variable introduced, W, W y Beta, mu, b are training parameters;
s32, obtaining a conditional joint probability distribution according to a formula (1), as shown in a formula (2);
wherein Z represents a partitioning function;
s33, obtaining formulas (3) and (4) based on the formula (2), and obtaining the conditional probability P (h|x, y) of h according to the formula (3); obtaining the activation probability of each component of h according to formula (4); the conditional probability formula (5) of y based on x and h can be obtained according to the formulas (2) and (3);
P(h|x,y)=Π i P(h i |x,y) (3);
however, the covariance matrix of equation (5) is a non-diagonal matrix, and although equation (5) can represent the correlation between the components of the tag y through the covariance matrix, equation (5) is difficult to directly sample and thus is not suitable for training the improved CRBM model, and the present invention further decomposes the conditional probability of y according to S34 in order to train the model.
S34, obtaining formulas (6) and (7) based on formulas (2) and (4), and obtaining the activation probability of each component of S according to formula (6); obtaining the activation probability of each component of y according to formula (7);
wherein N represents a Gaussian distribution; thus, under the action of s, the conditions between the components of y are independent, and the calculation can be performed through Gibbs sampling.
Step four: training an improved CRBM model classified as a target by utilizing the correlation characteristics, and outputting a predicted instrument label;
s41, constructing an objective function according to a formula (8), and training an improved CRBM model by using the objective function;
Loss=log(p(y|x)+Rank-Loss(y|x)+σ||y|| l1 (8);
where log (p (y|x) represents the likelihood function, rank-Loss (y|x) represents the ordering Loss function, y l1 Representing l1 regularization, σ being a hyper-parameter;
s42, based on the formulas (3), (4), (6) and (7), using a gradient formula (9) of likelihood function in Gibbs sampling acquisition formula (8),
where E represents a mathematical expectation and θ represents a parameter set, and both mathematical expectations can be obtained using Gibbs sampling according to formulas (4), (6) and (7). Thus, the gradient of equation (8) can be calculated according to equation (9), and the improved CRBM model can be trained according to the gradient of equation (8); obtaining a feature expression h containing label correlation and a conditional probability of the label through training;
s43, after the improved CRBM training is completed, calculating a label y maximizing log (p (y|x) according to a formula (10) in face of input of a musical instrument class to be predicted;
calculating the label y that maximizes log (p (y|x) is accomplished by calculating the gradient of equation (10) with respect to y, thereby obtaining an automatic instrument recognition model based on the predicted instrument label output from the automatic instrument recognition model based on the improved CRBM.
Preferably, in step S12 of the first step, the audio data is converted into a mel-frequency spectrum image using an open source tool.
In order to more fully extract features that can be used for the instrument recognition task, in step two, features of the mel-spectrum image are extracted using neural network res net101 that is pre-trained on the ImageNet dataset. Therefore, features which can be used for a musical instrument recognition task can be effectively extracted by means of Mel frequency spectrum mapping and ResNet101 image feature extraction, and recognition accuracy can be further improved by means of a multi-tone musical instrument recognition method based on tag correlation features.
In the method, in a data processing part, after the music audio is cut and muted by time slices, the music audio is converted into a Mel frequency spectrum according to the time slices. In the model construction part, extracting image features of the complex-tuned music mel frequency spectrum by using a convolutional neural network, modeling correlations between the image features and musical instrument labels corresponding to the mel frequency spectrum of the time slice by using an improved condition-limited boltzmann machine (Conditional Restricted Boltzmann Machine, CRBM) model, and finally training an improved CRBM model; based on the obtained correlation characteristics, a predicted instrument tag is output. Because a piece of multi-tuned music is usually played by multiple instruments at the same time, the multi-tuned music is played by multiple instrumentsThe identification problem of music is naturally a multi-tag identification problem. Meanwhile, the existing instrument identification method based on the neural network does not consider correlation among instrument categories as a key feature for distinguishing harmonic distribution of instruments in the process of identifying multi-tone music. To this end, the invention first learns the correlation between harmonic features of instruments in complex-tuned music and different instrument tags and generic features (where generic features (label specific feature) are intended to extract features in the data that are directly related to the tag from the tag's perspective) from both spectral feature extraction and instrument generic features, where generic features are expressed in the form of conditional probabilities P (h|x, y) in the modified CRBM model, where variable pairs (x, y) are introduced in the modified CRBM model to model the correlation between image features and tags, and by the activation probabilities of y to model the correlation between tags, P (y|x) and P (y) i I x, h, s) models the association of harmonic features and different instrument tags to fully extract features that can be used for instrument recognition tasks. In this way, not only the thought of multi-tag learning is consulted, but also the correlation among a plurality of musical instrument tags in the multi-tone music is modeled from the two angles of the image features and the generic features, so that the musical instruments in the multi-tone music can be identified according to the correlation among the tag correlation and the tag and the spectrum image features, which musical instruments are likely to have harmonic overlapping can be modeled through the tag correlation, and the overlapping which is likely to occur is associated with the corresponding spectrum image through the conditional probability between the spectrum image features and the corresponding tag, so that the musical instruments in the multi-tone music can be effectively distinguished and identified. The method overcomes the defect that the prior identification method only analyzes the characteristics of the spectrum image or the characteristics of the generic type, thereby greatly improving the identification precision and solving the identification problem of the musical instrument of the multi-tone music.

Claims (3)

1. An automatic musical instrument identification method based on a depth probability map neural network is characterized by comprising the following steps:
step one: preprocessing data;
s11, dividing the audio data into time slices, dividing a section of audio into N time slices with fixed lengths, and simultaneously recording labels corresponding to each time slice;
s12, converting the obtained audio data of each time slice into a Mel frequency spectrum image, then regularizing the image, and regularizing the value range of the pixel points to obtain a regularized Mel frequency spectrum image;
step two: extracting the characteristics of the data;
performing feature extraction on the obtained regularized Mel spectrum image by using a convolutional neural network to obtain image features of Mel spectrum, mapping the extracted features from two dimensions to one dimension, and combining the tags to form a time slice Mel spectrum image feature tag pair;
step three: modeling tag correlation features using an improved CRBM model, the CRBM model being a conditional restricted boltzmann machine;
s31, constructing an improved CRBM model according to the energy function proposed by the formula (1);
(1);
in the method, in the process of the invention,xthe spectral feature data representing the input is presented,yrepresentation ofxThe corresponding tag, which is also the expected output in the prediction phase,hwhich represents the expression of the desired characteristic,sis an additional variable that is introduced and,W、W y 、β、μ、bis a training parameter;
s32, obtaining a conditional joint probability distribution according to a formula (1), as shown in a formula (2);
(2);
in the method, in the process of the invention,Zrepresenting a partitioning function;
s33, obtaining formulas (3) and (4) based on formula (2), and obtaining according to formula (3)hConditional probability of (2)The method comprises the steps of carrying out a first treatment on the surface of the Obtained according to formula (4)hActivation probability of each component of (a); obtained according to formulas (2), (3)yBased onxAndhconditional probability formula (5) of (2);
(3);
(4);
(5);
s34, obtaining formulas (6) and (7) based on formulas (2) and (4), and obtaining according to formula (6)sActivation probability of each component of (a); obtained according to formula (7)yActivation probability of each component of (a);
(6);
(7);
in the method, in the process of the invention,Nrepresenting a gaussian distribution;
step four: training an improved CRBM model classified as a target by utilizing the correlation characteristics, and outputting a predicted instrument label;
s41, constructing an objective function according to a formula (8), and training an improved CRBM model by using the objective function;
Loss = log(p(y|x) + Rank-Loss(y|x) + σ||y|| l1 (8);
in log%p(y|x) The likelihood function is represented as a function of the likelihood,Rank-Loss(y|x) Representing the ranking lossLoss function, ||y|| l1 Representation ofl1 the regularization is carried out,σis a super parameter;
s42, based on formulas (3), (4), (6) and (7), using Gibbs sampling to obtain a gradient formula (9) of likelihood function in formula (8), calculating the gradient of formula (8) according to formula (9), training an improved CRBM model according to the gradient of formula (8), and training to obtain a feature expression containing label correlationhAnd conditional probability of the tag;
(9);
in the method, in the process of the invention,Erepresenting the mathematical expectation that the data will be,θrepresenting a set of parameters, two mathematical expectations were obtained using Gibbs sampling according to formulas (4), (6) and (7);
s43, after finishing the improved CRBM training, calculating log according to the formula (10) to obtain the input of the type of musical instrument to be predictedp(y|x) Maximum tagyWhereby a predicted instrument tag is output from the improved CRBM based instrument automatic identification model to obtain an instrument automatic identification model;
(10)。
2. the automatic instrument recognition method based on the depth probability map neural network according to claim 1, wherein in step S12, the audio data is converted into mel-frequency spectrum images using an open source tool.
3. The automatic instrument recognition method based on the deep probability map neural network according to claim 1 or 2, wherein in the second step, features of mel spectrum images are extracted using a neural network res net101 pre-trained on an ImageNet dataset.
CN202211391028.3A 2022-11-08 2022-11-08 Automatic musical instrument identification method based on depth probability map neural network Active CN115910099B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211391028.3A CN115910099B (en) 2022-11-08 2022-11-08 Automatic musical instrument identification method based on depth probability map neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211391028.3A CN115910099B (en) 2022-11-08 2022-11-08 Automatic musical instrument identification method based on depth probability map neural network

Publications (2)

Publication Number Publication Date
CN115910099A CN115910099A (en) 2023-04-04
CN115910099B true CN115910099B (en) 2023-08-04

Family

ID=86492715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211391028.3A Active CN115910099B (en) 2022-11-08 2022-11-08 Automatic musical instrument identification method based on depth probability map neural network

Country Status (1)

Country Link
CN (1) CN115910099B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106328121A (en) * 2016-08-30 2017-01-11 南京理工大学 Chinese traditional musical instrument classification method based on depth confidence network
CN106920544A (en) * 2017-03-17 2017-07-04 深圳市唯特视科技有限公司 A kind of audio recognition method based on deep neural network features training
CN109918535A (en) * 2019-01-18 2019-06-21 华南理工大学 Music automatic marking method based on label depth analysis
CN110909820A (en) * 2019-12-02 2020-03-24 齐鲁工业大学 Image classification method and system based on self-supervision learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8873813B2 (en) * 2012-09-17 2014-10-28 Z Advanced Computing, Inc. Application of Z-webs and Z-factors to analytics, search engine, learning, recognition, natural language, and other utilities
US11531852B2 (en) * 2016-11-28 2022-12-20 D-Wave Systems Inc. Machine learning systems and methods for training with noisy labels

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106328121A (en) * 2016-08-30 2017-01-11 南京理工大学 Chinese traditional musical instrument classification method based on depth confidence network
CN106920544A (en) * 2017-03-17 2017-07-04 深圳市唯特视科技有限公司 A kind of audio recognition method based on deep neural network features training
CN109918535A (en) * 2019-01-18 2019-06-21 华南理工大学 Music automatic marking method based on label depth analysis
CN110909820A (en) * 2019-12-02 2020-03-24 齐鲁工业大学 Image classification method and system based on self-supervision learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Conditional Restricted Boltzmann Machines for Multi-label Learning with incomplete Labels;Xin LI;Proceedings of Machine Learning Research;第38卷;全文 *

Also Published As

Publication number Publication date
CN115910099A (en) 2023-04-04

Similar Documents

Publication Publication Date Title
CN105023573B (en) It is detected using speech syllable/vowel/phone boundary of auditory attention clue
CN107680582A (en) Acoustic training model method, audio recognition method, device, equipment and medium
WO2020248388A1 (en) Method and device for training singing voice synthesis model, computer apparatus, and storage medium
CN111724770B (en) Audio keyword identification method for generating confrontation network based on deep convolution
CN105719661A (en) Automatic discrimination method for playing timbre of string instrument
CN112259105A (en) Training method of voiceprint recognition model, storage medium and computer equipment
CN111128236B (en) Main musical instrument identification method based on auxiliary classification deep neural network
CN112259104A (en) Training device of voiceprint recognition model
CN115565540B (en) Invasive brain-computer interface Chinese pronunciation decoding method
CN112559797A (en) Deep learning-based audio multi-label classification method
Haque et al. High-fidelity audio generation and representation learning with guided adversarial autoencoder
CN116665669A (en) Voice interaction method and system based on artificial intelligence
Diment et al. Semi-supervised learning for musical instrument recognition
Mousavi et al. Persian classical music instrument recognition (PCMIR) using a novel Persian music database
CN111666996A (en) High-precision equipment source identification method based on attention mechanism
Azarloo et al. Automatic musical instrument recognition using K-NN and MLP neural networks
CN115910099B (en) Automatic musical instrument identification method based on depth probability map neural network
CN112052880A (en) Underwater sound target identification method based on weight updating support vector machine
CN114495990A (en) Speech emotion recognition method based on feature fusion
Kohlsdorf et al. Feature Learning and Automatic Segmentation for Dolphin Communication Analysis.
Guerrero-Turrubiates et al. Guitar chords classification using uncertainty measurements of frequency bins
Bhaskar et al. Analysis of language identification performance based on gender and hierarchial grouping approaches
CN111681674A (en) Method and system for identifying musical instrument types based on naive Bayes model
CN112735477A (en) Voice emotion analysis method and device
Dodia et al. Identification of raga by machine learning with chromagram

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant