CN113807408A - Data-driven audio classification method, system and medium for supervised dictionary learning - Google Patents

Data-driven audio classification method, system and medium for supervised dictionary learning Download PDF

Info

Publication number
CN113807408A
CN113807408A CN202110988214.4A CN202110988214A CN113807408A CN 113807408 A CN113807408 A CN 113807408A CN 202110988214 A CN202110988214 A CN 202110988214A CN 113807408 A CN113807408 A CN 113807408A
Authority
CN
China
Prior art keywords
dictionary
data
learning
training
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110988214.4A
Other languages
Chinese (zh)
Other versions
CN113807408B (en
Inventor
陈真
邱小群
向友君
张淘珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202110988214.4A priority Critical patent/CN113807408B/en
Publication of CN113807408A publication Critical patent/CN113807408A/en
Application granted granted Critical
Publication of CN113807408B publication Critical patent/CN113807408B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/28Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering
    • G10L2015/0633Creating reference templates; Clustering using lexical or orthographic knowledge sources

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method, a system and a medium for audio classification based on data-driven supervised dictionary learning. The method comprises the following steps: determining the number of sample set categories; training a specific class dictionary by using the input sample and the class label corresponding to the input sample; obtaining sparse codes of input samples by using the trained dictionary, and training an SVM classifier by taking the sparse codes as characteristics; and classifying the input samples by using the trained dictionary and the trained SVM classifier, and outputting the prediction labels. The invention realizes the minimization of intra-class uniformity by learning one dictionary per class, maximizes the separability of the classes, improves the sparsity to control the complexity of signal decomposition on the dictionary, minimizes class-based reconstruction errors, and improves the pair-wise orthogonality of the dictionaries. The invention can be widely applied to a plurality of scenes, such as calculation of auditory scene recognition and music and string recognition; the test on the data set is relatively stable, and the generalization capability is excellent.

Description

Data-driven audio classification method, system and medium for supervised dictionary learning
Technical Field
The invention belongs to the technical field of sparse representation and supervised dictionary learning, and particularly relates to a data-driven supervised dictionary learning-based audio classification method, system and medium.
Background
Conventional dictionary learning formulas minimize reconstruction errors between a given signal and its sparse representation on a learning dictionary. Although this method is convenient for solving signal denoising, it may not be suitable for the classification task since its final goal is to obtain a discriminative decomposition of the training signal through a learned dictionary. Due to the limitations of the traditional dictionary learning technology in the aspect of classification, supervised dictionary learning is widely applied.
Ramirez et al suggest that different information may be obtained by enhancing the orthogonality of dictionaries to make the learned dictionaries as different as possible, i.e., one class corresponds to one dictionary; fulkerson et al propose to first learn a very large dictionary and then merge the atoms of the dictionary according to predefined criteria including the condensation information bottleneck (AIB) to act as a compression dictionary; mairal et al propose a joint learning dictionary and classification task; post-tensioning and young et al propose embedding class labels into dictionaries and learning of sparse coding to minimize intra-class differences and maximize inter-class differences.
Disclosure of Invention
The invention mainly aims to overcome the defects of the traditional dictionary learning method on the audio recognition task, and provides a supervised dictionary learning audio classification method, a supervised dictionary learning audio classification system and a supervised dictionary learning audio classification medium based on data driving.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect of the present invention, a method for audio classification based on data-driven supervised dictionary learning is provided, which comprises the following steps:
s1, determining the class number C of the sample set, and using the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
S2, utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
s3, utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y n
As an optimized technical scheme, the C specific class dictionaries D are trainedc,c∈[1,C]The following were used:
s11, initializing dictionary Dc 0Learning rate eta0Learning rate update rate alpha and iteration times T;
s12, determining a loss function J;
s13, starting the iterative solution process with the number of times of T, and fixing the dictionary D when the number of iterations is Tt-1Computing a sparse coding set At
S14 set A of fixed sparse codestUpdating dictionary Dc t
And S15, T is T +1, and the next iteration is carried out until T is T.
As a preferred technical solution, the loss function J is in a specific form:
J(A,D)=J1(D,A)+μJ2(D,A)+λJ3(A)+γ1J4(A)+γ2J5(D);
Figure BDA0003231440840000021
Figure BDA0003231440840000022
Figure BDA0003231440840000023
Figure BDA0003231440840000031
Figure BDA0003231440840000032
where μ is a sample constraint parameter, λ is a classifier constraint parameter, γ1For sparsely encoding the constraint parameter, gamma2The constraint parameters are learned for the dictionary.
As a preferred technical solution, in the iterative solution process with the start time being T, when the iteration time is T, the dictionary D is fixedt-1Computing a sparse coding set AtIn particular minimizing the loss function J (D) by the Lasso algorithmt-1,At) To obtain At
As a preferred technical solution, the set a of fixed sparse codestUpdating dictionary Dc tThe method comprises the following specific steps:
s141, calculating gradient G of loss function J relative to dictionary Dt
S142, preliminary update, Dc t/2=Dc t-1-ηGt
S143, constraining the preliminarily updated dictionary through a near-end projection operator Prox;
s144, up to J (D)c t,At)<J(Dc t-1,At-1) And ending the updating of the dictionary.
As a preferred technical solution, the training SVM classifier specifically includes: training to obtain a hyperplane, and separating different samples; the testing stage is to determine which side of the space divided by the hyperplane the sample is on.
In another aspect of the present invention, a data-driven audio classification system for supervised dictionary learning is further provided, which is applied to the above data-driven audio classification method for supervised dictionary learning, and includes a dictionary training module, an SVM classifier training module, and a prediction output module;
the dictionary training module is used for determining the class number C of the sample set and utilizing the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
The SVM classifier training module is used for utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
the prediction output module is used for utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y n
In another aspect of the present invention, a storage medium is provided, which stores a program, and when the program is executed by a processor, the program implements the above-mentioned method for audio classification based on data-driven supervised dictionary learning.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) the supervised dictionary learning audio recognition method based on data driving disclosed by the invention realizes the minimization of intra-class uniformity by learning one dictionary per class, maximizes the separability of the classes, improves the sparsity to control the complexity of signal decomposition on the dictionary, simultaneously minimizes class-based reconstruction errors, and improves the pair-wise orthogonality of the dictionaries;
(2) the method provided by the invention can be widely applied to a plurality of scenes, such as calculation of auditory scene recognition and music and string recognition; the test on the data set is relatively stable, and the generalization capability is excellent.
(3) The method provided by the invention can accurately improve the recognition of the audio frequency, and has excellent performance in the field of security calculation such as voice authentication and audio frequency identification.
Drawings
FIG. 1 is a flowchart of implementation steps of a method for audio classification based on data-driven supervised dictionary learning according to an embodiment of the present invention;
FIG. 2 is a class-specific dictionary D according to an embodiment of the present inventioncA flowchart of the learning step of (1);
FIG. 3 is a flowchart of the training steps of an SVM classifier according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a process of classifying and outputting prediction tags during a testing phase according to an embodiment of the present invention;
FIG. 5 is a graph of similarity of pairs of class-specific dictionaries learned on Rouen datasets by an embodiment of the present invention;
FIG. 6 is a similarity graph of a pair of class-specific dictionaries learned on a music and chord dataset according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of an audio classification system based on data-driven supervised dictionary learning according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Examples
As shown in fig. 1, the present embodiment provides a method for learning audio classification based on a data-driven supervised dictionary, comprising the following steps:
s1, determining the class number C of the sample set, and using the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C]As shown in fig. 2, the method specifically includes the following steps:
s11, initializing dictionary Dc 0Learning rate eta0Learning rate update rate alpha and iteration times T;
s12, determining a loss function J;
further, the loss function J is embodied as:
J(A,D)=J1(D,A)+μJ2(D,A)+λJ3(A)+γ1J4(A)+γ2J5(D);
Figure BDA0003231440840000051
Figure BDA0003231440840000052
Figure BDA0003231440840000053
Figure BDA0003231440840000061
Figure BDA0003231440840000062
where μ is a sample constraint parameter, λ is a classifier constraint parameter, γ1For sparsely encoding the constraint parameter, gamma2The constraint parameters are learned for the dictionary.
S13, starting the iterative solution process with the number of times of T, and fixing the dictionary D when the number of iterations is Tt-1Computing sparse code At
Further, the sparse coding AtMinimizing the loss function J (D) by the Lasso algorithmt-1,At) Thus obtaining the product.
S14 fixed sparse coding AtUpdating dictionary Dc tThe method comprises the following steps:
s141, calculating gradient G of loss function J relative to dictionary Dt(ii) a Specifically, the loss function is:
Figure BDA0003231440840000063
wherein:
Figure BDA0003231440840000064
Figure BDA0003231440840000065
Figure BDA0003231440840000066
the gradient is:
Figure BDA0003231440840000067
wherein:
Figure BDA0003231440840000068
Figure BDA0003231440840000071
Figure BDA0003231440840000072
s142, preliminary update, Dc t/2=Dc t-1-ηGt
S143, constraining the preliminarily updated dictionary through a near-end projection operator Prox;
s144, up to J (D)c t,At)<J(Dc t-1,At-1) And ending the updating of the dictionary.
And S15, T is T +1, and the next iteration is carried out until T is T.
S2, utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by using the sparse code as a feature, as shown in FIG. 3;
the training SVM classifier specifically comprises the following steps: training to obtain a hyperplane, and separating different samples; the testing stage is to determine which side of the space divided by the hyperplane the sample is on.
S3, in testing stage, using trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y nAs shown in fig. 4.
In this example, two different audio signal classification problems were tested, respectively auditory scene recognition and music and string recognition:
(1) in computing the auditory identification problem, the present invention performed experiments on both East Anglia and Litis Rouen datasets. Table 1 lists the results of the method of the present invention compared to other methods in this regard;
Figure BDA0003231440840000073
Figure BDA0003231440840000081
TABLE 1 comparison of the method of the present invention with other methods in computing auditory identification problems
As is clear from Table 1, the method of the present invention has completely outperformed certain methods, the test on two data sets is relatively stable, and the generalization ability is excellent, which indicates that the method of the present invention has certain prospect to be explored. Fig. 5 shows pairwise similarities of different dictionaries, and it can be seen that, in terms of calculating auditory scene recognition, dictionaries corresponding to different categories still have greater similarity, that is, features possibly extracted by different categories are similar and are not beneficial to classification, and increasing categories make it difficult to enforce dissimilarity between the dictionaries.
(2) In terms of musical chord identification, the present invention produces 2156 musical chord samples containing 14 different categories, each sample having a duration of 2s and a frequency of 44100 Hz. Comparing the method of the present invention with some conventional characteristics, the results shown in table 2 are obtained;
Features Music chord
Chroma 0.19±0.01
Interpolated PSD 0.15±0.02
Spectrogram pooling 0.14±0.01
Dictionary learning 0.66±0.01
TABLE 2. results of comparison of the method of the present invention with conventional characteristics in terms of musical chord identification
As is apparent from table 2, the method of the present invention is superior to other conventional features. Fig. 6 shows the pairwise similarity of different dictionaries, and it can be seen that the maximum value of the pairwise similarity of different dictionaries is on the diagonal line from top left to bottom right, which illustrates that the method of the present invention achieves the required effect on music and chord recognition data sets, i.e., dictionaries corresponding to different categories can extract different information, which is a good illustration that the method of the present invention overcomes other traditional characteristics.
In another embodiment of the present application, as shown in fig. 7, there is provided a data-driven supervised dictionary learning based audio classification system, which includes a dictionary training module, an SVM classifier training module, and a prediction output module;
the dictionary training module is used for determining the class number C of the sample set and utilizing the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
The SVM classifier training module is used for utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
the prediction output module is used for utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y n
It should be noted that the system provided in the above embodiment is only exemplified by the division of the above functional modules, and in practical applications, the above function allocation may be completed by different functional modules according to needs, that is, the internal structure is divided into different functional modules to complete all or part of the above described functions.
As shown in fig. 8, in another embodiment of the present application, there is further provided a storage medium storing a program, which when executed by a processor, implements a method for learning audio classification based on a data-driven supervised dictionary, specifically:
s1, determining the class number C of the sample set, and using the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
S2, utilizing the trained dictionaryDc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
s3, utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y n
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (8)

1. The audio classification method based on data-driven supervised dictionary learning is characterized by comprising the following steps of:
determining the number of sample set classes C, using the input samples xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
Using trained dictionaries Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
using trained dictionaries Dc,c∈[1,C]And trained SVM classificationFor input sample xnClassifying and outputting the prediction label y n
2. The audio classification method based on data-driven supervised dictionary learning of claim 1, wherein the C dictionaries of specific class D are trainedc,c∈[1,C]The following were used:
initializing dictionary Dc 0Learning rate eta0Learning rate update rate alpha and iteration times T;
determining a loss function J;
starting an iterative solution process with the number of times of T, and fixing a dictionary D when the number of iterations is Tt-1Computing a set A of sparse codest
Set A of fixed sparse codestUpdating dictionary Dc t
And T is T +1, and entering the next iteration until T is T.
3. The audio classification method based on data-driven supervised dictionary learning according to claim 2, wherein the loss function J is in the specific form:
J(A,D)=J1(D,A)+μJ2(D,A)+λJ3(A)+γ1J4(A)+γ2J5(D);
Figure FDA0003231440830000011
Figure FDA0003231440830000012
Figure FDA0003231440830000013
Figure FDA0003231440830000014
Figure FDA0003231440830000015
where μ is a sample constraint parameter, λ is a classifier constraint parameter, γ1For sparsely encoding the constraint parameter, gamma2The constraint parameters are learned for the dictionary.
4. The audio classification method based on data-driven supervised dictionary learning of claim 2, wherein the iterative solution process with the starting number of T is characterized in that when the iterative number is T, the fixed dictionary D is fixedt-1Computing a sparse coding set AtIn particular minimizing the loss function J (D) by the Lasso algorithmt-1,At) To obtain At
5. The data-driven supervised dictionary learning-based audio classification method according to claim 2, wherein the set A of fixed sparse codestUpdating dictionary Dc tThe method comprises the following specific steps:
calculating the gradient G of the loss function J with respect to the dictionary Dt
Preliminary update, Dc t/2=Dc t-1-ηGt
Constraining the preliminarily updated dictionary through a near-end projection operator Prox;
up to J (D)c t,At)<J(Dc t-1,At-1) And ending the updating of the dictionary.
6. The audio classification method based on data-driven supervised dictionary learning of claim 1, wherein the training SVM classifier is specifically: training to obtain a hyperplane, and separating different samples; the testing stage is to determine which side of the space divided by the hyperplane the sample is on.
7. The audio classification system based on data-driven supervised dictionary learning is characterized by being applied to the audio classification method based on data-driven supervised dictionary learning of any one of claims 1 to 6, and comprising a dictionary training module, an SVM classifier training module and a prediction output module;
the dictionary training module is used for determining the class number C of the sample set and utilizing the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
The SVM classifier training module is used for utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
the prediction output module is used for utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y n
8. A storage medium storing a program, characterized in that: the program, when executed by a processor, implements the data-driven supervised dictionary learning based audio classification method of any one of claims 1-6.
CN202110988214.4A 2021-08-26 2021-08-26 Data-driven supervised dictionary learning audio classification method, system and medium Active CN113807408B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110988214.4A CN113807408B (en) 2021-08-26 2021-08-26 Data-driven supervised dictionary learning audio classification method, system and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110988214.4A CN113807408B (en) 2021-08-26 2021-08-26 Data-driven supervised dictionary learning audio classification method, system and medium

Publications (2)

Publication Number Publication Date
CN113807408A true CN113807408A (en) 2021-12-17
CN113807408B CN113807408B (en) 2023-08-22

Family

ID=78941984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110988214.4A Active CN113807408B (en) 2021-08-26 2021-08-26 Data-driven supervised dictionary learning audio classification method, system and medium

Country Status (1)

Country Link
CN (1) CN113807408B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082727A (en) * 2022-05-25 2022-09-20 江苏大学 Scene classification method and system based on multilayer local perception depth dictionary learning
CN115273819A (en) * 2022-09-28 2022-11-01 深圳比特微电子科技有限公司 Sound event detection model establishing method and device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966105A (en) * 2015-07-13 2015-10-07 苏州大学 Robust machine error retrieving method and system
EP3166020A1 (en) * 2015-11-06 2017-05-10 Thomson Licensing Method and apparatus for image classification based on dictionary learning
CN109948735A (en) * 2019-04-02 2019-06-28 广东工业大学 A kind of multi-tag classification method, system, device and storage medium
CN111160387A (en) * 2019-11-28 2020-05-15 广东工业大学 Graph model based on multi-view dictionary learning
US20200312321A1 (en) * 2017-10-27 2020-10-01 Ecole De Technologie Superieure In-ear nonverbal audio events classification system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966105A (en) * 2015-07-13 2015-10-07 苏州大学 Robust machine error retrieving method and system
EP3166020A1 (en) * 2015-11-06 2017-05-10 Thomson Licensing Method and apparatus for image classification based on dictionary learning
US20200312321A1 (en) * 2017-10-27 2020-10-01 Ecole De Technologie Superieure In-ear nonverbal audio events classification system and method
CN109948735A (en) * 2019-04-02 2019-06-28 广东工业大学 A kind of multi-tag classification method, system, device and storage medium
CN111160387A (en) * 2019-11-28 2020-05-15 广东工业大学 Graph model based on multi-view dictionary learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋科建;杨海南;: "结合字典学习的多标签分类算法", 电子世界, no. 02, pages 67 - 68 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082727A (en) * 2022-05-25 2022-09-20 江苏大学 Scene classification method and system based on multilayer local perception depth dictionary learning
CN115082727B (en) * 2022-05-25 2023-05-05 江苏大学 Scene classification method and system based on multi-layer local perception depth dictionary learning
CN115273819A (en) * 2022-09-28 2022-11-01 深圳比特微电子科技有限公司 Sound event detection model establishing method and device and readable storage medium

Also Published As

Publication number Publication date
CN113807408B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN113807408A (en) Data-driven audio classification method, system and medium for supervised dictionary learning
US11908457B2 (en) Orthogonally constrained multi-head attention for speech tasks
Richard et al. A bag-of-words equivalent recurrent neural network for action recognition
CN110188827B (en) Scene recognition method based on convolutional neural network and recursive automatic encoder model
CN112884010A (en) Multi-mode self-adaptive fusion depth clustering model and method based on self-encoder
WO2019214289A1 (en) Image processing method and apparatus, and electronic device and storage medium
CN108898181B (en) Image classification model processing method and device and storage medium
CN110705636B (en) Image classification method based on multi-sample dictionary learning and local constraint coding
US20240177697A1 (en) Audio data processing method and apparatus, computer device, and storage medium
US9269024B2 (en) Image recognition system based on cascaded over-complete dictionaries
Huang et al. Deep learning vector quantization for acoustic information retrieval
CN111860834A (en) Neural network tuning method, system, terminal and storage medium
CN114613450A (en) Method and device for predicting property of drug molecule, storage medium and computer equipment
WO2016181474A1 (en) Pattern recognition device, pattern recognition method and program
KR101969346B1 (en) Apparatus for classifying skin conditions, and apparatus for generating skin condition classification model used on the same apparatus and method thereof
CN115881160A (en) Music genre classification method and system based on knowledge graph fusion
ES2536560T3 (en) Method to discover and recognize patterns
Zhao et al. Asymmetric deep hashing for person re-identifications
CN113160135A (en) Intelligent colon lesion identification method, system and medium based on unsupervised migration image classification
US20220383117A1 (en) Bayesian personalization
US20220318633A1 (en) Model compression using pruning quantization and knowledge distillation
CN114913358B (en) Medical hyperspectral foreign matter detection method based on automatic encoder
CN116541763A (en) Data distillation method, medium and visual task processing method
US20230281477A1 (en) Framework system for improving performance of knowledge graph embedding model and method for learning thereof
JP3237606B2 (en) Multiple character string alignment method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant