CN113807408A - Data-driven audio classification method, system and medium for supervised dictionary learning - Google Patents
Data-driven audio classification method, system and medium for supervised dictionary learning Download PDFInfo
- Publication number
- CN113807408A CN113807408A CN202110988214.4A CN202110988214A CN113807408A CN 113807408 A CN113807408 A CN 113807408A CN 202110988214 A CN202110988214 A CN 202110988214A CN 113807408 A CN113807408 A CN 113807408A
- Authority
- CN
- China
- Prior art keywords
- dictionary
- data
- learning
- training
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000012360 testing method Methods 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 abstract description 3
- 238000000354 decomposition reaction Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/28—Determining representative reference patterns, e.g. by averaging or distorting; Generating dictionaries
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0631—Creating reference templates; Clustering
- G10L2015/0633—Creating reference templates; Clustering using lexical or orthographic knowledge sources
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method, a system and a medium for audio classification based on data-driven supervised dictionary learning. The method comprises the following steps: determining the number of sample set categories; training a specific class dictionary by using the input sample and the class label corresponding to the input sample; obtaining sparse codes of input samples by using the trained dictionary, and training an SVM classifier by taking the sparse codes as characteristics; and classifying the input samples by using the trained dictionary and the trained SVM classifier, and outputting the prediction labels. The invention realizes the minimization of intra-class uniformity by learning one dictionary per class, maximizes the separability of the classes, improves the sparsity to control the complexity of signal decomposition on the dictionary, minimizes class-based reconstruction errors, and improves the pair-wise orthogonality of the dictionaries. The invention can be widely applied to a plurality of scenes, such as calculation of auditory scene recognition and music and string recognition; the test on the data set is relatively stable, and the generalization capability is excellent.
Description
Technical Field
The invention belongs to the technical field of sparse representation and supervised dictionary learning, and particularly relates to a data-driven supervised dictionary learning-based audio classification method, system and medium.
Background
Conventional dictionary learning formulas minimize reconstruction errors between a given signal and its sparse representation on a learning dictionary. Although this method is convenient for solving signal denoising, it may not be suitable for the classification task since its final goal is to obtain a discriminative decomposition of the training signal through a learned dictionary. Due to the limitations of the traditional dictionary learning technology in the aspect of classification, supervised dictionary learning is widely applied.
Ramirez et al suggest that different information may be obtained by enhancing the orthogonality of dictionaries to make the learned dictionaries as different as possible, i.e., one class corresponds to one dictionary; fulkerson et al propose to first learn a very large dictionary and then merge the atoms of the dictionary according to predefined criteria including the condensation information bottleneck (AIB) to act as a compression dictionary; mairal et al propose a joint learning dictionary and classification task; post-tensioning and young et al propose embedding class labels into dictionaries and learning of sparse coding to minimize intra-class differences and maximize inter-class differences.
Disclosure of Invention
The invention mainly aims to overcome the defects of the traditional dictionary learning method on the audio recognition task, and provides a supervised dictionary learning audio classification method, a supervised dictionary learning audio classification system and a supervised dictionary learning audio classification medium based on data driving.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect of the present invention, a method for audio classification based on data-driven supervised dictionary learning is provided, which comprises the following steps:
s1, determining the class number C of the sample set, and using the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
S2, utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
s3, utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y~ n。
As an optimized technical scheme, the C specific class dictionaries D are trainedc,c∈[1,C]The following were used:
s11, initializing dictionary Dc 0Learning rate eta0Learning rate update rate alpha and iteration times T;
s12, determining a loss function J;
s13, starting the iterative solution process with the number of times of T, and fixing the dictionary D when the number of iterations is Tt-1Computing a sparse coding set At;
S14 set A of fixed sparse codestUpdating dictionary Dc t;
And S15, T is T +1, and the next iteration is carried out until T is T.
As a preferred technical solution, the loss function J is in a specific form:
J(A,D)=J1(D,A)+μJ2(D,A)+λJ3(A)+γ1J4(A)+γ2J5(D);
where μ is a sample constraint parameter, λ is a classifier constraint parameter, γ1For sparsely encoding the constraint parameter, gamma2The constraint parameters are learned for the dictionary.
As a preferred technical solution, in the iterative solution process with the start time being T, when the iteration time is T, the dictionary D is fixedt-1Computing a sparse coding set AtIn particular minimizing the loss function J (D) by the Lasso algorithmt-1,At) To obtain At。
As a preferred technical solution, the set a of fixed sparse codestUpdating dictionary Dc tThe method comprises the following specific steps:
s141, calculating gradient G of loss function J relative to dictionary Dt;
S142, preliminary update, Dc t/2=Dc t-1-ηGt;
S143, constraining the preliminarily updated dictionary through a near-end projection operator Prox;
s144, up to J (D)c t,At)<J(Dc t-1,At-1) And ending the updating of the dictionary.
As a preferred technical solution, the training SVM classifier specifically includes: training to obtain a hyperplane, and separating different samples; the testing stage is to determine which side of the space divided by the hyperplane the sample is on.
In another aspect of the present invention, a data-driven audio classification system for supervised dictionary learning is further provided, which is applied to the above data-driven audio classification method for supervised dictionary learning, and includes a dictionary training module, an SVM classifier training module, and a prediction output module;
the dictionary training module is used for determining the class number C of the sample set and utilizing the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
The SVM classifier training module is used for utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
the prediction output module is used for utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y~ n。
In another aspect of the present invention, a storage medium is provided, which stores a program, and when the program is executed by a processor, the program implements the above-mentioned method for audio classification based on data-driven supervised dictionary learning.
Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) the supervised dictionary learning audio recognition method based on data driving disclosed by the invention realizes the minimization of intra-class uniformity by learning one dictionary per class, maximizes the separability of the classes, improves the sparsity to control the complexity of signal decomposition on the dictionary, simultaneously minimizes class-based reconstruction errors, and improves the pair-wise orthogonality of the dictionaries;
(2) the method provided by the invention can be widely applied to a plurality of scenes, such as calculation of auditory scene recognition and music and string recognition; the test on the data set is relatively stable, and the generalization capability is excellent.
(3) The method provided by the invention can accurately improve the recognition of the audio frequency, and has excellent performance in the field of security calculation such as voice authentication and audio frequency identification.
Drawings
FIG. 1 is a flowchart of implementation steps of a method for audio classification based on data-driven supervised dictionary learning according to an embodiment of the present invention;
FIG. 2 is a class-specific dictionary D according to an embodiment of the present inventioncA flowchart of the learning step of (1);
FIG. 3 is a flowchart of the training steps of an SVM classifier according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a process of classifying and outputting prediction tags during a testing phase according to an embodiment of the present invention;
FIG. 5 is a graph of similarity of pairs of class-specific dictionaries learned on Rouen datasets by an embodiment of the present invention;
FIG. 6 is a similarity graph of a pair of class-specific dictionaries learned on a music and chord dataset according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of an audio classification system based on data-driven supervised dictionary learning according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Examples
As shown in fig. 1, the present embodiment provides a method for learning audio classification based on a data-driven supervised dictionary, comprising the following steps:
s1, determining the class number C of the sample set, and using the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C]As shown in fig. 2, the method specifically includes the following steps:
s11, initializing dictionary Dc 0Learning rate eta0Learning rate update rate alpha and iteration times T;
s12, determining a loss function J;
further, the loss function J is embodied as:
J(A,D)=J1(D,A)+μJ2(D,A)+λJ3(A)+γ1J4(A)+γ2J5(D);
where μ is a sample constraint parameter, λ is a classifier constraint parameter, γ1For sparsely encoding the constraint parameter, gamma2The constraint parameters are learned for the dictionary.
S13, starting the iterative solution process with the number of times of T, and fixing the dictionary D when the number of iterations is Tt-1Computing sparse code At;
Further, the sparse coding AtMinimizing the loss function J (D) by the Lasso algorithmt-1,At) Thus obtaining the product.
S14 fixed sparse coding AtUpdating dictionary Dc tThe method comprises the following steps:
s141, calculating gradient G of loss function J relative to dictionary Dt(ii) a Specifically, the loss function is:
wherein:
the gradient is:
wherein:
s142, preliminary update, Dc t/2=Dc t-1-ηGt;
S143, constraining the preliminarily updated dictionary through a near-end projection operator Prox;
s144, up to J (D)c t,At)<J(Dc t-1,At-1) And ending the updating of the dictionary.
And S15, T is T +1, and the next iteration is carried out until T is T.
S2, utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by using the sparse code as a feature, as shown in FIG. 3;
the training SVM classifier specifically comprises the following steps: training to obtain a hyperplane, and separating different samples; the testing stage is to determine which side of the space divided by the hyperplane the sample is on.
S3, in testing stage, using trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y~ nAs shown in fig. 4.
In this example, two different audio signal classification problems were tested, respectively auditory scene recognition and music and string recognition:
(1) in computing the auditory identification problem, the present invention performed experiments on both East Anglia and Litis Rouen datasets. Table 1 lists the results of the method of the present invention compared to other methods in this regard;
TABLE 1 comparison of the method of the present invention with other methods in computing auditory identification problems
As is clear from Table 1, the method of the present invention has completely outperformed certain methods, the test on two data sets is relatively stable, and the generalization ability is excellent, which indicates that the method of the present invention has certain prospect to be explored. Fig. 5 shows pairwise similarities of different dictionaries, and it can be seen that, in terms of calculating auditory scene recognition, dictionaries corresponding to different categories still have greater similarity, that is, features possibly extracted by different categories are similar and are not beneficial to classification, and increasing categories make it difficult to enforce dissimilarity between the dictionaries.
(2) In terms of musical chord identification, the present invention produces 2156 musical chord samples containing 14 different categories, each sample having a duration of 2s and a frequency of 44100 Hz. Comparing the method of the present invention with some conventional characteristics, the results shown in table 2 are obtained;
Features | Music chord |
Chroma | 0.19±0.01 |
Interpolated PSD | 0.15±0.02 |
Spectrogram pooling | 0.14±0.01 |
Dictionary learning | 0.66±0.01 |
TABLE 2. results of comparison of the method of the present invention with conventional characteristics in terms of musical chord identification
As is apparent from table 2, the method of the present invention is superior to other conventional features. Fig. 6 shows the pairwise similarity of different dictionaries, and it can be seen that the maximum value of the pairwise similarity of different dictionaries is on the diagonal line from top left to bottom right, which illustrates that the method of the present invention achieves the required effect on music and chord recognition data sets, i.e., dictionaries corresponding to different categories can extract different information, which is a good illustration that the method of the present invention overcomes other traditional characteristics.
In another embodiment of the present application, as shown in fig. 7, there is provided a data-driven supervised dictionary learning based audio classification system, which includes a dictionary training module, an SVM classifier training module, and a prediction output module;
the dictionary training module is used for determining the class number C of the sample set and utilizing the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
The SVM classifier training module is used for utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
the prediction output module is used for utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y~ n。
It should be noted that the system provided in the above embodiment is only exemplified by the division of the above functional modules, and in practical applications, the above function allocation may be completed by different functional modules according to needs, that is, the internal structure is divided into different functional modules to complete all or part of the above described functions.
As shown in fig. 8, in another embodiment of the present application, there is further provided a storage medium storing a program, which when executed by a processor, implements a method for learning audio classification based on a data-driven supervised dictionary, specifically:
s1, determining the class number C of the sample set, and using the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
S2, utilizing the trained dictionaryDc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
s3, utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y~ n。
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (8)
1. The audio classification method based on data-driven supervised dictionary learning is characterized by comprising the following steps of:
determining the number of sample set classes C, using the input samples xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
Using trained dictionaries Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
using trained dictionaries Dc,c∈[1,C]And trained SVM classificationFor input sample xnClassifying and outputting the prediction label y~ n。
2. The audio classification method based on data-driven supervised dictionary learning of claim 1, wherein the C dictionaries of specific class D are trainedc,c∈[1,C]The following were used:
initializing dictionary Dc 0Learning rate eta0Learning rate update rate alpha and iteration times T;
determining a loss function J;
starting an iterative solution process with the number of times of T, and fixing a dictionary D when the number of iterations is Tt-1Computing a set A of sparse codest;
Set A of fixed sparse codestUpdating dictionary Dc t;
And T is T +1, and entering the next iteration until T is T.
3. The audio classification method based on data-driven supervised dictionary learning according to claim 2, wherein the loss function J is in the specific form:
J(A,D)=J1(D,A)+μJ2(D,A)+λJ3(A)+γ1J4(A)+γ2J5(D);
where μ is a sample constraint parameter, λ is a classifier constraint parameter, γ1For sparsely encoding the constraint parameter, gamma2The constraint parameters are learned for the dictionary.
4. The audio classification method based on data-driven supervised dictionary learning of claim 2, wherein the iterative solution process with the starting number of T is characterized in that when the iterative number is T, the fixed dictionary D is fixedt-1Computing a sparse coding set AtIn particular minimizing the loss function J (D) by the Lasso algorithmt-1,At) To obtain At。
5. The data-driven supervised dictionary learning-based audio classification method according to claim 2, wherein the set A of fixed sparse codestUpdating dictionary Dc tThe method comprises the following specific steps:
calculating the gradient G of the loss function J with respect to the dictionary Dt;
Preliminary update, Dc t/2=Dc t-1-ηGt;
Constraining the preliminarily updated dictionary through a near-end projection operator Prox;
up to J (D)c t,At)<J(Dc t-1,At-1) And ending the updating of the dictionary.
6. The audio classification method based on data-driven supervised dictionary learning of claim 1, wherein the training SVM classifier is specifically: training to obtain a hyperplane, and separating different samples; the testing stage is to determine which side of the space divided by the hyperplane the sample is on.
7. The audio classification system based on data-driven supervised dictionary learning is characterized by being applied to the audio classification method based on data-driven supervised dictionary learning of any one of claims 1 to 6, and comprising a dictionary training module, an SVM classifier training module and a prediction output module;
the dictionary training module is used for determining the class number C of the sample set and utilizing the input sample xnAnd its corresponding class label ynTraining C class-specific dictionaries Dc,c∈[1,C];
The SVM classifier training module is used for utilizing the trained dictionary Dc,c∈[1,C]To obtain input samples xnSparse coding of anTraining an SVM classifier by taking the sparse code as a characteristic;
the prediction output module is used for utilizing the trained dictionary Dc,c∈[1,C]And the trained SVM classifier on the input sample xnClassifying and outputting the prediction label y~ n。
8. A storage medium storing a program, characterized in that: the program, when executed by a processor, implements the data-driven supervised dictionary learning based audio classification method of any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110988214.4A CN113807408B (en) | 2021-08-26 | 2021-08-26 | Data-driven supervised dictionary learning audio classification method, system and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110988214.4A CN113807408B (en) | 2021-08-26 | 2021-08-26 | Data-driven supervised dictionary learning audio classification method, system and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113807408A true CN113807408A (en) | 2021-12-17 |
CN113807408B CN113807408B (en) | 2023-08-22 |
Family
ID=78941984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110988214.4A Active CN113807408B (en) | 2021-08-26 | 2021-08-26 | Data-driven supervised dictionary learning audio classification method, system and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113807408B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115082727A (en) * | 2022-05-25 | 2022-09-20 | 江苏大学 | Scene classification method and system based on multilayer local perception depth dictionary learning |
CN115273819A (en) * | 2022-09-28 | 2022-11-01 | 深圳比特微电子科技有限公司 | Sound event detection model establishing method and device and readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104966105A (en) * | 2015-07-13 | 2015-10-07 | 苏州大学 | Robust machine error retrieving method and system |
EP3166020A1 (en) * | 2015-11-06 | 2017-05-10 | Thomson Licensing | Method and apparatus for image classification based on dictionary learning |
CN109948735A (en) * | 2019-04-02 | 2019-06-28 | 广东工业大学 | A kind of multi-tag classification method, system, device and storage medium |
CN111160387A (en) * | 2019-11-28 | 2020-05-15 | 广东工业大学 | Graph model based on multi-view dictionary learning |
US20200312321A1 (en) * | 2017-10-27 | 2020-10-01 | Ecole De Technologie Superieure | In-ear nonverbal audio events classification system and method |
-
2021
- 2021-08-26 CN CN202110988214.4A patent/CN113807408B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104966105A (en) * | 2015-07-13 | 2015-10-07 | 苏州大学 | Robust machine error retrieving method and system |
EP3166020A1 (en) * | 2015-11-06 | 2017-05-10 | Thomson Licensing | Method and apparatus for image classification based on dictionary learning |
US20200312321A1 (en) * | 2017-10-27 | 2020-10-01 | Ecole De Technologie Superieure | In-ear nonverbal audio events classification system and method |
CN109948735A (en) * | 2019-04-02 | 2019-06-28 | 广东工业大学 | A kind of multi-tag classification method, system, device and storage medium |
CN111160387A (en) * | 2019-11-28 | 2020-05-15 | 广东工业大学 | Graph model based on multi-view dictionary learning |
Non-Patent Citations (1)
Title |
---|
宋科建;杨海南;: "结合字典学习的多标签分类算法", 电子世界, no. 02, pages 67 - 68 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115082727A (en) * | 2022-05-25 | 2022-09-20 | 江苏大学 | Scene classification method and system based on multilayer local perception depth dictionary learning |
CN115082727B (en) * | 2022-05-25 | 2023-05-05 | 江苏大学 | Scene classification method and system based on multi-layer local perception depth dictionary learning |
CN115273819A (en) * | 2022-09-28 | 2022-11-01 | 深圳比特微电子科技有限公司 | Sound event detection model establishing method and device and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113807408B (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113807408A (en) | Data-driven audio classification method, system and medium for supervised dictionary learning | |
US11908457B2 (en) | Orthogonally constrained multi-head attention for speech tasks | |
Richard et al. | A bag-of-words equivalent recurrent neural network for action recognition | |
CN110188827B (en) | Scene recognition method based on convolutional neural network and recursive automatic encoder model | |
CN112884010A (en) | Multi-mode self-adaptive fusion depth clustering model and method based on self-encoder | |
WO2019214289A1 (en) | Image processing method and apparatus, and electronic device and storage medium | |
CN108898181B (en) | Image classification model processing method and device and storage medium | |
CN110705636B (en) | Image classification method based on multi-sample dictionary learning and local constraint coding | |
US20240177697A1 (en) | Audio data processing method and apparatus, computer device, and storage medium | |
US9269024B2 (en) | Image recognition system based on cascaded over-complete dictionaries | |
Huang et al. | Deep learning vector quantization for acoustic information retrieval | |
CN111860834A (en) | Neural network tuning method, system, terminal and storage medium | |
CN114613450A (en) | Method and device for predicting property of drug molecule, storage medium and computer equipment | |
WO2016181474A1 (en) | Pattern recognition device, pattern recognition method and program | |
KR101969346B1 (en) | Apparatus for classifying skin conditions, and apparatus for generating skin condition classification model used on the same apparatus and method thereof | |
CN115881160A (en) | Music genre classification method and system based on knowledge graph fusion | |
ES2536560T3 (en) | Method to discover and recognize patterns | |
Zhao et al. | Asymmetric deep hashing for person re-identifications | |
CN113160135A (en) | Intelligent colon lesion identification method, system and medium based on unsupervised migration image classification | |
US20220383117A1 (en) | Bayesian personalization | |
US20220318633A1 (en) | Model compression using pruning quantization and knowledge distillation | |
CN114913358B (en) | Medical hyperspectral foreign matter detection method based on automatic encoder | |
CN116541763A (en) | Data distillation method, medium and visual task processing method | |
US20230281477A1 (en) | Framework system for improving performance of knowledge graph embedding model and method for learning thereof | |
JP3237606B2 (en) | Multiple character string alignment method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |