CN115457966A - Pig cough sound identification method based on improved DS evidence theory multi-classifier fusion - Google Patents
Pig cough sound identification method based on improved DS evidence theory multi-classifier fusion Download PDFInfo
- Publication number
- CN115457966A CN115457966A CN202211128776.2A CN202211128776A CN115457966A CN 115457966 A CN115457966 A CN 115457966A CN 202211128776 A CN202211128776 A CN 202211128776A CN 115457966 A CN115457966 A CN 115457966A
- Authority
- CN
- China
- Prior art keywords
- base
- classifier
- fusion
- classifiers
- improved
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 69
- 206010011224 Cough Diseases 0.000 title claims abstract description 49
- 238000000034 method Methods 0.000 title claims abstract description 48
- 239000008186 active pharmaceutical agent Substances 0.000 title claims abstract description 45
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000012360 testing method Methods 0.000 claims abstract description 33
- 238000012216 screening Methods 0.000 claims abstract description 24
- 238000011156 evaluation Methods 0.000 claims abstract description 14
- 241000282887 Suidae Species 0.000 claims abstract description 8
- 239000012634 fragment Substances 0.000 claims abstract 2
- 238000004422 calculation algorithm Methods 0.000 claims description 15
- 238000000804 electron spin resonance spectroscopy Methods 0.000 claims description 13
- 230000008901 benefit Effects 0.000 claims description 5
- 230000008569 process Effects 0.000 claims description 5
- 230000003595 spectral effect Effects 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000007637 random forest analysis Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000012706 support-vector machine Methods 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 2
- 238000007500 overflow downdraw method Methods 0.000 abstract description 4
- 230000005236 sound signal Effects 0.000 description 7
- 238000000605 extraction Methods 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 3
- 208000023504 respiratory system disease Diseases 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 238000009395 breeding Methods 0.000 description 2
- 230000001488 breeding effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/14—Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
The application discloses a pig cough sound identification method based on improved DS evidence theory multi-classifier fusion, which comprises the following steps: collecting sound fragments of live pigs in a pigsty to obtain a corpus; based on the corpus, obtaining a training set and a test set, and extracting a plurality of acoustic features in the training set and the test set; inputting a plurality of acoustic features in the training set into a plurality of base classifiers, and outputting to obtain performance evaluation indexes of the base classifiers; screening the base classifier according to the performance evaluation index of the base classifier to obtain an optimal base classifier; training the optimal selection base classifier by using the training set to complete a target training model; and inputting the test set into a target training model, and fusing the output results of the optimal base classifier by adopting an improved DS evidence theory to complete the recognition of the cough sounds of the pigs. The method improves DS fusion by adopting distance fusion, solves the problem that data classification of a DS fusion method close to a decision boundary part is unreliable, and can remarkably improve the identification precision of the cough sound of the live pig.
Description
Technical Field
The application relates to the field of voice signal processing, in particular to a pig cough sound identification method based on improved DS evidence theory multi-classifier fusion.
Background
In the process of live pig breeding, live pig respiratory diseases become one of main reasons for restricting the development of the live pig breeding industry due to high fatality rate, strong infectivity and the like, so that a rapid and accurate respiratory disease early warning method is urgently needed. In recent years, researches show that early warning of respiratory diseases can be realized by monitoring cough sounds of pigs, wherein the key technology is to identify the cough sounds of the pigs. From the existing research, the main method focuses on feature selection, feature fusion and classifier optimization so as to improve the classification performance. However, these classification algorithms are all based on a single classifier model, and are susceptible to environmental noise interference, and the classification accuracy is difficult to further improve.
Disclosure of Invention
The method for identifying the cough sound of the live pigs by using the improved DS multi-classifier fusion method obviously improves the sound identification precision.
In order to achieve the above object, the present application provides a pig cough sound identification method based on improved DS evidence theory multi-classifier fusion, comprising the steps of:
collecting sound segments of live pigs in a pigsty to obtain a corpus;
obtaining a training set and a test set based on the corpus, and extracting a plurality of acoustic features in the training set and the test set;
inputting a plurality of acoustic features in the training set into a plurality of base classifiers, and outputting to obtain a plurality of base classifier performance evaluation indexes;
screening the base classifier according to the performance evaluation index of the base classifier to obtain an optimal base classifier;
training the optimal base classifier by using the training set to complete a target training model;
and inputting the test set into the target training model, and fusing output results of the optimal base classifier by adopting an improved DS evidence theory to complete the pig cough sound recognition.
Preferably, the method for obtaining the training set and the test set includes:
labeling the corpus to obtain cough sound segments and non-cough sound segments;
and dividing the cough sound segments into a training set and a testing set according to a certain proportion based on the non-cough sound segments.
Preferably, the acoustic features include: mel-frequency cepstral coefficients, linear prediction cepstral coefficients, gamma pass cepstral coefficients, and power spectral density.
Preferably, the base classifier includes: support vector machines, random forests and K nearest neighbors classifiers.
Preferably, the base classifier performance evaluation index includes: and comprehensively evaluating the classification precision, the error similarity and the classification precision-error similarity, wherein the indexes are defined as follows:
assuming that the total number of cough and non-cough samples involved in classification is NA, the number of samples correctly classified by the ith base classifier is NR i The number of correctly classified samples of the jth base classifier is NR j The number of samples with errors classified by two base classifiers i and j is NF ij ;
The classification accuracy of the ith base classifier is defined as:
correspondingly, the classification precision of the jth base classifier is as follows:
in the formula ,OAi and OAj Representing the classification accuracy of base classifiers i and j, respectively;
therefore, OA represents the ratio of the number of correctly classified samples to the total number of samples, i.e., the classification precision, and the value range is [0,1]; meanwhile, the error similarity between the ith base classifier and the jth base classifier is defined as:
in the formula ,ESRij Representing the degree of error similarity between two base classifiers i and j;
therefore, ESR represents the proportion of the number of samples classified simultaneously as erroneous to the total number of samples between two base classifiers, i.e. the degree of similarity of the errors between the two base classifiers, and its value range is [0,1]; the classification precision-error similarity comprehensive evaluation between the ith base classifier and the jth base classifier is defined as:
in the formula, OAESR ij Representing the comprehensive evaluation of classification precision-error similarity between the base classifiers i and j;
for a system consisting of N base classifiers, its OAESR is defined as:
wherein OAESR represents the classification precision-error similarity comprehensive evaluation index.
Preferably, the method for screening a plurality of the base classifiers comprises the following steps: and (2) selecting preferentially by adopting a two-step screening method to obtain the preferred base classifier, wherein the two-step screening method comprises the following steps:
assuming that the number of the initial base classifiers is L 1 Setting a threshold of the ESR to be ESR thr Calculating the OA and the ESR between each two of the base classifiers ij If the ESR between two of the base classifiers is zero ij Not greater than the threshold ESR thr Temporarily retaining the base classifier if the ESR between two base classifiers is present ij Is greater than the threshold ESR thr If the error similarity of the two base classifiers is higher, continuously judging the OA of the two base classifiers, eliminating the base classifier with a smaller OA value, temporarily retaining the base classifier with a larger OA value, traversing all the ESRs, continuously removing the base classifiers with a high ESR and a low OA value, and finally retaining the base classifier which is the preferred classifier after the first screening, wherein the number of the preferred base classifiers after the first screening is L 2 (ii) a Forming a plurality of groups according to different numbers of the preferred base classifiers, calculating the OAESR values of all the base classifier combinations in each group, and sequencing the OAESR values from large to small, wherein the threshold of the OAESR is set as the OAESR th The combinations in each group that exceed the threshold are fused.
Preferably, the method for fusing the output results of the preferred base classifier by using the improved DS evidence theory includes:
assume that each base classifier outputs proposition A with probability m i (A i ) Then, for n base classifiers, after DS fusion, the output is:
where K represents a collision coefficient, expressed as:
where Σ denotes summation, n denotes intersection,the empty set is represented, in DS fusion, when the KNN output probability is 0, a single negative situation occurs, that is, if there is an output probability of 0 in a single base classifier, after DS fusion, the output probability is determined to be 0, resulting in no advantage obtained after fusion, and on the contrary, possibly resulting in performance degradation; therefore, the KNN output probability is converted into the KNN output probabilities of 0 and 1, when the output probability is 1, the KNN output probability is represented by the probability alpha, when the output probability is 0, the KNN output probability is represented by the probability 1-alpha, the value range of the alpha is (0.5, 1), and the optimal alpha value can be obtained by a linear search method;
when the fusion result is close to the decision boundary 0.5, the DS fusion result is unreliable due to the influence of factors such as noise interference, and a distance fusion algorithm is adopted to improve the DS fusion algorithm; for the test sample x, assuming that the probability of coughing output after DS evidence theory fusion is P and the classification result of distance fusion is RD, the final classification result R of the test sample is:
wherein, 1 represents cough, 0 represents non-cough, beta is a conversion boundary, the value range is [0.3,0.5], and the optimal beta value is searched for all the fusion strategies of the base classifiers by using a linear search method.
Preferably, the process of using the distance fusion algorithm includes:
use ofTo represent the ith feature vector of the jth training sample, using y i To represent the ith feature vector of the current test sample, wherein i =1,2,3,4 represents the mel-frequency cepstrum coefficient, the linearity, respectivelyPredicting cepstral coefficients, the gamma pass cepstral coefficients, and the power spectral density; let p (j) be a function that returns the jth test sample class,denotes y i Andthe distance is calculated in a manhattan distance mode; the fusion distance D from the current test sample to the jth training sample j Is defined as:
wherein ,
where M represents the total number of training samples, and the class R of the test sample is:
compared with the prior art, the application has the following beneficial effects:
according to the method, the live pig cough sound is identified by using an improved DS multi-classifier fusion method, the combination of different acoustic features and classifiers is subjected to optimized screening by using a secondary screening method, three indexes of OA, ESR and OAESR are defined for screening of the basis classifiers, the screened basis classifiers are output, and an improved DS algorithm is used for fusion to obtain a classification result. The method and the system have the advantages that different characteristics and different classifiers are optimized and screened at the same time, the DS fusion is improved by adopting distance fusion, the problem that data classification of a DS fusion method close to a decision boundary part is unreliable is solved, and compared with the existing algorithm, the method and the system can obviously improve the identification precision of the cough sound of the live pig.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required in the embodiments will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a method in an embodiment of the present application;
FIG. 2 is a flow chart of a first-pass filter-based classifier algorithm in an embodiment of the present application;
fig. 3 is a flowchart of a second filtering-based classifier algorithm in the embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
As shown in fig. 1, a schematic flow chart of a method according to an embodiment of the present application includes:
collecting sound segments of live pigs in a pigsty to obtain a corpus; the corpus is labeled cough sounds and non-cough sound segments collected in an actual pigsty, and 1250 cough sounds and 1250 non-cough sounds are randomly selected from the corpus to serve as a training set and a test set. In this embodiment, all sound samples are divided into training set and test set in a scale of 4. Acoustic features of the sound signal are extracted. Firstly, preprocessing a sound signal, wherein the preprocessing process comprises the following steps: firstly, filtering sound signals, wherein the frequency of a band-pass filter is 100 Hz-16 kHz, then performing framing and windowing, the frame length is 20ms, the overlapping length is 10ms, and the window function is a Hamming window. Acoustic features of the sound signal are extracted, including mel-frequency cepstral coefficients (MFCCs), linear Prediction Cepstral Coefficients (LPCCs), gamma pass cepstral coefficients (GTCCs), and Power Spectral Density (PSD).
The MFCC extraction process comprises the following steps: and performing fast Fourier transform on the preprocessed signals to obtain frequency spectrums of the preprocessed signals, calculating power spectrums of the preprocessed signals, filtering power spectrum densities through a group of Mel filters, and performing DCT (discrete cosine transformation) to obtain the MFCC.
LPCC is a representation of Linear Prediction Coefficients (LPC) in the cepstral domain. The LPC is a set of prediction coefficients directly determined from a sound signal, which minimizes the prediction error between the actual sound signal and the linear prediction under the minimum mean square error criterion. The LPC is cepstrud to obtain the LPCC, and the LPCC contains envelope information of a signal spectrum.
The GTCC extraction process is similar to the MFCC except that the mel filter bank is replaced with a gamma pass filter bank.
The PSD reflects the power of the sound signal as a function of frequency. The PSD extraction method comprises an autocorrelation function method, a periodogram method, an average periodogram method and the like. The most common of these is the averaging periodogram method, i.e., the speech signal is divided into a number of segments, the PSD is calculated in each segment, and finally averaged.
In the acoustic feature extraction process, the MFCC order is 13, the LPCC order is 24, the GTCC order is 13, and the FFT point number is 1024 when the PSD is solved.
Different features are input into different classifiers to obtain different results, in the embodiment, one classifier is input into each feature to be called a base classifier, and twelve different base classifiers are obtained by four features and three classifiers. In this embodiment, the base classifier includes: support vector machines, random forests and K nearest neighbors classifiers. For convenience of illustration, in this embodiment, C is used i Represents the ith base classifier, wherein i is 1 to 12, and the specific numbering rule is shown in Table 1. E.g. C 1 Representing the base classifier resulting from inputting the LPCC into the SVM classifier. With (C) i ,C j ) Is represented by C i and Cj And (4) fusing. The SVM kernel function is 'RBF', the KNN distance calculation mode adopts Manhattan distance, and the RF decision tree is 100.
TABLE 1
In this embodiment, accuracy and difference indexes of the base classifier are defined to evaluate and screen the base classifier, and the indexes mainly include an Overall Accuracy (OA), an Error Similarity (ESR), and fusion of the two indexes: overall Accuracy-Similarity of errors (OAESR). The OA represents the overall recognition precision of the classifiers, the ESR represents the similarity of the classification error data of the two classifiers, and the fusion index OAESR comprehensively evaluates the accuracy and the difference of the two classifiers. These several indices are used for the screening of the base classifier.
Assuming that the total number of cough and non-cough samples involved in classification is NA, the number of samples correctly classified by the ith base classifier is NR i The number of correctly classified samples of the jth base classifier is NR j The number of samples with errors classified by two base classifiers is NF ij 。
Then the OA of the ith base classifier is defined as:
correspondingly, the OA of the jth base classifier is:
OA represents the ratio of correctly sorted samples to the total number of samples, and the value range is [0,1%](ii) a While ESR between ith and jth base classifiers ij Is defined as:
ESR ij the ratio of the number of samples with errors in simultaneous classification between two base classifiers to the total number of samples is shown, and the value range is [0, 1%](ii) a OAESR between ith and jth base classifiers ij Is defined as:
for a system consisting of N base classifiers, its OAESR is defined as:
the screening process of the base classifiers is divided into two times, the first screening algorithm flow is shown in figure 2, and the number of the initial base classifiers is assumed to be L 1 Setting ESR threshold ESR thr Calculate OA and ESR between each base classifier ij Wherein the set A = { C 1 ,C 2 ,…C L1 }. If ESR between two base classifiers ij Not greater than a threshold ESR thr Then the base classifier is temporarily retained if the ESR between two base classifiers is present ij ESR greater than threshold thr If the error similarity of the two base classifiers is higher, continuing to judge the OA of the two base classifiers, eliminating the base classifiers with lower OA values, temporarily retaining the base classifiers with higher OA values, traversing all ESR, and continuously removing the base classifiers with higher ESR and lower OA values, wherein the finally retained base classifier is the preferred classifier after the first screening. After the first screening, the number of the preferred base classifiers is L 2 。
The OA and ESR of each base classifier were calculated, in this example, the ESR threshold was set to 2.5%, and after the first screening, the base classifier obtained included C 1 、C 2 、C 5 、C 8 、C 9 。
For L obtained in the first screening 2 The individual base classifier can have a plurality of combinations according to the different quantity of the fusion classifiers, and further preferential screening is required. In this example, all L's are first obtained 2 And grouping the base classifiers according to the number of the fusion classifiers in different combination modes, calculating the OAESR values of all the base classifier combinations in each group, and sequencing the OAESR values from large to small. In this embodiment, the combination of the OAESR values in each set accounting for the first 20% is taken for fusion, and then the fusion result is obtained through comparative analysis.
The five base classifiers obtained after the first screening are screened for the second time, as shown in fig. 3, four sets of classifiers with different numbers of 2,3,4,5 can be obtained respectively, and then are respectively marked as sets 2 to 5, wherein the set 5 has only one combination, so that the combinations are retained. Calculating OAESR values of different combinations in each set, taking the combination of which the OAESR value accounts for the first 20 percent in each set, and finally obtaining a fusion strategy comprising (C) 2 ,C 5 ),(C 2 ,C 9 ),(C 1 ,C 2 ,C 5 ),(C 2 ,C 5 ,C 9 ),(C 1 ,C 2 ,C 5 ,C 8 ),(C 1 ,C 2 ,C 5 ,C 8 ,C 9 )。
And fusing the screened combinations by adopting an improved DS multi-classifier fusion algorithm. Assume that each base classifier outputs proposition A with probability m i (A i ) Then, for n base classifiers, after DS fusion, the output is:
where K is the collision coefficient, expressed as:
where, sigma represents summation, n-tableThe intersection is shown in the figure of the drawing,indicating an empty set. In DS fusion, when the KNN output probability is 0, a single vote may occur, that is, if there is an output probability of 0 in a single base classifier, the output probability is determined to be 0 after DS fusion, so that no advantage is obtained after the fusion, and performance may be reduced; therefore, the KNN output probability is converted into the KNN output probabilities of 0 and 1, when the output probability is 1, the KNN output probability is represented by the probability alpha, when the output probability is 0, the KNN output probability is represented by the probability 1-alpha, the value range of the alpha is (0.5, 1), and the optimal alpha value can be obtained by a linear search method;
when the fusion result is close to the decision boundary 0.5, the DS fusion result is unreliable due to the influence of factors such as noise interference, and a distance fusion algorithm is adopted to improve the DS fusion algorithm; for the test sample x, assuming that the probability of coughing output after DS evidence theory fusion is P and the classification result of distance fusion is RD, the final classification result R of the test sample is:
wherein, 1 represents cough, 0 represents non-cough, beta is a conversion boundary, the value range is [0.3,0.5], and the optimal beta value is searched for all the fusion strategies of the base classifiers by using a linear search method.
The distance fusion algorithm described above is described as follows, usingTo represent the ith feature vector of the jth training sample, using y i To represent the ith feature vector of the current test sample, where i =1,2,3,4 represents LPCC, MFCC, GTCC, and PSD, respectively. Let ρ (j) be the function that returns the jth test sample class.Denotes y i Andthe distance is calculated as manhattan distance. The fusion distance D from the current test sample to the jth training sample j Is defined as:
wherein ,
where M represents the total number of training samples, and the class R of the test sample is:
in the embodiment, the KNN output probabilities are 0 and 1, and the conversion parameter α finds the optimal value by means of linear search. And replacing the DS fusion samples close to the decision boundary beta by adopting the classification result of distance fusion. The decision boundary of each combination may be different, and the optimal boundary value needs to be obtained by means of linear search.
And comparing and analyzing results of different combinations, and selecting a proper fusion mode of the base classifier according to the classification accuracy and the calculation complexity.
The above-described embodiments are merely illustrative of the preferred embodiments of the present application, and do not limit the scope of the present application, and various modifications and improvements made to the technical solutions of the present application by those skilled in the art without departing from the spirit of the present application should fall within the protection scope defined by the claims of the present application.
Claims (8)
1. The pig cough sound identification method based on the improved DS evidence theory multi-classifier fusion is characterized by comprising the following steps of:
collecting sound fragments of live pigs in a pigsty to obtain a corpus;
obtaining a training set and a test set based on the corpus, and extracting a plurality of acoustic features in the training set and the test set;
inputting a plurality of acoustic features in the training set into a plurality of base classifiers, and outputting to obtain a plurality of base classifier performance evaluation indexes;
screening the base classifier according to the performance evaluation index of the base classifier to obtain an optimal base classifier;
training the optimal base classifier by using the training set to complete a target training model;
and inputting the test set into the target training model, and fusing output results of the optimal base classifier by adopting an improved DS evidence theory to complete the pig cough sound recognition.
2. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 1, wherein the method for obtaining the training set and the test set comprises:
labeling the corpus to obtain cough sound segments and non-cough sound segments;
and dividing the cough sound segments into a training set and a testing set according to a certain proportion based on the non-cough sound segments.
3. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 1, wherein the acoustic features comprise: mel-frequency cepstrum coefficients, linear prediction cepstrum coefficients, gamma pass cepstrum coefficients, and power spectral density.
4. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 1, wherein the base classifier comprises: support vector machine, random forest and K nearest neighbors classifier.
5. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 1, wherein the base classifier performance evaluation index comprises: and comprehensively evaluating the classification precision, the error similarity and the classification precision-error similarity, wherein the indexes are defined as follows:
assuming that the total number of cough and non-cough samples involved in classification is NA, the number of samples correctly classified by the ith base classifier is NR i The number of correctly classified samples of the jth base classifier is NR j The number of samples with errors classified by two base classifiers i and j is NF ij ;
The classification accuracy of the ith base classifier is defined as:
correspondingly, the classification precision of the jth base classifier is as follows:
in the formula ,OAi and OAj Representing the classification accuracy of base classifiers i and j, respectively;
therefore, OA represents the ratio of the number of correctly classified samples to the total number of samples, i.e., the classification precision, and the value range is [0,1]; meanwhile, the error similarity between the ith base classifier and the jth base classifier is defined as:
in the formula ,ESRij Representing the degree of error similarity between two base classifiers i and j;
therefore, ESR represents the ratio of the number of samples that are classified simultaneously as erroneous to the total number of samples between two base classifiers, i.e. the degree of similarity of the errors between two base classifiers, and its value range is [0,1]; the classification precision-error similarity comprehensive evaluation between the ith base classifier and the jth base classifier is defined as:
in the formula, OAESR ij Representing the comprehensive evaluation of classification precision-error similarity between the base classifiers i and j;
for a system consisting of N base classifiers, its OAESR is defined as:
wherein OAESR represents the classification precision-error similarity comprehensive evaluation index.
6. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 5, wherein the method for screening a plurality of the base classifiers comprises the following steps: and selecting preferably by adopting a two-step screening method to obtain the preferred base classifier, wherein the two-step screening method comprises the following steps:
assuming that the number of the initial base classifiers is L 1 Setting a threshold of the ESR to be ESR thr Calculating the OA and the ESR between each two of the base classifiers ij If the ESR between two of the base classifiers is zero ij Not greater than the threshold ESR thr Temporarily retaining the base classifier if the ESR between two base classifiers is present ij Is greater than the threshold ESR thr If the error similarity between the two base classifiers is higher, continuing to judge the OA of the two base classifiers, eliminating the base classifiers with smaller OA values, temporarily reserving the base classifiers with larger OA values, traversing all the ESRs, continuously removing the base classifiers with high ESR and low OA values, and finally reserving the base classifiers which are the first base classifiersThe number of the preferred base classifiers after the first screening is L 2 (ii) a Forming a plurality of groups according to different numbers of the preferred base classifiers, calculating the OAESR values of all the base classifier combinations in each group, and sequencing the OAESR values from large to small, wherein the threshold of the OAESR is set as the OAESR th The combinations in each group that exceed the threshold are merged.
7. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 3, wherein the method for fusing the output result of the preferred base classifier by using the improved DS evidence theory comprises the following steps:
assume that each base classifier outputs proposition A with probability m i (A i ) Then, for n base classifiers, after DS fusion, the output is:
where K represents a collision coefficient, expressed as:
where Σ denotes summation, n denotes intersection,the empty set is represented, in the DS fusion, when the KNN output probability is 0, a situation of a single vote rejection occurs, namely if the single base classifier has the output probability of 0, after the DS fusion, the output probability is determined to be 0, so that no advantage is obtained after the fusion, and the performance is possibly reduced; therefore, the conditions that the KNN output probability is 0 and 1 are converted, when the output probability is 1, the probability is represented by alpha, when the probability is 0, the probability is represented by 1-alpha, and the value range of alpha is represented byTo (0.5, 1), a linear search method can be used to obtain an optimal α value;
when the fusion result is close to the decision boundary by 0.5, the DS fusion result is unreliable due to the influence of factors such as noise interference, and the DS fusion algorithm is improved by adopting a distance fusion algorithm; for the test sample x, assuming that the probability of coughing output after DS evidence theory fusion is P and the classification result of distance fusion is RD, the final classification result R of the test sample is:
wherein 1 represents cough, 0 represents non-cough, beta is a conversion boundary, the value range thereof is [0.3,0.5], and the optimal beta value is searched for all the basis classifier fusion strategies by using a linear search method.
8. The improved DS evidence theory multi-classifier fusion-based pig cough sound identification method according to claim 7, wherein the process of adopting the distance fusion algorithm comprises:
use ofTo represent the ith feature vector of the jth training sample, using y i To represent an ith feature vector of a current test sample, wherein i =1,2,3,4 represents the mel-frequency cepstral coefficient, the linear prediction cepstral coefficient, the gamma-pass cepstral coefficient, and the power spectral density, respectively; let p (j) be a function that returns the jth test sample class,denotes y i Andthe distance is calculated in a manhattan distance mode; the fusion distance D from the current test sample to the jth training sample j Is defined as:
wherein ,
where M represents the total number of training samples, and the class R of the test sample is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211128776.2A CN115457966B (en) | 2022-09-16 | 2022-09-16 | Pig cough sound identification method based on improved DS evidence theory multi-classifier fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211128776.2A CN115457966B (en) | 2022-09-16 | 2022-09-16 | Pig cough sound identification method based on improved DS evidence theory multi-classifier fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115457966A true CN115457966A (en) | 2022-12-09 |
CN115457966B CN115457966B (en) | 2023-05-12 |
Family
ID=84305069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211128776.2A Active CN115457966B (en) | 2022-09-16 | 2022-09-16 | Pig cough sound identification method based on improved DS evidence theory multi-classifier fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115457966B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117423342A (en) * | 2023-10-27 | 2024-01-19 | 东北农业大学 | Pig abnormal state monitoring method and system based on edge calculation |
CN117647587A (en) * | 2024-01-30 | 2024-03-05 | 浙江大学海南研究院 | Acoustic emission signal classification method, computer equipment and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140336537A1 (en) * | 2011-09-15 | 2014-11-13 | University Of Washington Through Its Center For Commercialization | Cough detecting methods and devices for detecting coughs |
CN106847262A (en) * | 2016-12-28 | 2017-06-13 | 华中农业大学 | A kind of porcine respiratory disease automatic identification alarm method |
CN109471942A (en) * | 2018-11-07 | 2019-03-15 | 合肥工业大学 | Chinese comment sensibility classification method and device based on evidential reasoning rule |
CN112861984A (en) * | 2021-02-25 | 2021-05-28 | 西华大学 | Speech emotion classification method based on feature fusion and ensemble learning |
CN113240034A (en) * | 2021-05-25 | 2021-08-10 | 北京理工大学 | Depth decision fusion method based on entropy method and D-S evidence theory |
CN114330453A (en) * | 2022-01-05 | 2022-04-12 | 东北农业大学 | Live pig cough sound identification method based on fusion of acoustic features and visual features |
CN114330454A (en) * | 2022-01-05 | 2022-04-12 | 东北农业大学 | Live pig cough sound identification method based on DS evidence theory fusion characteristics |
-
2022
- 2022-09-16 CN CN202211128776.2A patent/CN115457966B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140336537A1 (en) * | 2011-09-15 | 2014-11-13 | University Of Washington Through Its Center For Commercialization | Cough detecting methods and devices for detecting coughs |
CN106847262A (en) * | 2016-12-28 | 2017-06-13 | 华中农业大学 | A kind of porcine respiratory disease automatic identification alarm method |
CN109471942A (en) * | 2018-11-07 | 2019-03-15 | 合肥工业大学 | Chinese comment sensibility classification method and device based on evidential reasoning rule |
CN112861984A (en) * | 2021-02-25 | 2021-05-28 | 西华大学 | Speech emotion classification method based on feature fusion and ensemble learning |
CN113240034A (en) * | 2021-05-25 | 2021-08-10 | 北京理工大学 | Depth decision fusion method based on entropy method and D-S evidence theory |
CN114330453A (en) * | 2022-01-05 | 2022-04-12 | 东北农业大学 | Live pig cough sound identification method based on fusion of acoustic features and visual features |
CN114330454A (en) * | 2022-01-05 | 2022-04-12 | 东北农业大学 | Live pig cough sound identification method based on DS evidence theory fusion characteristics |
Non-Patent Citations (2)
Title |
---|
黎煊;赵建;高云;刘望宏;雷明刚;谭鹤群;: "基于连续语音识别技术的猪连续咳嗽声识别" * |
黎煊;赵建;高云;刘望宏;雷明刚;谭鹤群;: "基于连续语音识别技术的猪连续咳嗽声识别", 农业工程学报 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117423342A (en) * | 2023-10-27 | 2024-01-19 | 东北农业大学 | Pig abnormal state monitoring method and system based on edge calculation |
CN117423342B (en) * | 2023-10-27 | 2024-06-07 | 东北农业大学 | Pig abnormal state monitoring method and system based on edge calculation |
CN117647587A (en) * | 2024-01-30 | 2024-03-05 | 浙江大学海南研究院 | Acoustic emission signal classification method, computer equipment and medium |
CN117647587B (en) * | 2024-01-30 | 2024-04-09 | 浙江大学海南研究院 | Acoustic emission signal classification method, computer equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN115457966B (en) | 2023-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110120218B (en) | Method for identifying highway large-scale vehicles based on GMM-HMM | |
CN115457966A (en) | Pig cough sound identification method based on improved DS evidence theory multi-classifier fusion | |
US7177808B2 (en) | Method for improving speaker identification by determining usable speech | |
CN104795064B (en) | The recognition methods of sound event under low signal-to-noise ratio sound field scape | |
CN102915729B (en) | Speech keyword spotting system and system and method of creating dictionary for the speech keyword spotting system | |
CN103730130A (en) | Detection method and system for pathological voice | |
CN102799899A (en) | Special audio event layered and generalized identification method based on SVM (Support Vector Machine) and GMM (Gaussian Mixture Model) | |
US9043207B2 (en) | Speaker recognition from telephone calls | |
CN112259104A (en) | Training device of voiceprint recognition model | |
CN106910495A (en) | A kind of audio classification system and method for being applied to abnormal sound detection | |
CN115101076B (en) | Speaker clustering method based on multi-scale channel separation convolution feature extraction | |
CN112015874A (en) | Student mental health accompany conversation system | |
Xiao et al. | AMResNet: An automatic recognition model of bird sounds in real environment | |
Ge et al. | Speaker change detection using features through a neural network speaker classifier | |
CN107886071A (en) | A kind of processing method of fibre reinforced composites damage acoustic emission signal | |
CN114822557A (en) | Method, device, equipment and storage medium for distinguishing different sounds in classroom | |
CN114373453A (en) | Voice keyword detection method based on motion trail and discriminative information | |
CN117727307B (en) | Bird voice intelligent recognition method based on feature fusion | |
CN113571050A (en) | Voice depression state identification method based on Attention and Bi-LSTM | |
CN109036390B (en) | Broadcast keyword identification method based on integrated gradient elevator | |
Aurchana et al. | Musical instruments sound classification using GMM | |
Kumar et al. | Parkinson’s Speech Detection Using YAMNet | |
CN118098288B (en) | Weak supervision voice depression detection method based on self-learning label correction | |
Muthusamy et al. | A review of research in automatic language identification | |
Li et al. | The analysis on the acoustic parameters of distinctive features for Mandarin vowels |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |