CN111340132B - Machine olfaction mode identification method based on DA-SVM - Google Patents
Machine olfaction mode identification method based on DA-SVM Download PDFInfo
- Publication number
- CN111340132B CN111340132B CN202010161893.3A CN202010161893A CN111340132B CN 111340132 B CN111340132 B CN 111340132B CN 202010161893 A CN202010161893 A CN 202010161893A CN 111340132 B CN111340132 B CN 111340132B
- Authority
- CN
- China
- Prior art keywords
- svm
- new
- machine
- data set
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 230000008786 sensory perception of smell Effects 0.000 title claims abstract description 14
- 238000012706 support-vector machine Methods 0.000 claims abstract description 28
- 238000012549 training Methods 0.000 claims abstract description 21
- 238000002372 labelling Methods 0.000 claims abstract description 4
- 230000009467 reduction Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 28
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- 210000002569 neuron Anatomy 0.000 claims description 15
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 10
- 238000012567 pattern recognition method Methods 0.000 claims description 9
- 238000011478 gradient descent method Methods 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000000926 separation method Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 2
- 230000017105 transposition Effects 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 claims 2
- 230000007774 longterm Effects 0.000 abstract description 3
- 230000008447 perception Effects 0.000 abstract description 3
- 238000012360 testing method Methods 0.000 description 14
- 239000007789 gas Substances 0.000 description 12
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000003909 pattern recognition Methods 0.000 description 6
- 238000012795 verification Methods 0.000 description 5
- YXFVVABEGXRONW-UHFFFAOYSA-N Toluene Chemical compound CC1=CC=CC=C1 YXFVVABEGXRONW-UHFFFAOYSA-N 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000011664 nicotinic acid Substances 0.000 description 2
- 235000019645 odor Nutrition 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- VGGSQFUCUMXWEO-UHFFFAOYSA-N Ethene Chemical compound C=C VGGSQFUCUMXWEO-UHFFFAOYSA-N 0.000 description 1
- 239000005977 Ethylene Substances 0.000 description 1
- IKHGUXGNUITLKF-XPULMUKRSA-N acetaldehyde Chemical compound [14CH]([14CH3])=O IKHGUXGNUITLKF-XPULMUKRSA-N 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0031—General constructional details of gas analysers, e.g. portable test equipment concerning the detector comprising two or more sensors, e.g. a sensor array
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0036—Specially adapted to detect a particular component
- G01N33/0047—Specially adapted to detect a particular component for organic compounds
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/0004—Gaseous mixtures, e.g. polluted air
- G01N33/0009—General constructional details of gas analysers, e.g. portable test equipment
- G01N33/0027—General constructional details of gas analysers, e.g. portable test equipment concerning the detector
- G01N33/0036—Specially adapted to detect a particular component
- G01N33/0054—Specially adapted to detect a particular component for ammonia
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention discloses a machine olfaction mode identification method based on DA-SVM, which comprises the following steps: 1. acquiring a raw dataset of an olfactory systemS 1 Normalizing and manually labeling the data set; 2. constructing a depth self-encoder, rejecting a datasetS 1 Taking the rest data as the input of DA, and obtaining a feature data set after dimension reduction through iterative training; 3. the characteristic data set obtained in the step 2 is labeled in the step 1 again, and a new data set is generatedS 2 The method comprises the steps of carrying out a first treatment on the surface of the 4. Will beS 2 Sending the SVM classifier into a support vector machine model for training, and establishing the SVM classifier through multiple parameter adjustment; 5. the mode identification of the olfactory system can be realized by using an SVM classifier. The invention can solve the problems of the machine olfactory system in aspects of large samples, high-dimensional characteristics, multiple categories, long-term drift and the like, and improves the accuracy of machine olfactory perception.
Description
Technical Field
The invention relates to a machine olfactory system, in particular to an olfactory perception classifier combining a depth self-encoder and a support vector machine.
Background
The machine olfaction is a novel bionic detection technology simulating the working principle of biological olfaction, and a machine olfaction system is generally composed of a cross-sensitive chemical sensor array and a proper computer pattern recognition algorithm, and can be used for detecting, analyzing and identifying various odors. A complete machine olfactory system generally comprises a gas sensor array hardware device and a set of pattern recognition techniques oriented toward sensor signal and data processing. The pattern recognition technology is mainly used for establishing a proper machine learning model, judging the composition and concentration information of the detected gas or judging the smell of the detected target, and realizing the functions of bionic or machine smell.
However, the existing machine olfactory system still does not perform well in practical gas recognition or odor judgment applications, on the one hand, as the olfactory sensor is poisoned or degenerated with the increase of the service time, the response signal of the olfactory sensor gradually deviates from the value of the olfactory sensor, and the accuracy of recognition of the electronic nose is reduced or even becomes unreliable due to the drifting; on the other hand, the olfactory pattern recognition usually adopts a large amount of data to train a classifier, introduces a large amount of noise interference, and also faces the problems of high-dimensional and multi-variable interference among sensing signals, so that the really useful characteristic signals are submerged or difficult to extract, finally the recognition effect of a machine olfactory system is influenced,
in order to improve the performance of a machine olfactory system, ZL 2016610120715. X discloses an electronic nose mode identification method based on deep belief network feature extraction, ZL201110340338.8 discloses an electronic nose on-line drift suppression method based on a multi-self-organization neural network, and ZL201610216768.1 discloses an electronic nose gas identification method for target domain transfer extreme learning. However, these methods are mainly for building deep neural network based classifier models, which require classifying data by a large number of neurons. The classifier directly adopting the deep learning method or the neural network is too complex compared with the traditional machine learning classifier, and is limited in application on a plurality of low-power consumption low-computation chips, although the precision is improved.
Disclosure of Invention
In order to solve the problems, the invention provides a machine olfactory pattern recognition method combining a depth self-encoder and a support vector machine (DA-SVM), wherein the DA-SVM classifier established by the method can utilize the depth self-encoder to realize automatic degradation and effective feature extraction of large sample data, and simultaneously establishes a machine olfactory pattern recognition model based on an SVM shallow classifier, so that the method can finally improve the accuracy of machine olfactory perception in terms of large samples (more than or equal to 10000), high-dimensional features (more than or equal to 100), multi-category, long-term drift problems and the like.
In order to achieve the above purpose, the present invention is realized by the following technical scheme:
the machine olfactory pattern recognition method based on the DA-SVM is characterized by comprising the following steps of:
step one, obtaining an original data set of an olfactory system, normalizing and manually labeling the data set, wherein the data set can be recorded as S 1 ={(x 1 ,y 1 ),(x 2 ,y 2 )……(x m ,y m ) (x) wherein i ,y i ) For the i-th sample pair, i=1, …, m, x i Is the characteristic of the original data of the sample, y i The number of the samples is m, which is the number of the corresponding labels;
step two, constructing a depth automatic encoder (DA), and eliminating S in the step one 1 Is (y) i ) And the remaining feature set (x i ) As the input of the network, a new feature set can be output after repeated iterative trainingThe superscript o indicates new data;
step three, the characteristics obtained in the step two are processedAttaching the label (y) i ) Generating a new data set, which can be expressed as +.>
Step four, the new data set S of step three 2 Sending the model into a Support Vector Machine (SVM) for training, and obtaining parameters of an SVM classifier model through multiple parameter adjustment until the model error is reduced to a reasonable interval;
and fifthly, realizing the mode identification of the olfactory system by utilizing the SVM model parameters in the step four.
Further, the data preprocessing in the first step is normalized by using Min-Max function, and the original value x is mapped to the interval [0,1]]The process can cope with different standard valuesThe dimension of the olfactory signal. Meanwhile, the label y in the step one i The single-heat coding mode is adopted, and the coding mode adopts the number of values of the gas category characteristics, so that the characteristics are represented by the number of dimensions.
Further, the depth automatic encoder algorithm framework in the second step is constructed according to the following form, and the specific steps are as follows:
firstly, constructing a deep automatic encoder network comprising an input layer and an output layer, wherein n hidden layers (n is more than or equal to 2 and less than or equal to 20), initializing a network structure, determining the node number [128,6,64], namely 128 neurons are arranged on the input layer, the output layer comprises 6 neurons, and 64 neurons are hidden in the hidden layer;
secondly, coding dimension reduction: connecting the input layer with the neuron nodes of the first hidden layer according to the formula f (x) =f (w i x i +b i ) Encoding the input layer and the first hidden layer, w i As a weight matrix, b i As bias item, f is the mapping function of coding, repeating the coding step until the middle hidden layer is connected;
finally, decoding and reconstructing: connecting the most intermediate hidden layer with the neurons of the subsequent hidden layer according to the formulaAnd performing layer-by-layer reconstruction until the layer is connected to a final output layer, wherein g is a decoded mapping function, the superscript T represents the transposition of the vector, and the reconstruction process decodes a vector with the same size as the original size according to the function g.
Preferably, in the training process of the DA weight in the second step, a Loss Function (Loss Function) is used to measure the error of iterative computation, and finally, an optimal parameter is obtained; here, the loss function selected is the cross entropy loss function ofWherein (1)>A predicted value of a tag true value tag y; according to the loss minimization criterion->To continuously optimize the parameter Q and finally reach the Q of the optimal solution New The symbol Q represents the ownership value w i And bias b i Parameter set of constitution, Q New Representing the updated parameter set, argmin Q Representing the abbreviation for the minimum (minimize) optimization algorithm (algorithm) for solving the parameter Q. Compared with other loss functions, the loss function is monotonic in the whole curve, and the larger the loss is, the larger the gradient is, so that the gradient descent counter-propagation and optimization can be facilitated.
Further, the parameter Q New Optimizing according to an Adam self-adaptive learning rate gradient descent method, wherein an Adam algorithm designs independent self-adaptive learning rates for different parameters by calculating first moment estimation and second moment estimation of a gradient, so that gradient descent parameter updating is realized, and the Adam algorithm is specifically carried out according to the following formula:
wherein,and->Respectively representing a first time average value and a second time variance value, m t And v t Respectively a first moment gradient momentum and a second moment gradient momentum, alpha 1 And alpha 2 For the respective attenuation coefficients, the values 0.9 and 0.999, Q are respectively taken New-1 Is relative to Q New Gamma is the self-defined learning rate, the upper and lower marks t represent the t-th iterative computation, and theta isThe minimum value for preventing the denominator from being 0 is usually 10e-8.
The new feature set in the third stepDirectly selecting data output by the most middle hidden layer of the automatic encoder as final selected characteristics, and forming a new data set S by using the representative characteristics 2 。
The training and parameter adjustment process of the SVM classifier in the fourth step also needs to calculate model errors by using a loss function so as to measure whether parameter adjustment is optimal; here, the Hinge Loss function (Hinge Loss) is selected to determine the error, defined as:wherein y is i For tag true value, ++>Is the distance of the predicted point to the separation hyperplane.
Further, the training parameter adjustment in the fourth step refers to adjusting two important parameters c (penalty factor) and gamma (gaussian kernel) in the SVM model, and the optimal parameters can be determined by using a ten-fold cross-validation (10-fold cross-validation) method, and the model parameter adjustment further has the following characteristics: the model solving method is an adaptive learning rate gradient descent method (Adam), the initial momentum is set to be 0.9, the initial step length (learning rate) is set to be 0.1, and the iteration period is set to be 1000.
Further, the olfactory pattern recognition method in the fifth step further includes the following features: when the machine olfaction system only acquires a new sample, repeating the first step and the second step to extract the characteristics, and then utilizing the SVM classifier acquired in the fourth step to realize the identification of the new sample; however, when the machine olfactory system is newly acquiring a large number of labeled samples, steps one through four are repeated to achieve retraining of the DA and SVM models to update the models.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a machine olfaction mode identification method combining a depth self-encoder and a support vector machine, which can automatically reduce the dimension and extract the characteristics by utilizing the self-encoder, and simultaneously adopts a simple and reliable SVM classifier in reality for identification, so that the method can finally solve the problems of large samples, high-dimension characteristics, multiple categories, long-term drift and the like, and improve the mode identification performance of a machine olfaction system. Compared with other methods for identifying the machine olfaction mode by directly using a deep neural network, such as a deep belief network, a multiple self-organizing neural network, a deep convolutional neural network and the like, the method avoids training a complex or high-dimensional classifier, and trains a relatively simple SVM classifier by adopting effective low-dimensional characteristics, so that the method has stronger practicability when facing to a machine olfaction system with a large sample in practical application.
Drawings
FIG. 1 is a flow chart of the steps for implementing the present invention.
Fig. 2 is a schematic diagram of an automatic encoder of the present invention.
FIG. 3 is a graph of test results for one example of the present invention.
Detailed Description
The invention is further elucidated below in connection with the drawings and the detailed description, without however limiting the scope of the invention to the embodiments described.
A machine olfaction mode identification method based on DA-SVM, as shown in figure 1, comprises the following steps
Step one, obtaining an original data set of an olfactory system, normalizing and manually labeling the data set, wherein the data set can be recorded as S 1 ={(x 1 ,y 1 ),(x 2 ,y 2 )……(x m ,y m ) (x) wherein i ,y i ) For the ith sample pair, x i Is the characteristic of the original data of the sample, y i The number of the samples is m, which is the number of the corresponding labels;
step two, constructing an automatic encoder, and eliminating S in the step one 1 Is (y) i ) And the remaining feature set (x i ) As input to the network, throughAfter multiple iterative training, a new feature set can be output
Step three, the characteristics obtained in the step two are processedAttaching the label (y) i ) Generating a new data set, which can be expressed as +.>
Step four, the new data set S of step three 2 And (3) sending the model error into a Support Vector Machine (SVM) for training, and obtaining parameters of an SVM classifier model after multiple parameter adjustment until the model error is reduced to a reasonable interval.
And fifthly, realizing the mode identification of the olfactory system by utilizing the SVM model parameters in the step four.
The data preprocessing in the first step adopts Min-Max function for normalization, and the original value x is mapped into the interval [0,1]]The processing can solve the dimension problem of different olfactory signals. Meanwhile, the label y in the step one i The single-heat coding mode is adopted, and the coding mode adopts the number of values of the gas category characteristics, so that the characteristics are represented by the number of dimensions.
In one example of the invention, in step one there are a total of 6 gases in the label, the first gas takes the form of a single thermal code [1,0,0,0,0,0], the second gas [0,1,0,0,0,0], and so on.
The depth automatic encoder algorithm framework in the second step is constructed according to the following form, as shown in fig. 2, and specifically comprises the following steps:
firstly, constructing a deep automatic encoder network comprising an input layer and an output layer, wherein n hidden layers (n is more than or equal to 2 and less than or equal to 20), initializing a network structure, determining the node number [128,6,64], namely 128 neurons are arranged on the input layer, the output layer comprises 6 neurons, and 64 neurons are hidden in the hidden layer;
secondly, coding dimension reduction: to input layer and first hiddenThe neuronal nodes of the layers are connected according to the formula f (x) =f (w i x i +b i ) Encoding the input layer and the first hidden layer, w i As a weight matrix, b i As bias item, f is the mapping function of coding, repeating the coding step until the middle hidden layer is connected;
finally, decoding and reconstructing: connecting the most intermediate hidden layer with the neurons of the subsequent hidden layer according to the formulaLayer-by-layer reconstruction is performed until it is connected to the last output layer, and the reconstruction process decodes a vector of the same size as the original size by the function g.
Further, in the updating training process of the network weight in the second step, the adopted loss function (LossFunction) is a cross entropy loss functionWherein (1)>For prediction output, y is the tag true value. According to the loss minimization criterion->To continuously optimize the parameter Q and finally reach the Q of the optimal solution New Symbol Q represents an ownership value w i And bias b i Parameter set of constitution, Q New Representing the updated parameter set. Compared with other loss functions, the loss function is monotonic in the whole curve, and the larger the loss is, the larger the gradient is, so that the gradient descent counter-propagation and optimization can be facilitated.
Further, the parameter Q in the second step New Optimizing according to an Adam self-adaptive learning rate gradient descent method, wherein an Adam algorithm records an expected value alpha of a first moment 1 And the expected value of square of the second moment alpha 2 The parameter update for gradient descent is performed according to the following formula:
wherein,and->Represents the average value of the first moment and the variance value of the second moment respectively, Q New-1 Is relative to Q New Gamma is a self-defined learning rate, and θ is a minimum value for preventing the denominator from being 0, typically 10e-8.
In one embodiment of the present invention, the construction of the self-encoder in the second step may be implemented based on a deep learning algorithm framework of a keras, which is an open source artificial neural network library written based on Python language, and is suitable for model design, debugging, evaluation, application, visualization, etc. of the machine olfactory system of the present invention.
The new feature set in the third stepDirectly selecting data output by the most middle hidden layer of the automatic encoder as final selected characteristics, and forming a new data set S by using the representative characteristics 2 。
In the training process of the SVM classifier in the fourth step, a folding Loss function (Hinge Loss) is adopted to determine an error, and the error is defined as:wherein y is i For tag true value, ++>Distance from the predicted point to the separation hyperplane;
further, the training parameter adjustment in the fourth step refers to adjusting two important parameters c (penalty factor) and gamma (gaussian kernel) in the SVM model, and the optimal parameters can be determined by using a ten-fold cross-validation (10-fold cross-validation) method, and the model parameter adjustment further has the following characteristics: the model solving method is an adaptive learning rate gradient descent method (Adam), the initial momentum is set to be 0.9, the initial step length (learning rate) is set to be 0.1, and the iteration period is set to be 1000.
In a preferred embodiment of the present invention, the training and parameter tuning of the SVM classifier model in the fourth step can be implemented by using the Scikit-learn machine learning tool kit, and only the new data set S obtained in the third step is needed 2 Sending the test result into the tool bag for debugging. In the parameter adjusting operation of the olfactory SVM classifier, a data set is divided into 10 groups by ten-fold cross verification, 9 groups are taken to form a training set of a model, the rest 1 group is taken as a verification set of the model, and the result of the cross verification selects the average value of the accuracy of 10 classifiers on the verification set, so that the overfitting of the model can be prevented.
The olfactory pattern recognition method in the fifth step further has the following characteristics: when the machine olfaction system only acquires a new sample, repeating the first step and the second step to extract the characteristics, and then utilizing the SVM classifier acquired in the fourth step to realize the identification of the new sample; however, when the machine olfactory system is newly acquiring a large number of labeled samples, the steps one through four are repeated, thereby implementing retraining of the DA and SVM models to update the models.
To better illustrate the overall effect of the invention, a published machine olfactory database UCI (http:// archive. Ics. UCI. Edu/ml/data/gas+sensor+array+drift+data) was also selected for test verification, which took 3 years to collect 13910 samples, and collected 6 analytes including acetone, ethanol, acetaldehyde, ethylene, ammonia, and toluene, each sample being a feature vector containing 128 dimensions. By using the database, the invention also refers to the operation mode of the documents [ Vergara A, vembu S, ayhan T, et al, chemical gas sensor drift compensation using classifier ensembles and operators B: chemical,2012,166:320-329], and by dividing all data sets into 10 batches, 4 different pattern recognition methods are tested in comparison, as shown in FIG. 3, test 1 is a conventional SVM recognition algorithm, test 2 is a recognition algorithm for increasing bagging, test 3 is a DA-SVM pattern recognition algorithm of the invention, and test 4 is a recognition algorithm based on a random forest model. The hardware platform for the test is a portable computer platform, and the platform is provided with a Graphic Processor (GPU) of GTX 1060Ti and a memory RAM of 6.0GB, so that the training requirements of all the tests and algorithms can be met.
From the final measured results of fig. 3, it can be observed that: the effect of the conventional SVM classifier of test 1 and the random forest classifier of test 4 in gas recognition is not equal, the average correct rate index is about 84% and 82%, and the worst correct rate is about 68% and 59%, respectively; the bagging mode identification algorithm of the test 2 has the worst performance, particularly the worst stability, for example, the difference of the precision of the batch 2 and the batch 10 can reach more than 80 percent; the average accuracy of the DA-SVM classifier adopted in the test 3 of the invention is as high as 96%, and compared with other algorithms, the DA-SVM classifier has great advantages, and in the test, the established DA part can automatically reduce 128 dimension characteristics of a single sample to 64, and the worst performance in the result still keeps the accuracy of 90%.
The foregoing is an example of the present invention and is not intended to limit the invention. All equivalents and alternatives falling within the scope of the invention are intended to be included within the scope of the invention. What is not elaborated on the invention belongs to the prior art which is known to the person skilled in the art.
Claims (8)
1. The machine olfactory pattern recognition method based on the DA-SVM is characterized by comprising the following steps of:
step one, acquiring an original data set S1 of an olfactory system, normalizing and labeling the data set;
constructing a depth automatic encoder, removing a tag column of the original data set S1 in the first step, taking the rest data as the input of a network, and outputting a new characteristic data set after repeated iterative training; firstly, constructing a depth automatic encoder network containing an input layer, an output layer and n hidden layers, wherein n is more than or equal to 2 and less than or equal to 20, initializing the network structure, and determining the node number [128,6,64], namely 128 neurons are arranged on the input layer, the output layer contains 6 neurons and 64 neurons are arranged on the hidden layer;
secondly, coding dimension reduction: connecting the input layer with the neuron nodes of the first hidden layer according to the formula f (x) =f (w i x i +b i ) Encoding the input layer and the first hidden layer, w i As a weight matrix, b i As bias term, f is the mapping function of the code, x i Repeating the encoding step for characteristics of the sample raw data until the middle hidden layer is connected;
finally, decoding and reconstructing: connecting the most intermediate hidden layer with the neurons of the subsequent hidden layer according to the formulaPerforming layer-by-layer reconstruction until the layer-by-layer reconstruction is connected to a final output layer, wherein g is a decoded mapping function, the superscript T represents the transposition of the vector, and the reconstruction process decodes a vector with the same size as the original size according to the function g;
thirdly, attaching the new characteristic data set obtained in the second step with the label in the first step again to generate a new data set S2;
step four, the new data set S2 in the step three is sent into a support vector machine model for training, and parameters of an SVM classifier model are obtained through multiple parameter adjustment until the model error is reduced to a reasonable interval;
fifthly, realizing the mode identification of the olfactory system by utilizing the SVM model parameters of the fourth step; when the machine olfaction system only acquires a new sample, repeating the first step and the second step to extract the characteristics, and then utilizing the SVM classifier acquired in the fourth step to realize the identification of the new sample; however, when the machine olfactory system is newly acquiring a large number of labeled samples, steps one through four are repeated, thereby enabling retraining of the DA and SVM models to update the models.
2. The method of claim 1, wherein the step one uses Min-Max function for normalization, and maps the original value to the standard value in interval [0,1 ].
3. The method for identifying a machine olfactory pattern based on a DA-SVM of claim 1, wherein the label in said step one is in a form of a single thermal code.
4. The machine olfactory pattern recognition method based on DA-SVM according to claim 1, wherein the iterative training process in the second step uses a loss function to measure the error of iterative calculation, and finally obtains the optimal parameters; the cross entropy loss function selected by the loss function is Wherein (1)>A predicted value of a tag true value tag y; according to the loss minimization criterion->To continuously optimize the parameter Q and finally reach the Q of the optimal solution New The symbol Q represents the ownership value w i And bias b i Parameter set of constitution, Q New Representing the updated parameter set, argmin Q Representing the abbreviation for solving the minimum optimization algorithm for parameter Q.
5. The method for identifying a machine olfactory pattern based on a DA-SVM according to claim 4, wherein said parameter Q New Optimizing according to an Adam self-adaptive learning rate gradient descent method, wherein an Adam algorithm designs independent self-adaptive learning rates for different parameters by calculating first moment estimation and second moment estimation of a gradient, so that gradient descent parameter updating is realized, and the Adam algorithm is specifically carried out according to the following formula:
wherein,and->Respectively representing a first time average value and a second time variance value, m t And v t Respectively a first moment gradient momentum and a second moment gradient momentum, alpha 1 And alpha 2 For the respective attenuation coefficients, the values 0.9 and 0.999, Q are respectively taken New-1 Is relative to Q New Gamma is the self-defined learning rate, the upper and lower marks t represent the t-th iterative calculation, and theta is the minimum value for preventing the denominator from being 0, and generally 10e-8 is taken.
6. The method as claimed in claim 1, wherein the new feature set in the third step directly selects the data outputted from the middle hidden layer of the automatic encoder as the final selected feature, and uses the representative features to form the new data set S 2 。
7. The machine olfactory pattern recognition method based on the DA-SVM of claim 1, wherein in the training process of the SVM classifier in the fourth step, a folding loss function is used to determine an error, which is defined as:wherein y is i For tag true value, ++>Is the distance of the predicted point to the separation hyperplane.
8. The machine olfactory pattern recognition method based on DA-SVM of claim 5, wherein the training parameter adjustment in step four is to adjust two important parameter penalty factors and Gaussian kernel in SVM model, and the optimal parameters are determined by ten-fold cross validation, and the model parameter adjustment further has the following characteristics: the model solving method is an adaptive learning rate gradient descent method, the initial momentum is set to be 0.9, the initial step length is set to be 0.1, and the iteration period is set to be 1000.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010161893.3A CN111340132B (en) | 2020-03-10 | 2020-03-10 | Machine olfaction mode identification method based on DA-SVM |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010161893.3A CN111340132B (en) | 2020-03-10 | 2020-03-10 | Machine olfaction mode identification method based on DA-SVM |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111340132A CN111340132A (en) | 2020-06-26 |
CN111340132B true CN111340132B (en) | 2024-02-02 |
Family
ID=71182212
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010161893.3A Active CN111340132B (en) | 2020-03-10 | 2020-03-10 | Machine olfaction mode identification method based on DA-SVM |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111340132B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112367338A (en) * | 2020-11-27 | 2021-02-12 | 腾讯科技(深圳)有限公司 | Malicious request detection method and device |
CN113378935B (en) * | 2021-06-11 | 2022-07-01 | 中国石油大学(华东) | Intelligent olfactory sensation identification method for gas |
CN113506596B (en) * | 2021-09-08 | 2022-11-15 | 汉王科技股份有限公司 | Method and device for screening olfactory receptor, model training and identifying wine product |
CN113808197A (en) * | 2021-09-17 | 2021-12-17 | 山西大学 | Automatic workpiece grabbing system and method based on machine learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1482453A (en) * | 2003-07-11 | 2004-03-17 | 华东理工大学 | Machine olfaction odor distinguishing method based on modularized composite neural net |
CN103544392A (en) * | 2013-10-23 | 2014-01-29 | 电子科技大学 | Deep learning based medical gas identifying method |
CN105913079A (en) * | 2016-04-08 | 2016-08-31 | 重庆大学 | Target domain migration extreme learning-based electronic nose heterogeneous data identification method |
CN108760829A (en) * | 2018-03-20 | 2018-11-06 | 天津大学 | A kind of electronic nose recognition methods based on bionical olfactory bulb model and convolutional neural networks |
-
2020
- 2020-03-10 CN CN202010161893.3A patent/CN111340132B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1482453A (en) * | 2003-07-11 | 2004-03-17 | 华东理工大学 | Machine olfaction odor distinguishing method based on modularized composite neural net |
CN103544392A (en) * | 2013-10-23 | 2014-01-29 | 电子科技大学 | Deep learning based medical gas identifying method |
CN105913079A (en) * | 2016-04-08 | 2016-08-31 | 重庆大学 | Target domain migration extreme learning-based electronic nose heterogeneous data identification method |
CN108760829A (en) * | 2018-03-20 | 2018-11-06 | 天津大学 | A kind of electronic nose recognition methods based on bionical olfactory bulb model and convolutional neural networks |
Non-Patent Citations (2)
Title |
---|
Souhir BEDOUI, Hekmet SAMET and Mounir SAMET.Gases Identification with Support Vector Machines Technique (SVMs).《1st International Conference on Advanced Technologies for Signal and Image Processing - ATSIP'2014》.2014,全文. * |
基于在线支持向量机的电子鼻模式识别算法;余炜等;《西北大学学报》;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111340132A (en) | 2020-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111340132B (en) | Machine olfaction mode identification method based on DA-SVM | |
CN108228716B (en) | SMOTE _ Bagging integrated sewage treatment fault diagnosis method based on weighted extreme learning machine | |
CN111061843B (en) | Knowledge-graph-guided false news detection method | |
CN103544392B (en) | Medical science Gas Distinguishing Method based on degree of depth study | |
Yan et al. | Correcting instrumental variation and time-varying drift: A transfer learning approach with autoencoders | |
CN111443165B (en) | Odor identification method based on gas sensor and deep learning | |
CN111368920B (en) | Quantum twin neural network-based classification method and face recognition method thereof | |
CN111126386B (en) | Sequence domain adaptation method based on countermeasure learning in scene text recognition | |
KR20180125905A (en) | Method and apparatus for classifying a class to which a sentence belongs by using deep neural network | |
CN111103325B (en) | Electronic nose signal drift compensation method based on integrated neural network learning | |
CN112418395B (en) | Gas sensor array drift compensation method based on generation countermeasure network | |
CN111046961B (en) | Fault classification method based on bidirectional long-time and short-time memory unit and capsule network | |
CN110880369A (en) | Gas marker detection method based on radial basis function neural network and application | |
CN111309909B (en) | Text emotion classification method based on hybrid model | |
CN112529638B (en) | Service demand dynamic prediction method and system based on user classification and deep learning | |
Liu et al. | Review on algorithm design in electronic noses: Challenges, status, and trends | |
Qu et al. | Open-set gas recognition: A case-study based on an electronic nose dataset | |
CN112466284B (en) | Mask voice identification method | |
CN113740381A (en) | Cross-domain subspace learning electronic nose drift compensation method based on manifold learning | |
CN112580539A (en) | Long-term drift suppression method for electronic nose signals based on PSVM-LSTM | |
CN113095354A (en) | Unknown radar target identification method based on radiation source characteristic subspace knowledge | |
CN114998731A (en) | Intelligent terminal navigation scene perception identification method | |
CN115579068A (en) | Pre-training and deep clustering-based metagenome species reconstruction method | |
CN115049026A (en) | Regression analysis method of space non-stationarity relation based on GSNNR | |
CN112465054B (en) | FCN-based multivariate time series data classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |