CN103544392B - Medical science Gas Distinguishing Method based on degree of depth study - Google Patents

Medical science Gas Distinguishing Method based on degree of depth study Download PDF

Info

Publication number
CN103544392B
CN103544392B CN201310503402.9A CN201310503402A CN103544392B CN 103544392 B CN103544392 B CN 103544392B CN 201310503402 A CN201310503402 A CN 201310503402A CN 103544392 B CN103544392 B CN 103544392B
Authority
CN
China
Prior art keywords
layer
parameter
represent
sigma
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310503402.9A
Other languages
Chinese (zh)
Other versions
CN103544392A (en
Inventor
刘启和
陈雷霆
蔡洪斌
邱航
蒲晓蓉
胡晓楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201310503402.9A priority Critical patent/CN103544392B/en
Publication of CN103544392A publication Critical patent/CN103544392A/en
Application granted granted Critical
Publication of CN103544392B publication Critical patent/CN103544392B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of medical science Gas Distinguishing Method based on degree of depth study, specifically used original frequency response signal, it is carried out simple normalization, then input stack autoencoder network, by successively extracting, final study obtains the abstract characteristics of initial data, whole network externally shields the processes such as extraction feature, dimensionality reduction, suppression drift, finally add a classification layer at network so that these features can be directly entered grader and classify simultaneously.Training process is divided into two steps of pre-training and fine setting, can be effectively improved the learning capacity of network, and after having trained, new samples input network can directly obtain the classification of prediction.The method of the present invention can automatically extract the effective distinguishing characteristic of medical science gas, the steps such as feature extraction, feature selection and suppression drift is merged into one, greatly simplifies the complexity of traditional method, improve the efficiency of gas detecting and identification.

Description

Medical science Gas Distinguishing Method based on degree of depth study
Technical field
The invention belongs to field of biomedicine technology, be specifically related to medical science Gas Distinguishing Method.
Background technology
Machine olfaction is a kind of artificial intelligence system, and its ultimate principle is: scent molecule is adsorbed by sensor array, produces The signal of telecommunication, then uses various signal processing technology to extract feature, then judges through computer PRS, complete gas The work such as body identification and measurement of concetration.Electric nasus system is i.e. typical case's application of machine olfaction, plays at medical domain Very important effect, such as, diagnose some disease, identifies the bacterial species in blood, detects the gas harmful to respiratory system Deng.
Sensing gas detecting has important application with identification at medical domain, such as, can utilize Electronic Nose equipment collection Sample data in oral cavity, thoracic cavity, blood, then use various signal processing technology with analyzing and processing, then by computer patterns Identification system judges, and can complete the tasks such as such as medical diagnosis on disease, pathogenic bacteria identification, determination of drug concentration.
Traditional gas detecting and recognition methods generally comprise the steps such as feature extraction, feature selection, finally by classification, The means such as recurrence or cluster reach predeterminated target.For needing the equipment of life-time service, it is necessary to use effective sensor Drift-compensation techniques suppresses the impact brought of drifting about.In medical application, owing to these operated in accordance with conventional methods are complicated, efficiency is relatively Low, generally require between accuracy and real-time compromise.
The data that sensor sample obtains can be regarded as a kind of time series signal, and this signal structure is complicated, it is difficult to Explain, and often dimension is the highest.In order to preferably be identified, it usually needs according to the various attribute design features of signal, Again through feature selection, such as dimensionality reduction etc., then they are classified as the input of sorting algorithm, such as support vector machine.
As time goes on sensor drift refers to, the response of sensor can occur slow and random change.This Planting change causes PRS can not be advantageously applied to follow-up sample to be tested in the pattern currently learning to obtain, from And make gas detecting be gradually lowered with the accuracy rate of identification.In medical application, in order to suppress the impact of sensor drift, typically There are two kinds of measures: (1) develop effective drift-compensation techniques.This process is often separate with feature extraction, and operates multiple Miscellaneous, efficiency is the lowest;(2) less due to the drift degree in the short time, update to protect by periodically carrying out Electronic Nose equipment safeguarding Card sampled data is reliable and stable, but this substantially increases cost undoubtedly, decreases the service life of equipment.
It practice, some designs good feature good robustness to drift.Consider from this angle, Ke Yijian Single suppress sensor drift by extracting more preferable feature, thus by two Process fusions together.Degree of depth learning art Comprised the artificial neural network of multiple hidden layer by foundation, simulating human brain is analyzed study and explains data, permissible Obtain the expression of the high abstraction of data, be good at finding pattern potential in data, the most applicable when solving the problems referred to above.
At document " M.Trincavelli, S.Coradeschi, A.Loutfi, B.P.Thunberg, Direct identification ofbacteria in blood culture samples using an electronic Nose, IEEE Trans Biomedical Engineering57 (12), 2,884 2890,2010 " propose in a kind of effectively The method of the pathogen in identification blood cultivation specimen, first the method is obtained sample data, then by the sampling of Electronic Nose equipment Carry out feature extraction and dimensionality reduction, finally complete classification, wherein in characteristic extraction part, for the total bulk wave of signal by support vector machine Shape, have employed steady-state response and response two kinds of feature extracting methods of derivative.
Sometimes for obtaining higher recognition accuracy in complicated problem, need to analyze signal waveform more meticulously, Extract the higher feature of dimension.Document " A.Vergaraa, S.Vembua, T.Ayhanb, M.A.R.Vitae, M.L.H.Vitae and R.Huertaa.Chemical gas sensor drift compensation using classifer ensembles,Sensors andActuators B:Chemical,vol.166-167,pp.320-329, May 2012 " have studied the recognition accuracy of the gas such as ethanol how improved under drift, devise 8 kinds of different features.
In the case of sorting algorithm determines, the recognition accuracy of gas is just solely dependent upon the fine or not degree of feature.Compare Frequency response values in primary signal, designs good feature and dimension redundancy can be greatly reduced, highlighted not simultaneously Difference between generic, is commonly available reasonable recognition accuracy.
But the feature of hand-designed is often for some specific application scenario (gas type, sensor type, the external world Environment etc.), thus there is extremely strong purposiveness, versatility is very poor.And due to the cross-sensitivity of sensor, final extraction Intrinsic dimensionality is the highest, it usually needs find efficient dimension-reduction algorithm, such as PCA, LDA etc..If in the application that certain is new, The recognition accuracy using existing feature does not all reach requirement, it is necessary to design more preferable feature, and this further increases undoubtedly The complexity of this task.
The most efficient method of suppression drift at present is to carry out drift compensation by cycle re-graduation, and its general idea is searching one Plant linear transformation sensor response is normalized so that grader may be directly applied to the data after these conversion.
CN1514239A disclose a kind of realize gas sensor drift detection and revise method.The method is by comprehensive Utilize pivot analysis and wavelet transformation technique, improve sensitivity and the accuracy of sensor drift detection.To the drift detected Displacement sensor, uses modification method based on self adaptation drift model, and sensor output is carried out on-line amending, and drift about mould simultaneously Type can carry out online updating, thus reaches to improve sensing system reliability, extends the purpose in system service life.
At document " T.Artursson, T.I.P.M.and M.Holmberg,“Drift correction for gas sensors using multivariatemethods,” J.Chemom., vol.14, no.5 6, pp.711 723,2000 " in one reference gas carry out approximate evaluation drift bearing, so Response to gas to be analyzed carries out following correction afterwards.
But the moving tracks that these methods assume that sensor is linear, this point is not confirmed, and Generally require the ginseng that a kind of chemical property is the most stable and the most similar on sensor row is to gas to be analyzed Examining gas, this condition is undoubtedly the harshest in actual applications.In addition, these methods operate in actual applications Considerably complicated, efficiency is the lowest.
Summary of the invention
It is an object of the invention to the complexity simplifying traditional gas detection with recognition methods, exploitation one is simpler, higher Effect, gas detecting more robust for sensor drift and recognition methods.
The present invention program uses original frequency response signal, and it is carried out simple normalization, and then input stack is certainly Coding network, by successively extracting, final study obtains the abstract characteristics of initial data, and whole network externally shields extraction spy Levy, dimensionality reduction, suppress the processes such as drift, finally add a classification layer at network so that these features can be directly entered simultaneously Grader is classified.Training process is divided into two steps of pre-training and fine setting, can be effectively improved the learning capacity of network, After having trained, new samples input network can directly obtain the classification of prediction.
The technical scheme is that a kind of medical science Gas Distinguishing Method based on degree of depth study, specifically comprise the following steps that
Step 1. data normalization, is provided with m sample, by each sample by following form tissue, v=[s1,s2···, st], wherein, siBeing i-th frequency response values, one has t response value, whole gas data collection and corresponding label can be with table It is shown as:
V = [ v 1 T , v 2 T , ... , v i T , ... , v m T ]
Y=[y1,y2,…,yi,…,ym]T
T represents the transposition of vector, and in matrix V, the i-th row vi represents i-th sample, and in Y, i-th element is corresponding sample Class label;
Utilize formulaData set is normalized to [0,1],
Wherein, Vi,jRepresenting the i-th frequency response values of jth sample, L is normalized lower bound, and its value is 0, and U is for returning One upper bound changed, its value is 1, maxiAnd miniFor the maximum and minimum value of a line every in matrix, the data set a after normalization(0)Represent;
Step 2. pre-training stack autoencoder network, in described stack autoencoder network, v, h and y represent input respectively Layer, hidden layer and output layer, W(i)For connecting the weight matrix of each layer, b(i)Bias vector for hidden layer;
Step 2.1. training ground floor, i.e. first automatic coding machine, object function is:
J = 1 2 m Σ i ( v i - v ^ i ) + λ 2 Σ i Σ j W i j 2 + β Σ j [ ρ log ρ p j + ( 1 - ρ ) log 1 - ρ 1 - p j ] ;
Wherein, Section 1 is reconstructed error item, represents input and the difference degree of output, wherein, viRepresent step 1 normalizing I-th input sample after change,Represent sample viBy output at output layer after network;Section 2 is referred to as weight attenuation term, It is the amplitude in order to reduce weight, prevents over-fitting, wherein, WijRepresent current layer jth unit and next layer of i-th unit it Between weights, Section 3 is sparse penalty term, and wherein, pj represents the average activation of Hidden unit j, λ, β and ρ and is and presets Parameter;M is number of samples, and J represents the object function of first automatic coding machine;
Optimization object function, for the automatic coding machine of a n-layer, concrete optimization step is:
Step 2.1.1. random initializtion parameter W(i)、b(i), initialize matrix or the vector of complete zero, i.e. Δ W(i)=0, Δ b(i)=0;
Step 2.1.2. makesFor each sample, back-propagation algorithm is utilized to calculate partially DerivativeDetailed process is as follows:
Feedforward calculates, and obtains each layer excitation a(i), computing formula is a(i)=σ (W(i)a(i-1)+b(i)), wherein,Being sigmoid function, output area is [0,1];
For output layer, calculate residual error: δ(n)=-(v-a(n))·σ′(z(n)), wherein, " " represents dot product, wherein z(n)=W(n-1)a(n-1)+b(n-1), σ ' represents the derivative of σ (x);
For l=n-1, n-2 ..., each layer of 2, calculate:
Calculating local derviation numerical value:
WhereinWith(W, b) to W all to represent J(i)Local derviation,WithAll represent (W, b) to b for J(i)Local derviation;
The partial derivative obtained is added to Δ W by step 2.1.3. respectively(i),Δb(i)On, i.e.
Step 2.1.4. undated parameter W(i),b(i), Wherein, α is learning rate;
Step 2.1.5. repetition step 2.1.2, to step 2.1.4, is gradually reduced the value of object function, until the threshold set Value, obtain encoding layer parameter (W, b) and the parameter of decoding layer
Step 2.2. abandons decoding layer after having trainedBy the parameter of coding layer, (W, b) as stack own coding net The initial parameter of corresponding level, i.e. W in network(1)=W, b(1)=b;
Step 2.3. calculates the hidden layer excitation of current automatic coding machine: a(1)=σ (W(1)a(0)+b(1));
Step 2.4. is at excitation a(1)The upper training second layer, i.e. second automatic coding machine, wherein, first automatic encoding The hidden layer of machine is as the input layer of second automatic coding machine, and training process is identical with the process of training ground floor, but input Become a(1).Train initial parameter W obtaining the network second layer(2), b(2)And hidden layer excitation a(2)
Step 2.5, for third layer to n-th layer, repeats the process of step 2.1 to 2.4, can obtain each hidden layer Initial parameter, and finally give the excitation a of the n-th hidden layer(n), the excitation of this layer also serves as the defeated of softmax layer simultaneously Enter, be denoted as aS
The a that step 2.6. step 2.5 obtainsSWith last layer of label Y training network, i.e. softmax grader, obtain Initial parameter W to last layerS
A is represented respectively with x and θSAnd WS, and assume total k kind, for i-th sample, it was predicted that the class mark obtained Probability for jth class is:
P ( y i = j | x i ; θ ) = expθ j T x i Σ l = 1 k expθ l T x i
Wherein,Represent the jth row in θ, represent and connect the row of weights between jth output unit and all input blocks Vector.L is constant variables and 1≤l≤k, and k is the input a of softmax layerSAnd initial parameter W of softmax graderS's Classification number, xiInput value for the softmax layer of i-th sample.Final output is a probability column vector P, jth component Represent that forecast sample is judged as the probability of jth classification, utilize and minimize loss function to train weight matrix θ:
J ( θ ) = - 1 m [ Σ i = 1 m Σ j = 1 k 1 { y i = j } log P ( y i = j ) ] + λ 2 Σ i = 1 m Σ j = 1 n θ i j 2
Wherein, logP (yi=j) represent a certain probit P (yi=j) natural logrithm,It is indicator function, works as bracket In condition be true time value be 1, no person is 0;M is number of samples, and n is the number of plies of automatic coding machine;
Step 3 is finely tuned, and i.e. network is regarded as an entirety, calculates the partial derivative of each layer parameter with back propagation, then uses Gradient descent method iteration optimization, detailed process is as follows:
Step 3.1. uses formula a(i)=σ (W(i)a(i-1)+b(i)) carry out feedforward calculating, obtain the excitation a of each layer(i)
Step 3.2. calculates softmax layer parameter WSPartial derivative:Wherein, P is step 2.6 Calculated conditional probability vector;
Step 3.3. calculates the residual error of last hidden layer and is:Wherein (W, b) to a to represent J(n)Local derviation, a(n)It it is the excitation of the n-th hidden layer;
Step 3.4. for l=n-1, n-2 ..., each layer of 2, calculate:
The partial derivative of the step 3.5. each hidden layer of calculating:
Step 3.6. utilizes the partial derivative that above-mentioned steps obtains to update each layer parameter:
W S ′ = W S - α ( 1 m Σ ▿ W S J + λ θ ) W ( i ) ′ = W ( i ) - α ( 1 m Σ ▿ W ( i ) J ) b ( i ) ′ = b ( i ) - α ( 1 m Σ ▿ b ( i ) J ) ;
Wherein(W, b) to W to represent JSLocal derviation, WSFor the initial parameter of softmax grader,Represent J (W, b) to W(i)Local derviation,(W, b) to b to represent J(i)Local derviation;
Step 3.7. repeat the above steps, reduces the value of object function by iteration, until the threshold value set;
Step 4. predicts the affiliated classification of forecast sample, and detailed process is as follows:
Step 4.1. is by forecast sample vpNormalize to [0,1];
Step 4.2., for hidden layer, uses formula a(i)=σ (W(i)a(i-1)+b(i)) successively carry out feedforward calculating, and then obtain The input a of softmax layerS
Step 4.3. is according to probability calculation formula design conditions probability vector P of step 2.5, wherein, and maximum component pair The classification answered is prediction classification belonging to this sample.
Above-mentioned various in i and j all represent for counting constant parameter.
The method have the benefit that the present invention devises and adapt to the network structure that medical science gas signal processes, by the sample of input The most successively extract feature, eventually enter into classification layer characteristic dimension relatively low and also for drift have good robustness.With Traditional feature extracting method is compared, and this method can automatically extract the effective distinguishing characteristic of medical science gas, by feature extraction, feature Select and suppress the steps such as drift to merge into one, greatly simplifiing the complexity of traditional method, improve gas detecting With the efficiency identified.It is in particular in following several respects:
(1), except training softmax grader in step 2, other processes are not required to class label, therefore extract feature Process be unsupervised;If sample is rare, it is possible to use substantial amounts of without owning before mark sample training classification layer Layer, is finally finely adjusted with a small amount of mark sample;
(2), from network structure it will be seen that the number of unit of each layer is all few than preceding layer, therefore eventually enter into grader Input dimension relatively low, much smaller than being originally inputted, can be regarded as a reduction process;
(3) extract feature to be automatically performed, do not introduce artificial intervention, it is thus eliminated that the answering of hand-designed feature Miscellaneous degree, has wide applicability simultaneously;
(4) the feature extracted has extraordinary robustness to drift, can effectively improve the gas detecting under drift and knowledge Other accuracy rate, extends the service life of equipment.
Accompanying drawing explanation
Fig. 1 is the medical science Gas Distinguishing Method schematic flow sheet of the embodiment of the present invention.
Fig. 2 is the stack autoencoder network for medical science gas identification of the embodiment of the present invention.
Fig. 3 is the automatic coding machine comprising a hidden layer of the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawings embodiments of the invention are described further.
The substantially flow process of the Gas Distinguishing Method of the present invention is as shown in Figure 1:
Step 1. data normalization, is provided with m sample, by each sample by following form tissue, v=[s1,s2..·, st], wherein, siBeing i-th frequency response values, one has t response value, whole gas data collection and corresponding label can be with table It is shown as:
V = [ v 1 T , v 2 T , ... , v i T , ... , v m T ]
Y=[y1,y2,…,yi,…,ym]T
T represents the transposition of vector, the i-th row v in matrix ViRepresenting i-th sample, in Y, i-th element is corresponding sample Class label;
Utilize formulaData set is normalized to [0,1],
Wherein, Vi,jRepresenting the i-th frequency response values of jth sample, L is normalized lower bound, and its value is 0, and U is for returning One upper bound changed, its value is 1, maxiAnd miniFor the maximum and minimum value of a line every in matrix, the data set a after normalization(0)Represent.
Step 2. pre-training stack autoencoder network, in described stack autoencoder network, v, h and y represent input respectively Layer, hidden layer and output layer, W(i)For connecting the weight matrix of each layer, b(i)Bias vector for hidden layer.
The present invention uses network structure similar to Figure 2, according to the difference of specific tasks, thus it is possible to vary the number of plies of network And the number of unit of each layer, therefore corresponding parametric form also can change.
Such network often level is very deep, and parameter is various, it is difficult to directly train, therefore initially with the method for pre-training Successively train the parameter of each layer.Compared with random initializtion, pre-training can make each layer parameter be positioned in parameter space preferably Position on.
Except softmax layer is used for classifying, in network, remaining part can be regarded as several single hidden layer automatic coding machine Stacking, wherein the output of preceding layer is connected with the input of later layer.This automatic coding machine is by reconstruct (representing with symbol ^) Input obtains the excitation of Hidden unit, and as the character representation being originally inputted, as shown in Figure 3.
Each automatic coding machine only retains the parameter of coding layer after having trained, i.e. W and b, as stack autoencoder network The initial parameter of middle corresponding level, detailed process is as follows:
Step 2.1. training ground floor, i.e. first automatic coding machine, object function is:
J = 1 2 m Σ i ( v i - v ^ i ) + λ 2 Σ i Σ j W i j 2 + β Σ j [ ρ l o g ρ p j + ( 1 - ρ ) l o g 1 - ρ 1 - p j ] ;
Wherein, Section 1 is reconstructed error item, represents input and the difference degree of output, wherein, viRepresent step 1 normalizing I-th input sample after change,Represent sample viBy output at output layer after network;Section 2 is referred to as weight attenuation term, It is the amplitude in order to reduce weight, prevents over-fitting, wherein, WijRepresent current layer jth unit and next layer of i-th unit it Between weights, Section 3 is sparse penalty term, wherein, pjRepresent the average activation of Hidden unit j, λ, β and ρ to be and preset Parameter, this purpose is the average activation making all unit of hidden layer close to several ρ the least, say, that the most few Number Hidden unit is activated.M is number of samples, and J represents the object function of first automatic coding machine.
Use gradient descent method optimization object function, wherein during iteration, need to calculate the partial derivative of each parameter, this calculating Process is completed by back-propagation algorithm (backpropagation).
Optimization object function, for the automatic coding machine of a n-layer, concrete optimization step is:
Step 2.1.1. random initializtion parameter W(i)、b(i), initialize matrix or the vector of complete zero, i.e. Δ W(i)=0, Δ b(i)=0;
Step 2.1.2. makesFor each sample, back-propagation algorithm is utilized to calculate partially DerivativeDetailed process is as follows:
Feedforward calculates, and obtains each layer excitation a(i), computing formula is a(i)=σ (W(i)a(i-1)+b(i)), wherein,Being sigmoid function, output area is [0,1];
For output layer, calculate residual error: δ(n)=-(v-a(n))·σ′(z(n)), wherein, " " represents dot product, wherein z(n)=W(n-1)a(n-1)+b(n-1), σ ' represents the derivative of σ (x);
For l=n-1, n-2 ..., each layer of 2, calculate:
Calculating local derviation numerical value:
WhereinWith(W, b) to W all to represent J(i)Local derviation,WithAll represent (W, b) to b for J(i)Local derviation.
The partial derivative obtained is added to Δ W by step 2.1.3. respectively(i),Δb(i)On, i.e.
Step 2.1.4. undated parameter W(i),b(i), Wherein, α is learning rate.
Step 2.1.5. repetition step 2.1.2, to step 2.1.4, is gradually reduced the value of object function, until the threshold set Value.Now can obtain encoding layer parameter (W, b) and the parameter of decoding layer
Step 2.2. abandons decoding layer after having trainedBy the parameter of coding layer, (W, b) as stack own coding net The initial parameter of corresponding level, i.e. W in network(1)=W, b(1)=b.
Step 2.3. calculates the hidden layer excitation of current automatic coding machine: a(1)=σ (W(1)a(0)+b(1))。
Step 2.4. is at excitation a(1)The upper training second layer, i.e. second automatic coding machine, wherein, first automatic encoding The hidden layer of machine is as the input layer of second automatic coding machine, and training process is identical with the process of training ground floor, but input Become a(1).Train initial parameter W obtaining the network second layer(2), b(2)And hidden layer excitation a(2)
Step 2.5, for third layer to n-th layer, repeats the process of step 2.1 to 2.4, can obtain each hidden layer Initial parameter, and finally give the excitation a of the n-th hidden layer(n), the excitation of this layer also serves as the defeated of softmax layer simultaneously Enter, be denoted as aS;Automatic coding machine in the present embodiment is 3 layers.
The a that step 2.6. step 2.5 obtainsSWith last layer of label Y training network, i.e. softmax grader, obtain Initial parameter W to last layerS
It is that Logistics returns the popularization in many classification problems that softmax returns.Convenient in order to represent, divide with x and θ Do not represent aSAnd WS, and assume total k kind.For i-th sample, it was predicted that the class obtained is designated as the probability of jth class and is:
P ( y i = j | x i ; θ ) = expθ j T x i Σ l = 1 k expθ l T x i
Wherein,Represent the jth row in θ, represent and connect the row of weights between jth output unit and all input blocks Vector, l is constant variables and 1≤l≤k, and k is the input a of softmax layerSAnd initial parameter W of softmax graderS's Classification number, xiInput value for the softmax layer of i-th sample.Final output is a probability column vector P, jth component Represent that forecast sample is judged as the probability of jth classification, utilize and minimize loss function to train weight matrix θ:
J ( θ ) = - 1 m [ Σ i = 1 m Σ j = 1 k 1 { y i = j } log P ( y i = j ) ] + λ 2 Σ i = 1 m Σ j = 1 n θ i j 2
Wherein, logP (yi=j) represent the natural logrithm of a certain probit,It is indicator function, when the condition in bracket Being 1 for true time value, no person is 0;This loss function is a strict convex function, uses gradient decline or lbfgs etc. Optimized algorithm can be in the hope of globally optimal solution.M is number of samples, and n is the number of plies of automatic coding machine.
As long as connecting the parameter in the middle of two-layer here is all weight matrix, WSI.e. θ is the weight square connecting last two-layer Battle array.
Utilize in the present embodiment and minimize loss function to train weight matrix θ detailed process as follows:
Step 2.6.1 random initializtion parameter matrix θ;
Step 2.6.2 directly calculates the derivative of J (θ), wherein, θjJth row in representing matrix;
▿ θ j J ( θ ) = - 1 m Σ i = 1 m [ x i ( 1 { y i = j } - P ( y i = j | x i ; θ ) ) ] + λθ j
Step 2.6.3 undated parameter θ:Wherein α is learning rate;Represent that J (θ) is right θjLocal derviation;
Step 2.6.4 repetition step 2.6.2, to step 2.6.3, is gradually reduced the value of J (θ), until the threshold value set, this Time the θ that obtains be last weight matrix, namely WS
Step 3 is finely tuned, and i.e. network is regarded as an entirety, calculates the partial derivative of each layer parameter with back propagation, then uses Gradient descent method iteration optimization.
After pre-training completes, in network, the initial parameter of every layer determines that, at this moment needs to carry out all parameters once Fine setting, to improve the classification capacity of network.The process of fine setting is that network is regarded as an entirety, calculates each layer ginseng with back propagation The partial derivative of number, then uses gradient descent method iteration optimization.At this moment network is no longer to reconstruct, therefore object function with The object function of softmax layer is identical, and wherein, softmax layer regards additional one layer as, individual processing, and each hidden layer Optimization process is essentially identical with described in step 2.1.
Detailed process is as follows:
Step 3.1. uses formula a(i)=σ (W(i)a(i-1)+b(i)) carry out feedforward calculating, obtain the excitation a of each layer(i)
Step 3.2. calculates softmax layer parameter WSPartial derivative:Wherein, P is step 2.6 Calculated conditional probability vector;
Step 3.3. calculates the residual error of last hidden layer and is:Wherein (W, b) to a to represent J(n)Local derviation, a(n)It it is the excitation of the n-th hidden layer;
Step 3.4. for l=n-1, n-2 ..., each layer of 2, calculate: δ(l)=((W(l))Tδ(l+1))·σ′(z(l));
The partial derivative of the step 3.5. each hidden layer of calculating:
Step 3.6. utilizes the partial derivative that above-mentioned steps obtains to update each layer parameter:
W S ′ = W S - α ( 1 m Σ ▿ W S J + λ θ ) W ( i ) ′ = W ( i ) - α ( 1 m Σ ▿ W ( i ) J ) b ( i ) ′ = b ( i ) - α ( 1 m Σ ▿ b ( i ) J ) ;
Wherein(W, b) to W to represent JSLocal derviation, WSFor the initial parameter of softmax grader,Represent J (W, b) to W(i)Local derviation,(W, b) to b to represent J(i)Local derviation;
Step 3.7. repeat the above steps, reduces the value of object function by iteration, until the threshold value set;
Step 4. predicts the affiliated classification of forecast sample, and detailed process is as follows:
Step 4.1. is by forecast sample vpNormalize to [0,1];
Step 4.2., for hidden layer, uses formula a(i)=σ (W(i)a(i-1)+b(i)) successively carry out feedforward calculating, and then obtain The input a of softmax layerS
Step 4.3. is according to probability calculation formula design conditions probability vector P of step 2.5, wherein, and maximum component pair The classification answered is prediction classification belonging to this sample.
Above-mentioned various in i and j all represent for counting constant parameter.
The core content of the present invention is exactly to devise to adapt to the network structure that medical science gas signal processes, and uses degree of depth study Process patient's gas data of Electronic Nose sampling, thus automatically extract more general, for sensor drift more robust Feature, complete the most simply and effectively gas detecting with identify this task, this is requiring the medical science of accuracy and real-time Field has great using value undoubtedly.

Claims (2)

1. a medical science Gas Distinguishing Method based on degree of depth study, specifically comprises the following steps that
Step 1. data normalization, is provided with m sample, by each sample by following form tissue, v=[s1,s2...,st], its In, siBeing i-th frequency response values, one has t response value, whole gas data collection and corresponding label can be expressed as:
V = [ v 1 T , v 2 T , ... , v i T , ... , v m T ]
Y=[y1,y2,…,yi,…,ym]T
T represents the transposition of vector, the i-th row v in matrix ViRepresenting i-th sample, in Y, i-th element is the classification of corresponding sample Label;
Utilize formulaData set is normalized to [0,1],
Wherein, vi,jRepresenting the i-th frequency response values of jth sample, L is normalized lower bound, and its value is 0, and U is normalization The upper bound, its value is 1, maxiAnd miniFor the maximum and minimum value of a line every in matrix, the data set a after normalization(0)Table Show;
Step 2. pre-training stack autoencoder network, in described stack autoencoder network, v, h and y represent input layer respectively, hidden Hide layer and output layer, W(i)For connecting the weight matrix of each layer, b(i)Bias vector for hidden layer;
Step 2.1. training ground floor, i.e. first automatic coding machine, object function is:
J = 1 2 m Σ i ( v i - v ^ i ) + λ 2 Σ i Σ j W i j 2 + β Σ j [ ρ l o g ρ p j + ( 1 - ρ ) l o g 1 - ρ 1 - p j ] ;
Wherein, Section 1 is reconstructed error item, represents input and the difference degree of output, wherein, viAfter representing step 1 normalization I-th input sample,Represent sample viBy output at output layer after network;Section 2 is referred to as weight attenuation term, be for Reduce the amplitude of weight, prevent over-fitting, wherein, WijRepresent between current layer jth unit and next layer of i-th unit Weights, Section 3 is sparse penalty term, wherein, pjRepresent the average activation of Hidden unit j, λ, β and ρ and be ginseng set in advance Number;M is number of samples, and J represents the object function of first automatic coding machine;
Optimization object function, for the automatic coding machine of a n-layer, concrete optimization step is:
Step 2.1.1. random initializtion parameter W(i)、b(i), initialize matrix or the vector of complete zero, i.e. Δ W(i)=0, Δ b(i)= 0;
Step 2.1.2. makesFor each sample, back-propagation algorithm is utilized to calculate partial derivativeDetailed process is as follows:
Feedforward calculates, and obtains each layer excitation a(i), computing formula is a(i)=σ (W(i)a(i-1)+b(i)), wherein, Being sigmoid function, output area is [0,1];
For output layer, calculate residual error: δ(n)=-(v-a(n))·σ′(z(n)), wherein, " " represents dot product, wherein z(n) =W(n-1)a(n-1)+b(n-1), σ ' represents the derivative of σ (x);
For l=n-1, n-2 ..., each layer of 2, calculate: δ(l)=((W(l))Tδ(l+1))·σ′(z(l));
Calculating local derviation numerical value:
WhereinWith(W, b) to W all to represent J(i)Local derviation,WithAll represent J (W, B) to b(i)Local derviation;
The partial derivative obtained is added to Δ W by step 2.1.3. respectively(i),Δb(i)On, i.e.
Step 2.1.4. undated parameter W(i),b(i),Wherein, α is learning rate;
Step 2.1.5. repetition step 2.1.2, to step 2.1.4, is gradually reduced the value of object function, until the threshold value set, To coding layer parameter (W, b) and the parameter of decoding layer
Step 2.2. abandons decoding layer after having trainedBy the parameter of coding layer, (W, b) as in stack autoencoder network The initial parameter of corresponding level, i.e. W(1)=W, b(1)=b;
Step 2.3. calculates the hidden layer excitation of current automatic coding machine: a(1)=σ (W(1)a(0)+b(1));
Step 2.4. is at excitation a(1)The upper training second layer, i.e. second automatic coding machine, wherein, first automatic coding machine Hidden layer is as the input layer of second automatic coding machine, and training process is identical with the process of training ground floor, but input becomes a(1), trained initial parameter W obtaining the network second layer(2), b(2)And hidden layer excitation a(2)
Step 2.5, for third layer to n-th layer, repeats the process of step 2.1 to 2.4, can obtain at the beginning of each hidden layer Beginning parameter, and finally give the excitation a of the n-th hidden layer(n), the excitation of this layer also serves as the input of softmax layer simultaneously, It is denoted as aS
The a that step 2.6. step 2.5 obtainsSWith last layer of label Y training network, i.e. softmax grader, obtain Initial parameter W of later layerS
A is represented respectively with x and θSAnd WS, and assume total k kind, for i-th sample, it was predicted that the class obtained is designated as jth The probability of class is:
P ( y i = j | x i ; θ ) = expθ j T x i Σ l = 1 k expθ l T x i
Wherein,Represent the jth row in θ, represent connect between jth output unit and all input blocks the row of weights to Amount, k is the input a of softmax layerSAnd initial parameter W of softmax graderSClassification number;Final output is one Probability column vector P, jth representation in components forecast sample is judged as the probability of jth classification, utilizes and minimizes loss function Training weight matrix θ:
J ( θ ) = - 1 m [ Σ i = 1 m Σ j = 1 k 1 { y i = j } log P ( y i = j ) ] + λ 2 Σ i = 1 m Σ j = 1 n θ i j 2
Wherein, logP (yi=j) represent a certain probit P (yi=j) natural logrithm,It is indicator function, when in bracket Condition be true time value be 1, no person is 0;M is number of samples, and n is the number of plies of automatic coding machine;
Step 3 is finely tuned, and i.e. network is regarded as an entirety, calculates the partial derivative of each layer parameter with back propagation, then uses gradient Descent method iteration optimization, detailed process is as follows:
Step 3.1. uses formula a(i)=σ (W(i)a(i-1)+b(i)) carry out feedforward calculating, obtain the excitation a of each layer(i)
Step 3.2. calculates softmax layer parameter WSPartial derivative:Wherein, P is that step 2.6 calculates The conditional probability vector obtained;
Step 3.3. calculates the residual error of last hidden layer and is:Wherein (W, b) to a to represent J(n)Local derviation, a(n)It it is the excitation of the n-th hidden layer;
Step 3.4. for l=n-1, n-2 ..., each layer of 2, calculate: δ(l)=((W(l))Tδ(l+1))·σ′(z(l));
The partial derivative of the step 3.5. each hidden layer of calculating:
Step 3.6. utilizes the partial derivative that above-mentioned steps obtains to update each layer parameter:
W S ′ = W S - α ( 1 m Σ ▿ W S J + λ θ ) W ( i ) ′ = W ( i ) - α ( 1 m Σ ▿ W ( i ) J ) b ( i ) ′ = b ( i ) - α ( 1 m Σ ▿ b ( i ) J ) ;
Wherein(W, b) to W to represent JSLocal derviation, WSFor the initial parameter of softmax grader,Expression J (W, b) To W(i)Local derviation,(W, b) to b to represent J(i)Local derviation;Step 3.7. repeat the above steps, reduces target by iteration The value of function, until the threshold value set;
Step 4. predicts the affiliated classification of forecast sample, and detailed process is as follows:
Step 4.1. is by forecast sample vpNormalize to [0,1];
Step 4.2., for hidden layer, uses formula a(i)=σ (W(i)a(i-1)+b(i)) successively carry out feedforward calculating, and then obtain The input a of softmax layerS
Step 4.3. is according to probability calculation formula design conditions probability vector P of step 2.5, and wherein, maximum component is corresponding Classification is prediction classification belonging to this sample.
Medical science Gas Distinguishing Method based on degree of depth study the most according to claim 1, it is characterised in that in step 2.6 Utilization minimizes loss function to train weight matrix θ detailed process as follows:
Step 2.6.1 random initializtion parameter matrix θ;
Step 2.6.2 directly calculates the derivative of J (θ), wherein, θjJth row in representing matrix;
▿ θ j J ( θ ) = - 1 m Σ i = 1 m [ x i ( 1 { y i = j } - P ( y i = j | x i ; θ ) ) ] + λθ j
Step 2.6.3 undated parameter θ:Wherein, α is learning rate;Represent that J (θ) is to θj's Local derviation;
Step 2.6.4 repetition step 2.6.2, to step 2.6.3, is gradually reduced the value of J (θ), until the threshold value set, now To θ be last weight matrix, namely WS
CN201310503402.9A 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study Expired - Fee Related CN103544392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310503402.9A CN103544392B (en) 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310503402.9A CN103544392B (en) 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study

Publications (2)

Publication Number Publication Date
CN103544392A CN103544392A (en) 2014-01-29
CN103544392B true CN103544392B (en) 2016-08-24

Family

ID=49967837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310503402.9A Expired - Fee Related CN103544392B (en) 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study

Country Status (1)

Country Link
CN (1) CN103544392B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996056B (en) * 2014-04-08 2017-05-24 浙江工业大学 Tattoo image classification method based on deep learning
CN104021224A (en) * 2014-06-25 2014-09-03 中国科学院自动化研究所 Image labeling method based on layer-by-layer label fusing deep network
CN104484684B (en) * 2015-01-05 2018-11-02 苏州大学 A kind of Manuscripted Characters Identification Method and system
CN105844331B (en) * 2015-01-15 2018-05-25 富士通株式会社 The training method of nerve network system and the nerve network system
CN104866727A (en) 2015-06-02 2015-08-26 陈宽 Deep learning-based method for analyzing medical data and intelligent analyzer thereof
CN105913079B (en) * 2016-04-08 2019-04-23 重庆大学 Electronic nose isomeric data recognition methods based on the study of the target domain migration limit
CN106202054B (en) * 2016-07-25 2018-12-14 哈尔滨工业大学 A kind of name entity recognition method towards medical field based on deep learning
CN106264460B (en) * 2016-07-29 2019-11-19 北京医拍智能科技有限公司 The coding/decoding method and device of cerebration multidimensional time-series signal based on self study
CN106156530A (en) * 2016-08-03 2016-11-23 北京好运到信息科技有限公司 Health check-up data analysing method based on stack own coding device and device
CN108122035B (en) 2016-11-29 2019-10-18 科大讯飞股份有限公司 End-to-end modeling method and system
CN107368671A (en) * 2017-06-07 2017-11-21 万香波 System and method are supported in benign gastritis pathological diagnosis based on big data deep learning
CN107368670A (en) * 2017-06-07 2017-11-21 万香波 Stomach cancer pathology diagnostic support system and method based on big data deep learning
US10885470B2 (en) * 2017-06-26 2021-01-05 D5Ai Llc Selective training for decorrelation of errors
CN108416439B (en) * 2018-02-09 2020-01-03 中南大学 Oil refining process product prediction method and system based on variable weighted deep learning
CN109472303A (en) * 2018-10-30 2019-03-15 浙江工商大学 A kind of gas sensor drift compensation method based on autoencoder network decision
CN111474297B (en) * 2020-03-09 2022-05-03 重庆邮电大学 Online drift compensation method for sensor in bionic olfaction system
CN111340132B (en) * 2020-03-10 2024-02-02 南京工业大学 Machine olfaction mode identification method based on DA-SVM
CN111915069B (en) * 2020-07-17 2021-12-07 天津理工大学 Deep learning-based detection method for distribution of lightweight toxic and harmful gases

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135639A (en) * 2007-09-27 2008-03-05 中国人民解放军空军工程大学 Mixture gas component concentration infrared spectrum analysis method based on supporting vector quantities machine correct model
CN102411687A (en) * 2011-11-22 2012-04-11 华北电力大学 Deep learning detection method of unknown malicious codes
CN103267793A (en) * 2013-05-03 2013-08-28 浙江工商大学 Carbon nano-tube ionization self-resonance type gas sensitive sensor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101042339B (en) * 2006-03-21 2012-05-30 深圳迈瑞生物医疗电子股份有限公司 Device for recognizing zone classification of anesthetic gas type and method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135639A (en) * 2007-09-27 2008-03-05 中国人民解放军空军工程大学 Mixture gas component concentration infrared spectrum analysis method based on supporting vector quantities machine correct model
CN102411687A (en) * 2011-11-22 2012-04-11 华北电力大学 Deep learning detection method of unknown malicious codes
CN103267793A (en) * 2013-05-03 2013-08-28 浙江工商大学 Carbon nano-tube ionization self-resonance type gas sensitive sensor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SVM和BP算法在气体识别中的对比研究;汪丹等;《传感技术学报》;20050326;第18卷(第1期);全文 *
基于支持向量机和小波分解的气体识别研究;葛海峰;《仪器仪表学报》;20060630;第27卷(第6期);全文 *
基于支持向量机算法的气体识别研究;汪丹;《传感器技术》;20050220;第24卷(第2期);全文 *

Also Published As

Publication number Publication date
CN103544392A (en) 2014-01-29

Similar Documents

Publication Publication Date Title
CN103544392B (en) Medical science Gas Distinguishing Method based on degree of depth study
CN103728551B (en) A kind of analog-circuit fault diagnosis method based on cascade integrated classifier
CN101135689B (en) Electric nose development platform
CN105738109A (en) Bearing fault classification diagnosis method based on sparse representation and ensemble learning
CN109979541B (en) Method for predicting pharmacokinetic property and toxicity of drug molecules based on capsule network
CN110298264B (en) Human body daily behavior activity recognition optimization method based on stacked noise reduction self-encoder
CN103268607B (en) A kind of common object detection method under weak supervision condition
CN111340132B (en) Machine olfaction mode identification method based on DA-SVM
Cai et al. Anomaly detection of earthquake precursor data using long short-term memory networks
CN105609116B (en) A kind of automatic identifying method in speech emotional dimension region
CN108447057A (en) SAR image change detection based on conspicuousness and depth convolutional network
CN110455512B (en) Rotary mechanical multi-integration fault diagnosis method based on depth self-encoder DAE
Glezakos et al. Plant virus identification based on neural networks with evolutionary preprocessing
CN115343676B (en) Feature optimization method for positioning technology of redundant substances in sealed electronic equipment
CN111354338A (en) Parkinson speech recognition system based on PSO convolution kernel optimization sparse transfer learning
CN113887342A (en) Equipment fault diagnosis method based on multi-source signals and deep learning
Ye et al. A deep learning-based method for automatic abnormal data detection: Case study for bridge structural health monitoring
Yang et al. Stacking-based and improved convolutional neural network: a new approach in rice leaf disease identification
CN110488020A (en) A kind of protein glycation site identification method
CN117516939A (en) Bearing cross-working condition fault detection method and system based on improved EfficientNetV2
CN117350364A (en) Knowledge distillation-based code pre-training model countermeasure sample generation method and system
Orlic et al. Earthquake—explosion discrimination using genetic algorithm-based boosting approach
Castro-Cabrera et al. Adaptive classification using incremental learning for seismic-volcanic signals with concept drift
CN112541524A (en) BP-Adaboost multi-source information motor fault diagnosis method based on attention mechanism improvement
CN113378935B (en) Intelligent olfactory sensation identification method for gas

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160824

Termination date: 20171023

CF01 Termination of patent right due to non-payment of annual fee