CN116384450A - Medical data-oriented deep convolution fuzzy neural network and training method thereof - Google Patents
Medical data-oriented deep convolution fuzzy neural network and training method thereof Download PDFInfo
- Publication number
- CN116384450A CN116384450A CN202310431951.3A CN202310431951A CN116384450A CN 116384450 A CN116384450 A CN 116384450A CN 202310431951 A CN202310431951 A CN 202310431951A CN 116384450 A CN116384450 A CN 116384450A
- Authority
- CN
- China
- Prior art keywords
- fuzzy
- layer
- algorithm
- convolution
- rule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 47
- 238000012549 training Methods 0.000 title claims abstract description 38
- 230000006870 function Effects 0.000 claims abstract description 38
- 238000004364 calculation method Methods 0.000 claims abstract description 30
- 230000008569 process Effects 0.000 claims abstract description 21
- 239000000284 extract Substances 0.000 claims abstract description 10
- 230000005284 excitation Effects 0.000 claims description 37
- 238000012545 processing Methods 0.000 claims description 19
- 238000005192 partition Methods 0.000 claims description 17
- 230000004913 activation Effects 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 6
- 210000002569 neuron Anatomy 0.000 claims description 5
- 238000012937 correction Methods 0.000 claims description 3
- 210000002364 input neuron Anatomy 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000009827 uniform distribution Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 2
- 238000013527 convolutional neural network Methods 0.000 abstract 1
- 238000009826 distribution Methods 0.000 description 4
- 238000013506 data mapping Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/043—Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Computational Mathematics (AREA)
- Fuzzy Systems (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Databases & Information Systems (AREA)
- Automation & Control Theory (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention provides a medical data-oriented deep convolution fuzzy neural network and a training method thereof, wherein the medical data-oriented deep convolution fuzzy neural network comprises a medical data interpretability prediction model (IP-DCFNN) and an IP-DCFNN based on the deep convolution fuzzy neural network, wherein the medical data interpretability prediction model (IP-DCFNN) comprises three parts: fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part; a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation; the fuzzy result representation part: the fuzzy result representation part is a process for handling defuzzification in fuzzy inference. The invention relates to the technical field of computers, and discloses an IP-DCFNN which is added with a deep convolutional neural network concept on the basis of a fuzzy reasoning system to achieve the interpretive prediction capability for medical data.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a deep convolution fuzzy neural network oriented to medical data and a training method thereof.
Background
The fuzzy inference system is limited by a data path and a complex structure, and extremely high calculation cost is often required when facing large-scale and high-latitude data, and meanwhile, the relation between the data is difficult to express, so that the accuracy of a prediction result of the fuzzy inference system is inferior to that of some algorithm models with network structures.
Although the neural network correlation algorithm can effectively process high-dimensional data, because the neural network correlation algorithm constructs a feedforward propagation network path through the topological relation of a large number of neurons, the weighting coefficient iterated out by the neurons of the hidden layer cannot generate direct connection with a result, and cannot represent the real meaning directly related to the processed task, namely the so-called black box property, which causes the neural network to lack of interpretability and influences the credibility and acceptance of the result.
The various attribute features of medical data often have more potential associations than other types of data, and the different physiological indicators or morphological characterization data of patients have partial logical associations therebetween, which makes some common inference models incapable of well processing medical data.
Disclosure of Invention
In view of this, the present invention aims to propose a deep convolution fuzzy neural network for medical data, so as to add the concept of the deep neural network on the basis of a fuzzy reasoning system to achieve the capability of both interpretability and processing high-dimensional data.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
the deep convolution fuzzy neural network for medical data is characterized in that: including medical data interpretability prediction model (IP-DCFNN) based on deep convolutional fuzzy neural network;
the IP-DCFNN consists of three parts:
fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part;
a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation;
the fuzzy result representation part: the fuzzy result representation part is a process for processing defuzzification in fuzzy inference.
the input data x i Is [ x ] 1 ,…,x n ]Within the interval, the membership functionWill input data x i Mapping to fuzzy sets low, high]Wherein k=1,..k and represents the selection of fuzzy linguistic variables, the parameters of the membership functions are μ and σ;
the fuzzy logic front part is formed by adopting Takagi-Sugeno fuzzy model
Fuzzy rules described in (a);
the fuzzy variables generated by different dimensionalities of input data are arranged and combined to construct a regular front part, and the initial excitation intensity of the rule can be obtained by accumulating membership degrees corresponding to each fuzzy variable in the regular front part, wherein the initial excitation intensity represents the functional weight of the rule;
the initial excitation intensity of the jth rule is calculated by the membership degree of all fuzzy variables in the rule front part, and the specific formula is thatI.e. the weight of the fuzzy rule is obtained by performing an AND fuzzy logic operation on each fuzzy variable.
Further, the depth convolution part takes the initial excitation intensity generated by the fuzzy logic front part as an initial input and outputs a final excitation intensity, wherein the final excitation intensity is the same as the initial excitation intensity in dimension;
the depth convolution calculation part mainly comprises a convolution layer and a full connection layer;
the convolution layer comprises a one-dimensional convolution layer, a ReLU activation layer and a maximum pooling layer;
the full connection layer comprises a Linear layer and a ReLU activation layer;
the initial input is passed throughThe convolution process is extended to a high dimension by the one-dimensional convolution kernel and flattened in the fully-connected layer to an initial dimension, wherein +_>Represents the ith data unit in the first convolution layer,/and/or the second convolution layer>A weight, b representing the convolution kernel of the first layer l Representing the bias of the layer;
the network layers in series after the convolutional layers are fully connected, meaning that each node of the first layer is connected to all nodes of the first-1 layer,the parameter w represents the connection weight, b represents the bias of the node, and the formula of the feedforward propagation of the neural network is
In the network layers in series after the convolutional layers, the activation function is defined as σ (x) =1/(1+e) -x ) One-dimensional convolution is used to process the excitation intensity of the input.
Further, the blurring result representing section calculates a final excitation intensity [ W 'for the output in the depth convolution calculation section' 1 ,W′ 2 ,...,W′ N ]Calculating a normalization weight by using a normalization algorithm and showing the activation level of each rule;
the formula of the normalization algorithm isWherein (1)>Representing the normalized excitation intensity of rule j and representing the contribution of that rule to the total weight;
the back-part of the fuzzy rule is a linear combination of the input data, expressed asBut->The j rule is the back part parameter;
the total output Z is defined as the weighted sum of all rule outputs, and the specific formula is Wherein the weight of each rule refers to the excitation intensity normalized by the rule;
for the rear partContinuous parametersUsing a Least Squares Estimator (LSE) to obtain an optimized result, let +.>And θ= [ θ ] 1 ,θ 2 ,...,θ N ] T Let F= [ F 1 ,f 2 ,...,f N ]When the estimated error is set to be e, the correction formula of Z is obtained as Z=Fθ+e, and the mean square error function of the result is as followsWherein m represents the length of the dataset and y t Actual tag representing the t-th data of the dataset,/->Representing the t-th input data from the dataset;
the optimization objective is to minimize the square error whenSubstituting z=fθ+e, the data can be fitted with LSE and get the back-piece parameters in F.
Further, the fuzzy result representation part outputs the prediction resultCalculating a loss value using a Mean Square Error (MSE) and updating parameters of the IP-FDCNN using a back propagation algorithm (GD), the loss function being +.>
Compared with the prior art, the invention has the following advantages:
the deep convolution fuzzy neural network for medical data achieves the capability of interpretability and processing high-dimensional data by adding the concept of the deep neural network on the basis of being provided with a fuzzy reasoning system, particularly, the invention expresses the fuzzification and defuzzification parts of the fuzzy reasoning system by using a network architecture, and inserts a functional module of the deep convolution calculation in the middle of the fuzzification and defuzzification, and improves the data processing capability of the whole model system by the high-latitude data mapping and processing capability of the deep convolution network in the functional module. Although the hidden weight of the deep convolution part still cannot be well explained due to the topological limitation of the neural network, the fuzzified and defuzzified parts can extract richer rule base and membership information, so that the capability of processing data of the system is improved under the condition of not losing the interpretability of the system.
The invention further aims to provide a training method of the deep convolution fuzzy neural network for medical data, and the training efficiency is greatly improved by providing a fuzzy membership parameter initialization algorithm based on grid division and simulating membership parameters in an initialization model self-adapting to data distribution conditions.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
the training method of the deep convolution fuzzy neural network facing the medical data comprises two stages of initialization of model parameters and iterative updating of parameters, wherein the training method of the deep convolution fuzzy neural network facing the medical data adopts a hybrid learning method, and specific parameters are learned and updated through different algorithms in the feedforward propagation and counter propagation processes.
Further, the training method of the deep convolution fuzzy neural network for medical data adopts a mixed learning method as follows:
the front part parameters are positioned in the fuzzy logic front part, the initialization algorithm is a grid division initialization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the back-piece parameter is positioned in the fuzzy result representation part, the initialization algorithm is zero initialization algorithm, the update time is feedforward propagation, and the update algorithm is least square estimation algorithm;
the node weight is positioned in the deep convolution calculation part, the initialization algorithm is an He-standardization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the node bias is positioned in a depth convolution computing part, an initialization algorithm is an He-standardization algorithm, updating time is back propagation, and an updating algorithm is a gradient descent algorithm.
Further, the front piece parameters of the fuzzy logic front piece section are expressed as{ μ, σ } in each membership function in (a), where μ represents the blur center and σ represents the blur width;
the initialization front part parameter adopts an initialization algorithm (Grid Partition-based Initialization Algorithm) based on Grid division;
the Grid Partition (Grid Partition) is a method for dividing a data space, and divides an input data space into Grid subspaces parallel to axes according to membership functions of each feature;
initializing parameters of each membership function according to a predefined partition grid divided by a grid based on a grid partition initialization algorithm, initializing a fuzzy center mu as a median of data in the grid for a data sample falling in the grid, initializing a fuzzy width sigma as a linear approximation of a data scale in the grid, and obtaining coefficients from linear fitting;
the parameters of the deep convolution calculation part are weights and deviations of neurons of a convolution layer and a full connection layer, and the parameters are initialized by adopting an He-uniform algorithm, and the method comprises the following specific steps of:
A. sampling to form a uniform distribution of [ -limit, limit ];
B. limit atMiddle calculation(s) (i.e. f)>Representing the number of input neurons in the first layer;
the back-piece parameters of the fuzzy result representation part are parameters in the output linear expression for each Rule, in particular Rule j :
a simple zero initialization is applied to all the back-piece parameters before training the model (i.e. all the back-piece parameters in the model are assigned a value of 0 at the beginning of training).
Further, the IP-DCFNN model is trained by adopting a hybrid learning method;
the model updates the back-piece parameters through a least squares estimation algorithm (LSE) in the feed-forward process, and updates the front-piece parameters and the hidden layer parameters through a gradient descent algorithm (GD) in the back-propagation process;
in counter-propagation, gradients such asThe calculation shown, wherein C represents +.>The # n represents the number of nodes of the first layer affected by the nodes in the layer 1;
and reversely updating the weights of the nodes in the network layer by layer through the calculated gradient values.
Compared with the prior art, the invention has the following advantages:
according to the training method of the deep convolution fuzzy neural network for medical data, disclosed by the invention, parameters of membership functions can be simulated in an initialization model in a self-adaptive mode according to data distribution conditions through a fuzzy membership parameter initialization algorithm based on grid division, so that training efficiency is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings:
fig. 1 is a schematic structural diagram of a deep convolution fuzzy neural network for medical data and a training method thereof according to an embodiment of the present invention;
FIG. 2 is a membership function of a fuzzy logic front part of a medical data oriented deep convolution fuzzy neural network and a training method thereof according to an embodiment of the present invention;
FIG. 3 is a schematic structure of a depth convolution fuzzy neural network and a training method thereof for medical data according to an embodiment of the present invention;
fig. 4 is a Conv1d layer of a depth convolution fuzzy neural network and a depth convolution calculation part of a training method thereof for medical data according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of meshing of a deep convolutional fuzzy neural network and a training method thereof for medical data according to an embodiment of the present invention;
Detailed Description
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
In the description of the present invention, it should be noted that the azimuth or positional relationship indicated by the terms "upper", "lower", "inner", "back", etc. are based on the azimuth or positional relationship shown in the drawings, and are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or element referred to must have a specific azimuth, be constructed and operated in a specific azimuth, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The invention will be described in detail below with reference to the drawings in connection with embodiments.
The embodiment relates to a medical data-oriented deep convolution fuzzy neural network and a training method thereof, wherein the method comprises a medical data interpretability prediction model (IP-DCFNN) based on the deep convolution fuzzy neural network;
as shown in fig. 1, the IP-DCFNN consists of three parts: fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part; a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation; the fuzzy result representation part: the fuzzy result representation part is used for processing defuzzification in fuzzy inference; specifically, the fuzzy logic front part and the fuzzy result representation part are respectively used as a fuzzification and defuzzification tool of a fuzzy rule in a model, and the depth convolution calculation part is used for processing hidden weights in a high-dimensional representation. Although the deep convolution part is mostly constructed by adopting a neural network, the hidden layer weight is unexplainable, but we can obtain the interpretable information from the other two parts, such as fuzzy membership degree, fuzzy rule, triggering strength of each rule and the like. Thus, the model can process medical data in an interpretable manner and avoid degradation of prediction accuracy.
The fuzzy logic front part is to form the basic framework of the fuzzy if-then rule and represent the fuzzy front in the rule. In this section, the input data will be extracted by the fuzzy logic, and the input data will be converted from a numerical value by the operation of the membership function into a set of membership values for the fuzzy language scalar.
As shown in FIG. 2, the membership function μ * (x) For mapping numeric input data to a value within the 0 and 1 interval that represents membership in a defined fuzzy language set. For the membership function, the gaussian membership function has stronger expressive power and better expression effect than other function forms, so in the model proposed in this embodiment, the membership is calculated by using the gaussian membership function; specifically, for [ x ] 1 ,...,x n ]Input data x of (a) i Is good forUsing membership functionsMap it to fuzzy set low, high]Where k=1,..k (K represents the size of the fuzzy set, here 2) and represents the selection of the fuzzy linguistic variable; the parameters of the membership functions are μ and σ, which greatly affect the approximation ability of the fuzzy system, and in the IP-DCFNN model, the present embodiment applies a meshing-based initialization algorithm to initialize these parameters and update them with a gradient descent algorithm in each iteration during model training.
The fuzzy logic front part is formed by adopting Takagi-Sugeno fuzzy model
The fuzzy rule (TS rule) described in (a) above. TS Rule j (j=1,...,N and N=K n ) The content of the j-th fuzzy rule is described. The part following "IF" is a regular front piece representing each input data in the form of a fuzzy linguistic variable, where x i (i=1, 2,) n is input data, ++>Is a fuzzy variable in the fuzzy set. The part following "THEN" is the back-piece of the rule, the result of which is represented by the non-ambiguous linear combination of input variables, where output f j Representing the final excitation intensity of rule j.
The fuzzy logic front part may adaptively generate N rules (here, a front of the generation rule is specified), where n=k n (n is the dimension of the input data and K is the size of the fuzzy set), from which it can be seen that the basic method of adaptive generation is to arrange and combine fuzzy variables generated by different dimensions of the input data to construct a regular front piece by multiplying togetherThe membership degree corresponding to each fuzzy variable in the rule front can obtain the initial excitation intensity of the rule, which represents the functional weight of the rule. As inIn the method, the initial excitation intensity of the jth rule is calculated by membership degrees of all fuzzy variables in the rule front, namely the weight of the fuzzy rule is obtained by carrying out 'AND' fuzzy logic operation on each fuzzy variable.
The deep convolution computing part is responsible for extracting hidden features from the weights of the input rules and converting the weights of the hidden layers into high-latitude information representations so as to characterize hidden relations among different rules. The complex structure of the neural network can represent a nonlinear relation, so that the processing capacity of the whole model is greatly improved; specifically, as shown in fig. 3, the depth convolution section takes as an initial input the initial excitation intensity generated by the fuzzy logic front part section and outputs a final excitation intensity that is the same dimension as the initial excitation intensity; the depth convolution calculation part mainly comprises a convolution layer and a full connection layer, wherein data are expanded to a high dimension by a one-dimensional convolution kernel in the convolution layer and flattened to an initial dimension in the full connection layer.
Input data pass throughAnd (5) convolution processing. />Represents the ith data unit in the first convolution layer,/and/or the second convolution layer>A weight, b representing the convolution kernel of the first layer l Representing the bias of the layer. In a network, the activation function is defined as σ (x) =1/(1+e) -x ). As shown in fig. 4, the present embodiment adopts one-dimensional convolution to process the excitation intensities of the inputs, thereby mining the relationship between the input weights; the network layers in series after the convolutional layers are fully connected, meaning that each node of the first layer isAll nodes connected to layer 1, the parameter w represents the connection weight, b represents the bias of the node, and the formula of feed forward propagation isThis layer will help to transform the number of dimensions of the hidden weights to the same scale as the partial input and prepare for the next partial defuzzification.
The fuzzy result representation part is used for processing the defuzzification process in fuzzy push, and is helpful for converting the operation intermediate quantity of the fuzzy logic into a clear result which is finally output by the model; final excitation intensity for last part output [ W ]' 1 ,W′ 2 ,...,W′ N ]This section calculates the normalization weights using a normalization algorithm and represents the activation level for each rule. Normalization formulas such asShown as (I)>Represents the normalized excitation intensity of rule j and means the contribution of the rule to the total weight. From the normalized excitation intensity, the importance of each rule in the rule base can be deduced interpretatively.
In the TS rule described above, the back-part of the fuzzy rule is a linear combination of the input data, expressed asBut->The j rule is the back part parameter; the total output Z is defined as the weighted sum of all rule outputs, in particular, the weight of each rule refers to the sum of all rule outputsThe excitation intensity after regular normalization; thus, in the fuzzy result representation part, the result of the rule will be formed andthe intermediate quantity of the fuzzy logic is converted into a digital value to be output, so that a model can give a prediction result; for subsequent parameters->The present embodiment uses a Least Squares Estimator (LSE) to obtain the optimized result; let->And θ= [ θ ] 1 ,θ 2 ,...,θ N ] T Let F= [ F 1 ,f 2 ,...,f N ]And assuming that the estimation error is e, a correction formula z=fθ+e of Z can be obtained; the mean square error function of the result is thatWherein m represents the length of the dataset and y t Actual tag representing the t-th data of the dataset,/->Representing the t-th input data from the dataset; thus, the optimization objective is to minimize the square error; when it is toWhen Z=Fθ+e is substituted, the LSE can be used for fitting data and obtaining the back part parameters in F; fuzzy result representation part outputs prediction result +.>The present embodiment uses Mean Square Error (MSE) to calculate the loss value and uses back propagation algorithm (GD) to update the parameters of the IP-FDCNN, the loss function is like +.>
The IP-DCFNN adopts a mixed learning method, the training process of the IP-DCFNN model can be divided into two stages, and the initialization of model parameters and the iterative update of parameters are performed; during the training process, some parameters play a critical role in the final performance of the model, and these parameters are shown in table 1:
TABLE 1
These parameters are distributed among three components of the IP-DCFNN; wherein the result parameters (back-office parameters) are updated by least squares estimation algorithm (LSE) during feed forward propagation, while the other parameters are updated by Gradient Descent (GD) algorithm during backward propagation; i.e. the parameters are learned and updated by different algorithms during feed forward and back propagation etc.
The initialization algorithm of the parameters is shown in Table 1, and the front parameters of the fuzzy logic front part are expressed as{ μ, σ }, where μ represents the blur center and σ represents the blur width; the initialization front part parameter adopts a grid partition-based initialization algorithm (grid partition), specifically, grid partition (grid partition) is a method for dividing a data space, and divides an input data space into grid subspaces parallel to axes according to a membership function of each feature; as shown in fig. 5, taking two dimensions as an example, in each dimension, the dimension is divided by a fuzzy function corresponding to the fuzzy set along the direction parallel to the axial direction, the division of multiple dimensions of the data divides the whole data set into small grids, and each grid partition can represent a combination of fuzzy variables; initializing parameters of each membership function according to a predefined partition grid divided by a grid based on a grid partition initialization algorithm; for data samples falling in the grid, the blur center μ is initialized to the median of the data in this grid, the blur width σ is initialized to a linear approximation of the size of the data in the grid, the coefficients of which come from the linear fit.
Based on the foregoing, the steps of the initialization algorithm for the front piece parameters are as follows:
1) Dividing an input space into different dimensions according to attribute characteristics;
2) Sequencing the data according to each data dimension, and performing equal segmentation on the current single-dimensional feature data according to the quantity of fuzzy variables in the fuzzy set of the corresponding features;
3) For each segmented data unit, calculating membership function parameters (u, θ) of the corresponding fuzzy variable:
wherein u is the median of the characteristic data values in the current data unit, θ is ρ times the difference between the maximum value and the minimum value of the characteristic data values in the current data unit, and ρ is the linear fitting coefficient of the membership function calculation result and the characteristic data distribution range.
The parameters of the deep convolution calculation part are weights and deviations of neurons of a convolution layer and a full connection layer, and the parameters are initialized by adopting an He-uniform algorithm, and the method comprises the following specific steps of:
A. sampling to form a uniform distribution of [ -limit, limit ];
representing the number of input neurons in the first layer; the result parameters (back-piece parameters) of the fuzzy result representation part are parameters in the output linear expression for each rule, in particular
The result part of the middle rule shows; unlike other parameters, the resulting parameters are generated during forward propagation, which depend on a single iteration in the forward propagation, and are independent of historical iterations, i.e., two adjacentThe back-piece parameters in the second iteration have no direct relation; a simple zero initialization is applied to all the back-piece parameters before training the model (i.e. all the back-piece parameters in the model are assigned a value of 0 at the beginning of training).
The IP-DCFNN model is trained by adopting a hybrid learning method; the model updates the back-piece parameters through a least squares estimation algorithm (LSE) in the feed-forward process, and updates the front-piece parameters and the hidden layer parameters through a gradient descent algorithm (GD) in the back-propagation process; in counter-propagation, gradients such asThe calculation shown, wherein C represents +.>The # n represents the number of nodes of the first layer affected by the nodes in the layer 1; and reversely updating the weights of the nodes in the network layer by layer through the calculated gradient values.
The training strategy for the model is as follows:
1) Initializing model parameters through an initialization algorithm;
2) Converting the input data into membership values relative to fuzzy variables through membership functions;
3) Generating all rules in a rule base and initial excitation intensities thereof by arranging and combining fuzzy variables;
4) Processing the initial excitation intensity through a convolution kernel in a convolution calculation unit to obtain the final excitation intensity;
5) Normalizing the excitation intensity, and calculating fuzzy back-piece parameters by a least square estimation method;
6) Calculating a final result degree value through an input linear calculation function in the fuzzy back part;
7) Calculating a loss value by adopting a mean square error, and calculating gradient values of a fuzzy logic front part and a depth convolution calculation part;
8) And updating fuzzy front part parameters and hidden layer parameters (node weights and node offsets) by using a gradient descent algorithm.
The deep convolution fuzzy neural network for medical data and the training method thereof add the concept of the deep neural network on the basis of a fuzzy reasoning system to achieve the capability of not only interpretability but also processing high-dimensional data, specifically, the fuzzification and defuzzification parts of the fuzzy reasoning system are represented by using a network architecture, and a functional module of the deep convolution calculation is inserted in the middle of the fuzzification and defuzzification, and the data processing capability of the whole model system is improved through the high-latitude data mapping and processing capability of the deep convolution network in the functional module; although the hidden weight of the deep convolution part still cannot be well explained due to the topological limitation of the neural network, the fuzzified and defuzzified parts can extract richer rule base and membership information, so that the capability of processing data of the system is improved under the condition of not losing the interpretability of the system.
Meanwhile, by proposing a fuzzy membership parameter initialization algorithm based on grid division, parameters of membership functions can be simulated in an initialization model self-adapting to data distribution conditions, the condition that the parameter initialization aiming at a fuzzy reasoning system part greatly influences the effect and the final capability is avoided, and the training efficiency is greatly improved.
The model is used for carrying out targeted processing on the medical data of the patient by combining the fuzzy inference network and the depth convolution calculation unit, and the convolution unit in the model can further mine the associated information and the potential characteristic expression in the physiological information of the patient in the medical data by carrying out high-dimensional mapping calculation on the fuzzy rule weight, so that the model is more beneficial to reasoning and analyzing the task related to the medical data of the patient.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (9)
1. The deep convolution fuzzy neural network for medical data is characterized in that: including medical data interpretability prediction model (IP-DCFNN) based on deep convolutional fuzzy neural network;
the IP-DCFNN consists of three parts:
fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part;
a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation;
the fuzzy result representation part: the fuzzy result representation part is a process for processing defuzzification in fuzzy inference.
2. The medical data oriented deep convolutional fuzzy neural network of claim 1, wherein:
the input data x i Is [ x ] 1 ,...,x n ]Within the interval, the membership functionWill input data x i Mapping to fuzzy sets low, high]Wherein k=1,..k and represents the selection of fuzzy linguistic variables, the parameters of the membership functions are μ and σ;
the fuzzy logic front part adopts Takagi-Sugeno fuzzy model to form Rule i :IF x 1 isx 2 is…x n is/>
Fuzzy rules described in (a);
the fuzzy variables generated by different dimensionalities of input data are arranged and combined to construct a regular front part, and the initial excitation intensity of the rule can be obtained by accumulating membership degrees corresponding to each fuzzy variable in the regular front part, wherein the initial excitation intensity represents the functional weight of the rule;
3. The medical data oriented deep convolutional fuzzy neural network of claim 2, wherein:
the depth convolution part takes the initial excitation intensity generated by the fuzzy logic front part as initial input and outputs final excitation intensity, wherein the final excitation intensity is the same as the initial excitation intensity in dimension;
the depth convolution calculation part mainly comprises a convolution layer and a full connection layer;
the convolution layer comprises a one-dimensional convolution layer, a ReLU activation layer and a maximum pooling layer;
the full connection layer comprises a Linear layer and a ReLU activation layer;
the initial input is passed throughThe convolution process is extended to a high dimension by the one-dimensional convolution kernel and is performed on the one-dimensional convolution kernelFlattening in the fully connected layer to an initial dimension, wherein +.>Representing the ith data element in the first convolutional layer,a weight, b representing the convolution kernel of the first layer l Representing the bias of the layer;
the network layers connected in series after the convolution layer are fully connected, each node of the first layer is connected to all nodes of the first-1 layer, the parameter w represents the connection weight, b represents the bias of the nodes, and the formula of feedforward propagation of the neural network is that
In the network layers in series after the convolutional layers, the activation function is defined as σ (x) =1/(1+e) -x ) One-dimensional convolution is used to process the excitation intensity of the input.
4. The medical data oriented deep convolutional fuzzy neural network of claim 3, wherein:
the fuzzy result representation part calculates the final excitation intensity W 'of the output in the deep convolution calculation part' 1 ,W′ 2 ,...,W′ N ]Calculating a normalization weight by using a normalization algorithm and showing the activation level of each rule;
the formula of the normalization algorithm isWherein (1)>Representing the normalized excitation intensity of rule j and representing the contribution of that rule to the total weight;
the back-part of the fuzzy rule is the input dataIs expressed as a linear combination ofWhileThe j rule is the back part parameter;
the total output Z is defined as the weighted sum of all rule outputs, and the specific formula is Wherein the weight of each rule refers to the excitation intensity normalized by the rule;
for subsequent parametersUsing a Least Squares Estimator (LSE) to obtain an optimized result, let +.>And θ= [ θ ] 1 ,θ 2 ,...,θ N ] T Let F= [ F 1 ,f 2 ,...,f N ]When the estimated error is set to be e, the correction formula of Z is obtained as Z=Fθ+e, and the mean square error function of the result is as followsWherein m represents the length of the dataset and y t Actual tag representing the t-th data of the dataset,/->Representing the t-th input data from the dataset;
5. The medical data oriented deep convolutional fuzzy neural network of claim 4, wherein:
6. A training method of a deep convolution fuzzy neural network for medical data is characterized by comprising the following steps:
the training method of the deep convolution fuzzy neural network for the medical data adopts a hybrid learning method, and specific parameters are learned and updated through different algorithms in the feedforward propagation and the counter propagation processes.
7. The training method of the medical data-oriented deep convolution fuzzy neural network of claim 6, wherein the training method comprises the following steps: the training method of the deep convolution fuzzy neural network for medical data adopts a mixed learning method as follows:
the front part parameters are positioned in the fuzzy logic front part, the initialization algorithm is a grid division initialization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the back-piece parameter is positioned in the fuzzy result representation part, the initialization algorithm is zero initialization algorithm, the update time is feedforward propagation, and the update algorithm is least square estimation algorithm;
the node weight is positioned in the deep convolution calculation part, the initialization algorithm is an He-standardization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the node bias is positioned in a depth convolution computing part, an initialization algorithm is an He-standardization algorithm, updating time is back propagation, and an updating algorithm is a gradient descent algorithm.
8. The training method of the medical data-oriented deep convolution fuzzy neural network of claim 7, wherein the training method comprises the following steps:
the front piece parameters of the fuzzy logic front piece part are expressed as{ μ, σ } in each membership function in (a), where μ represents the blur center and σ represents the blur width;
the initialization front part parameter adopts an initialization algorithm (Grid Partition-based Initialization Algorithm) based on Grid division;
the Grid Partition (Grid Partition) is a method for dividing a data space, and divides an input data space into Grid subspaces parallel to axes according to membership functions of each feature;
initializing parameters of each membership function according to a predefined partition grid divided by a grid based on a grid partition initialization algorithm, initializing a fuzzy center mu as a median of data in the grid for a data sample falling in the grid, initializing a fuzzy width sigma as a linear approximation of a data scale in the grid, and obtaining coefficients from linear fitting;
the parameters of the deep convolution calculation part are weights and deviations of neurons of a convolution layer and a full connection layer, and the parameters are initialized by adopting an He-uniform algorithm, and the method comprises the following specific steps of:
A. sampling to form a uniform distribution of [ -limit, limit ];
B. limit atMiddle calculation(s) (i.e. f)>Representing the number of input neurons in the first layer;
the back-piece parameters of the fuzzy result representation part are parameters in the output linear expression for each Rule, in particular Rule j :IF x 1 isx 2 is/>…x n is/>
a simple zero initialization is applied to all the back-piece parameters before training the model (i.e. all the back-piece parameters in the model are assigned a value of 0 at the beginning of training).
9. The training method of the medical data-oriented deep convolution fuzzy neural network of claim 8, wherein the training method comprises the following steps:
the IP-DCFNN model is trained by adopting a hybrid learning method;
the model updates the back-piece parameters through a least squares estimation algorithm (LSE) in the feed-forward process, and updates the front-piece parameters and the hidden layer parameters through a gradient descent algorithm (GD) in the back-propagation process;
in counter-propagation, gradients such asThe calculation shown, wherein C represents +.>In the first layer, # (n) represents the number of nodes of the first layer affected by the nodes in the first layer +1;
and reversely updating the weights of the nodes in the network layer by layer through the calculated gradient values.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310431951.3A CN116384450A (en) | 2023-04-21 | 2023-04-21 | Medical data-oriented deep convolution fuzzy neural network and training method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310431951.3A CN116384450A (en) | 2023-04-21 | 2023-04-21 | Medical data-oriented deep convolution fuzzy neural network and training method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116384450A true CN116384450A (en) | 2023-07-04 |
Family
ID=86975046
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310431951.3A Pending CN116384450A (en) | 2023-04-21 | 2023-04-21 | Medical data-oriented deep convolution fuzzy neural network and training method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116384450A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117272233A (en) * | 2023-11-21 | 2023-12-22 | 中国汽车技术研究中心有限公司 | Diesel engine emission prediction method, apparatus, and storage medium |
CN117272233B (en) * | 2023-11-21 | 2024-05-31 | 中国汽车技术研究中心有限公司 | Diesel engine emission prediction method, apparatus, and storage medium |
-
2023
- 2023-04-21 CN CN202310431951.3A patent/CN116384450A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117272233A (en) * | 2023-11-21 | 2023-12-22 | 中国汽车技术研究中心有限公司 | Diesel engine emission prediction method, apparatus, and storage medium |
CN117272233B (en) * | 2023-11-21 | 2024-05-31 | 中国汽车技术研究中心有限公司 | Diesel engine emission prediction method, apparatus, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106600059B (en) | Intelligent power grid short-term load prediction method based on improved RBF neural network | |
Ebadzadeh et al. | IC-FNN: a novel fuzzy neural network with interpretable, intuitive, and correlated-contours fuzzy rules for function approximation | |
Ivakhnenko et al. | The review of problems solvable by algorithms of the group method of data handling (GMDH) | |
CN110647980A (en) | Time sequence prediction method based on GRU neural network | |
Ma et al. | Particle-swarm optimization of ensemble neural networks with negative correlation learning for forecasting short-term wind speed of wind farms in western China | |
CN114399032B (en) | Method and system for predicting metering error of electric energy meter | |
CN110472280B (en) | Power amplifier behavior modeling method based on generation of antagonistic neural network | |
Han et al. | An improved fuzzy neural network based on T–S model | |
CN113159389A (en) | Financial time sequence prediction method based on deep forest generation countermeasure network | |
CN110874374A (en) | On-line time sequence prediction method and system based on granularity intuition fuzzy cognitive map | |
CN114565021A (en) | Financial asset pricing method, system and storage medium based on quantum circulation neural network | |
CN113836823A (en) | Load combination prediction method based on load decomposition and optimized bidirectional long-short term memory network | |
Hayashi et al. | Fuzzy neural expert system with automated extraction of fuzzy if-then rules from a trained neural network | |
Sa’ad et al. | A structural evolving approach for fuzzy systems | |
Feng et al. | Performance analysis of fuzzy BLS using different cluster methods for classification | |
CN113887717A (en) | Method for predicting neural network training duration based on deep learning | |
CN113298131A (en) | Attention mechanism-based time sequence data missing value interpolation method | |
Li et al. | Bayesian robust multi-extreme learning machine | |
CN113128666A (en) | Mo-S-LSTMs model-based time series multi-step prediction method | |
CN116303786B (en) | Block chain financial big data management system based on multidimensional data fusion algorithm | |
Springer et al. | Robust parameter estimation of chaotic systems | |
Espinós Longa et al. | Swarm Intelligence in Cooperative Environments: Introducing the N-Step Dynamic Tree Search Algorithm | |
CN116384450A (en) | Medical data-oriented deep convolution fuzzy neural network and training method thereof | |
CN111524348A (en) | Long-short term traffic flow prediction model and method | |
CN115081323A (en) | Method for solving multi-objective constrained optimization problem and storage medium thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |