CN116384450A - Medical data-oriented deep convolution fuzzy neural network and training method thereof - Google Patents

Medical data-oriented deep convolution fuzzy neural network and training method thereof Download PDF

Info

Publication number
CN116384450A
CN116384450A CN202310431951.3A CN202310431951A CN116384450A CN 116384450 A CN116384450 A CN 116384450A CN 202310431951 A CN202310431951 A CN 202310431951A CN 116384450 A CN116384450 A CN 116384450A
Authority
CN
China
Prior art keywords
fuzzy
layer
algorithm
convolution
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310431951.3A
Other languages
Chinese (zh)
Inventor
周文晖
刘晓敏
何丽莉
李熙
白洪涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin University
Original Assignee
Jilin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin University filed Critical Jilin University
Priority to CN202310431951.3A priority Critical patent/CN116384450A/en
Publication of CN116384450A publication Critical patent/CN116384450A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/043Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Computational Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Automation & Control Theory (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention provides a medical data-oriented deep convolution fuzzy neural network and a training method thereof, wherein the medical data-oriented deep convolution fuzzy neural network comprises a medical data interpretability prediction model (IP-DCFNN) and an IP-DCFNN based on the deep convolution fuzzy neural network, wherein the medical data interpretability prediction model (IP-DCFNN) comprises three parts: fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part; a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation; the fuzzy result representation part: the fuzzy result representation part is a process for handling defuzzification in fuzzy inference. The invention relates to the technical field of computers, and discloses an IP-DCFNN which is added with a deep convolutional neural network concept on the basis of a fuzzy reasoning system to achieve the interpretive prediction capability for medical data.

Description

Medical data-oriented deep convolution fuzzy neural network and training method thereof
Technical Field
The invention relates to the technical field of computers, in particular to a deep convolution fuzzy neural network oriented to medical data and a training method thereof.
Background
The fuzzy inference system is limited by a data path and a complex structure, and extremely high calculation cost is often required when facing large-scale and high-latitude data, and meanwhile, the relation between the data is difficult to express, so that the accuracy of a prediction result of the fuzzy inference system is inferior to that of some algorithm models with network structures.
Although the neural network correlation algorithm can effectively process high-dimensional data, because the neural network correlation algorithm constructs a feedforward propagation network path through the topological relation of a large number of neurons, the weighting coefficient iterated out by the neurons of the hidden layer cannot generate direct connection with a result, and cannot represent the real meaning directly related to the processed task, namely the so-called black box property, which causes the neural network to lack of interpretability and influences the credibility and acceptance of the result.
The various attribute features of medical data often have more potential associations than other types of data, and the different physiological indicators or morphological characterization data of patients have partial logical associations therebetween, which makes some common inference models incapable of well processing medical data.
Disclosure of Invention
In view of this, the present invention aims to propose a deep convolution fuzzy neural network for medical data, so as to add the concept of the deep neural network on the basis of a fuzzy reasoning system to achieve the capability of both interpretability and processing high-dimensional data.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
the deep convolution fuzzy neural network for medical data is characterized in that: including medical data interpretability prediction model (IP-DCFNN) based on deep convolutional fuzzy neural network;
the IP-DCFNN consists of three parts:
fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part;
a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation;
the fuzzy result representation part: the fuzzy result representation part is a process for processing defuzzification in fuzzy inference.
Further, the membership function
Figure BDA0004190591260000021
For mapping numerical input data into 0 and 1 intervals;
the input data x i Is [ x ] 1 ,…,x n ]Within the interval, the membership function
Figure BDA0004190591260000022
Will input data x i Mapping to fuzzy sets low, high]Wherein k=1,..k and represents the selection of fuzzy linguistic variables, the parameters of the membership functions are μ and σ;
the fuzzy logic front part is formed by adopting Takagi-Sugeno fuzzy model
Figure BDA0004190591260000023
Figure BDA0004190591260000024
Fuzzy rules described in (a);
the fuzzy variables generated by different dimensionalities of input data are arranged and combined to construct a regular front part, and the initial excitation intensity of the rule can be obtained by accumulating membership degrees corresponding to each fuzzy variable in the regular front part, wherein the initial excitation intensity represents the functional weight of the rule;
the initial excitation intensity of the jth rule is calculated by the membership degree of all fuzzy variables in the rule front part, and the specific formula is that
Figure BDA0004190591260000025
I.e. the weight of the fuzzy rule is obtained by performing an AND fuzzy logic operation on each fuzzy variable.
Further, the depth convolution part takes the initial excitation intensity generated by the fuzzy logic front part as an initial input and outputs a final excitation intensity, wherein the final excitation intensity is the same as the initial excitation intensity in dimension;
the depth convolution calculation part mainly comprises a convolution layer and a full connection layer;
the convolution layer comprises a one-dimensional convolution layer, a ReLU activation layer and a maximum pooling layer;
the full connection layer comprises a Linear layer and a ReLU activation layer;
the initial input is passed through
Figure BDA0004190591260000031
The convolution process is extended to a high dimension by the one-dimensional convolution kernel and flattened in the fully-connected layer to an initial dimension, wherein +_>
Figure BDA0004190591260000032
Represents the ith data unit in the first convolution layer,/and/or the second convolution layer>
Figure BDA0004190591260000033
A weight, b representing the convolution kernel of the first layer l Representing the bias of the layer;
the network layers in series after the convolutional layers are fully connected, meaning that each node of the first layer is connected to all nodes of the first-1 layer,the parameter w represents the connection weight, b represents the bias of the node, and the formula of the feedforward propagation of the neural network is
Figure BDA0004190591260000034
In the network layers in series after the convolutional layers, the activation function is defined as σ (x) =1/(1+e) -x ) One-dimensional convolution is used to process the excitation intensity of the input.
Further, the blurring result representing section calculates a final excitation intensity [ W 'for the output in the depth convolution calculation section' 1 ,W′ 2 ,...,W′ N ]Calculating a normalization weight by using a normalization algorithm and showing the activation level of each rule;
the formula of the normalization algorithm is
Figure BDA0004190591260000035
Wherein (1)>
Figure BDA0004190591260000036
Representing the normalized excitation intensity of rule j and representing the contribution of that rule to the total weight;
the back-part of the fuzzy rule is a linear combination of the input data, expressed as
Figure BDA0004190591260000037
But->
Figure BDA0004190591260000038
The j rule is the back part parameter;
the total output Z is defined as the weighted sum of all rule outputs, and the specific formula is
Figure BDA0004190591260000039
Figure BDA0004190591260000041
Wherein the weight of each rule refers to the excitation intensity normalized by the rule;
for the rear partContinuous parameters
Figure BDA0004190591260000042
Using a Least Squares Estimator (LSE) to obtain an optimized result, let +.>
Figure BDA0004190591260000043
And θ= [ θ ] 1 ,θ 2 ,...,θ N ] T Let F= [ F 1 ,f 2 ,...,f N ]When the estimated error is set to be e, the correction formula of Z is obtained as Z=Fθ+e, and the mean square error function of the result is as follows
Figure BDA0004190591260000044
Wherein m represents the length of the dataset and y t Actual tag representing the t-th data of the dataset,/->
Figure BDA0004190591260000045
Representing the t-th input data from the dataset;
the optimization objective is to minimize the square error when
Figure BDA0004190591260000046
Substituting z=fθ+e, the data can be fitted with LSE and get the back-piece parameters in F.
Further, the fuzzy result representation part outputs the prediction result
Figure BDA0004190591260000047
Calculating a loss value using a Mean Square Error (MSE) and updating parameters of the IP-FDCNN using a back propagation algorithm (GD), the loss function being +.>
Figure BDA0004190591260000048
Compared with the prior art, the invention has the following advantages:
the deep convolution fuzzy neural network for medical data achieves the capability of interpretability and processing high-dimensional data by adding the concept of the deep neural network on the basis of being provided with a fuzzy reasoning system, particularly, the invention expresses the fuzzification and defuzzification parts of the fuzzy reasoning system by using a network architecture, and inserts a functional module of the deep convolution calculation in the middle of the fuzzification and defuzzification, and improves the data processing capability of the whole model system by the high-latitude data mapping and processing capability of the deep convolution network in the functional module. Although the hidden weight of the deep convolution part still cannot be well explained due to the topological limitation of the neural network, the fuzzified and defuzzified parts can extract richer rule base and membership information, so that the capability of processing data of the system is improved under the condition of not losing the interpretability of the system.
The invention further aims to provide a training method of the deep convolution fuzzy neural network for medical data, and the training efficiency is greatly improved by providing a fuzzy membership parameter initialization algorithm based on grid division and simulating membership parameters in an initialization model self-adapting to data distribution conditions.
In order to achieve the above purpose, the technical scheme of the invention is realized as follows:
the training method of the deep convolution fuzzy neural network facing the medical data comprises two stages of initialization of model parameters and iterative updating of parameters, wherein the training method of the deep convolution fuzzy neural network facing the medical data adopts a hybrid learning method, and specific parameters are learned and updated through different algorithms in the feedforward propagation and counter propagation processes.
Further, the training method of the deep convolution fuzzy neural network for medical data adopts a mixed learning method as follows:
the front part parameters are positioned in the fuzzy logic front part, the initialization algorithm is a grid division initialization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the back-piece parameter is positioned in the fuzzy result representation part, the initialization algorithm is zero initialization algorithm, the update time is feedforward propagation, and the update algorithm is least square estimation algorithm;
the node weight is positioned in the deep convolution calculation part, the initialization algorithm is an He-standardization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the node bias is positioned in a depth convolution computing part, an initialization algorithm is an He-standardization algorithm, updating time is back propagation, and an updating algorithm is a gradient descent algorithm.
Further, the front piece parameters of the fuzzy logic front piece section are expressed as
Figure BDA0004190591260000051
{ μ, σ } in each membership function in (a), where μ represents the blur center and σ represents the blur width;
the initialization front part parameter adopts an initialization algorithm (Grid Partition-based Initialization Algorithm) based on Grid division;
the Grid Partition (Grid Partition) is a method for dividing a data space, and divides an input data space into Grid subspaces parallel to axes according to membership functions of each feature;
initializing parameters of each membership function according to a predefined partition grid divided by a grid based on a grid partition initialization algorithm, initializing a fuzzy center mu as a median of data in the grid for a data sample falling in the grid, initializing a fuzzy width sigma as a linear approximation of a data scale in the grid, and obtaining coefficients from linear fitting;
the parameters of the deep convolution calculation part are weights and deviations of neurons of a convolution layer and a full connection layer, and the parameters are initialized by adopting an He-uniform algorithm, and the method comprises the following specific steps of:
A. sampling to form a uniform distribution of [ -limit, limit ];
B. limit at
Figure BDA0004190591260000061
Middle calculation(s) (i.e. f)>
Figure BDA0004190591260000062
Representing the number of input neurons in the first layer;
the back-piece parameters of the fuzzy result representation part are parameters in the output linear expression for each Rule, in particular Rule j
Figure BDA0004190591260000063
Figure BDA0004190591260000064
The result part of the middle rule shows;
a simple zero initialization is applied to all the back-piece parameters before training the model (i.e. all the back-piece parameters in the model are assigned a value of 0 at the beginning of training).
Further, the IP-DCFNN model is trained by adopting a hybrid learning method;
the model updates the back-piece parameters through a least squares estimation algorithm (LSE) in the feed-forward process, and updates the front-piece parameters and the hidden layer parameters through a gradient descent algorithm (GD) in the back-propagation process;
in counter-propagation, gradients such as
Figure BDA0004190591260000065
The calculation shown, wherein C represents +.>
Figure BDA0004190591260000066
The # n represents the number of nodes of the first layer affected by the nodes in the layer 1;
and reversely updating the weights of the nodes in the network layer by layer through the calculated gradient values.
Compared with the prior art, the invention has the following advantages:
according to the training method of the deep convolution fuzzy neural network for medical data, disclosed by the invention, parameters of membership functions can be simulated in an initialization model in a self-adaptive mode according to data distribution conditions through a fuzzy membership parameter initialization algorithm based on grid division, so that training efficiency is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings:
fig. 1 is a schematic structural diagram of a deep convolution fuzzy neural network for medical data and a training method thereof according to an embodiment of the present invention;
FIG. 2 is a membership function of a fuzzy logic front part of a medical data oriented deep convolution fuzzy neural network and a training method thereof according to an embodiment of the present invention;
FIG. 3 is a schematic structure of a depth convolution fuzzy neural network and a training method thereof for medical data according to an embodiment of the present invention;
fig. 4 is a Conv1d layer of a depth convolution fuzzy neural network and a depth convolution calculation part of a training method thereof for medical data according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of meshing of a deep convolutional fuzzy neural network and a training method thereof for medical data according to an embodiment of the present invention;
Detailed Description
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other.
In the description of the present invention, it should be noted that the azimuth or positional relationship indicated by the terms "upper", "lower", "inner", "back", etc. are based on the azimuth or positional relationship shown in the drawings, and are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or element referred to must have a specific azimuth, be constructed and operated in a specific azimuth, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The invention will be described in detail below with reference to the drawings in connection with embodiments.
The embodiment relates to a medical data-oriented deep convolution fuzzy neural network and a training method thereof, wherein the method comprises a medical data interpretability prediction model (IP-DCFNN) based on the deep convolution fuzzy neural network;
as shown in fig. 1, the IP-DCFNN consists of three parts: fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part; a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation; the fuzzy result representation part: the fuzzy result representation part is used for processing defuzzification in fuzzy inference; specifically, the fuzzy logic front part and the fuzzy result representation part are respectively used as a fuzzification and defuzzification tool of a fuzzy rule in a model, and the depth convolution calculation part is used for processing hidden weights in a high-dimensional representation. Although the deep convolution part is mostly constructed by adopting a neural network, the hidden layer weight is unexplainable, but we can obtain the interpretable information from the other two parts, such as fuzzy membership degree, fuzzy rule, triggering strength of each rule and the like. Thus, the model can process medical data in an interpretable manner and avoid degradation of prediction accuracy.
The fuzzy logic front part is to form the basic framework of the fuzzy if-then rule and represent the fuzzy front in the rule. In this section, the input data will be extracted by the fuzzy logic, and the input data will be converted from a numerical value by the operation of the membership function into a set of membership values for the fuzzy language scalar.
As shown in FIG. 2, the membership function μ * (x) For mapping numeric input data to a value within the 0 and 1 interval that represents membership in a defined fuzzy language set. For the membership function, the gaussian membership function has stronger expressive power and better expression effect than other function forms, so in the model proposed in this embodiment, the membership is calculated by using the gaussian membership function; specifically, for [ x ] 1 ,...,x n ]Input data x of (a) i Is good forUsing membership functions
Figure BDA0004190591260000081
Map it to fuzzy set low, high]Where k=1,..k (K represents the size of the fuzzy set, here 2) and represents the selection of the fuzzy linguistic variable; the parameters of the membership functions are μ and σ, which greatly affect the approximation ability of the fuzzy system, and in the IP-DCFNN model, the present embodiment applies a meshing-based initialization algorithm to initialize these parameters and update them with a gradient descent algorithm in each iteration during model training.
The fuzzy logic front part is formed by adopting Takagi-Sugeno fuzzy model
Figure BDA0004190591260000091
Figure BDA0004190591260000092
The fuzzy rule (TS rule) described in (a) above. TS Rule j (j=1,...,N and N=K n ) The content of the j-th fuzzy rule is described. The part following "IF" is a regular front piece representing each input data in the form of a fuzzy linguistic variable, where x i (i=1, 2,) n is input data, ++>
Figure BDA0004190591260000094
Is a fuzzy variable in the fuzzy set. The part following "THEN" is the back-piece of the rule, the result of which is represented by the non-ambiguous linear combination of input variables, where output f j Representing the final excitation intensity of rule j.
The fuzzy logic front part may adaptively generate N rules (here, a front of the generation rule is specified), where n=k n (n is the dimension of the input data and K is the size of the fuzzy set), from which it can be seen that the basic method of adaptive generation is to arrange and combine fuzzy variables generated by different dimensions of the input data to construct a regular front piece by multiplying togetherThe membership degree corresponding to each fuzzy variable in the rule front can obtain the initial excitation intensity of the rule, which represents the functional weight of the rule. As in
Figure BDA0004190591260000093
In the method, the initial excitation intensity of the jth rule is calculated by membership degrees of all fuzzy variables in the rule front, namely the weight of the fuzzy rule is obtained by carrying out 'AND' fuzzy logic operation on each fuzzy variable.
The deep convolution computing part is responsible for extracting hidden features from the weights of the input rules and converting the weights of the hidden layers into high-latitude information representations so as to characterize hidden relations among different rules. The complex structure of the neural network can represent a nonlinear relation, so that the processing capacity of the whole model is greatly improved; specifically, as shown in fig. 3, the depth convolution section takes as an initial input the initial excitation intensity generated by the fuzzy logic front part section and outputs a final excitation intensity that is the same dimension as the initial excitation intensity; the depth convolution calculation part mainly comprises a convolution layer and a full connection layer, wherein data are expanded to a high dimension by a one-dimensional convolution kernel in the convolution layer and flattened to an initial dimension in the full connection layer.
Input data pass through
Figure BDA0004190591260000101
And (5) convolution processing. />
Figure BDA0004190591260000102
Represents the ith data unit in the first convolution layer,/and/or the second convolution layer>
Figure BDA0004190591260000103
A weight, b representing the convolution kernel of the first layer l Representing the bias of the layer. In a network, the activation function is defined as σ (x) =1/(1+e) -x ). As shown in fig. 4, the present embodiment adopts one-dimensional convolution to process the excitation intensities of the inputs, thereby mining the relationship between the input weights; the network layers in series after the convolutional layers are fully connected, meaning that each node of the first layer isAll nodes connected to layer 1, the parameter w represents the connection weight, b represents the bias of the node, and the formula of feed forward propagation is
Figure BDA0004190591260000104
This layer will help to transform the number of dimensions of the hidden weights to the same scale as the partial input and prepare for the next partial defuzzification.
The fuzzy result representation part is used for processing the defuzzification process in fuzzy push, and is helpful for converting the operation intermediate quantity of the fuzzy logic into a clear result which is finally output by the model; final excitation intensity for last part output [ W ]' 1 ,W′ 2 ,...,W′ N ]This section calculates the normalization weights using a normalization algorithm and represents the activation level for each rule. Normalization formulas such as
Figure BDA0004190591260000105
Shown as (I)>
Figure BDA0004190591260000106
Represents the normalized excitation intensity of rule j and means the contribution of the rule to the total weight. From the normalized excitation intensity, the importance of each rule in the rule base can be deduced interpretatively.
In the TS rule described above, the back-part of the fuzzy rule is a linear combination of the input data, expressed as
Figure BDA0004190591260000107
But->
Figure BDA0004190591260000108
The j rule is the back part parameter; the total output Z is defined as the weighted sum of all rule outputs, in particular, the weight of each rule refers to the sum of all rule outputs
Figure BDA0004190591260000109
The excitation intensity after regular normalization; thus, in the fuzzy result representation part, the result of the rule will be formed andthe intermediate quantity of the fuzzy logic is converted into a digital value to be output, so that a model can give a prediction result; for subsequent parameters->
Figure BDA00041905912600001010
The present embodiment uses a Least Squares Estimator (LSE) to obtain the optimized result; let->
Figure BDA00041905912600001011
And θ= [ θ ] 1 ,θ 2 ,...,θ N ] T Let F= [ F 1 ,f 2 ,...,f N ]And assuming that the estimation error is e, a correction formula z=fθ+e of Z can be obtained; the mean square error function of the result is that
Figure BDA0004190591260000111
Wherein m represents the length of the dataset and y t Actual tag representing the t-th data of the dataset,/->
Figure BDA0004190591260000112
Representing the t-th input data from the dataset; thus, the optimization objective is to minimize the square error; when it is to
Figure BDA0004190591260000113
When Z=Fθ+e is substituted, the LSE can be used for fitting data and obtaining the back part parameters in F; fuzzy result representation part outputs prediction result +.>
Figure BDA0004190591260000114
The present embodiment uses Mean Square Error (MSE) to calculate the loss value and uses back propagation algorithm (GD) to update the parameters of the IP-FDCNN, the loss function is like +.>
Figure BDA0004190591260000115
The IP-DCFNN adopts a mixed learning method, the training process of the IP-DCFNN model can be divided into two stages, and the initialization of model parameters and the iterative update of parameters are performed; during the training process, some parameters play a critical role in the final performance of the model, and these parameters are shown in table 1:
Figure BDA0004190591260000116
TABLE 1
These parameters are distributed among three components of the IP-DCFNN; wherein the result parameters (back-office parameters) are updated by least squares estimation algorithm (LSE) during feed forward propagation, while the other parameters are updated by Gradient Descent (GD) algorithm during backward propagation; i.e. the parameters are learned and updated by different algorithms during feed forward and back propagation etc.
The initialization algorithm of the parameters is shown in Table 1, and the front parameters of the fuzzy logic front part are expressed as
Figure BDA0004190591260000121
{ μ, σ }, where μ represents the blur center and σ represents the blur width; the initialization front part parameter adopts a grid partition-based initialization algorithm (grid partition), specifically, grid partition (grid partition) is a method for dividing a data space, and divides an input data space into grid subspaces parallel to axes according to a membership function of each feature; as shown in fig. 5, taking two dimensions as an example, in each dimension, the dimension is divided by a fuzzy function corresponding to the fuzzy set along the direction parallel to the axial direction, the division of multiple dimensions of the data divides the whole data set into small grids, and each grid partition can represent a combination of fuzzy variables; initializing parameters of each membership function according to a predefined partition grid divided by a grid based on a grid partition initialization algorithm; for data samples falling in the grid, the blur center μ is initialized to the median of the data in this grid, the blur width σ is initialized to a linear approximation of the size of the data in the grid, the coefficients of which come from the linear fit.
Based on the foregoing, the steps of the initialization algorithm for the front piece parameters are as follows:
1) Dividing an input space into different dimensions according to attribute characteristics;
2) Sequencing the data according to each data dimension, and performing equal segmentation on the current single-dimensional feature data according to the quantity of fuzzy variables in the fuzzy set of the corresponding features;
3) For each segmented data unit, calculating membership function parameters (u, θ) of the corresponding fuzzy variable:
wherein u is the median of the characteristic data values in the current data unit, θ is ρ times the difference between the maximum value and the minimum value of the characteristic data values in the current data unit, and ρ is the linear fitting coefficient of the membership function calculation result and the characteristic data distribution range.
The parameters of the deep convolution calculation part are weights and deviations of neurons of a convolution layer and a full connection layer, and the parameters are initialized by adopting an He-uniform algorithm, and the method comprises the following specific steps of:
A. sampling to form a uniform distribution of [ -limit, limit ];
B. limit at
Figure BDA0004190591260000122
Calculating;
Figure BDA0004190591260000123
representing the number of input neurons in the first layer; the result parameters (back-piece parameters) of the fuzzy result representation part are parameters in the output linear expression for each rule, in particular
Figure BDA0004190591260000131
Figure BDA0004190591260000132
The result part of the middle rule shows; unlike other parameters, the resulting parameters are generated during forward propagation, which depend on a single iteration in the forward propagation, and are independent of historical iterations, i.e., two adjacentThe back-piece parameters in the second iteration have no direct relation; a simple zero initialization is applied to all the back-piece parameters before training the model (i.e. all the back-piece parameters in the model are assigned a value of 0 at the beginning of training).
The IP-DCFNN model is trained by adopting a hybrid learning method; the model updates the back-piece parameters through a least squares estimation algorithm (LSE) in the feed-forward process, and updates the front-piece parameters and the hidden layer parameters through a gradient descent algorithm (GD) in the back-propagation process; in counter-propagation, gradients such as
Figure BDA0004190591260000133
The calculation shown, wherein C represents +.>
Figure BDA0004190591260000134
The # n represents the number of nodes of the first layer affected by the nodes in the layer 1; and reversely updating the weights of the nodes in the network layer by layer through the calculated gradient values.
The training strategy for the model is as follows:
1) Initializing model parameters through an initialization algorithm;
2) Converting the input data into membership values relative to fuzzy variables through membership functions;
3) Generating all rules in a rule base and initial excitation intensities thereof by arranging and combining fuzzy variables;
4) Processing the initial excitation intensity through a convolution kernel in a convolution calculation unit to obtain the final excitation intensity;
5) Normalizing the excitation intensity, and calculating fuzzy back-piece parameters by a least square estimation method;
6) Calculating a final result degree value through an input linear calculation function in the fuzzy back part;
7) Calculating a loss value by adopting a mean square error, and calculating gradient values of a fuzzy logic front part and a depth convolution calculation part;
8) And updating fuzzy front part parameters and hidden layer parameters (node weights and node offsets) by using a gradient descent algorithm.
The deep convolution fuzzy neural network for medical data and the training method thereof add the concept of the deep neural network on the basis of a fuzzy reasoning system to achieve the capability of not only interpretability but also processing high-dimensional data, specifically, the fuzzification and defuzzification parts of the fuzzy reasoning system are represented by using a network architecture, and a functional module of the deep convolution calculation is inserted in the middle of the fuzzification and defuzzification, and the data processing capability of the whole model system is improved through the high-latitude data mapping and processing capability of the deep convolution network in the functional module; although the hidden weight of the deep convolution part still cannot be well explained due to the topological limitation of the neural network, the fuzzified and defuzzified parts can extract richer rule base and membership information, so that the capability of processing data of the system is improved under the condition of not losing the interpretability of the system.
Meanwhile, by proposing a fuzzy membership parameter initialization algorithm based on grid division, parameters of membership functions can be simulated in an initialization model self-adapting to data distribution conditions, the condition that the parameter initialization aiming at a fuzzy reasoning system part greatly influences the effect and the final capability is avoided, and the training efficiency is greatly improved.
The model is used for carrying out targeted processing on the medical data of the patient by combining the fuzzy inference network and the depth convolution calculation unit, and the convolution unit in the model can further mine the associated information and the potential characteristic expression in the physiological information of the patient in the medical data by carrying out high-dimensional mapping calculation on the fuzzy rule weight, so that the model is more beneficial to reasoning and analyzing the task related to the medical data of the patient.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims (9)

1. The deep convolution fuzzy neural network for medical data is characterized in that: including medical data interpretability prediction model (IP-DCFNN) based on deep convolutional fuzzy neural network;
the IP-DCFNN consists of three parts:
fuzzy logic front part: the fuzzy logic front part extracts input data, and the input data is converted from a numerical value into a group of membership values for fuzzy language scalar through the operation of membership functions in the fuzzy logic front part;
a depth convolution calculation section: the depth convolution part extracts hidden features in the input rule weight and converts the hidden layer weight into high latitude information representation;
the fuzzy result representation part: the fuzzy result representation part is a process for processing defuzzification in fuzzy inference.
2. The medical data oriented deep convolutional fuzzy neural network of claim 1, wherein:
the membership function
Figure FDA0004190591250000011
For mapping numerical input data into 0 and 1 intervals;
the input data x i Is [ x ] 1 ,...,x n ]Within the interval, the membership function
Figure FDA0004190591250000012
Will input data x i Mapping to fuzzy sets low, high]Wherein k=1,..k and represents the selection of fuzzy linguistic variables, the parameters of the membership functions are μ and σ;
the fuzzy logic front part adopts Takagi-Sugeno fuzzy model to form Rule i :IF x 1 is
Figure FDA0004190591250000013
x 2 is
Figure FDA0004190591250000014
…x n is/>
Figure FDA0004190591250000015
THEN
Figure FDA0004190591250000016
Fuzzy rules described in (a);
the fuzzy variables generated by different dimensionalities of input data are arranged and combined to construct a regular front part, and the initial excitation intensity of the rule can be obtained by accumulating membership degrees corresponding to each fuzzy variable in the regular front part, wherein the initial excitation intensity represents the functional weight of the rule;
the initial excitation intensity of the jth rule is calculated by the membership degree of all fuzzy variables in the rule front part, and the specific formula is that
Figure FDA0004190591250000021
I.e. the weight of the fuzzy rule is obtained by performing an AND fuzzy logic operation on each fuzzy variable.
3. The medical data oriented deep convolutional fuzzy neural network of claim 2, wherein:
the depth convolution part takes the initial excitation intensity generated by the fuzzy logic front part as initial input and outputs final excitation intensity, wherein the final excitation intensity is the same as the initial excitation intensity in dimension;
the depth convolution calculation part mainly comprises a convolution layer and a full connection layer;
the convolution layer comprises a one-dimensional convolution layer, a ReLU activation layer and a maximum pooling layer;
the full connection layer comprises a Linear layer and a ReLU activation layer;
the initial input is passed through
Figure FDA0004190591250000022
The convolution process is extended to a high dimension by the one-dimensional convolution kernel and is performed on the one-dimensional convolution kernelFlattening in the fully connected layer to an initial dimension, wherein +.>
Figure FDA0004190591250000023
Representing the ith data element in the first convolutional layer,
Figure FDA0004190591250000024
a weight, b representing the convolution kernel of the first layer l Representing the bias of the layer;
the network layers connected in series after the convolution layer are fully connected, each node of the first layer is connected to all nodes of the first-1 layer, the parameter w represents the connection weight, b represents the bias of the nodes, and the formula of feedforward propagation of the neural network is that
Figure FDA0004190591250000025
In the network layers in series after the convolutional layers, the activation function is defined as σ (x) =1/(1+e) -x ) One-dimensional convolution is used to process the excitation intensity of the input.
4. The medical data oriented deep convolutional fuzzy neural network of claim 3, wherein:
the fuzzy result representation part calculates the final excitation intensity W 'of the output in the deep convolution calculation part' 1 ,W′ 2 ,...,W′ N ]Calculating a normalization weight by using a normalization algorithm and showing the activation level of each rule;
the formula of the normalization algorithm is
Figure FDA0004190591250000031
Wherein (1)>
Figure FDA0004190591250000032
Representing the normalized excitation intensity of rule j and representing the contribution of that rule to the total weight;
the back-part of the fuzzy rule is the input dataIs expressed as a linear combination of
Figure FDA0004190591250000033
While
Figure FDA0004190591250000034
The j rule is the back part parameter;
the total output Z is defined as the weighted sum of all rule outputs, and the specific formula is
Figure FDA0004190591250000035
Figure FDA0004190591250000036
Wherein the weight of each rule refers to the excitation intensity normalized by the rule;
for subsequent parameters
Figure FDA0004190591250000037
Using a Least Squares Estimator (LSE) to obtain an optimized result, let +.>
Figure FDA0004190591250000038
And θ= [ θ ] 1 ,θ 2 ,...,θ N ] T Let F= [ F 1 ,f 2 ,...,f N ]When the estimated error is set to be e, the correction formula of Z is obtained as Z=Fθ+e, and the mean square error function of the result is as follows
Figure FDA0004190591250000039
Wherein m represents the length of the dataset and y t Actual tag representing the t-th data of the dataset,/->
Figure FDA00041905912500000310
Representing the t-th input data from the dataset;
the optimization objective is to minimize the square error when
Figure FDA00041905912500000311
Substituting z=fθ+e, the data can be fitted with LSE and get the back-piece parameters in F.
5. The medical data oriented deep convolutional fuzzy neural network of claim 4, wherein:
fuzzy result representation part outputs prediction result
Figure FDA00041905912500000312
Calculating a loss value using a Mean Square Error (MSE) and updating parameters of the IP-FDCNN using a back propagation algorithm (GD), the loss function being +.>
Figure FDA0004190591250000041
6. A training method of a deep convolution fuzzy neural network for medical data is characterized by comprising the following steps:
the training method of the deep convolution fuzzy neural network for the medical data adopts a hybrid learning method, and specific parameters are learned and updated through different algorithms in the feedforward propagation and the counter propagation processes.
7. The training method of the medical data-oriented deep convolution fuzzy neural network of claim 6, wherein the training method comprises the following steps: the training method of the deep convolution fuzzy neural network for medical data adopts a mixed learning method as follows:
the front part parameters are positioned in the fuzzy logic front part, the initialization algorithm is a grid division initialization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the back-piece parameter is positioned in the fuzzy result representation part, the initialization algorithm is zero initialization algorithm, the update time is feedforward propagation, and the update algorithm is least square estimation algorithm;
the node weight is positioned in the deep convolution calculation part, the initialization algorithm is an He-standardization algorithm, the updating time is counter-propagation, and the updating algorithm is a gradient descent algorithm;
the node bias is positioned in a depth convolution computing part, an initialization algorithm is an He-standardization algorithm, updating time is back propagation, and an updating algorithm is a gradient descent algorithm.
8. The training method of the medical data-oriented deep convolution fuzzy neural network of claim 7, wherein the training method comprises the following steps:
the front piece parameters of the fuzzy logic front piece part are expressed as
Figure FDA0004190591250000042
{ μ, σ } in each membership function in (a), where μ represents the blur center and σ represents the blur width;
the initialization front part parameter adopts an initialization algorithm (Grid Partition-based Initialization Algorithm) based on Grid division;
the Grid Partition (Grid Partition) is a method for dividing a data space, and divides an input data space into Grid subspaces parallel to axes according to membership functions of each feature;
initializing parameters of each membership function according to a predefined partition grid divided by a grid based on a grid partition initialization algorithm, initializing a fuzzy center mu as a median of data in the grid for a data sample falling in the grid, initializing a fuzzy width sigma as a linear approximation of a data scale in the grid, and obtaining coefficients from linear fitting;
the parameters of the deep convolution calculation part are weights and deviations of neurons of a convolution layer and a full connection layer, and the parameters are initialized by adopting an He-uniform algorithm, and the method comprises the following specific steps of:
A. sampling to form a uniform distribution of [ -limit, limit ];
B. limit at
Figure FDA0004190591250000051
Middle calculation(s) (i.e. f)>
Figure FDA0004190591250000052
Representing the number of input neurons in the first layer;
the back-piece parameters of the fuzzy result representation part are parameters in the output linear expression for each Rule, in particular Rule j :IF x 1 is
Figure FDA0004190591250000053
x 2 is/>
Figure FDA0004190591250000054
…x n is/>
Figure FDA0004190591250000055
THEN
Figure FDA0004190591250000056
The result part of the middle rule shows;
a simple zero initialization is applied to all the back-piece parameters before training the model (i.e. all the back-piece parameters in the model are assigned a value of 0 at the beginning of training).
9. The training method of the medical data-oriented deep convolution fuzzy neural network of claim 8, wherein the training method comprises the following steps:
the IP-DCFNN model is trained by adopting a hybrid learning method;
the model updates the back-piece parameters through a least squares estimation algorithm (LSE) in the feed-forward process, and updates the front-piece parameters and the hidden layer parameters through a gradient descent algorithm (GD) in the back-propagation process;
in counter-propagation, gradients such as
Figure FDA0004190591250000057
The calculation shown, wherein C represents +.>
Figure FDA0004190591250000058
In the first layer, # (n) represents the number of nodes of the first layer affected by the nodes in the first layer +1;
and reversely updating the weights of the nodes in the network layer by layer through the calculated gradient values.
CN202310431951.3A 2023-04-21 2023-04-21 Medical data-oriented deep convolution fuzzy neural network and training method thereof Pending CN116384450A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310431951.3A CN116384450A (en) 2023-04-21 2023-04-21 Medical data-oriented deep convolution fuzzy neural network and training method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310431951.3A CN116384450A (en) 2023-04-21 2023-04-21 Medical data-oriented deep convolution fuzzy neural network and training method thereof

Publications (1)

Publication Number Publication Date
CN116384450A true CN116384450A (en) 2023-07-04

Family

ID=86975046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310431951.3A Pending CN116384450A (en) 2023-04-21 2023-04-21 Medical data-oriented deep convolution fuzzy neural network and training method thereof

Country Status (1)

Country Link
CN (1) CN116384450A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117272233A (en) * 2023-11-21 2023-12-22 中国汽车技术研究中心有限公司 Diesel engine emission prediction method, apparatus, and storage medium
CN117272233B (en) * 2023-11-21 2024-05-31 中国汽车技术研究中心有限公司 Diesel engine emission prediction method, apparatus, and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117272233A (en) * 2023-11-21 2023-12-22 中国汽车技术研究中心有限公司 Diesel engine emission prediction method, apparatus, and storage medium
CN117272233B (en) * 2023-11-21 2024-05-31 中国汽车技术研究中心有限公司 Diesel engine emission prediction method, apparatus, and storage medium

Similar Documents

Publication Publication Date Title
CN106600059B (en) Intelligent power grid short-term load prediction method based on improved RBF neural network
Ebadzadeh et al. IC-FNN: a novel fuzzy neural network with interpretable, intuitive, and correlated-contours fuzzy rules for function approximation
Ivakhnenko et al. The review of problems solvable by algorithms of the group method of data handling (GMDH)
CN110647980A (en) Time sequence prediction method based on GRU neural network
Ma et al. Particle-swarm optimization of ensemble neural networks with negative correlation learning for forecasting short-term wind speed of wind farms in western China
CN114399032B (en) Method and system for predicting metering error of electric energy meter
CN110472280B (en) Power amplifier behavior modeling method based on generation of antagonistic neural network
Han et al. An improved fuzzy neural network based on T–S model
CN113159389A (en) Financial time sequence prediction method based on deep forest generation countermeasure network
CN110874374A (en) On-line time sequence prediction method and system based on granularity intuition fuzzy cognitive map
CN114565021A (en) Financial asset pricing method, system and storage medium based on quantum circulation neural network
CN113836823A (en) Load combination prediction method based on load decomposition and optimized bidirectional long-short term memory network
Hayashi et al. Fuzzy neural expert system with automated extraction of fuzzy if-then rules from a trained neural network
Sa’ad et al. A structural evolving approach for fuzzy systems
Feng et al. Performance analysis of fuzzy BLS using different cluster methods for classification
CN113887717A (en) Method for predicting neural network training duration based on deep learning
CN113298131A (en) Attention mechanism-based time sequence data missing value interpolation method
Li et al. Bayesian robust multi-extreme learning machine
CN113128666A (en) Mo-S-LSTMs model-based time series multi-step prediction method
CN116303786B (en) Block chain financial big data management system based on multidimensional data fusion algorithm
Springer et al. Robust parameter estimation of chaotic systems
Espinós Longa et al. Swarm Intelligence in Cooperative Environments: Introducing the N-Step Dynamic Tree Search Algorithm
CN116384450A (en) Medical data-oriented deep convolution fuzzy neural network and training method thereof
CN111524348A (en) Long-short term traffic flow prediction model and method
CN115081323A (en) Method for solving multi-objective constrained optimization problem and storage medium thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination