CN112201340B - Electrocardiogram disease determination method based on Bayesian network filtering - Google Patents

Electrocardiogram disease determination method based on Bayesian network filtering Download PDF

Info

Publication number
CN112201340B
CN112201340B CN202010678145.2A CN202010678145A CN112201340B CN 112201340 B CN112201340 B CN 112201340B CN 202010678145 A CN202010678145 A CN 202010678145A CN 112201340 B CN112201340 B CN 112201340B
Authority
CN
China
Prior art keywords
disorder
candidate
disease
anchor
bayesian network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010678145.2A
Other languages
Chinese (zh)
Other versions
CN112201340A (en
Inventor
韩京宇
孙广鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202010678145.2A priority Critical patent/CN112201340B/en
Publication of CN112201340A publication Critical patent/CN112201340A/en
Application granted granted Critical
Publication of CN112201340B publication Critical patent/CN112201340B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses an electrocardiogram disease determination method based on Bayesian network filtering, and belongs to the field of electrocardiogram disease diagnosis. On the basis of a trained base classifier, the method adopts a two-layer structure to determine a final disease label: the first layer constructs a voter to screen the results of the base classifier, and generates an anchor disease set and a candidate disease set; and in the second layer, a Bayesian network is constructed by adopting a hill climbing method based on BDe scores, and the Bayesian network filters the anchor disease set and the candidate disease set to determine a final prediction disease set. The method is characterized in that (1) the dependency relationship among disease labels is fully utilized, and the generalization capability of the model is improved; (2) the prediction result of the base classifier can be corrected through two layers of filtering processing, and the accuracy of model prediction is improved; (3) because the causal relationship used for constructing the Bayesian network is a strong correlation, the model has the characteristic of stability and does not show too large difference due to different data distribution.

Description

Electrocardiogram symptom determining method based on Bayesian network filtering
Technical Field
The invention belongs to the technical field of intelligent diagnosis of electrocardiogram disorders based on machine learning, and relates to electrocardiogram disorder determination, in particular to a multi-label disorder determination method based on machine learning.
Background
Multi-label classification refers to classification of multiple labels for a given exemplar, which may correspond to one or more labels in a set of labels. Defining a feature space X ═ R d Where d represents the dimension of the feature, L ═ L 1 ,L 2 ,…,L n Denotes a label space with n labels, and a training set D ═ x is constructed i ,L j ) I is more than or equal to 1 and less than or equal to q, j is more than or equal to 1 and less than or equal to n, q represents the size of the training set, i represents the serial number of the sample, x i E X represents a d-dimensional feature vector, L j E L represents a tag element in L. The task of multi-label learning is to learn a multi-label classifier h (-) according to a training set D, predict a new sample x by using the classifier h (-) and predict the result
Figure BDA0002584803180000011
Is the set of class labels for sample x.
The solutions for multi-label classification are mainly divided into two types at present: one is a strategy based on problem transformation and one is a strategy based on algorithm adaptation. The strategy of problem conversion is to convert the multi-label problem into a plurality of single-label two-classification submodels and then combine the results of the submodels to obtain the final result. And the strategy based on algorithm adaptation is to adjust the popular learning algorithm to adapt to multi-label learning.
The strategy of problem transformation can be divided into Binary Relevance (Binary reservance), Classifier Chains (Classifier Chains), Label power set (Label Powerset) and the like. The binary correlation method is the simplest method, and the core idea is to decompose the multi-label classification problem into a plurality of binary classification problems. The method has the advantages that the implementation method is simple and easy to understand, the model obtained by training is good in effect when the dependency relationship does not exist among the labels, and if the direct dependency relationship exists among the labels, the finally constructed model is weak in generalization capability and cannot achieve the expected effect. The core idea of the classifier chain is to convert the multi-label classification problem into a binary classifier chain form, wherein the construction of the binary classifier after the chain is carried out on the basis of the prediction result of the preceding classifier, in the model construction process, the label sequence needs to be disordered and ordered, and then the model corresponding to each label is constructed respectively according to the sequence from head to tail. The classifier chain method has the advantages that the implementation method is relatively simple, meanwhile, the relation of the labels is considered, the generalization capability of the model is enhanced to a certain extent, but the effect of the method is influenced by the sequencing, and a proper label dependency relationship is difficult to find. The label power set method is to convert the multi-label classification into a multi-classification problem, and the label set of each sample instance is used as a single class to construct a multi-classifier. The method considers the combination relation among the labels, but does not consider the dependency among the labels, and the number of classes may be increased along with the increase of the number of the labels, so that the model becomes more complex and the generalization capability of the model is reduced to a certain extent.
The methods adopting the algorithm adaptation strategy mainly comprise ML-kNN and ML-DT. ML-kNN is a modified algorithm of kNN algorithm, and it is thought that for each sample instance, k nearest instances are obtained, and feature information of these instances is used to determine the predicted tag set of the instance. The ML-kNN can identify different neighborhoods of each sample and predict by using the inter-domain information, so that the accuracy is high, but the ML-kNN is not sensitive to abnormal points. The basic idea of ML-DT is to process multi-label data by using a decision tree technology, and recursively construct a decision tree by using an information gain criterion based on multi-label entropy, wherein a decision tree model can be efficiently derived from the multi-label data, but the labels are assumed to be independent when the information entropy is calculated.
The dependency among the labels is largely ignored by the algorithm adaptation strategy and the problem transformation strategy, a model is not constructed by utilizing the relation among the labels, and the electrocardiogram diseases are just related, so that the methods cannot well utilize the electrocardiogram to determine the diseases, and the prediction accuracy is poor.
Causal relationships are important patterns of data mining and can reveal dependencies between tags. Causal relationships explain the cause of an event occurrence and what the event occurrence will cause, emphasizing the strong correlation between variables. Causal relationships among data can be mined through corresponding algorithms, common constraint-based mining algorithms include SGS algorithms, PC algorithms and variants of various PC algorithms, and score-based mining algorithms are mainly search algorithms based on Bayesian Dirichlet likelihood equivalence scores. The mining of causal relationships, while implemented by a number of algorithms, has been rarely used for electrocardiographic disorder determination.
The invention combines the work of the two aspects and utilizes the causal relationship among symptoms to provide the electrocardiogram symptom determining method based on Bayesian network filtration.
Disclosure of Invention
Aiming at the problems, the invention provides an electrocardiogram disease determination method which adopts a Bayesian network for filtering treatment and well realizes the intelligent diagnosis of electrocardiogram diseases.
The technical scheme of the invention is as follows: an electrocardiogram disease determination method based on Bayesian network filtering comprises the following specific operation steps:
step (1.1): predicting the possible disease label of the instance ob by using a plurality of base classifiers;
step (1.2): constructing a voting machine;
step (1.3): transmitting the prediction result obtained in the step (1.1) into the voter constructed in the step (1.2) for screening, and obtaining an anchor disease set AS (ob) and a candidate disease set CS (ob) by the voter after screening;
step (1.4): combining all subsets of anchor disorder set AS (ob) and candidate disorder set CS (ob) to obtain anchor disorder support set ASP (ob), each element of which is a union of anchor disorder set AS (ob) and candidate disorder set CS (ob) subsets, denoted as anchor disorder extension SL (ob) i (ob);
Step (1.5): constructing a Bayesian network by using a hill-climbing search algorithm based on Bayesian Dirichlet likelihood equivalence scores;
step (1.6): respectively calculating anchor point disorder set AS (ob) and anchor point disorder extension SL by utilizing Bayesian network i The joint probabilities of (ob) are denoted as P (AS), (ob), and P (SL) i (ob))。
Further, in step (1.2), the operation steps of constructing the voter are as follows:
(1.2.1) setting a probability threshold that allows the model to participate in anchor disorder set as (ob), candidate disorder set vote cs (ob); presetting voting threshold values required for adding the symptoms into an anchor symptom set AS (ob) and a candidate symptom set vote CS (ob);
(1.2.2) traversing the prediction results of all base classifier models corresponding to a disease, wherein the prediction results are probability values of 1 predicted by the models, when the prediction results are not smaller than a preset probability threshold, the models are qualified to participate in voting, and after the models have the voting right, the votes obtained by the corresponding anchor point disease set AS (ob) or candidate disease set voting CS (ob) are added with 1;
(1.2.3) if the votes obtained by the anchor disease set AS (ob) satisfy the voting threshold condition, adding the disease into the anchor disease set AS (ob); otherwise, checking the number of votes obtained from the candidate disorder set CS (ob), and if the voting threshold is met, adding the disorder into the candidate disorder set CS (ob);
(1.2.4), repeating the steps (1.2.1) to (1.2.3) and determining the attribution of all the symptoms.
Further, in step (1.3), the anchor set of disorders as (ob) stores the identified disorders, and the candidate set of disorders cs (ob) stores the disorders that need to be identified.
Further, in step (1.5), the constructed bayesian network is constructed by using causal relationship mining, namely a hill climbing method based on bayesian dirichlet likelihood equivalence score.
Further, in step (1.6), the disorder set SL satisfying the following formula (r) i (ob) is the prediction of instance ob and is denoted as tls (ob)
Figure BDA0002584803180000031
The beneficial effects of the invention are: (1) the invention adopts a two-layer structure to determine the final result for a plurality of trained base classifiers: the first layer constructs a voter which screens the results of the base classifier, and the second layer constructs a Bayesian network which filters the results of the voter, so that the classification effect of the base classifier is enhanced, and the accuracy of determining the electrocardiogram symptoms is improved; (2) the Bayesian network is constructed by using a hill climbing method based on Bayesian Dirichlet likelihood equivalent scores, so that the dependence among symptoms is fully utilized, and the generalization performance of the model is improved; (3) the Bayesian network is constructed by mining the causal relationship, and the causal relationship reveals the dependency relationship with strong relevance, so that the model has the characteristic of stability and cannot show too large difference due to different data distribution.
Drawings
FIG. 1 is a flow chart of the structure of the present invention;
FIG. 2 is a schematic diagram of the structure of the voting machine according to the present invention;
FIG. 3 is a diagram of an exemplary voting architecture of the voter of the present invention;
FIG. 4 is a flow chart of the Bayesian network construction of the present invention;
FIG. 5 is a diagram of an exemplary Bayesian network architecture in accordance with the present invention;
fig. 6 is a schematic partial structure diagram of a bayesian network in an embodiment of the present invention.
Detailed Description
In order to more clearly illustrate the technical solution of the present invention, the following detailed description is made with reference to the accompanying drawings:
an electrocardiogram disease determination method based on Bayesian network filtering comprises the following specific operation steps:
step (1.1): predicting a probable disorder label for the instance ob using a number of basis classifiers;
step (1.2): constructing a voting machine;
step (1.3): transmitting the prediction result obtained in the step (1.1) into the voter constructed in the step (1.2) for screening, and obtaining an anchor disease set AS (ob) and a candidate disease set CS (ob) by the voter after screening;
step (1.4): combining all subsets of anchor disorder set AS (ob) and candidate disorder set CS (ob) to obtain anchor disorder support set ASP (ob), each element of which is a union of anchor disorder set AS (ob) and candidate disorder set CS (ob) subsets, denoted as anchor disorder extension SL (ob) i (ob);
Step (1.5): constructing a Bayesian network by using a hill-climbing search algorithm based on Bayesian Dirichlet likelihood equivalence scores;
step (1.6): respectively calculating anchor point disorder set AS (ob) and anchor point disorder extension SL by utilizing Bayesian network i The joint probabilities of (ob) are denoted as P (AS), (ob), and P (SL) i (ob))。
Further, in step (1.2), the operation steps of constructing the voter are as follows:
(1.2.1) setting a probability threshold that allows the model to participate in anchor disorder set as (ob), candidate disorder set vote cs (ob); presetting voting threshold values required for adding the symptoms into an anchor symptom set AS (ob) and a candidate symptom set vote CS (ob);
(1.2.2) traversing the prediction results of all base classifier models corresponding to a disease, wherein the prediction results are probability values of 1 predicted by the models, when the prediction results are not smaller than a preset probability threshold, the models are qualified to participate in voting, and after the models have the voting right, the votes obtained by the corresponding anchor point disease set AS (ob) or candidate disease set voting CS (ob) are added with 1;
(1.2.3) if the votes obtained by the anchor point disorder set AS (ob) meet the voting threshold condition, adding the disorder into the anchor point disorder set AS (ob); otherwise, checking the number of votes obtained from the candidate disorder set CS (ob), and if the voting threshold is met, adding the disorder into the candidate disorder set CS (ob);
(1.2.4), repeating the steps (1.2.1) to (1.2.3) and determining the attribution of all the symptoms.
Further, in step (1.3), the anchor disorder set as (ob) stores the identified disorders, and the candidate disorder set cs (ob) stores the disorders requiring confirmation.
Further, in step (1.5), the constructed bayesian network is constructed by using causal relationship mining, namely a hill climbing method based on bayesian dirichlet likelihood equivalence score.
Further, in step (1.6), the disorder set SL satisfying the following formula (r) i (ob) is the prediction of instance ob and is denoted as tls (ob)
Figure BDA0002584803180000051
In particular, as depicted in FIG. 1; a method for determining electrocardiogram symptoms based on Bayesian network filtering adopts a two-layer structure to determine the final result of a plurality of trained base classifiers: a first layer constructs a voter V which screens the results of the base classifier to generate an anchor disease set AS and a candidate disease set CS; and a Bayesian network is constructed on the second layer by using a causal mining algorithm based on a mountain climbing method with BDe scores, and the Bayesian network carries out secondary filtering on an anchor disease set AS and a candidate disease set CS to determine a final prediction disease set tls (ob).
As shown in fig. 2, the diagram is a schematic design diagram of a voter, and the implementation steps are as follows:
step 1: for disorder Li, its base classifier is recordedBC1, BC2, …, BCm, the prediction result of the base classifier is noted as P (BC) j ) Representing the probability that the prediction result of the jth base classifier is 1;
step 2: setting 6 parameters of AS _ count, CS _ count, AS _ proba, CS _ proba, t1 and t2, wherein the AS _ count and the CS _ count are respectively used for recording the votes of the AS and the CS and are initially 0, the AS _ proba and the CS _ proba are respectively threshold values for allowing the model to participate in voting of the AS and the CS, and t1 and t2 represent threshold values of the votes required for adding the disease Li into the AS and the CS;
and 3, step 3: traverse the prediction results of m (all the basis classifiers corresponding to the condition Li), for the basis classifier BC j Is provided with P j Predict the probability value of 1 for the jth model if P (BC) j )>AS _ proba, the base classifier BC j Voting for participating AS, adding 1 to AS _ count, otherwise checking P (BC) j )>Whether CS _ proba is true or not, if true, the base classifier BC j Voting for participating in the CS, adding 1 to the CS _ count, and if not, starting the next base classifier BC j+1 The voting process of (2);
and 4, step 4: if AS _ count > -t 1, the pathology Li is added to the set AS, otherwise it is checked if CS _ count > -t 2 holds, if it holds, the pathology Li is added to the set CS, if it does not hold, it is discarded.
AS shown in fig. 3, there are 6 disease conditions L1, …, L6, each having 5 base classifier models BC1, …, BC5, the data in the table represents the probability value of the prediction result of the base classifier models being 1, while AS _ proba, CS _ proba are thresholds allowing the models to participate in the AS, CS voting, respectively, and are set to 0.8, 0.5, respectively, t1, t2 represent thresholds of the number of votes required to add the disease condition Li to the AS, CS, and are set to 4, 2, respectively. The number of prediction results not less than AS _ proba in 5 base classifiers of the disease condition L1 is 4, and let AS _ count be 4, which is not less than the threshold t1, so that the disease condition L1 is added to the AS; since the number of prediction results not less than AS _ proba in 5 base classifiers of the disorder L2 is 0, the condition that AS _ count > -4 is not satisfied, it is necessary to check whether or not CS _ count > -2 is satisfied, and the disorder L2 is added to CS when the condition is satisfied because the number of prediction results more than CS _ proba in the base classifiers is 2. The attribution of L3, … and L6 is determined in the same way, and the obtained AS is { L1, L4}, CS is { L2, L5}, and L3 and L6 are discarded.
Fig. 4 is a flow chart of the bayesian network construction, which is mainly constructed by a causal relationship mining method based on BDe scored Hill Climbing (HC), and includes the following main steps:
step 1: an initial network G is randomly generated, and three search operators are defined: an edge adding operator A, an edge reducing operator M and an edge turning operator R, namely three operations of edge adding, edge cutting and edge changing directions of the network G are defined;
step 2: carrying out search operator operation on the current network G, updating the network, and acquiring a series of candidate networks G1, G2, … and Gm;
and step 3: respectively scoring the candidate networks G1, G2, … and Gm by using an BDe scoring function, marking the scores as S (Gi), representing the scores of the candidate networks Gi, and selecting the network with the highest score as an optimal candidate network structure, which is marked as G';
and 4, step 4: if the score of G 'is greater than that of G, i.e. S (G') > S (G), updating the current network G to G '(i.e. making G ═ G'), returning to step 2 to start the next round of search; otherwise, the current network is not updated, the search is finished, and the current structure G is stored;
and 5: obtaining a conditional probability table CPT among symptoms in a statistical mode;
step 6: a bayesian network is established using the storage network G (which is a directed acyclic graph) in step 4 and the conditional probability table CPT in step 5.
As shown in fig. 5, an example is a built bayesian network, in which there are 5 disorders L1, …, and L5, the solid line with arrows represents the inter-disorder dependency, and the table connected by the dotted lines is the conditional probability table of the node. An example of filtering using this bayesian network is as follows:
if AS is { L1, L2}, and CS is { L3, L4, L5}, there are 8 subsets of CS, so 8 calculations are required, and the SL corresponding to the maximum value is taken i (ob); let BCS i SL is a subset of CS, L3, L4 i (ob) { L1, L2, L3, L4}, so
Figure BDA0002584803180000061
Wherein
P(L1_1,L2_1)=0.5×0.4=0.2,
P (L1_1, L2_1, L3_1, L4_1, L5_0) is 0.5 × 0.4 × 0.1 × 0.6 × 0.2 is 0.0024; all SLs can be calculated by the same method i (ob) as a result of the correspondence, take SL corresponding to maximum value i (ob) is the final prediction set of disorders tls (0b), i.e., SL required i (ob) satisfies the formula:
Figure BDA0002584803180000062
the specific embodiment of the invention: the following describes the processing procedure of the method according to the embodiment of the present invention, taking the conditions of I-degree atrioventricular block, incomplete right bundle branch block, incomplete left bundle branch block, left ventricular hypertrophy, sinus bradycardia, etc. as examples:
(1) in this example, the practical feasibility of the present invention was verified by comparing the results of the filtration (Filter) and the non-filtration (NotFilter).
(2) And constructing a voter to primarily screen the result of the base classifier:
firstly, setting parameters required by a voter, and initially setting a parameter AS _ count (CS _ count) 0 for recording the number of votes obtained by an AS and a CS; AS _ proba and CS _ proba are thresholds for allowing the model to participate in the AS and CS voting, respectively, and are set to 0.8 for AS _ proba and 0.6 for CS _ proba, respectively, and t1 and t2 represent vote count thresholds required for adding the disorder Li to the AS and CS, and are set to 4 for t1 and 2 for t 2;
then, voting is carried out on all test examples by using the voting rules shown in fig. 2 and fig. 3, and after the screening of the voter, each test example ob obtains an anchor point label set as (ob) and a candidate label set cs (ob) corresponding to the test example ob;
and merging all subsets of the anchor label set AS (ob) and the candidate label set CS (ob) respectively to obtain an anchor disease support set consisting of a union of the subsets AS (ob) and CS (ob).
(3) Constructing a Bayesian network:
in this embodiment, a bayesian network is constructed by using a hill climbing method (HC) based on bayesian dirichlet likelihood equivalence score explained in fig. 4, and fig. 6 is a partial structure diagram of the bayesian network constructed in this embodiment; if connection relations exist among all disease nodes in the graph, one end with an arrow represents a child node, the other end without the arrow is a father node, the child node has strong dependence relations with the father node, the nodes are connected by dotted lines and are a conditional probability table corresponding to the nodes, and the conditional probability table shows the probability of the existence of the child node under the state of determining the father node;
the bayesian network is then used to determine the final set of predicted disorders using the calculation method illustrated in fig. 5;
(4) and analysis of example results:
Figure BDA0002584803180000071
Figure BDA0002584803180000081
the table above shows partial results of this embodiment, and the indexes for measuring the model quality in this embodiment are precision, recall, and fscore, respectively. It can be seen from the above table that the recall value of each disease is greatly improved after the filtering treatment, and the fscore is also obviously improved, which shows that the invention enhances the classification effect of the base classifier after the filtering treatment twice, improves the accuracy of determining the electrocardiogram diseases, fully utilizes the dependence relationship among the diseases, and improves the generalization performance of the model.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of embodiments of the present invention; other variations are possible within the scope of the invention; thus, by way of example, and not limitation, alternative configurations of embodiments of the invention may be considered consistent with the teachings of the present invention; accordingly, the embodiments of the invention are not limited to the embodiments explicitly described and depicted.

Claims (4)

1. A method for determining an electrocardiogram disease based on Bayesian network filtering is characterized by comprising the following specific operation steps:
step (1.1): predicting a probable disorder label for the instance ob using a number of basis classifiers;
step (1.2): constructing a voting machine;
step (1.3): transmitting the prediction result obtained in the step (1.1) into the voter constructed in the step (1.2) for screening, and obtaining an anchor disease set AS (ob) and a candidate disease set CS (ob) by the voter after screening;
step (1.4): combining all subsets of anchor disorder set AS (ob) and candidate disorder set CS (ob) to obtain anchor disorder support set ASP (ob), each element of which is a union of anchor disorder set AS (ob) and candidate disorder set CS (ob) subsets, denoted as anchor disorder extension SL (ob) i (ob);
Step (1.5): constructing a Bayesian network by using a hill-climbing search algorithm based on Bayesian Dirichlet likelihood equivalence scores;
the Bayesian network construction flow chart is mainly constructed by a causal relationship mining method based on a BDe scoring hill climbing method, and the method mainly comprises the following steps:
step 1: randomly generating an initial network G, and defining three search operators: an edge adding operator A, an edge subtracting operator M and an edge turning operator R, namely three operations of adding edges, cutting edges and converting the directions of the edges are defined for the network G;
step 2: carrying out search operator operation on the current network G, updating the network, and acquiring a series of candidate networks G1, G2, … and Gm;
and step 3: respectively scoring the candidate networks G1, G2, … and Gm by using an BDe scoring function, marking the scores as S (Gi), representing the scores of the candidate networks Gi, and selecting the network with the highest score as an optimal candidate network structure, which is marked as G';
and 4, step 4: if the score of G ' is larger than that of G, namely S (G ') > S (G), updating the current network G to G ', returning to the step 2 and starting the next round of search; otherwise, the current network is not updated, the search is finished, and the current structure G is stored;
and 5: obtaining a conditional probability table CPT among symptoms in a statistical mode;
step 6: establishing a Bayesian network by using the storage network G in the step 4 and the conditional probability table CPT in the step 5;
step (1.6): respectively calculating anchor point disorder set AS (ob) and anchor point disorder extension SL by utilizing Bayesian network i The joint probabilities of (ob) are denoted as P (AS (ob)) and P (SL) i (ob))。
2. The method for determining electrocardiographic disorder based on bayesian network filtering according to claim 1, wherein in step (1.2), the operation steps for constructing the voter are as follows:
(1.2.1) setting a probability threshold that allows the model to participate in anchor disorder set as (ob), candidate disorder set vote cs (ob); presetting voting threshold values required for adding the symptoms into an anchor symptom set AS (ob) and a candidate symptom set vote CS (ob);
(1.2.2) traversing the prediction results of all base classifier models corresponding to a disease, wherein the prediction results are probability values of 1 predicted by the models, when the prediction results are not smaller than a preset probability threshold, the models are qualified to participate in voting, and after the models have the voting right, the votes obtained by the corresponding anchor point disease set AS (ob) or candidate disease set voting CS (ob) are added with 1;
(1.2.3) if the votes obtained by the anchor point disorder set AS (ob) meet the voting threshold condition, adding the disorder into the anchor point disorder set AS (ob); otherwise, checking the number of votes obtained from the candidate disorder set CS (ob), and if the voting threshold is met, adding the disorder into the candidate disorder set CS (ob);
and (1.2.4) repeating the steps (1.2.1) to (1.2.3) and determining the attribution of all the symptoms.
3. The method according to claim 1, wherein in step (1.3), the anchor disorder set as (ob) stores determined disorders, and the candidate disorder set cs (ob) stores disorders requiring confirmation.
4. The method for determining electrocardiographic disorders based on Bayesian network filtering as set forth in claim 1, wherein in step (1.6), the disorder set SL satisfying the following formula (r) i (ob) is the prediction result of instance ob, and is denoted as tls (ob):
Figure FDA0003730579620000021
CN202010678145.2A 2020-07-15 2020-07-15 Electrocardiogram disease determination method based on Bayesian network filtering Active CN112201340B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010678145.2A CN112201340B (en) 2020-07-15 2020-07-15 Electrocardiogram disease determination method based on Bayesian network filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010678145.2A CN112201340B (en) 2020-07-15 2020-07-15 Electrocardiogram disease determination method based on Bayesian network filtering

Publications (2)

Publication Number Publication Date
CN112201340A CN112201340A (en) 2021-01-08
CN112201340B true CN112201340B (en) 2022-08-26

Family

ID=74005476

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010678145.2A Active CN112201340B (en) 2020-07-15 2020-07-15 Electrocardiogram disease determination method based on Bayesian network filtering

Country Status (1)

Country Link
CN (1) CN112201340B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114176600B (en) * 2021-12-28 2023-10-20 上海交通大学 Electrocardiogram ST segment abnormality discrimination system based on causal analysis

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874655A (en) * 2017-01-16 2017-06-20 西北工业大学 Traditional Chinese medical science disease type classification Forecasting Methodology based on Multi-label learning and Bayesian network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874655A (en) * 2017-01-16 2017-06-20 西北工业大学 Traditional Chinese medical science disease type classification Forecasting Methodology based on Multi-label learning and Bayesian network

Also Published As

Publication number Publication date
CN112201340A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
Wang et al. Learning search space partition for black-box optimization using monte carlo tree search
CN107122594B (en) New energy vehicle battery health prediction method and system
EP1589473A2 (en) Using tables to learn trees
Cuzzocrea et al. An information-theoretic approach for setting the optimal number of decision trees in random forests
CN112201340B (en) Electrocardiogram disease determination method based on Bayesian network filtering
Sadiq et al. Data missing solution using rough set theory and swarm intelligence
CN116469155A (en) Complex action recognition method and device based on learnable Markov logic network
CN113033410A (en) Domain generalization pedestrian re-identification method, system and medium based on automatic data enhancement
Cano et al. A score based ranking of the edges for the PC algorithm
Delaplace et al. Two evolutionary methods for learning bayesian network structures
CN115620046A (en) Multi-target neural architecture searching method based on semi-supervised performance predictor
Ma et al. Feature selection using forest optimization algorithm based on contribution degree
CN115904920A (en) Test case recommendation method and device, terminal and storage medium
Shen et al. An ensemble method for iot device classification
CN113032612A (en) Construction method of multi-target image retrieval model, retrieval method and device
Kassan et al. Robustness analysis of hybrid machine learning model for anomaly forecasting in radio access networks
Sun et al. Reinforced contrastive graph neural networks (RCGNN) for anomaly detection
CN110443344B (en) Momentum wheel fault diagnosis method and device based on K2ABC algorithm
CN117152568B (en) Deep integration model generation method and device and computer equipment
Chen et al. MBAN-MLC: A multi-label classification method and its application in automating fault diagnosis
Cui et al. MLANE: Meta-learning based adaptive network embedding
CN109255722B (en) Complex network hierarchical analysis system and method based on neighbor topology
Diao et al. Heuristic search for fuzzy-rough bireducts and its use in classifier ensembles
CN117540247A (en) Comprehensive decision method, system and medium for preference learning based on graph neural network
CN117235639A (en) Log anomaly detection auxiliary decision-making method and system based on knowledge graph and reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant