CN1760881A - Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases - Google Patents

Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases Download PDF

Info

Publication number
CN1760881A
CN1760881A CNA2005100954203A CN200510095420A CN1760881A CN 1760881 A CN1760881 A CN 1760881A CN A2005100954203 A CNA2005100954203 A CN A2005100954203A CN 200510095420 A CN200510095420 A CN 200510095420A CN 1760881 A CN1760881 A CN 1760881A
Authority
CN
China
Prior art keywords
over
change
random forest
training
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2005100954203A
Other languages
Chinese (zh)
Inventor
周志华
黎铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CNA2005100954203A priority Critical patent/CN1760881A/en
Publication of CN1760881A publication Critical patent/CN1760881A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The method forecasts result through following steps: (1) if forecasting model is not well trained, then executing step (2), otherwise, jumping to step (6); (2) using diagnosed cases and not diagnosed cases to generate marked trained data set and not marked trained data set; (3) using marked trained data set to train out a random forest; (4) through synergic training technique, using not marked data to help to raise precision of each individual in the random forest; (5) using technique of voting majority generates final forecasting model; (6) using forecasting model to forecast and give out result. Advantages are: use of not diagnosed cases improves method of forecasting model to raise performance of computer-aided medical diagnosis device in assistant.

Description

In device of computer aided diagnosis, utilize the not forecast modeling method of diagnosed SARS case
One, technical field
The present invention relates to a kind of computer-aided medical diagnosis device, thus particularly a kind of in conjunction with coorinated training technology and the integrated study technology method that effectively utilization has been diagnosed and diagnosed SARS case is not carried out forecast modeling.
Two, background technology
Along with development of computer, the computer-aided medical diagnosis device has become important auxiliary diagnosis means owing to be not subjected to the influence of factors such as fatigue, mood.The computer-aided medical diagnosis device normally utilizes some forecast modeling methods that case is analyzed, thereby set up forecast model, and then come the new case is diagnosed with this forecast model, its result submits to the medical expert and further analyzes and make a definite diagnosis, thereby alleviates medical expert's work load to a certain extent.Therefore, the forecast modeling method is the key of computer-aided medical diagnosis device.
In computer-aided medical diagnosis, often need a large amount of cases through medical expert's diagnosis are analyzed the model that just can obtain having high precision of prediction.Along with popularizing of health care, can from various daily health check-ups, obtain for the case of modeling in a large number.But,, will certainly increase the weight of medical experts' burden if allow the Medical Technologist be that each case all provides corresponding diagnosis.Usually, medical experts can only provide diagnosis for a small amount of case.Yet, only use these a spot of diagnosed SARS cases to carry out modeling, forecast model often is difficult to reach desired precision, has therefore restricted the effectiveness of computer-aided medical diagnosis device greatly.Simultaneously, if those not diagnostic datas of having collected are not utilized, will certainly cause the waste of resource.Therefore, if the forecast modeling method can utilize simultaneously diagnosed SARS case and not diagnosed SARS case set up model, make the precision of prediction of on a spot of diagnosed SARS case, setting up model will improve the performance of computer-aided medical diagnosis device so by the utilization of diagnosed SARS case not is improved.
Three, summary of the invention
1, goal of the invention: fundamental purpose of the present invention is owing to can only utilize a spot of diagnosed SARS case to carry out modeling at forecast modeling method in the diagnosis of active computer medical assistance, thereby make forecast model can not reach the problem of degree of precision, provide a kind of can utilize simultaneously diagnosed SARS case and not diagnosed SARS case set up model, thereby make that the precision of prediction of setting up model on a spot of diagnosed SARS case can be by the forecast modeling method that the utilization of diagnosed SARS case not is improved, with the auxiliary performance that improves the computer-aided medical diagnosis device.
2, technical scheme: for achieving the above object, the invention provides and a kind ofly utilize diagnosed SARS case and the diagnosed SARS case method of carrying out the high-precision forecast modeling not in conjunction with coorinated training technology and integrated study technology: this method comprises that the symptom of obtaining the follow-up object by the medical symptom checkout equipment forms the symptom vector, is predicted the outcome by following steps then:
(1) if forecast model is set up execution in step 2, otherwise change step 6 over to as yet;
(2) according to diagnosed SARS case and not diagnosed SARS case produce mark training dataset and unmarked training dataset respectively;
(3) according to random forest decision tree of flag data training is integrated;
(4) adopt the coorinated training technology to utilize unmarked data to improve each individual precision of prediction in the random forest;
(5) adopt most voting mechanisms to come the individuality of refining, to produce forecast model in conjunction with through step (4);
(6) utilizing forecast model to predict and provide predicts the outcome;
(7) finish.
3, beneficial effect: the invention has the advantages that to the computer-aided medical diagnosis device provides and to utilize the diagnosed SARS case not in a large number that conveniently to obtain, improve and only utilize the forecast modeling method of model accuracy that diagnosed SARS case is set up, with the auxiliary performance that improves the computer-aided medical diagnosis device.
Below in conjunction with accompanying drawing most preferred embodiment is elaborated.
Four, description of drawings
Fig. 1 is the workflow diagram of computer-aided medical diagnosis device.
Fig. 2 is the process flow diagram of the inventive method.
Fig. 3 is the training process of random forest.
Fig. 4 adopts the coorinated training technology to utilize diagnosed SARS case not to improve the process flow diagram of individual precision in the random forest.
Five, embodiment
As shown in Figure 1, the computer-aided medical diagnosis device utilizes for example symptom for example body temperature, the blood pressure etc. that obtain the follow-up object such as body temperature, blood pressure measurement device of medical symptom checkout equipment, symptom is quantized to obtain symptom vector, for example [t then 1, t 2, t n], t wherein 1Represent first symptom value, t 2Represent second symptom value, the rest may be inferred.The symptom vector is given forecast model and is handled, and can be predicted the outcome, and after the literal processing, has just produced the diagnosis of submitting to the user at last.
Method of the present invention as shown in Figure 2.Step 10 is initial actuatings.Step 11 judges whether forecast model trains, and can handle diagnostic task, execution in step 16 if trained then; Otherwise need train execution in step 12.Step 12 utilize diagnosed and not diagnosed SARS case produce mark training set L and unmarked training set U, wherein each case is a sample, is labeled as this sample and belongs to some illnesss or " health ", the mark disappearance of unmarked sample.But step 13 utilizes the repeated sampling technology to train N stochastic decision tree from L, thereby it is integrated to form a special decision tree---random forest.N is the round values of a user preset, for example 6, and it has determined the number of the stochastic decision tree that random forest comprised.The stochastic decision tree here can get by revising on the decision tree learning algorithms of introducing at the machine learning textbook such as C4.5, CART, specifically, when certain feature of selecting sample during as the inside node of decision tree, C4.5, CART scheduling algorithm all are that selecting a feature that separating capacity is the strongest according to certain decision principle from all features of data comes dividing data at every turn, stochastic decision tree then is elder generation's subclass of random choose from all available features, and then selects a strongest feature of separating capacity from this subclass.Therefore, only needing to change the set of at every turn selecting feature in C4.5, the CART scheduling algorithm into a subclass at random from the feature complete or collected works gets final product.Step 14 employing coorinated training technology is utilized among the U and is not improved each individual precision in the random forest by unmarked sample, and this step will be introduced in Fig. 3 in detail.After coorinated training was finished, step 15 was exactly to adopt most ballot technology to utilize in conjunction with all the predicting the outcome of stochastic decision tree of unmarked data purification.Be exactly that predicting the outcome of final random forest keeps predicting the outcome identical with most individualities specifically.Step 16 receives symptom vector to be diagnosed.Step 17 is submitted to the symptom vector through the random forest of unmarked data purification and is predicted.Step 18 provides predicting the outcome of random forest generation.Step 19 is done states.
Fig. 3 describes the step 13 of Fig. 2 in detail, understands how to utilize L to produce random forest specifically.The step 1300 of Fig. 3 is initial states.Step 1301,1302 and 1309 have constituted a loop body, set up a stochastic decision tree in each loop body.Set up N altogether.Step 1303 is being used to train i stochastic decision tree training set to be changed to empty set.Step 1304,1305,1307 constitute a loop body, but are used for producing L from L by the repeated sampling technology iWherein M is L iSize, common big or small the same with L.Step 1306, the sample of selecting from L at random copies L to iIn.Because what use is copy function, just selected sample still may continue selected in circulation next time, but so this technology be called as the repeated sampling technology.Step 1308, but according to the training set L that adopts the repeated sampling technology to produce iConstruct a stochastic decision tree.Done state during step 1310.
Fig. 4 describes the step 14 of Fig. 2 in detail, and its effect is to use the coorinated training technology to utilize the unmarked sample among the U to improve precision individual in the random forest.The step 1400 of Fig. 4 is initial states.H in the last round of iteration of step 1401 difference initialization i(i=1 ..., error e N) i' be 0.5 and last round of iteration in the weights and the W of all unmarked samples of using i' be 0.H wherein iBe to remove individual h in the random forest iLater formed random forest.Step 1402,1403 and 1411 constitute a circulation is used to travel through all individualities, all is arranged in this loop body from step 1404 to 1410.Step 1404 is estimated H according to training data iError e in the epicycle iteration iIf step 1405 is judged H iError e in the epicycle iteration iLess than on take turns the error e of iteration i', then carry out step 1406, otherwise forward step 1411 to.The some samples of step 1406 random choose from U, and copy set L to i, and make L iIn the total weight value of all samples be no more than e i' w i'/e iFor the sample among the U, initial weight all is 1.Notice that in this step, the sample of being selected does not remove from U, therefore the sample of having selected still may continue selected afterwards.Step 1407 is utilized H iTo L iMiddle sample is predicted, composes to this sample as the mark of this sample predicting the outcome, and writes down the forecast confidence of each sample.Step 1408 checks that forecast confidence whether greater than certain predetermined threshold value, if carry out step 1409, uses the degree of confidence of sample correspondence to be sample weighting, otherwise carry out step 1410, promptly from L iThe lower sample of middle these degree of confidence of deletion.Step 1412,1413 and 1417 have constituted a following condition, are used for traveling through each individuality of random forest.From step 1414 to 1416 all between this loop body.Step 1414 is judged H iError e in the epicycle iteration iLess than on take turns the error e of iteration i' and the epicycle iteration in L iTotal weight value W iGreater than last round of total weight value W i', if, L iCan be used to help to improve the h of current version iTherefore precision carry out step 1415, otherwise is not considered L iAnd forward 1417 to.Step 1415 is utilized original markd data set L and new flag data collection L iAgain train the individual h in the random forest iStep 1416 is the e in the epicycle iteration iAnd w iCompose the e that takes turns on giving i' and w i', so that enter the next round iteration.Step 1418 judges whether that all individualities did not all upgrade in the epicycle iteration, if not, change step 1402, the iteration of a beginning new round is if then enter step 1419.Step 1419 is a done state.
Wherein i, j, N, m are natural number, h iBe i stochastic decision tree, H iFor among the random forest H except h iThe random forest that other stochastic decision tree is in addition formed.L is mark training set, L iBe H iBe h iThe also unmarked data acquisition of mark of selecting, e i' and e iBe respectively H iError in last round of iteration and epicycle iteration.w i' and w iBe respectively L in last round of iteration and the epicycle iteration iIn sample weights and.

Claims (3)

1, a kind of not forecast modeling method of diagnosed SARS case of utilizing in device of computer aided diagnosis is characterized in that this method comprises the symptom formation symptom vector that obtains the follow-up object by the medical symptom checkout equipment, is predicted the outcome by following steps then:
(1) if forecast model is set up execution in step (2), otherwise change step (6) over to as yet;
(2) according to diagnosed SARS case and not diagnosed SARS case produce mark training dataset and unmarked training dataset respectively;
(3) according to random forest decision tree of flag data training is integrated;
(4) adopt the coorinated training technology to utilize unmarked case to improve each individual precision of prediction in the random forest;
(5) adopt most voting mechanisms to come the individuality of refining, to produce forecast model in conjunction with through step (4);
(6) utilizing forecast model to predict and provide predicts the outcome;
(7) finish.
2, the not forecast modeling method of diagnosed SARS case of in device of computer aided diagnosis, utilizing according to claim 1, it is characterized in that in step (3) described basis integrated method of random forest decision tree of flag data training may further comprise the steps:
(1301) putting i is 0;
(1302) as if i≤N, execution in step (1303), otherwise change step (1310) over to;
(1303) put L iBe empty set;
(1304) putting j is 0;
(1305) if j≤m then carries out (1306), otherwise changes (1308) over to;
(1306) sample of random choose copies L to from L iIn;
(1307) put j=j+1, change (1305) over to;
(1308) at L iStochastic decision tree of last training;
(1309) put i=i+1, change (1302) over to;
(1310) finish;
Wherein i, j, N, m are natural number, and L is mark training set, L iBe i stochastic decision tree training set.
3, the not forecast modeling method of diagnosed SARS case of in device of computer aided diagnosis, utilizing according to claim 1, it is characterized in that in step (4), described employing coorinated training technology utilizes unmarked case to improve the method for each individual precision of prediction in the random forest, may further comprise the steps:
(1401) put each individual e i' be 0.5, w iBe 0;
(1402) putting i is 1;
(1403) as if i≤N, execution in step (1404), otherwise change (1412) over to;
(1404) estimate H iError e i
(1405) if e i<e i', execution in step (1406), otherwise change (1411) over to;
(1406) never among the flag data collection U subsampling produce L i
(1407) use H iTo L iIn sample predict;
(1408) if the degree of confidence of prediction greater than threshold value, execution in step (1409), otherwise change (1410) over to;
(1409) be sample weighting with degree of confidence;
(1410) from L iMiddle these degree of confidence of deletion surpass the sample of threshold value;
(1411) counter i increases by 1;
(1412) putting i is 1;
(1413) as if i≤N, execution in step (1414), otherwise change (1418) over to;
(1414) if e i<e i' and w i'<w i, execution in step (1415), otherwise change (1417) over to;
(1415) utilize L and L iAgain train h i
(1416) e i' be changed to e i, w i' be changed to w i
(1417) counter i increases by 1;
(1418) if all h iAll do not upgrade, execution in step (1419), otherwise change (1402) over to
(1419) finish;
Wherein i, j, N, m are natural number, h iBe i stochastic decision tree, H iFor among the random forest H except h iThe random forest that other stochastic decision tree is in addition formed.L is mark training set, L iBe H iBe h iThe also unmarked data acquisition of mark of selecting, e i' and e iBe respectively H iError in last round of iteration and epicycle iteration.w i' and w iBe respectively L in last round of iteration and the epicycle iteration iIn sample weights and.
CNA2005100954203A 2005-11-14 2005-11-14 Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases Pending CN1760881A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2005100954203A CN1760881A (en) 2005-11-14 2005-11-14 Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2005100954203A CN1760881A (en) 2005-11-14 2005-11-14 Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases

Publications (1)

Publication Number Publication Date
CN1760881A true CN1760881A (en) 2006-04-19

Family

ID=36706949

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2005100954203A Pending CN1760881A (en) 2005-11-14 2005-11-14 Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases

Country Status (1)

Country Link
CN (1) CN1760881A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011110491A1 (en) * 2010-03-09 2011-09-15 Sabirmedical, S.L. A non-invasive system and method for diagnosing and eliminating white coat hypertention and white coat effect in a patient
CN102298318A (en) * 2011-05-25 2011-12-28 中国人民解放军防化指挥工程学院 Biological hazard monitoring, predicting and optimal controlling system for emergency (BMPOSE)
CN103092762A (en) * 2013-02-19 2013-05-08 南京大学 Real-time software defect detection method applicable to rapid software development model
CN106339593A (en) * 2016-08-31 2017-01-18 青岛睿帮信息技术有限公司 Kawasaki disease classification and prediction method based on medical data modeling
CN108648183A (en) * 2018-05-04 2018-10-12 北京雅森科技发展有限公司 A method of to SPECT brain three-dimensional image analysis
CN111477321A (en) * 2020-03-11 2020-07-31 北京大学第三医院(北京大学第三临床医学院) Treatment effect prediction system with self-learning capability and treatment effect prediction terminal
WO2022073244A1 (en) * 2020-10-10 2022-04-14 Roche Diagnostics Gmbh Method and system for diagnostic analyzing

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011110491A1 (en) * 2010-03-09 2011-09-15 Sabirmedical, S.L. A non-invasive system and method for diagnosing and eliminating white coat hypertention and white coat effect in a patient
CN102298318A (en) * 2011-05-25 2011-12-28 中国人民解放军防化指挥工程学院 Biological hazard monitoring, predicting and optimal controlling system for emergency (BMPOSE)
CN103092762A (en) * 2013-02-19 2013-05-08 南京大学 Real-time software defect detection method applicable to rapid software development model
CN106339593A (en) * 2016-08-31 2017-01-18 青岛睿帮信息技术有限公司 Kawasaki disease classification and prediction method based on medical data modeling
CN106339593B (en) * 2016-08-31 2023-04-18 北京万灵盘古科技有限公司 Kawasaki disease classification prediction method based on medical data modeling
CN108648183A (en) * 2018-05-04 2018-10-12 北京雅森科技发展有限公司 A method of to SPECT brain three-dimensional image analysis
CN108648183B (en) * 2018-05-04 2021-07-27 北京雅森科技发展有限公司 Method for analyzing SPECT brain three-dimensional image
CN111477321A (en) * 2020-03-11 2020-07-31 北京大学第三医院(北京大学第三临床医学院) Treatment effect prediction system with self-learning capability and treatment effect prediction terminal
WO2022073244A1 (en) * 2020-10-10 2022-04-14 Roche Diagnostics Gmbh Method and system for diagnostic analyzing
CN115151182A (en) * 2020-10-10 2022-10-04 豪夫迈·罗氏有限公司 Method and system for diagnostic analysis
CN115151182B (en) * 2020-10-10 2023-11-14 豪夫迈·罗氏有限公司 Method and system for diagnostic analysis

Similar Documents

Publication Publication Date Title
CN1234092C (en) Predictive modelling method application to computer-aided medical diagnosis
Kalayci et al. An artificial bee colony algorithm with feasibility enforcement and infeasibility toleration procedures for cardinality constrained portfolio optimization
Ïpek et al. Efficiently exploring architectural design spaces via predictive modeling
Arora Comparative analysis of classification algorithms on different datasets using WEKA
Rossi et al. Parameterization of NDDO wavefunctions using genetic algorithms. An evolutionary approach to parameterizing potential energy surfaces and direct dynamics calculations for organic reactions
CN103745273B (en) Semiconductor fabrication process multi-performance prediction method
CN1760881A (en) Modeling method of forecast in device of computer aided diagnosis through using not diagnosed cases
JPH0390957A (en) Learning type decision making support system
WO2020224433A1 (en) Target object attribute prediction method based on machine learning and related device
Sun et al. Hierarchical structure-based joint operations algorithm for global optimization
Biswas et al. Hybrid expert system using case based reasoning and neural network for classification
CN107785074A (en) A kind of disease subsidiary discriminant method and system of Process Based engine
Tamiz et al. Interactive frameworks for investigation of goal programming models: Theory and practice
CN109063418A (en) Determination method, apparatus, equipment and the readable storage medium storing program for executing of disease forecasting classifier
Sallehuddin et al. Hybridization model of linear and nonlinear time series data for forecasting
Błaszczyński et al. Variable consistency bagging ensembles
CN106569472B (en) The quick prevention method of Workshop deadlock based on BDD
Faugeras et al. An advection-diffusion-reaction size-structured fish population dynamics model combined with a statistical parameter estimation procedure: application to the Indian Ocean skipjack tuna fishery
Noack et al. High-performance hybrid-global-deflated-local optimization with applications to active learning
CN113539517A (en) Prediction method of time sequence intervention effect
Nayak et al. Cognitive computing in software evaluation
Yazid et al. Clinical pathway variance prediction using artificial neural network for acute decompensated heart failure clinical pathway
Bal et al. JMASM 55: MATLAB Algorithms and Source Codes of'cbnet'Function for Univariate Time Series Modeling with Neural Networks (MATLAB)
Flores et al. Multicriteria evaluation tools to support the conceptual design of activated sludge systems
CN112070200A (en) Harmonic group optimization method and application thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication