EP0879449A2 - Method for selecting medical and biochemical diagnostic tests using neural network-related applications - Google Patents

Method for selecting medical and biochemical diagnostic tests using neural network-related applications

Info

Publication number
EP0879449A2
EP0879449A2 EP97915835A EP97915835A EP0879449A2 EP 0879449 A2 EP0879449 A2 EP 0879449A2 EP 97915835 A EP97915835 A EP 97915835A EP 97915835 A EP97915835 A EP 97915835A EP 0879449 A2 EP0879449 A2 EP 0879449A2
Authority
EP
European Patent Office
Prior art keywords
variables
variable
decision
iayer
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP97915835A
Other languages
German (de)
English (en)
French (fr)
Inventor
Jerome Lapointe
Duane D. Desieno
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Adeza Biomedical Corp
Original Assignee
Adeza Biomedical Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Adeza Biomedical Corp filed Critical Adeza Biomedical Corp
Publication of EP0879449A2 publication Critical patent/EP0879449A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • BIOCHEMICAL DIAGNOSTIC TESTS USING NEURAL NETWORKS
  • This subject matter of the invention relates to the use of prediction technology, particularly nonlinear prediction technology, for the development of medical diagnostic aids.
  • prediction technology particularly nonlinear prediction technology
  • training techniques operative on neural networks and other expert systems with inputs from patient historical information for the development of medical diagnostic tools and methods of diagnosis are provided.
  • a number of computer decision-support systems have the ability to classify information and identify patterns in input data, and are particularly useful in evaluating data sets having large quantities of variables and complex interactions between variables.
  • These computer decision systems which are collectively identified as “data mining” or “knowledge discovery in databases” (and herein as decision-support systems) rely on similar basic hardware components, e.g. , personal computers (PCS) with a processor, internal and peripheral devices, memory devices and input/output interfaces.
  • PCS personal computers
  • Paradigms that provide decision-support functions include regression methods, decision trees, discriminant analysis, pattern recognition, Bayesian decision theory, and fuzzy logic.
  • One of the more widely used decision-support computer systems is the artificial neural network.
  • Artificial neural networks or "neural nets" are parallel information processing tools in which individual processing elements called neurons are arrayed in layers and furnished with a large number of interconnec ⁇ tions between elements in successive layers.
  • the functioning of the processing elements are modeled to approximate biologic neurons where the output of the processing element is determined by a typically non ⁇ linear transfer function.
  • the processing elements are arranged into an input Iayer for elements which receive inputs, an output Iayer containing one or more elements which generate an output, and one or more hidden layers of elements therebetween.
  • the hidden layers provide the means by which non-linear problems may be solved.
  • the input signals to the element are weighted arithmetically according to a weight coefficient associated with each input.
  • the resulting weighted sum is transformed by a selected non-linear transfer function, such as a sigmoid function, to produce an output, whose values range from 0 to 1 , for each processing element.
  • the learning process called "training”, is a trial-and- error process involving a series of iterative adjustments to the processing element weights so that a particular processing element provides an output which, when combined with the outputs of other processing elements, generates a result which minimizes the resulting error between the outputs of the neural network and the desired outputs as represented in the training data. Adjustment of the element weights are triggered by error signals. Training data are described as a number of training examples in which each example contains a set of input values to be presented to the neural network and an associated set of desired output values.
  • a common training method is backpropagation or "backprop", in which error signals are propagated backwards through the network.
  • the error signal is used to determine how much any given element's weight is to be changed and the error gradient, with the goal being to converge to a global minimum of the mean squared error.
  • the path toward convergence, [ji , the gradient descent, is taken in steps, each step being an adjustment of the input weights of the processing element.
  • the size of each step is determined by the learning rate.
  • the slope of the gradient descent includes flat and steep regions with valleys that act as local minima, giving the false impression that convergence has been achieved, leading to an inaccurate result.
  • Some variants of backprop incorporate a momentum term in which a proportion of the previous weight-change value is added to the current value.
  • the Quickprop algorithm is publicly accessible, and may be downloaded via the Internet, from the Artificial Intelligence Repository maintained by the School of Computer Science at Carnegie Mellon University.
  • a dynamic momentum rate is calculated based upon the slope of the gradient. If the slope is smaller but has the same sign as the slope following the immediately preceding weight adjustment, the weight change will accelerate. The acceleration rate is determined by the magnitude of successive differences between slope values. If the current slope is in the opposite direction from the previous slope, the weight change decelerates.
  • the Quickprop method improves convergence speed, giving the steepest possible gradient descent, helping to prevent convergence to a local minimum.
  • the neural network acts as an associative memory that is able to generalize to a correct solution for sets of new input data that were not part of the training data.
  • Neural networks have been shown to be able to operate even in the absence of complete data or in the presence of noise. It has also been observed that the performance of the network on new or test data tends to be lower than the performance on training data. The difference in the performance on test data indicates the extent to which the network was able to generalize from the training data. A neural network, however, can be retrained and thus learn from the new data, improving the overall performance of the network.
  • Neural nets thus, have characteristics that make them well suited for a large number of different problems, including areas involving prediction, such as medical diagnosis. Neural Nets and Diagnosis
  • a physician In diagnosing and/or treating a patient, a physician will use patient condition, symptoms, and the results of applicable medical diagnostic tests to identify the disease state or condition of the patient. The physician must carefully determine the relevance of the symptoms and test results to the particular diagnosis and use judgement based on experience and intuition in making a particular diagnosis. Medical diagnosis involves integration of information from several sources including a medical history, a physical exam and biochemical tests. Based upon the results of the exam and tests and answers to the questions, the physician, using his or her training, experience and knowledge and expertise, formulates a diagnosis. A final diagnosis may require subsequent surgical procedures to verify or to formulate. Thus, the process of diagnosis involves a combination of decision-support, intuition and experience. The validity of a physician's diagnosis is very dependent upon his/her experience and ability.
  • neural networks have been used to aid in the diagnosis of cardiovascular disorders (see, e.g., Baxt ( 1 991 ) "Use of an Artificial Neural Network for the Diagnosis of Myocardial Infarction, " Annals of Internal Medicine 1 1 5:843: Baxt ( 1 992) " Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks, " Neural Computation 4:772: Baxt ( 1 992), "Analysis of the clinical variables that drive decision in an artificial neural network trained to identify the presence of myocardial infarction, " Annals of Emergency Medicine 21 : 1439; and Baxt ( 1 994) "Complexity, chaos and human physiology: the justification for non-linear neural computational analysis," Cancer Letters 77:85).
  • neural networks for cancer diagnosis (see, e.g. , Maclin, et aL. (1 9910 "Using Neural Networks to Diagnose Cancer” Journal of Medical Systems 1 5: 1 1 -9: Rogers, et aL (1 994) "Artificial Neural Networks for Early Detection and Diagnosis of Cancer” Cancer Letters 77:79-83; Wilding, et aL ( 1 994) "Application of Backpropogation Neural Networks to Diagnosis of Breast and Ovarian Cancer” Cancer Letters 77: 145-53), neuromuscular disorders (Pattichis, et a_L (1 995) “Neural Network Models in EMG Diagnosis", IEEE Transactions on Biomedical Engineering 42:5:486-495), and chronic fatigue syndrome (Solms, et aL ( 1 996) "A Neural Network Diagnostic Tool for the Chronic Fatigue Syndrome", International Conference on Neural Networks, Paper No.
  • MYCIN Dermats, et aL, "Production Systems as a Representation for a Knowledge-based Consultation Program", Artificial Intelligence, 1 977; 8: 1 : 1 5-45
  • MYCIN is an interactive program that diagnoses certain infectious diseases and prescribes anti-microbial therapy.
  • Such knowledge-based systems contain factual knowledge and rules or other methods for using that knowledge, with all of the information and rules being pre-programmed into the system's memory rather than the system developing its own procedure for reaching the desired result based upon input data, as in neural networks.
  • Another computerized diagnosis method is the Bayesian network, also known as a belief or causal probabilistic network, which classifies patterns based on probability density functions from training patterns and a priori information.
  • Bayesian decision systems are reported for uses in interpretation of mammograms for diagnosing breast cancer (Roberts, et aL, "MammoNet: A Bayesian Network diagnosing Breast Cancer", Midwest Artificial Intelligence and Cognitive Science Society Conference, Carbondale, IL, April 1 995) and hypertension (Blinowska, et aL ( 1 993) "Diagnostica -- A Bayesian Decision-Aid System — Applied to Hypertension Diagnosis", IEEE Transactions on Biomedical Engineering 40:230-35) Bayesian decision systems are somewhat limited in their reliance on linear relationships and in the number of input data points that can be handled, and may not be as well suited for decision-support involving non-linear relationships between variables.
  • Bayesian methods using the processing elements of a neural network can overcome some of these limitations (see, e.g.. Penny, et aL ( 1 996) In "Neural Networks in Clinical Medicine”. Medical Decision-support, 1 996; 1 6:4: 386-98). These methods have been used, by mimicking the physician, to diagnose disorders in which important variables are input into the system. It, however, would be of interest to use these systems to improve upon existing diagnostic procedures. Endometriosis
  • Endometriosis is the growth of uterine-like tissue outside of the uterus. It affects about 1 5-30 percent of reproductive age women. The cause(s) of endometriosis are not known, but it may result from retrograde menstruation, the reflux of endometrial tissue and cells (menstrual debris) from the uterus into the peritoneal cavity. While retrograde menstruation is thought to occur in most or all women, it is unclear why some women develop endometriosis and others do not.
  • endometriosis represents an example of a disease state in which a physician must draw upon experience using a complex set of information to formulate a diagnosis. The validity of the diagnosis is related to the experience and ability of the physician.
  • the methods provided herein include a method of using patient history data and identification of important variables to develop a diagnostic test; a method for identification of important selected variables; a method of designing a diagnostic test; a method of evaluating the usefulness of diagnostic test; a method of expanding clinical utility of diagnostic test, and a method of selecting a course of treatment by predicting the outcome of various possible treatments.
  • disease parameters or variables to aid in the diagnosis of disorders including any disorders that are difficult to diagnose, such as endometriosis, predicting pregnancy related events, such as the likelihood of delivery within a particular time period, and other such disorders relevant to women's health. It is understood that although women's disorders are exemplified herein, the methods herein are applicable to any disorder or condition.
  • a method for identifying variables or sets of variables that aid in the diagnosis of disorders or conditions is provided herein.
  • patient data or information typically patient history or clinical data are collected and variables based on this data are identified.
  • the data includes information for each patient regarding the number of pregnancies each patient has had.
  • the extracted variable is, thus, number of pregnancies.
  • the variables are analyzed by the decision-support systems, exemplified by neural networks, to identify important or relevant variables.
  • Methods are provided for developing medical diagnostic tests using computer-based decision-support systems, such as neural networks and other adaptive processing systems (collectively, "data mining tools") .
  • the neural networks or other such systems are trained on the patient data and observations collected from a group of test patients in whom the condition is known or suspected; a subset or subsets of relevant variables are identified through the use of a decision-support system or systems, such as a neural network or a consensus of neural networks; and another set of decision-support systems is trained on the identified subset(s) to produce a consensus decision-support system based test, such as a neural net-based test for the condition.
  • consensus systems such as consensus neural networks, minimizes the negative effects of local minima in decision-support systems, such as neural network-based systems, thereby improving the accuracy of the system.
  • the patient data can be augmented by increasing the number of patients used.
  • biochemical test data and other data may be included as part of additional examples or by using the data as dditional variables prior to the variable selection process.
  • the resulting systems are used as an aid in diagnosis.
  • patient data can be stored and then used to further train the systems and to develop systems that are adapted for a particular genetic population.
  • This inputting of additional data into the system may be implemented automatically or done manually. By doing so the systems continually learn and adapt to the particular environment in which they are used.
  • the resulting systems have numerous uses in addition to diagnosis, which includes assessing the severity of a disease or disorder, predicting the outcome of a selected treatment protocol.
  • the systems may also be used to assess the value of other data in a diagnostic procedure, such as biochemical test data and other such data, and to identify new tests that are useful for diagnosing a particular disease.
  • the disorders and conditions that are of particular interest herein and to which the methods herein may be readily applied are gyneco- logical conditions and other conditions that impact on fertility, including but not limited to endometriosis, infertility, prediction of pregnancy-related events, such as the likelihood of delivery within a particular time period, and pre-eclampsia. It is understood, however, that the methods herein are applicable to any disorder or condition.
  • the methods are exemplified with reference to neural networks, however, it is understood that other data mining tools, such as expert systems, fuzzy logic, decision trees, and other statistical decision-support systems which are generally non-linear, may be used.
  • variables provided herein are intended to be used with decision-support systems, once the variables are identified, then a person, typically a physician, armed with knowledge the important variables can use them to aid in diagnosis in the absence of a decision-support system or using a less complex linear system of analysis.
  • patient history data without supplementing biochemical test data, can be used to diagnose or aid in diagnosing a disorder or condition when used with the decision-support systems, such as the neural nets provided herein.
  • the accuracy of the diagnosis with or without biochemical data may be sufficient to obviate the need for invasive surgical diagnostic procedures.
  • the results of a particular test particular one that had heretofore not been considered of clinical utility with respect to the disorder or condition of interest, are combined with the variables and used with the decision-support system, such as a neural net. If the performance, the ability to correctly diagnose a disorder, of the system is improved by addition of the results of the test, then the test will have clinical utility or a new utility.
  • the resulting systems can be used to identify new utilities for drugs or therapies and also to identify uses for particular drugs and therapies.
  • the systems can be used to select subpopulations of patients for whom a particular drug or therapy is effective.
  • neural networks are employed to evaluate specific observation values and test results, to guide the development of biochemical or other diagnostic tests, and to provide the decision-support functionality for the test.
  • a method for identification of important variables (parameters) or sets thereof for use in the decision-support systems is also provided.
  • This method while exemplified herein with reference to medical diagnosis, has broad applicability in any field, such as financial analysis, in which important parameters or variables are selected from among a plurality.
  • a method for selecting effective combinations of variables involves: ( 1 ) providing a set of "n" candidate variables and a set of "selected important variables” , which initially is empty; (2) ranking all candidate variables based on a chi square and sensitivity analysis; (3) taking the highest “m” ranked variables one at a time, where m is from 1 up to n, and evaluating each by training a consensus of neural nets on that variable combined with the current set of important variables; (4) selecting the best of the m variables, where the best variable is the one that gives the highest performance, and if it improves performance in comparison to the performance of the selected important variables, adding it to the "selected important variable” set, removing it from the candidate set and continuing processing at step (3), otherwise going to step (5); (5) if all variables on the candidate set have been evaluated, the process is complete, otherwise continue taking the next highest "m” ranked variables one at a time, and evaluating each by training a consensus of neural nets on that variable combined with the current set of
  • the sensitivity analysis involves: (k) determining an average observation value for each of the variables in an observation data set; (I) selecting a training example, and running the example through a decision-support system to produce an output value, designated and stored as the normal output; (m) selecting a first variable in the selected training example, replacing the observation value with the average observation value of the first variable; running the modified example in the decision-support system in the forward mode and recording the output as the modified output; (n) squaring the difference between the normal output and the modified output and accumulating it as a total for each variable, in which this total is designed the selected variable total for each variable; (o) repeat steps (m) and (n) for each variable in the example; (p) repeating steps (l)-(n) for each example in the data set, where each total for the selected variable represents the relative contribution of each variable to the determination of the decision-support system output. This total will be used to rank each variable according to its relative contribution to the determination of the decision-support system output.
  • computer-based decision-support systems such as neural networks reveal that certain input factors, which were not initially considered to be important, can influence an outcome.
  • This ability of a neural network to reveal the relevant input factors permits its use in guiding the design of diagnostic tests.
  • a method of designing a diagnostic test, and a method of evaluating utility of diagnostic test are also provided.
  • the data from the test or possible test is added to the input of the decision-support system. If the results are improved when the data are included in the input, then the diagnostic test may have clinical utility. In this manner, tests that heretofore were not known to be of value in diagnosis of a particular disorder are identified, or new tests can be developed.
  • Neural networks can add robustness to diagnostic tests by discounting the effects of spurious data points and by identifying other data points that might be substituted, if any.
  • Networks are trained on one set of variables and then clinical data from diagnostic or biochemical test data and/or additional patient information are added to the input data. Any variable that improves the results compared to their absence is (are) selected.
  • Any variable that improves the results compared to their absence is (are) selected.
  • particular tests that heretofore were of unknown value in diagnosing a particular disorder can be shown to have relevance. For example, the presence or absence of particular spots on a western blot of serum antibodies can be correlated with a disease state. Based on the identity of particular spots (i.e., antigens) new diagnostic tests can be developed.
  • An example of the application of the prediction technology to aid in the diagnosis of disease and more particularly the use of neural network techniques with inputs from various information sources to aid in the diagnosis of the disease endometriosis is provided.
  • a trained set of neural networks operative according to a consensus of networks in a computer system is employed to evaluate specific clinical associations, for example obtained by survey, some of which may not generally be associated with a disease condition. This is demonstrated with an exemplary disease condition endometriosis, and factors used to aid in the diagnosis of endometriosis are provided.
  • the neural network training is based on correlations between answers to questions furnished by physicians of a significant number of clinical patients whose disease condition has been surgically verified, herein termed clinical data.
  • a plurality of factors, twelve to about sixteen, particularly a set of fourteen factors, in a specific trained neural network extracted from a collection of over forty clinical data factors have been identified as primary indicia for endometriosis.
  • the following set of parameters age, parity (number of births), gravidity (number of pregnancies), number abortions, smoking (packs/day), past history of endometriosis, dysmenorrhea, pelvic pain, abnormal pap/dysplasia, history pelvic surgery, medication history, pregnancy hypertension, genital warts and diabetes were identified as being significant. Other similar sets of parameters were also identified. Subsets of these variables also may be employed in diagnosing endometriosis.
  • any subset of the selected set of parameters, particularly the set of fourteen variables, that contain one (or more) of the following combinations of three variables can be used with a decision- support system for diagnosis of endometriosis: a) number of births, history of endometriosis, history of pelvic surgery; b) diabetes, pregnancy hypertension, smoking; c) pregnancy hypertension, abnormal pap smear/dysplasia, history of endometriosis; d) age, smoking, history of endometriosis; e) smoking, history of endometriosis, dysmenorrhea; f) age, diabetes, history of endometriosis; g) pregnancy hypertension, number of births, history of endometriosis; h) Smoking, number of births, history of endometriosis; i) pregnancy hypertension, history endometriosis, history of pelvic surgery; j) number of pregnancies, history of endometriosis, history of pelvic
  • Diagnostic software and exemplary neural networks that use the variables for diagnosis of endometriosis are also provided.
  • the software generates a clinically useful endometriosis index.
  • the performance of a diagnostic neural network system used to test for endometriosis is enhanced by including variables based on biochemical test results from a relevant biochemical test as part of the factors (herein termed biochemical test data, which includes tests from analyses and data such as vital signs, such as pulse rate and blood pressure) used for training the network.
  • An exemplary network that results therefrom is an augmented neural network that employs fifteen input factors, including results of the biochemical test and the fourteen clinical parameters.
  • the set of weights of the eight augmented neural networks differ from the set of weights of the eight clinical data neural networks.
  • the exemplified biochemical test employs an immuno-diagnostic test format, such as the ELISA diagnostic test format.
  • the methodology applied to endometriosis as exemplified herein can be similarly applied and used to identify factors for other disorders, including, but not limited to gynecological disorders and female- associated disorders, such as, for example, infertility, prediction of pregnancy related events, such as the likelihood of delivery within a particular time period, and pre-eclampsia.
  • Neural networks thus, can be trained to predict the disease state based on the identification of factors important in predicting the disease state and combining them with biochemical data.
  • the resulting diagnostic systems may be adapted and used not only for diagnosing the presence of a condition or disorder, but also the severity of the disorder and as an aid in selecting a course of treatment.
  • FIGURE 1 is a flow chart for developing a patient-history-based diagnostic test process.
  • FIGURE 2 is a flow chart for developing a biochemical diagnostic test.
  • FIGURE 3 is a flow chart of the process for isolating important variables.
  • FIGURE 4 is a flow chart on the process of training one or a set of neural networks involving a partitioning of variables.
  • FIGURE 5 is a flow chart for developing a biochemical diagnostic test.
  • FIGURE 6 is a flow chart for determining the effectiveness of a biochemical diagnostic test.
  • FIGURE 7 is a schematic diagram of a neural network trained on clinical data of the form used for the consensus network of a plurality of neural networks.
  • FIGURE 8 is a schematic diagram of a second embodiment of a neural network trained on clinical data augmented by test results data of the form used for the consensus of eight neural networks.
  • FIGURE 9 is a schematic diagram of a processing element at each node of the neural network.
  • FIGURE 10 is a schematic diagram of a consensus network of eight neural networks using either the first or second embodiment of the neural network.
  • FIGURE 1 1 is a depiction of an exemplary interface screen of the user interface in the diagnostic endometriosis index. DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS Definitions
  • a decision-support system also referred to as a "data mining system” or a “knowledge discovery in data system” is any system, typically a computer-based system, that can be trained on data to classify the input data and then subsequently used with new input data to make decisions based on the training data.
  • These systems include, but are not limited, expert systems, fuzzy logic, non-linear regression analysis, multivariate analysis, decision tree classifiers, Bayesian belief networks and, as exemplified herein, neural networks.
  • an adaptive machine learning process refers to any system whereby data are used to generate a predictive solution. Such processes include those effected by expert systems, neural networks, and fuzzy logic.
  • expert system is a computer-based problem solving and decision-support system based on knowledge of its task and logical rules or procedures for using the knowledge. Both the knowledge and the logic are entered into the computer from the experience of human specialists in the area of expertise.
  • a neural network is a parallel computational model comprised of densely interconnected adaptive processing elements.
  • the processing elements are configured into an input Iayer, an output Iayer and at least one hidden Iayer.
  • Suitable neural networks are known to those of skill in this art (see, e.g. , U.S.
  • a processing element which may also be known as a perceptron or an artificial neuron, is a computational unit which maps input data from a plurality of inputs into a single binary output in accordance with a transfer function.
  • Each processing element has an input weight corresponding to each input which is multiplied with the signal received at that input to produce a weighted input value.
  • the processing element sums the weighted inputs values of each of the inputs to generate a weighted sum which is then compared to the threshold defined by the transfer function.
  • transfer function also known as a threshold function or an activation function, is a mathematical function which creates a curve defining two distinct categories. Transfer functions may be linear, but, as used in neural networks, are more typically non-linear, including quadratic, polynomial, or sigmoid functions.
  • backpropogation is a training method for neural networks for correcting errors between the target output and the actual output.
  • the error signal is fed back through the processing Iayer of the neural network, causing changes in the weights of the processing elements to bring the actual output closer to the target output.
  • Quickprop is a backpropogation method that was proposed, developed and reported by Fahlman ("Fast Learning Variations on Back-Propagation: An Empirical Study", Proceedings on the 1 988 Connectionist Models Summer School. Pittsburgh, 1 988, D. Touretzky, et aL, eds., pp.38-51 , Morgan Kaufmann, San Mateo, CA; and, with
  • diagnosis refers to a predictive process in which the presence, absence, severity or course of treatment of a disease, disorder or other medical condition is assessed.
  • diagnosis will also include predictive processes for determining the outcome resulting from a treatment.
  • biochemical test data refers to the results of any analytical methods, which include, but are not limited to:, immunoassays, bioassays, chromatography, data from monitors, and imagers; measurements and also includes data related to vital signs and body function, such as pulse rate, temperature, blood pressure, the results of, for example, EKG, ECG and EEG, biorhythm monitors and other such information.
  • the analysis can assess for example, analytes, serum markers, antibodies, and other such material obtained from the patient through a sample.
  • patient historical data refers to data obtained from a patient, such as by questionnaire format, but typically does not include biochemical test data as used herein, except to the extent such data is historical, a desired solution is one that generates a number or result whereby a diagnosis of a disorder can be generated.
  • a training example includes the observation data for a single diagnosis, typically the observation data related to one patient.
  • patient data will include information with respect to individual patient's smoking habits. The variable associated with that will be smoking.
  • partition means to select a portion of the data, such as 80%, and use it for training a neural net and to use the remaining portion as test data.
  • the network is trained on all but one portion of the data.
  • the process can then be repeated and a second network trained.
  • the process is repeated until all partitions are used as used as test data and training data.
  • the method of training by partitioning the available data into a plurality of subsets is generally referred to as the "holdout method" of training.
  • the holdout method is particularly useful when the data available for network training is limited.
  • training refers to the process in which input data are used to generate a decision-support system.
  • training is a trial-and-error process involving a series of iterative adjustments to the processing element weights so that a particular processing element provides an output which, when combined with the outputs of other processing elements, generates a result which minimizes the resulting error between the outputs of the neural network and the desired outputs as represented in the training data.
  • a variable selection process is a systematic method whereby combinations of variables that yield predictive results are selected from any available set. Selection is effected by maximizing predictive performance of subsets such that addition of additional variables does not improve the result.
  • the preferred methods provided herein advantageously permit selection of variables without considering all possible combinations.
  • a candidate variable is a selected item from collected observations from a group of test patients for the diagnostic embodiments or other records, such as financial records, that can be used with the decision-support system.
  • Candidate variables will be obtained by collecting data, such as patient data, and categorizing the observations as a set of variables.
  • important selected variables refer to variables that enhance the network performance of the task at hand. Inclusion of all available variables does not result in the optimal neural network; some variables, when included in network training, lower the network performance. Networks that are trained only with relevant parameters result in increased network performance. These variables are also referred to herein as a subset of relevant variables.
  • ranking refers to a process in which variables are listed in an order for selection.
  • Ranking may be arbitrary or, preferably, is ordered. Ordering may be effected, for example, by a statistical analysis that ranks the variables in order of importance with respect to the task, such as diagnosis, or by a decision-support system based analysis. Ranking may also be effected, for example, by human experts, by rule based systems, or any combination of any of these methods.
  • a consensus of neural networks refers to the linear combination of outputs from a plurality of neural networks where the weight on each is outputs is determined arbitrarily or set to an equal value.
  • a greedy algorithm is a method for optimizing a data set by determining whether to include or exclude a point from a given data set.
  • the set begins with no elements and sequentially selects an element from the feasible set of remaining elements by myopic optimization, in which, given any partial solution, another value that improves the object the most is selected.
  • a genetic algorithm is a method that begins with an initial population of randomly generated neural networks which are run through a training cycle and ranked according to their performance in reaching the desired target. The poor-performing networks are removed from the population, with the fitter networks being retained and selected for the crossover process to offspring that retain the desirable characteristics of the parent networks.
  • performance of a system is said to be improved or higher when the results more accurately predict or deterimine a particular outcome. It is also to be understood that the performance of a system will typically be better as more training examples are used. Thus, the systems herein will improve over time as they are used and more patient data is accumulated and then added to the systems as training data.
  • Clinical sensitivity measures how well a test detects patients with the disease; clinical specificity measures how well a test correctly identifies those patients who do not have the disease.
  • positive predictive value is TP/(TP + FP); and negative predictive value (NPV) is TN/(TN + FN) .
  • Positive predictive value is the likelihood that a patient with a positive test actually has the disease, and negative predictive value is the likelihood that a patient with a negative test result does not have the disease.
  • fuzzy logic is an approach to deal with systems that cannot be described precisely. Membership functions (membership in a data set) are not binary in fuzzy logic systems; instead membership function may take on fractional values. Therefore, an element can be simultaneously contained in two contradictory sets, albeit with different coefficients of set membership. Thus, this type of approach is useful for answering questions in which there is no yes or answer. Thus, this type of logic is suitable for categorizing responses from patient historical questionnaires, in which the answer is often one of degree. 1 .
  • neural networks may also reveal that certain input factors that were not initially considered to be important can influence an outcome, as well as reveal that presumably important factors are not outcome determinative.
  • the ability of neural networks to reveal the relevant and irrelevant input factors permit their use in guiding the design of a diagnostic test.
  • neural networks, and other such data mining tools are a valuable advance in diagnostics, providing an opportunity to increase the sensitivity and specificity of a diagnostic test. As shown herein, care must be taken to avoid the potential of poor- accuracy answer due to the phenomenon of local minima. The methods herein provide a means to avoid this problem or at least minimize it.
  • Methods for diagnosis based solely on patient history data are provided. As demonstrated herein, it is possible to provide decision- support system that rely only on patient history information but that aid in diagnosis. Consequently, the resulting systems can then be used to improve the predictive ability of biochemical test data, to identify new disease markers, to develop biochemical tests, to identify tests that heretofore were not thought to be predictive of a particular disorder.
  • the methods may also be used to select an appropriate course of treatment by predicting the result of selected course of treatment and to predict status following therapy.
  • the input variables for training would be derived from, for example, electronic patient records, that indicate diagnoses and other available data, including selected treatments and outcomes.
  • the resulting decision-support system would then be used with all available data to, for example, categorize women into different classes that will respond to different treatments and predict the outcome of a particular treatment. This permits selection of a treatment or protocol most likely to be successful.
  • the systems can be used to identify new utilities for drugs or therapies and also to identify uses for particular drugs and therapies. For example, the systems can be used to select subpopulations of patients for whom a particular drug or therapy is effective. Thus, methods for expanding the indication for a drug or therapy and identifying new drugs and therapies are provided.
  • Fig. 1 sets forth a flow chart for developing a patient-history-based diagnostic test process.
  • the process begins with collection of patient history data (Step A) .
  • Patient history data or observation values are obtained from patient questionnaires, clinical results, in some instances diagnostic test results, and patient medical records and supplied in computer-readable form to a system operating on a computer.
  • the patient history data are categorized into a set of variables of two forms: binary (such as true/false) values and quantitative (continuous) values.
  • binary-valued variable might include the answer to the question, "Do you smoke?"
  • a quantitative-valued variable might be the answer to the question, "How many packs per day do you smoke?"
  • Other values, such as membership functions, may also be useful as input vehicles.
  • the patient history data will also include a target or desired outcome variable that would be assumed to be indicative of the presence, absence, or severity of the medical condition to be diagnosed.
  • This desired outcome information is useful for neural network training.
  • the selection of data to be included in the training data can be made with the knowledge or assumption of the presence, severity, or absence of the medical condition to be diagnosed.
  • diagnosis may also include assessment of the progress and/or effectiveness of a therapeutic treatment.
  • the number of variables, which can be defined and thus generated, can be unwieldy. Binary variables are typically sparse in that the number of positive (or negative) responses is often a small percentage of the overall number of responses.
  • Step B steps are taken to isolate from the available variables a subset of variables important to the diagnosis.
  • the specific choice of the subset of variables from among the available variables will affect the diagnostic performance of the neural network.
  • the process outlined herein has been found to produce a subset of variables which is comparable or superior in sensitivity and reliability to the subset of variables typically chosen by a trained human expert, such as a physician.
  • the variables are prioritized or placed in order of rank or relevance.
  • the final neural networks to be used in the diagnostic procedure are trained (Step C).
  • a consensus i.e. a plurality
  • the resulting networks form the decision-support functionality for the completed patient history diagnostic test (Step D).
  • the method permits sets of effective variables to be selected without comparing every possible combination of variables.
  • the important variables may be used as the inputs for the decision-support systems.
  • Figure 3 provides a flow chart of the process for isolating the important or relevant variables (Step E) within a diagnostic test.
  • a process is typically conducted using a digital computer system to which potentially relevant information has been provided.
  • This procedure ranks the variables in order of importance using two independent methods, then selects a subset of the available variables from the uppermost of the ranking.
  • other ranking methods can be used by those of skill in the art in place of chi square or sensitivity analysis.
  • x is set to N (the total number of candidate variables)
  • ranking can be arbitrary.
  • the system trains a plurality of neural networks on the available data (Step I), as explained hereinafter, then generates a sensitivity analy- sis over all trained networks to determine to what extent each input vari ⁇ able was used in the network to perform the diagnosis (Step J).
  • a con ⁇ sensus sensitivity analysis of each input variable is determined by aver ⁇ aging the individual sensitivity analysis results for each of the networks trained.
  • a ranking order for each of the variables available from the patient history information is determined (Step K). Ranking the variables
  • the variables are ranked using a statistical analysis, such as a chi square analysis, and/or a decision- support system-based analysis, such as a sensitivity analysis.
  • a sensitivity analysis and chi square analysis are used, in the exemplary embodiment to rank variables.
  • Other statistical methods and/or decision- support system-based including but not limited to regression analysis, discriminant analysis and other methods known to those of skill in the art, may be used.
  • the ranked variables may be used to train the networks, or, preferably, used in the method of variable selection provided herein.
  • the method employs a sensitivity analysis in which each input is varied and the corresponding change in output is measured (see, also, Modai, et aL, ( 1 993) "Clinical Decisions for Psychiatric Inpatients and Their Evaluation by Trained Neural Networks", Methods of Information in Medicine 32:396-99; Wilding et aL ( 1 994) "Application of Backpropogation Neural Networks to Diagnosis of Breast and Ovarian Cancer", Cancer Letters 77: 145-53; Ruck et aL (1 990) "Feature Selection in Feed-Forward Neural Networks", Neural Network Computing 20:40-48; and Utans, et aJL (1 993) "Selecting Neural Network Architectures Via the Prediction Risk: Application to Corporate Bond Rating Prediction”; Proceedings of the First International Conference on Artificial Intelligence Applications on Wall Street. Washington, D.C . IEEE Computer Society Press, pp. 35-41 ; Penny et aL
  • Step K Fig. 3, provides an outline of the sensitivity analysis.
  • Each network or a plurality of trained neural networks (networks N, through N n ) is run in the forward mode (no training) for each training example S x (input data group for which true output is known or suspected; there must be at least two training examples), where "x" is the number of training examples.
  • the output of each network N N n for each training example S x is recorded, i.e., stored in memory.
  • a new training example is defined containing the average value for each input variable within all training examples.
  • each input variable within each original training example S x is replaced with its corresponding average value V 1 (avg) through V y(avg) , where "y" is the number of variables, and the modified training example S x ' is again executed through the multiple networks to produce a modified output for each network for each variable.
  • the differences between the output from the original training example S x and the modified output for each input variable are the squared and summed (accumulated) to obtain individual sums corresponding to each input variable.
  • N N 10 and 5 different training examples S S 5 each having 15 variables V,-V 15 , each of the 5 training examples would be run through the 10 networks to produce 50 total outputs.
  • variable V from each of the training examples, an average value V 1 (avg) is calculated.
  • This averaged variable V 1 (avg ) is substituted into each of the 5 training examples to create modified training examples S,'-S 5 ' and they are again run through the 10 networks.
  • Fifty modified output values are generated by the networks N ⁇ N 10 for the 5 training examples, the modification being the result of using the average value variable V 1 (avgl .
  • the difference between each of the fifty original and modified output values is calculated, i.e., the original output from training S 4 in network N 6 :
  • 0UT(S 4 N 6 ) is subtracted from the modified output from training example S 4 in network N 6 , OUT(S 4 'N 6 ) . That difference value is squared [0UT(S 4 'N 6 )-0UT(S 4 N 6 )] V1 . This value is summed with the squared difference values for all combinations of networks and training examples for the iteration in which variable V ⁇ was substituted with its average value V 1 (avg) , e ⁇ ,
  • variable #2 finding the differences between the original and modified outputs for each combination of network and training example, squaring, then summing the differences. This process is repeated for each variable until all 1 5 variables have been completed.
  • Each of the resultant sums is then normalized so that if all variables contributed equally to the single resultant output, the normalized value would be 1 .0.
  • the summed squared differences for each variable are summed to obtain a total summed squared difference for all variables.
  • the value for each variable is divided by the total summed square difference to normalize the contribution from each variable. From this information, the normalized value for each variable can be ranked in order of importance, with higher relative numbers indicating that the corresponding variable has a greater influence on the output.
  • the sensitivity analysis of the input variables is used to indicate which variables played the greatest roles in generating the network output.
  • Chi-square contingency table When dealing with sparse binary data, a positive response on a given variable might be highly correlated to the condition being diagnosed, but occur so infrequently in the training data that the importance of the variable, as indicated by the neural network sensitivity analysis, might be very low. In order to catch these occurrences, the Chi- square contingency table is used as a secondary ranking process.
  • a 2X2 contingency table Chi-square test on the binary variables, where each cell of the table is the observed frequency for the combination of the two variables (Fig. 3, Step F) is performed.
  • a 2X2 contingency table Chi- square test is performed on the continuous variables using optimal thresholds (which might be empirically-determined) (Step G) >
  • the binary and continuous variables that have been based on Chi-square analysis are ranked (Step H) .
  • the standard Chi-square 2X2 contingency table operative on the binary variables (Step F) is used to determine the significance of the relationship between a specific binary input variable and the desired output (as determined by comparing the training data with the known single output result) . Variables that have a low Chi-square value are typically unrelated to the desired output.
  • a 2X2 contingency table can be constructed (Step G) by comparing the continuous variable to a threshold value.
  • the threshold value is modified experimentally to yield the highest possible Chi-square value.
  • the Chi-square values of the continuous variables and of the binary variables can then be combined for common ranking (Step H).
  • a second level of ranking can then be performed that combines the Chi-square- ranked variables with the sensitivity-analysis-ranked variables (Step L) .
  • This combining of rankings allows variables that are significantly related to the output but that are sparse (Le, values that are positive or negative in only a small percentage of cases) to be included in the set of important variables. Otherwise, important information in such a non-linear system could easily be overlooked. Selection of important variables from among the ranked variables
  • important variables are selected from among the identified variables.
  • the selection is effected after ranking the variables at which time a second level ranking process is invoked.
  • a method for identification of important variables (parameters) or sets thereof for use in the decision-support systems is also provided. This method, while exemplified herein with reference to medical diagnosis, has broad applicability in any field, such as financial analysis and other endeavors that involve statistically-based prediction, in which important parameters or variables are selected from among a plurality.
  • a method for selecting effective combinations of variables involves: (3) taking the highest “m” ranked variables one at a time, where m is from 1 up to n, and evaluating each by training a consensus of neural nets on that variable combined with the current set of important variables; (4) selecting the best of the m variables, where the best variable is the one that most improves performance, and if it improves performance, adding it to the "selected important variable” set, removing it from the candidate set and continuing processing at step (3) otherwise continuing by going to step (5); (5) if all variables on the candidate set have been evaluated, the process is complete, otherwise continue taking the next highest "m” ranked variables one at a time, and evaluating each by training a consensus of neural nets on that variable combined with the current set of important selected variables and performing
  • the second level ranking process starts by adding the highest ranked variable from the sensitivity analysis (Step K) to the set of important variables (Step H).
  • the second level ranking process could be started with an empty set and then testing the top several (x) variables from each of the two sets of ranking.
  • This second level ranking process uses the network training procedure (Step I) on a currently selected partition or subset of variables from the available data to train a set of neural networks.
  • the ranking process is a network training procedure using the current set of "important" variables (which generally will initially be empty) plus the current variable being ranked or tested for ranking, and uses a greedy algorithm to optimize the set of input variables by myopically optimizing the input set based upon the previously identified important variable(s), to identify the remaining variable(s) which improve the output the most.
  • This training process is illustrated in Fig. 4.
  • the number of inputs used by the neural network is controlled by excluding inputs which are found to not contribute significantly to the desired output, i.e., the known target output of the training data.
  • a commercial computer program such as ThinksProTM neural networks for WindowsTM (or TrainDos " the DOS version) by Logical Designs Consulting, Inc, La Jolla, California, or any other such program that one of skill in the art can develop may be used to vary the inputs and train the networks.
  • Brainmaker which is available from California Scientific Software Co., Nevada Adaptive Solutions, Beaverton, OR; Neural Network Utility/2 tm , from NeuralWare, Inc., Pittsburgh, PA; NeuroShell tm and Neuro- Windows ,m , from Ward Systems Group, Inc., Frederick, MD.
  • Other types of data mining tools, U ⁇ , decision-support systems, that will provide the function of variable selection and network optimization may be designed or other commercially available systems may be used.
  • NeuroGenetic Optimizer from BioComp Systems, Inc., Redmond, WA; and Neuro Forecaster/GENETICA, from New Wave Intelligent Business Systems (NIB5) Pte Ltd., Republic of Singapore, use genetic algorithms that are modelled on natural selection to eliminate poor-performing nodes within network population while passing on the best performing rates to offspring nodes to "grow" an optimized network and to eliminate input variables which do not contribute significantly to the outcome.
  • Networks based on genetic algorithms use mutation to avoid trapping in local minima and use crossover processes to introduce new structures into the population.
  • KDD Knowledge discovery in data
  • a number of KDD systems are commercially available including Darwin ,m , from Thinking Machines, Bedford, MA; Minese " 1 , from Silicon Graphics, Mountain View, CA, and Eikoplex tm from Ultragem Data Mining Company, San Francisco, CA. (Eikoplex ,m has been used to provide classification rules for determining the probability of the presence of heart disease.) Others may be developed by those of skill in the art.
  • Step T the top two variables from each of the two ranking sets will be tested by the process (Fig. 3, Steps L, S), and results are checked to see if the test results show improvement (Step T). If there is an improvement, the single best performing variable is added to the set of "important" variables, and then that variable is removed from the two rankings (Fig. 3, Step U) for further testing (Step S). If there is no improvement, then the process is repeated with the next x variables from each set until an improvement is found or all of the variables from the two sets have been tested.
  • the final set of networks is trained to perform the diagnosis (Fig. 4, Steps M, N, Q, R).
  • a number of final neural networks are trained to perform the diagnosis. It is this set of neural networks (a that can form the basis of a deliverable product to the end user. Since different initial conditions (initial weights) can produce differing outputs for a given network, it is useful to seek a consensus. (The different initial weights are used to avoid error from trapping in local minima.) The consensus is formed by averaging the outputs of each of the trained networks which then becomes the single output of the diagnostic test. Training a consensus of networks Fig. 4 illustrates the procedure for the training of a consensus of neural networks.
  • one of the partitions e.g., P,, representing 20% of the total data set
  • P 2 -P 4 The remaining four files, P 2 -P 4 , are identified as training data.
  • a group of N neural networks is trained using the training partitions, each network having different starting weights (Step Q) .
  • N 20
  • the output values of all 20 networks are averaged to provide the average performance on the test data for the trained networks.
  • the data in the test file (partition P.,) is then run through the trained networks to provide an estimate of the performance of the trained networks.
  • the performance is typically determined as the mean squared error of prediction, or misclassification rate.
  • a final performance estimate is generated by averaging the individual performance estimates of each network to produce a completed consensus network (Step R) .
  • This method of training by partitioning the available data into a plurality of subsets is generally referred to as the "holdout method" of training. The holdout method is particularly useful when the data available for network training is limited.
  • Test set performance can be empirically maximized by performing various experiments that identify network parameters that maximize test set performance.
  • the parameters that can be modified in this set of experiments are 1 ) the number of hidden processing elements, 2) the amount of noise added to the inputs, 3) the amount of error tolerance, 4) the choice of learning algorithm, 5) the amount of weights decay, and 6) the number of variables.
  • a complete search of all possible combinations is typically not practical, due to the amount of processing time that is required.
  • test networks are trained with training parameters chosen empirically via a computer program, such as ThinksProTM or a user developed program, or from the results of existing test results generated by others who are working in the field of interest. Once a "best" configuration is determined, a final set of networks can be trained on the complete data set. 3. Development of biochemical diagnostic test
  • a similar technique for isolating variables may be used to build or validate a biochemical diagnostic test, and also to combine a biochemical diagnostic test data with the patient history diagnostic test to enhance the reliability of a medical diagnosis.
  • the selected biochemical test can include any test from which useful diagnostic information may be obtained in association with a patient and/or patient's condition.
  • the test can be instrument or non- instrument based and can include the analysis of a biological specimen, a patient symptom, a patient indication, a patient status, and/or any change in these factors. Any of a number of analytical methods can be employed and can include, but are not limited to, immunoassays, bioassays, chromatography, monitors, and imagers.
  • the analysis can assess analytes, serum markers, antibodies, and the like obtained from the patient through a sample. Further, information concerning the patient can be supplied in conjunction with the test. Such information includes, but is not limited to, age, weight, blood pressure, genetic history, and the other such parameters or variables.
  • the exemplary biochemical test developed in this embodiment employs a standardized test format, such as the Enzyme Linked
  • Immunosorbent Assay or ELISA test although the information provided herein may apply to the development of other biochemical or diagnostic tests and is not limited to the development of an ELISA test (see, e.g.. Molecular Immunology: A Textbook, edited by Atassi et aL Marcel Dekker Inc., New York and Basel 1 984, for a description of ELISA tests). Information important to the development of the ELISA test can be found in the Western Blot test, a test format that determines antibody reactivity to proteins in order to characterize antibody profiles and extract their properties.
  • a Western Blot is a technique used to identify, for example, particular antigens in a mixture by separating these antigens on polyacrylamide gels, blotting onto nitrocellulose, and detecting with labeled antibodies as probes. (See, for example, Basic and Clinical mmunology, Seventh Edition, edited by Stites and Terr, Appleton and Large 1 991 , for information on Western Blots.) It is, however, sometimes undesirable to employ the Western Blot test as a diagnostic tool. If instead, ranges of molecular weight that contain relevant information to the diagnosis can be pre-identified then this information can be "coded" into an equivalent ELISA test.
  • Step W Western Blot data are used as a source
  • Step X the first step in processing the Western Blot data are to pre-process the Western Blot data for use by the neural network.
  • Images are digitized and converted to fixed dimension training records by using a computer to perform the spline interpolation and image normalization. It is necessary to align images on a given gel based only on information in the image in order to use data from multiple Western Blot tests.
  • Each input of a neural network needs to represent a specific molecular weight or range of molecular weights accurately.
  • each gel produced contains a standards image for calibration, wherein the proteins contained are of a known molecular weight, so that the standards image can also be used for alignment of images contained within the same Western Blot.
  • a standard curve can be used to estimate the molecular weight range of other images on the same Western Blot and thereby align the nitrocellulose strips.
  • the process for alignment of images is cubic spline interpolation. This is a method which guarantees smooth transitions at the data points represented by the standards. To avoid possible performance problems due to extrapolation, termination conditions are set so that extrapolation is linear. This alignment step minimizes the variations in the estimates of molecular weight for a given band on the output of the Western Blot.
  • the resultant scanned image is then processed to normalize the density of the image by scaling the density so that the darkest band has a scaled density of 1 .0 and the lightest band is scaled to 0.0.
  • the image is then processed into a fixed length vector of numbers which become the inputs to a neural network, which at the outset must be trained as hereinafter explained.
  • a training example is built in a process similar to that previously described where the results generated from the processing of the Western Blot data are trained (Step Y) .
  • Step Y the results generated from the processing of the Western Blot data are trained.
  • Step AA regions of significantly contributing molecular weights (MW) can be determined and identified.
  • inputs in contiguous regions are preferably combined into "bins" as long as the sign of the correlation between the input and the desired output is the same. This process reduces the typical 1 00-plus inputs produced by the Western Blot, plus the other inputs, to a much more manageable number of inputs of less than about twenty.
  • a correlation may be either positive or negative.
  • a reduced input representation may be produced by using a Gaussian region centered on each of the peaks found in the Western Blot training, with a standard deviation determined so that the value of the Gaussian was below 0.5 at the edges of the region.
  • the basic operation to generate the neural network input is to perform a convolution between the Gaussian and the Western Blot image, using the log of the molecular weight for calculation.
  • the data may be tested using the holdout method, as previously described. For example, five partitions might be used where, in each partition, 80% of the data are used for training and 20% of the data are used for testing. The data are shuffled so that each of the partitions is likely to have examples from each of the gels.
  • Step AA Once the molecular weight regions important to diagnosis have been identified (Step AA), one or more tests for the selected region or regions of molecular weight may be built (Step AB) .
  • the ELISA biochemical test is one example.
  • the selected region or regions of molecular weight identified as important to the diagnosis may then be physically identified and used as a component of the ELISA biochemical test. Whereas regions of the same correlation sign may, or may not, be combined into a single ELISA test, regions of differing correlation signs should not be combined into a single test. The value of such a biochemical test may then be determined by comparing the biochemical test result with the known or suspected medical condition.
  • the development of a biochemical diagnostic test may be enhanced by combining patient data and biochemical data in a process shown in Fig. 2.
  • the patient history diagnostic test is the basis for the biochemical diagnostic test.
  • the variables that are identified as important variables are combined with data derived from the Western Blot data in order to train a set of neural networks to be used to identify molecular weight regions that are important to a diagnosis.
  • Western Blot data are used as a source (Step
  • Step W pre-processed for use by the neural network as described previously
  • Step X A training example is built in a process similar to that previously described wherein the important variables from the patient history data and the results generated from the processing of the Western Blot data are combined and are trained using the combined data (Step Y) .
  • networks are trained on patient history data, as described above (Step Z).
  • a set of neural networks (consensus set) on the data by the partitioning method. From the sensitivity analysis of the training runs on patient history data alone and on combined data, regions of significantly contributing molecular weights can be determined and identified as previously described (Step AA) . As a further step in the isolation process, a set of networks is thereafter trained using as inputs the combined patient history and bin information in order to isolate the important bins for the Western Blot data.
  • the "important bins" represent the important regions of molecular weight related to the diagnosis considering the contribution of patient history information. These bins are either positively or negatively correlated with the desired output of the diagnosis.
  • Step AA one or more tests for the selected region or regions may be built and validated as previously described (Step AB).
  • the designed ELISA tests are then produced and used to generate ELISA data for each patient in the database (Step AC).
  • a set of networks is trained using the partition approach as described above (Step AE).
  • the partition approach can be used to obtain an estimate of the lower bound of the biochemical test.
  • the final training (Step AE) of a set of networks, i.e. , the networks to be used as a deliverable product, is made using all available data as part of the training data.
  • Step AF new data may be used to validate the performance of the diagnostic test.
  • the performance on all the training data becomes the upper bound on the performance estimate for the biochemical test.
  • the consensus of the networks represents the intended diagnostic test output (AG). This final set of neural networks can then be used for diagnosis. 4. Improvement of neural network performance
  • An important feature of the decision-support systems, as exemplified with the neural networks, and methods provided herein is the ability to improve performance.
  • the training methodology outlined above may be repeated as more information becomes available. During operation, all input and output variables are recorded and augment the training data in future training sessions. In this way, the diagnostic neural network may adapt to individual populations and to gradual changes in population characteristics.
  • the process of improving performance through use may be automated. Each entry and corresponding output is retained in memory. Since the steps for retraining the network can be encoded into the apparatus, the network can be re-trained at any time with data that are specific to the population.
  • Method for evaluating the effectiveness of a diagnostic test course of treatment typically, the effectiveness or usefulness of a diagnostic test is determined by comparing the diagnostic test result with the patient medical condition that is either known or suspected.
  • a diagnostic test is considered to be of value if there is good correlation between the diagnostic test result and the patient medical condition; the better the correlation between the diagnostic test result and the patient medical condition, the higher the value placed on the effectiveness of the diagnostic test. In the absence of such a correlation, a diagnostic test is considered to be of lesser value.
  • the systems provided herein provide a means to assess the effectiveness of a biochemical test by determining whether the variable that corresponds to that test is an important selected variable. Any test that yields data that improves the performance of the system is identified.
  • a method by which the effectiveness of a diagnostic test may be determined, independent of the correlation between the diagnostic test result and the patient medical condition is described below.
  • a similar method may be used to assess the effectiveness of a particular treatment.
  • the method compares the performance of a patient history diagnostic neural network trained on patient data alone, with the performance of a combined neural network trained on the combination of patient historical data and biochemical test data, such as ELISA data.
  • Patient history data are used to isolate important variables for the diagnosis (Step AH), and final neural networks are trained (Step AJ) , ail as previously described.
  • biochemical test results are provided for all or a subset of the patients for whom the patient data are known (Step AK), and a diagnostic neural network is trained on the combined patient and biochemical data by first isolating important variables for the diagnosis (Step AL), and subsequently training the final neural networks (Step AM), all as previously described.
  • the performance of the patient history diagnostic neural network derived from Step AJ is then compared with the performance of the combined diagnostic neural network derived from Step AM, in Step AN.
  • the performance of a diagnostic neural network may be measured by any number of means. In one example, the correlations between each diagnostic neural network output to the known or suspected medical condition of the patient are compared. Performance can then be measured as a function of this correlation. There are many other ways to measure performance. In this example, any increase in the performance of the diagnostic neural network derived from Step AM over that derived from Step AJ is used as a measure of th effectiveness of the biochemical test.
  • a biochemical test in this example, and any diagnostic test in general, that lacks sufficient correlation between that test result and the known or suspected medical condition, is traditionally considered to be of limited utility. Such a test may be shown to have some use through the method described above, thereby enhancing the effectiveness of that test which otherwise might be considered uninformative.
  • the method described herein serves two functions: it provides a means of evaluating the usefulness of a diagnostic test, and also provides a means of enhancing the effectiveness of a diagnostic test. 6.
  • the methods and networks provided herein provide a means to, for example, identify important variables, improve upon existing biochemical tests, develop new tests, assess therapeutic progress, and identify new disease markers. To demonstrate these advantages, the methods provided have been applied to endometriosis and to pregnancy related events, such as the likelihood of labor and delivery during a particular period. Endometriosis
  • the methods described herein have provided a means to develop a non-invasive methodology for the diagnosis of endometriosis.
  • the methods herein provide means to develop biochemical tests that provide data indicative of endometriosis, and also to identify and develop new biochemical tests.
  • a decision-support system in this instance, a consensus of neural networks, has been developed for the diagnosis of endometriosis.
  • neural networks capable of aiding in the diagnosis of endometriosis that only rely on patient historical data, i.e., data that can be obtained from a patient by questionnaire format. It was found that biochemical test data could be used to enhance the performance of a particular network, but it was not essential to its value as a diagnostic tool.
  • variable selection protocol and neural nets provide a means to select sets of variables that can be inputted into the decision-support system to provide a means to diagnose endometriosis. While some of the identified variables, include those that have traditionally been associated with endometriosis, others of the variables have not. In addition, as noted above, variables, such as pelvic pain and dysmenorrhea that have been associated with endometriosis are not linearly correlated with it to permit diagnosis.
  • Example 14 Comparison of the output of the pat07 network output with the probability of having endometriosis yields a positive correlation (see Table 1 ) .
  • the pat07 network can predict the likelihood of a woman having endometriosis based on her pat07 score. For example, if a woman has a pat07 score of 0.6, then she has a 90% probability of having endometriosis; if her pat07 score is 0.4, then she has a 10% probability of having endometriosis.
  • the dynamic range of pat07 output when applied to the database was about 0.3 to about 0.7. Theoretically, the output values can range from 0 to 1 , but values below 0.3 or above 0.7 were not observed. Over 800 women have been evaluated using the pat07 network, and its performance can be summarized as follows: TABLE 1
  • the pat07 network score is interpreted as the likelihood of having endometriosis, and not whether or not a woman is diagnosed with endometriosis.
  • the likelihood is based on the relative incidence of endometriosis found in each score group. For example, in the group of women with pat07 network score of 0.6 or greater, 90% of these women had endometriosis, and 10% of these women did not. This likelihood relates to the population of women at infertility clinics.
  • Software programs have been developed that contain the pat07 network.
  • One program referred to as adezacrf.exe, provides a single screen windows interface that allows the user to obtain the pat07 network score for a woman. User enters values for all 1 variables, pat07 network score is calculated following every keystroke.
  • adzcrf2.exe is almost exactly the same as adezacrf.exe, except that it allows for one additional input: the value of an ELISA test.
  • This program and network is a specific example of a method of expanding clinical utility of a diagnostic test.
  • the ELISA test results did not correlate with endometriosis. By itself, the ELISA test does not have clinical utility.
  • the ELISA test improved network performance, so that one may assert that incorporating the ELISA result as an input for network analysis expanded the clinical utility of that ELISA test.
  • Another program (provided herein in the Appendix II, designated adzcrf2.exe, provides a multiple screen windows interface that allows the user to obtain the pat07 network score for a woman.
  • the multiple data entry screens guides the user to enter all patient historical data, and not just those parameters required as inputs for pat07. Pat07 score is calculated after all data are entered and accepted as correct by the user.
  • This program also saves the data entered in * .fdb files, can import data, calculate pat07 scores on imported data, and export data. The user can edit previously entered data. All three of the above programs serve as specific examples of the diagnostic software for endometriosis.
  • Figure 1 1 illustrates an exemplary interface screen used in the diagnostic software.
  • the display 1 100 which is provided as a MicroSoft WindowTM-type display, provides a template for entry of numerical values for each of the important variables which have been determined for diagnosis of endometriosis.
  • Input of data to perform a test is accomplished using a conventional keyboard alone, or in combination with a computer mouse, a trackball or joystick. For purposes of this description, a mouse/keyboard combination will be used.
  • Each of the text boxes 1101-1106 is for entry of numerical values representative of the important variables Age (box 1101); Number of Pregnancies (box 1102); Number of Births (box 1103); Number of Abortions (box 1104); Number of Packs of Cigarettes Smoked per Day (box 1105); and ELISA test results (box 1106).
  • Age box 1101
  • Pregnancies box 1102
  • Number of births box 1103
  • Number of Abortions box 1104
  • Number of Packs of Cigarettes Smoked per Day box 1105
  • ELISA test results box 1106
  • Boxes 1107-1115 are important selected variables for which the data are binary, i.e., either "yes” or "no".
  • the boxes and the variables are correlated as follows:
  • a "yes” to any one of these variables can be indicated by pointing at the corresponding box and clicking the mouse button to indicate an "X" within the box.
  • the network automatically processes the data after every keystroke, so changes will be seen in the output values displayed in text boxes 1 1 18-1 1 20 after every entry into the template 1 100.
  • Text box 1 1 1 8, labelled “Endo” provides consensus network output for the presence of endometriosis;
  • text box 1 1 1 9, labelled “No Endo” provides consensus network output for the absence of endometriosis;
  • text box 1 1 20 provides a relative score indicative of whether or not the patient has endometriosis.
  • the score in the text box 1 120 is an artificial number derived from boxes 1 1 18 and 1 1 1 9 that makes it easier for the physician to interpret results.
  • a value in this box in the positive range up to 25 is indicative of having endometriosis, and a value in the negative range down to -25 will be indicative of not having endometriosis.
  • the selected transformations permits the physician to readily interpret the pat07 output more readily.
  • the pat07 is not the only network that is predictive of endometriosis.
  • Other networks, designated pat08 through pat23a have been developed. These are also predictive of endometriosis. All these networks perform very similarly, and can readily be used in place of pat07.
  • Pat08 and pat09 are the most similar to pat07: these networks were developed by following the protocol outlined above, and were allowed to select important variables from the same set as that used for development of pat07. It was found that the initial weighting of variables can have effects on the outcome of the variable selection procedure, but not in the ultimate diagnostic result. Pat08 and pat09 used the same database of patient data as pat07 to derive the disease relevant parameters. Pat10 through pat23a were training runs originally designed to elucidate the importance of certain parameters: history of endometriosis, history of pelvic surgery, dysmenorrhea and pelvic pain. For development of these, the importance of a variable was assessed by withholding that variable from the variable selection process. It was found that the variable selection process and training the final consensus networks, network performance did not significantly deteriorate.
  • the data includes, patient history data, western blot data and ELISA data.
  • variable selection protocol was applied to the set of 14 variables. From among the 14, 5 variables were selected. These are pregnancy hypertension, number of births, abnormal PAP/dysplasia, history of endo and history of pelvic surgery. This combination ranked as the 68th best performing combination out the 1 6,384 (99.6 percentile) possible combinations, thereby demonstrating the effectiveness of variable selection protocol. Also, the combination that includes all 14 variables was ranked 71 8th out the 1 6,384 possible combinations (95.6 percentile).
  • any subset of the selected set of parameters, particularly the set of fourteen variables, that contain one (or more) of the following combinations of three variables can be used with a decision-support system for diagnosis of endometriosis: a) number of births, history of endometriosis, history of pelvic surgery; b) diabetes, pregnancy hypertension, smoking; c) pregnancy hypertension, abnormal pap smear/dysplasia, history of endometriosis; d) age, smoking, history of endometriosis; e) smoking, history of endometriosis, dysmenorrhea; f) age, diabetes, history of endometriosis; g) pregnancy hypertension, number of births, history of endometriosis; h) Smoking, number of births, history of endometriosis; I) pregnancy hypertension, history endometriosis, history of pelvic surgery; j) number of preg
  • Predicting pregnancy related events such as the likelihood of delivery within a particular time period
  • the methods herein may be applied to any disorder or condition, and are particularly suitable for conditions in which no diagnostic test can be adequately correlated or for which no biochemical test or convenient biochemical test is available.
  • the methods herein have been applied to predicting pregnancy related events, such as the likelihood of delivery within a particular time period.
  • Determination of impending birth is of importance, example, for increasing neonatal survival of infants born before 34 weeks.
  • the presence of fetal fibronectin in secretion samples from the vaginal cavity or the cervical canal from a pregnant patient after week 20 of pregnancy is associated with a risk of labor and delivery before 34 weeks.
  • Methods and kits for screening for fetal fibronectin in secretion samples from the vaginal cavity or the cervical canal for a pregnant patient after week 20 of pregnancy are available (see, U.S. Patent Nos. 5, 51 6,702, 5,468,61 9, and 5,281 , 522, and 5,096,830; see, also U.S. Patent Nos. 5,236,846, 5,223,440, 5, 1 85,270).
  • the methods herein have been applied to development of a decision-support system that assesses the likelihood of certain pregnancy related events.
  • neural nets for predicting delivery before (and after) 34 weeks of gestation have been developed.
  • Neural networks and other decision-support systems developed as described herein can improve the performance of the fetal fibronectin (fFN) test by lowering the number of false positives.
  • these methods may be used to identify tests previously not thought to be relevant to a disease, condition or disorder, and to design new tests and identify new disease markers.
  • Evaluation of the patient history to determine which variables are relevant to the diagnosis is performed by performing a sensitivity analysis on each of the variables to be used in the diagnosis.
  • Two methods can be used to perform this analysis. The first is to train a network on all the information and determine from the network weights, the influence of each input on the network output.
  • the second method is to compare the performance of two networks, one trained with the variable included and the second trained with the variable eliminated. This training would be performed for each of the suspected relevant variables. Those that did not contribute to the performance will be eliminated. These operations are performed to lower the dimension of the inputs to the network. When training with limited amounts of data, a lower input dimension will increase the generalization capabilities of the network. Analysis of Data
  • the data used for this example included 510 patient histories. Each record contained 120 text and numeric fields. Of these fields, 45 were identified as being known before surgery and always containing information. These fields were used as the basic available variables for the analysis and training of networks. A summary of the variables used in this example was as follows:
  • PID Pelvic inflammatory disease
  • the most commonly used method for determining the importance of variables is to train a neural network on the data with all the variables included. Using the trained network as the basis, a sensitivity analysis is performed on the network and the training data. For each training example the network is run in the forward mode (no training). The network outputs were recorded. The for each input variable, the network is rerun with the variable replaced by it's average value over the training example. The difference in output values is squared and accumulated. This process is repeated for each training example. The resulting sums are then normalized so that the sum of the normalized values equals the number of variables. In this way, if all variables contribute equally to the output, their normalized value would be 1 .0. The normalized value can then be ranked in order of importance.
  • the rankings or variables are as follows and are reported for networks trained in run pat05.
  • the rankings for the pat07 networks are as follows:
  • the set of variables identified in this example appear to be reasonable based on the testing and information.
  • the holdout method is effective in providing test information useful in determining network configuration and parameter settings.
  • a 20% holdout was used instead of the proposed 25%. This produced 5 partitions of the data instead of 4 and made 80% of the data for training in each partition.
  • Test networks were trained with parameters chosen from based on parameters known by those of skill in the art to be important in this area and based on results of prior tests. Other sets of variables would also have been suitable. Also, as shown elsewhere herein, combinations of all of the selected 14 variables have been tested. Once the best configuration was determined, a final set of networks was trained on the complete data set of 510 patients. In the final set of networks, a consensus of eight networks was made to produce the final statistics. Results
  • the final holdout training run was pat06 with 14 variables.
  • the performance on the test data was 68.23%.
  • the full training run was pat07 with the same network configuration as pat06.
  • the performance on the training data was 72.9%.
  • Statistics were generated on the last training run based on the use of a cutoff in the network output values. If the network output was below the cutoff, the example was rejected from consideration.
  • the following table is a summary of the results for the consensus of eight networks in the pat07. A test program named adzcrf was produced to demonstrate this final training.
  • the antigen data from Western Blots, for the patients that was originally delivered to Logical Designs provided information on only the peak molecular weights and their associated intensities. Analysis of this data and of the original images from which the data was taken, suggests that it may be possible to use the original image digitized in a way that could provide more information to the neural network.
  • it preprocessing of the image data decreases the variability of the position of a specific molecular weight in the image. This preprocessing will use a polynomial fit through the standards image to produce a modified image. Preprocessing of the images will also include steps to normalize the background level and contrast of the images.
  • the image data could be used as is, or the peak molecular weights could be extracted.
  • inputs to the neural network will be generated.
  • a typical image is about 1000 pixels long, methods to reduce the number of inputs will be investigated.
  • a neural network will be trained with supervised learning to aid in the determination of the ranges of molecular weights that are related to the determination of the disease. This Example focuses on using the image as a whole in the input to the network.
  • the preprocessed image network had a training example performance of 97% while the performance for no preprocessing was 79%.
  • the two alternative gave similar results.
  • the choice was made to use the preprocessed images for further training runs. This choice insured that a given network input would consistently be associated with a specific molecular weight within the tolerances achievable using the Western Blot method.
  • the weights of the network were then averaged together to generate a consensus value for each weight. Since the interconnection weight from the hidden clement to the output could be either positive or negative. The weights were transformed so that all the output connections had the same sign. The weights were then averaged and the results plotted using Excel. Results The analysis of the Western Blot data was performed using a cubic spline interpolation for image alignment to the network inputs and Max/Min image preprocessing. Given that a certain amount of variability can be expected in the accuracy of alignment of the images, due to the Western Blot methodology, this approach is believed to give better results that the polynomial fit originally used.
  • the neural network was able to find regions on the Western Blot that correlate with the presence of the disease.
  • Example 4 From the results of Example 4, several ranges in molecular weight were determined to correlate with the disease. A reduced input representation was produced by using a gaussian region centered on each of the peaks found in Example 5. The standard deviation of the gaussian was determined so that the value of the gaussian was below 0.5 at the edges of the region. The basic operation performed to generate the neural network input was to perform a convolution between the gaussian and the Western blot image. The calculations were all performed using the log of the molecular weight.
  • a separate software program was produced.
  • the program performed the convolution on the normalized images with respect to molecular weight and intensity.
  • the parameters for calculation of the network inputs are contained in a table in the binproc program.
  • binproc the mean and std. deviation are stored in the table.
  • the program is recompiled when the table values are changed.
  • the program has a test mode that produces an output file that allows the gaussians to be plotted an compared to the Western Blot images, using Excel. Plots of regions are included in the documentation.
  • binproc. c was again modified to translate the positions of the fractions into table values for binproc. This modified program is called fproc.d.
  • Binproc2.c was produced from binproc, replacing the mean and std. deviation tables with min. and max. tables which correspond to the endpoints of the fractions in the files supplied.
  • the holdout methods was used with 80% of the data being used for training and the remaining 20% to be used for testing.
  • Example 5 networks trained on all the data were used to determine what ranges of molecular weights were important to the classification process.
  • the holdout method was used to train networks so that the test set performance could be estimated.
  • the first set of test were based on regions identified in Example 5.
  • the second set of tests were made using the fractions identified in the four ishgel files.
  • the initial consensus runs based on the top six regions found in Example 5 yielded poor performance (50%).
  • the regions were widened and the top ten regions from Example 5 were included instead of the top six.
  • a test on the ten wider regions indicated slightly better performance.
  • the purpose of this example was to train a set of networks to determine the performance estimate for the diagnosis using only western blot data. Experiments were run to determine the best configuration and parameters for training of the networks. The method described in Example 2 above was be used for this performance estimate. A final network was trained using all of the available data as training data. The output of this trained network (antigen index) was used as an input to the network generated in the combined data phase. Methodology Used
  • the training runs made during the development of the automated procedure were chosen from these rankings. At the time that the training runs were made, an automated procedure had not been formulated. To save on overall processing time, only one partition of the training data was used. Combinations of variables that performed well in the first partition of the training and test data were then tried on remaining the partitions.
  • One method suggested in the literature for finding the best set of inputs has been to use a genetic algorithm to determine the highest performing set of inputs. Genetic algorithms typically require thousands of iterations to converge to a good solution. In working with the Western Blot data, this would represent a large amount of computer time, even with the small training example size. For 10 variables, an enumeration of all combinations would require 1024 training runs. An alternative to the genetic algorithm was attempted.
  • a neural network was trained to predict the test set RMS Error based on the set of inputs chosen.
  • the training examples used for this experiment were the results of training runs on the first partition of the Western Blot data.
  • the predictor network was then tested with all combinations to determine the predicted minimum combination.
  • the input combination was then used to train a network on the Western Blot data.
  • the main drawback of this method and the genetic algorithm approach is that the sensitivity analysis information, found to be very effective is ignored in the process. Results
  • the process of using the sensitivity and contingency table rankings of variables is an effective and efficient technique for picking a set of variables to maximize the neural network performance.
  • the top 3 variables under both rankings were the same, indicating that these methods are performing well.
  • This method appears to work with the Western Blot data but should work well on any form of data, making this a general purpose neural network technique that can also be applied to patient history data.
  • the above results indicate more data would improve the level of performance.
  • the sensitivity analysis shows little variation in the relative values of variables. Most of the variables contribute to the solution. This should be expected since the bins were chosen based on an analysis of neural network weights trained on the full Western Blot images. By using all or most of the variables, however, the neural networks quickly get into an overtraining situation. This can be avoided by adding data to the training example.
  • EXAMPLE 6 Combine Patient History and ELISA Data Requirements Using the processing developed in the above examples, train a set of networks on the combination of Patient History Data and ELISA Data. An index generated from an ELISA test, based on the use of the entire set of antigens will be used to determine the improvement in performance achieved by combining this information with the patient historical data. Additional Requirements
  • Run 1 ELISA 1 00, ELISA 200, log (ELISA 2) and the original 14 variables.
  • Run 2 (ELISA 2) and the original 14 variables
  • Run 3 The original 14 variables
  • a test program named adzcrf2.exe was produced as a demo of this final training. This program permits the running of pat07 and CRFEL2 based on the value input in the ELISA field. A value of 0 in the field causes pat07 to be used.
  • the analysis of variable relationships was performed. Based on the analysis of the relationships, the variables which showed Endo Present as a contributing factor were compared to the variables used in predicting Endo. Results of training two networks (PATVARSA and PATVARS3) showed that in the case of Endo, relationships were not symmetric, as they are when using correlation. CRFVARSA.XLS was built from the sensitivity analysis results to summarize the results. These results show the nonlinear nature of the relationships. The importance of a variable is affected by the other variables in the training run. This suggests that a means of eliminating unimportant variables in an automated fashion may be required to increase the usefulness of this analysis.
  • the ELISA 2 test adds to the predictive power of the neural network.
  • the ELISA 2 test has eliminated the need for the original ELISA tests. Based on this result it is likely that results of the work with the Western Blot data will further improve the power of the neural network diagnostic test.
  • AFS score desired output There were 7 patients missing Stage information and 28 patients missing Score information.
  • the stage variable the average value of 2.09 was used where the data was missing.
  • score the missing data was replaced with a value depending on the value of the stage variable.
  • stage 1 a score of 3 was used. From stage 2, 10.5 was used.
  • stage 3 28 was used and for stage 4 the value 55 was used.
  • Stage and score were reprocessed so that the desired output would fall in the range of 0.0 to 1 .0. Stage was translated linearly. Two methods were used for score. The first was the square root of the score divided by 1 2.5. The second was the log of score + 1 divided by the log of 1 50.
  • the holdout method was used to train networks on stage, square root score and log of score. These networks were trained using 45 variables. The results were compared to determine which variable and processing would be used for the remainder of the Example. The log of score was chosen.
  • the comparison of the score network to the Endo present network can be performed by forcing a threshold on the desired score output to produce an Endo present comparison.
  • the results for the score and the pat07 networks are shown below.
  • Training data for the adhesions variable was generated in the same manner as for EXAMPLE 7.
  • the adhesions variable generated two output variables in a manner similar to that used for Endo present.
  • the procedure for isolating the set of important variables was begun. Eight networks were trained on the full training example and the consensus sensitivity analysis was generated to produce the first ranking of the variables. Then the Chi Square contingency table was generated to produce the second ranking of the variables.
  • the procedure for isolating important variables was started manually, but was found to be too time consuming. The procedure was implemented as a computer program and was run on a computer for about one week before completing.
  • PID Pelivic inflammatory disease 23
  • variables chosen in the variable selection procedure were as follows, showing the ranking from the final sensitivity analysis:
  • the comparison of the Score network to the Endo present network can be performed by forcing a threshold on the desired score output to produce an Endo present comparison.
  • the results for the score and the pat07 networks are shown below.
  • EXAMPLE 9 This example shows the reprodicibility of the process provided herein. Methodology Used Software used for the selection of important variables for
  • Adhesions and Score was modified to operate with the Endo present desired output.
  • the software was further modified to allow it to be run in the general case instead of needing to be recompiled for each specific test.
  • the run was made on the Endo Present variable in the same fashion as the runs for Adhesion and score. This included using a consensus of 4 networks during the variable selection process.
  • the training data was partitioned into five partitions during the training process, generating a total of 20 networks for each evaluation of the current set of variables being tested.
  • variable selection process found a different set of important variables.
  • pat08 and pat09 consensus networks are shown below along with their sensitivity analysis rankings:
  • variable selection process appears works well and has produced two alternative networks that work as well or better than the the pat07 nets.
  • the reason for this conclusion is that the performance statistics generated only on the training data appear slightly better for the pat07 then pat08 and pat09. Since the variable selection process carefully picks variables based on test set performance, the associated networks are not likely to have been overtrained. As a network becomes overtrained the typical characteristic is that the training example performance increases and the test set performance decreases. Thus the higher performance of pat07 may be the result of slight overtraining.
  • variable selection process appears to have produced two alternative selections on the same training data
  • performance of the two selections appears to be very similar. This is based on the test set performance of the final variable selections for the two runs. It has become clear that when two variables are closer in their relative performance, then random factors can influence their relative ranking.
  • the random factors in the variable selection runs included the random starting points and the use of added noise on the inputs during training. The random noise has been shown to aid in producing better generalization (translation: test set performance). As the number of networks in the consensus increases, the effects of the random influences are decreased.
  • variable selection process The determination of a set of variables that produces a high quality network seems to be addressed by the variable selection process. As more combinations of variables that work successfully are enumerated, it is evident that certain variables or combinations of variables are essential to good performance.
  • the purpose of this Example was to determine the importance of 'Past history of endometriosis' and 'Past history of pelvic surgery' variables in evaluating a patient's risk of having endometriosis, and to provide an alternative means (different from sensitivity analysis) to measure the importance of any given variable in predicting the outcome.
  • variable selection software developed in Example 9 was used as the basis to generate results for each of Example 10.
  • the software was modified so that the user could identify variables that would be excluded from consideration based on the requirements of Example 10.
  • This software was also modified to allow the reporting of classification performance for each of the sets of variables tested so that the effect of an eliminated variable could be more easily understood.
  • parameters for the variable selection process were set as follows:
  • Training example size 510
  • the networks trained for this Example are identified as follows (the two nets have different random seeds); Past Hist, of Endo eliminated: pat1 0, pat1 1 .
  • the typical performance of a consensus of networks was estimated using the holdout method with a partition of 5. When all variables were available, as in pat08 and pat09, the classification performance was estimated to be 65.23%.
  • Consensus networks 1 0
  • Training example size 510
  • the ordering of database variables in the variable selection process was based on the sensitivity analysis and Chi square analysis. This ordering was the same as used in pat08 and pat09.
  • the networks trained for this task are identified as follows (the two nets have different random seeds);
  • variable selection software developed in EXAMPLE 10 and modified in EXAMPLE 1 1 was used as the basis to generate results for each of the tasks in this example.
  • parameters for the variable selection process were set as follows; Number of partitions: 5
  • Training example size 51 0 (290 for step (6)) Number of passes: 999
  • the ordering of database variables in the variable selection process was based on the sensitivity analysis an chi square analysis run specifically for the new target output as described in Example 1 .
  • the networks trained for this Example are identified as follows (the two nets have different random seeds);
  • the count of variables found in the reduced subset run was smaller than for the runs on the full training example.
  • the typical performance of a consensus of networks was estimated using the holdout method with a partition of 5.
  • the typical classification performance for the AFS run using the full training example was 77.22549%.
  • the typical classification performance on the endo present subset was 63.008621 %. If all examples were classified as negative, the performance for the full training example would be 78.82% and 65.29% for the subset.
  • variable selection runs for the full training example and the subset of endo present examples suggests that the size of the training example is of importance in the determination of the important variables. It is clear that as the size of the training example increases, more variables will be considered important. This result can also be interpreted as an indication that more training data will improve the variable selection process and also the overall performance of the consensus networks used in building the diagnostic test.
  • Variable selection was performed without fetal fibronectin (fFN) test data.
  • EGA1 -EGA4 represent neural networks used for variable selection.
  • the variable selection protocol was performed a network architecture with 8 inputs in the input Iayer, three processing elements in the hidden Iayer, and one output in the output Iayer.
  • EGA2 is the same as EGA1 , except that it is 9 inputs in the input Iayer.
  • EGA3 has 7 inputs in the input Iayer, three processing elements in the hidden Iayer, and one output in the output Iayer.
  • EGA4 is the same as EGA1 , except that it is 8 inputs in the input Iayer.
  • the variables selected are as follows:
  • Ethnic Origin 1 Ethnic Origin 4 (Caucasian) (Hispanic)
  • Vaginal bleeding Vaginal bleeding (at time of sampling) (at time of sampling)
  • SN sensitivity
  • SP specificity
  • PPV positive predictive value
  • NPV negative predictive value
  • OR odds ratio (total number correct/total number correct answers)
  • fFN the results from the ELISA assay for fFN.
  • the results show that the network EGA4, the neural net that includes seven patient variables and includes the fFN ELISA assay and that predicts delivery at less than 34 weeks, has far fewer false positives than the fFN ELISA assay. In addition, the number of false positives was reduced by 50%. Incorporation of the fFN test into a neural net improved the performance of the fFN ELISA assay. All of the neural nets performed better than the fFN test alone.
  • the methods herein can be used to develop neural nets, as well as other decision-support systems, that can be used to predict pregnancy related events.
  • the examples shows the results of a task designed to quantitate the contribution of patO7 variables to patO7 performance, and to develop endometriosis networks using minimal numbers of patO7 variables.
  • training examples were generated for each of the combinations of variables to be evaluated. These training examples contained only the variables required for the given consensus run. TrainDos " was used in batch mode to train a set of eight neural networks for each of the combinations of variables to be evaluated. The networks were trained using the same parameters as the Pat07 training runs. The only difference was the setting of the random number seeds for each network. Each network was trained on the full 51 0 record database. From these training runs, a consensus of the outputs was generated in an Excel spreadsheet so that the performance of each of the networks could be evaluated. Results
  • Fig. 7 is a schematic diagram of an embodiment of one type of neural network 10 trained on clinical data of the form used for the consensus network (Fig. 10) of a plurality of neural networks.
  • the structure is stored in digital form, along with weight values and data to be processed in a digital computer.
  • This first type neural network 10 contains three layers, an input Iayer 1 2, a hidden Iayer 14 and an output Iayer 1 6.
  • the input Iayer 1 2 has fourteen input preprocessors 1 7-30, each of which is provided with a normalizer (not shown) which generates a mean and standard deviation value to weight the clinical factors which are input into the input Iayer.
  • the mean and standard deviation values are unique to the network training data.
  • the input Iayer preprocessors 17-30 are each coupled to first and second processing elements 48, 50 of the hidden Iayer 14 via paths 51 -64 and 65-78 so that each hidden Iayer processing element 48, 50 receives a value or signal from each input preprocessor 1 7-30.
  • Each path is provided with a unique weight based on the results of training on training data.
  • the unique weights 80- 93 and 95- 108 are non-linearly related to the output and are unique for each network structure and initial values of the training data. The final value of the weights are based on the initialized values assigned for network training.
  • the combination of the weights that result from training comprise a functional apparatus whose description as expressed in weights produces a desired solution, or more specifically a preliminary indicator of a diagnosis for endometriosis.
  • the factors used to train the neural network and upon which the output is based are the past history of the disease, number of births, dysmenorrhea, age, pelvic pain, history of pelvic surgery, smoking quantity per day, medication history, number of pregnancies, number of abortions, abnormal PAP/dysplasia, pregnancy hypertension, genital warts and diabetes. These fourteen factors have been determined to be a set of the most influential (greatest sensitivity) from the original set of over forty clinical factors. (Other sets of influential factors have been derived, see, EXAMPLES, above) .
  • the hidden Iayer 14 is biased by bias weights 94, 1 1 9 provided via paths 1 64 and 1 79 to the processing elements 48 and 50.
  • the output Iayer 1 6 contains two output processing elements 1 20, 122.
  • the output Iayer 1 6 receives input from both hidden Iayer processing elements 48, 50 via paths 1 23, 1 24 and 1 25, 1 26.
  • the output Iayer processing elements 1 20, 1 22 are weighted by weights 1 10, 1 1 2 and 1 1 4, 1 16.
  • the output Iayer 1 6 is biased by bias weights 1 28, 1 30 provided via paths 1 29 and 1 31 to the processing elements 1 20 and 1 22.
  • the preliminary indication of the presence, absence or severity of endometriosis is the output pair of values A and B from the two processing elements 1 20, 1 22.
  • the values are always positive between zero and one.
  • One of the indicators is indicative that endometriosis is present.
  • the other one of the indicators is indicative that endometriosis is absent.
  • the output pair A, B provide generally valid indication of the disease, a consensus network of trained neural networks provides a higher confidence index.
  • a final indicator pair C, D is based on an analysis of a consensus of preliminary indicator pairs from a plurality, specifically eight, trained neural networks 1 OA - 1 OH (Fig. 10) .
  • Each preliminary indicator pair A, B is provided to one of two consensus processors 1 50, 1 52. via paths 1 33-140 and 141 - 1 48.
  • the first consensus processor 1 50 processes all positive indicators.
  • the second consensus processor 1 52 processes all negative indicators.
  • Each consensus processor 1 50, 1 52 is an averager, i.e. , it merely forms a linear combination, such as an average, of the collection of like preliminary indicator pairs A, B.
  • the resultant confidence indicator pair is the desired result, where the inputs are the set of clinical factors for the patient under test.
  • Fig. 9 illustrates a typical processor element 1 20. Similar processors 48 and 50 have more input elements, and processor element 1 22 is substantially identical.
  • Typical processor element 1 20 comprises a plurality of weight multipliers 1 10, 1 14, 1 28 on respective input paths (numbering in total herein 1 5, 1 6 or 3 per element and shown herein as part of the processor element 1 20) .
  • the weighted values from the weight multipliers are coupled to a summer 1 56.
  • the summer 1 56 output is coupled to an activation function 1 58, such as a sigmoid transfer function or an arctangent transfer function.
  • the processor elements can be implemented as dedicated hardware or in a software function.
  • a sensitivity analysis can be performed to determine the relative importance of the clinical factors.
  • the sensitivity analysis is performed on a digital computer as follows: A trained neural network is run in the forward mode (no training) for each training example (input data group for which true output is known or suspected). The output of the network for each training example is then recorded. Thereafter, the network is rerun with each input variable being replaced by the average value of that input variable over the entire training example. The difference in values for each output is then squared and summed (accumulated) to obtain individual sums.
  • This sensitivity analysis process is performed for each training example.
  • Each of the resultant sums is then normalized according to conventional processes so that if all variables contributed equally to the single resultant output, the normalized value would be 1 .0. From this information, the normalized value can be ranked in order of importance.
  • the order of sensitivity of factors for this neural network system are the past history of the disease, number of births, dysmenorrhea, age, pelvic pain, history of pelvic surgery, smoking quantity per day, medication history, number of pregnancies, number of abortions, abnormal PAP/dysplasia, pregnancy hypertension, genital warts and diabetes.
  • a specific neural network system has been trained and has been found to be an effective diagnostic tool.
  • the neural network system as illustrated by Figs. 7 and 10, is described as follows:
  • Dysmenorrhea are as follows for each of eight of the first type of neural networks 10: First neural network A:
  • the results of biochemical tests may be used to produce trained augmented neural network systems to produce a relatively higher confidence level in terms of sensitivity and specificity.
  • These second type neural networks are illustrated in Fig. 8. The numbering is identical to Fig. 7, except for the addition of a node 31 in the input Iayer 1 2 and a pair of weights 1 09 and 1 1 1 . All weights, however, in the network change upon training with the additional biochemical result. The exact weight set is dependent on the specific biochemical test training example.
  • the training system may be used.
  • Alternative training techniques may also be used (see, e.g. , Baxt, "Use of an Artificial Neural Network for the Diagnosis of Myocardial Infarction, " Annals of Internal Medicine 1 1 5. p.843 (1 December 1 991 ); “Improving the Accuracy of an Artificial Neural Network Using Multiple Differently Trained Networks,” Neural Computation 4, p. 772 ( 1 992)).

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
EP97915835A 1996-02-09 1997-02-07 Method for selecting medical and biochemical diagnostic tests using neural network-related applications Withdrawn EP0879449A2 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US59927596A 1996-02-09 1996-02-09
US1144996P 1996-02-09 1996-02-09
US11449P 1996-02-09
US599275 1996-02-09
PCT/US1997/002104 WO1997029447A2 (en) 1996-02-09 1997-02-07 Method for selecting medical and biochemical diagnostic tests using neural network-related applications

Publications (1)

Publication Number Publication Date
EP0879449A2 true EP0879449A2 (en) 1998-11-25

Family

ID=26682401

Family Applications (1)

Application Number Title Priority Date Filing Date
EP97915835A Withdrawn EP0879449A2 (en) 1996-02-09 1997-02-07 Method for selecting medical and biochemical diagnostic tests using neural network-related applications

Country Status (5)

Country Link
EP (1) EP0879449A2 (ja)
JP (6) JP3480940B2 (ja)
AU (1) AU2316297A (ja)
CA (1) CA2244913A1 (ja)
WO (1) WO1997029447A2 (ja)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11978208B2 (en) 2019-01-31 2024-05-07 Fujifilm Corporation Trained model, learning method, learning program, medical information acquisition device, medical information acquisition method, and medical information acquisition program

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6678669B2 (en) 1996-02-09 2004-01-13 Adeza Biomedical Corporation Method for selecting medical and biochemical diagnostic tests using neural network-related applications
US6622036B1 (en) * 2000-02-09 2003-09-16 Cns Response Method for classifying and treating physiologic brain imbalances using quantitative EEG
US6601034B1 (en) 1998-03-05 2003-07-29 American Management Systems, Inc. Decision management system which is cross-function, cross-industry and cross-platform
US6609120B1 (en) 1998-03-05 2003-08-19 American Management Systems, Inc. Decision management system which automatically searches for strategy components in a strategy
US8364578B1 (en) 1998-03-05 2013-01-29 Cgi Technologies And Solutions Inc. Simultaneous customer/account strategy execution in a decision management system
US6546545B1 (en) 1998-03-05 2003-04-08 American Management Systems, Inc. Versioning in a rules based decision management system
CA2326579C (en) * 1998-04-03 2011-01-18 Triangle Pharmaceuticals, Inc. Systems, methods and computer program products for guiding the selection of therapeutic treatment regimens
AU4971499A (en) * 1998-07-30 2000-02-21 Arcturus Engineering, Inc. Medical diagnostic and treatment information system and method
US6708155B1 (en) 1999-07-07 2004-03-16 American Management Systems, Inc. Decision management system with automated strategy optimization
JP4726177B2 (ja) * 2000-06-08 2011-07-20 ビルコ・ビーブイビーエイ ニューラルネットワークを使用して治療薬耐性を予測し、そして薬剤耐性の遺伝的基礎を定めるための方法およびシステム
AU6378501A (en) * 2000-06-09 2001-12-17 Medimage Aps A fail-sure computer aided method for anticoagulant treatment
WO2002021423A2 (en) * 2000-09-06 2002-03-14 Cellomics, Inc. Method and system for obtaining knowledge based recommendations
US6920439B1 (en) * 2000-10-10 2005-07-19 Hrl Laboratories, Llc Method and apparatus for incorporating decision making into classifiers
US6988088B1 (en) * 2000-10-17 2006-01-17 Recare, Inc. Systems and methods for adaptive medical decision support
US7315784B2 (en) * 2001-02-15 2008-01-01 Siemens Aktiengesellschaft Network for evaluating data obtained in a biochip measurement device
US20020155587A1 (en) 2001-04-20 2002-10-24 Sequenom, Inc. System and method for testing a biological sample
US7840421B2 (en) 2002-07-31 2010-11-23 Otto Carl Gerntholtz Infectious disease surveillance system
US20040103001A1 (en) * 2002-11-26 2004-05-27 Mazar Scott Thomas System and method for automatic diagnosis of patient health
US7529394B2 (en) 2003-06-27 2009-05-05 Siemens Medical Solutions Usa, Inc. CAD (computer-aided decision) support for medical imaging using machine learning to adapt CAD process with knowledge collected during routine use of CAD system
CN101416191A (zh) * 2003-12-02 2009-04-22 什拉加·洛特姆 母婴状况的诊断、筛选、预防和治疗的人工智能和设备
US20090083768A1 (en) * 2007-09-20 2009-03-26 Hatalkar Atul N Context platform framework for aggregation, analysis and use of contextual information
MX2013001557A (es) * 2010-08-13 2013-06-28 Respiratory Motion Inc Dispositivos y metodos para el control de la variacion respiratoria mediante la medicion de volumenes, movimiento y variabilidad respiratoria.
NZ730197A (en) * 2014-09-11 2022-07-01 Berg Llc Bayesian causal relationship network models for healthcare diagnosis and treatment based on patient data
JP6481335B2 (ja) * 2014-11-06 2019-03-13 日本電気株式会社 情報処理システム、情報処理装置、情報処理方法および情報処理プログラム
JP6280997B1 (ja) * 2016-10-31 2018-02-14 株式会社Preferred Networks 疾患の罹患判定装置、疾患の罹患判定方法、疾患の特徴抽出装置及び疾患の特徴抽出方法
JP2019101902A (ja) * 2017-12-06 2019-06-24 株式会社グルーヴノーツ データ処理装置、データ処理方法及びデータ処理プログラム
JP2021519479A (ja) * 2018-03-19 2021-08-10 アンブリー ジェネティクス コーポレーションAmbry Genetics Corporation 遺伝子検査とゲノム検査を識別するための人工知能と機械学習プラットフォーム
KR102043376B1 (ko) * 2018-08-16 2019-11-11 한국과학기술연구원 심층 신경망 알고리즘을 이용하는 실시간 스트레스 분석 방법
EP3844722A4 (en) 2018-08-29 2022-06-01 Movidius Ltd. COMPUTER VISION SYSTEM
KR102166441B1 (ko) * 2018-10-02 2020-10-16 재단법인대구경북과학기술원 병변 검출 장치 및 제어 방법
JPWO2020189235A1 (ja) * 2019-03-20 2020-09-24
CN110164549A (zh) * 2019-05-20 2019-08-23 南通奕霖智慧医学科技有限公司 一种基于神经网络分类器的儿科分诊方法及系统
KR102187344B1 (ko) * 2019-09-03 2020-12-04 메디사피엔스 주식회사 결정 트리를 이용한 반려동물 진단 방법 및 장치
US20220383111A1 (en) * 2019-09-27 2022-12-01 D5Ai Llc Selective training of deep learning modules
WO2022190891A1 (ja) * 2021-03-11 2022-09-15 ソニーグループ株式会社 情報処理システム及び情報処理方法
KR20230027810A (ko) * 2021-08-20 2023-02-28 가톨릭대학교 산학협력단 예후 예측 장치, 방법 및 기록매체
US20230169168A1 (en) * 2021-11-29 2023-06-01 Microsoft Technology Licensing, Llc. Detect anomalous container deployment at a container orchestration service
CN115116594B (zh) * 2022-06-06 2024-05-31 中国科学院自动化研究所 医疗装置有效性的检测方法及装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3231810B2 (ja) * 1990-08-28 2001-11-26 アーチ・デベロップメント・コーポレーション ニューラル・ネットワークを用いた鑑別診断支援方法
JP3081043B2 (ja) * 1991-12-27 2000-08-28 シスメックス株式会社 脳梗塞の診断方法
JPH05277119A (ja) * 1992-03-31 1993-10-26 Nuclear Fuel Ind Ltd 癌診断装置
CA2161655A1 (en) * 1993-04-30 1994-11-10 James David Keeler Method and apparatus for determining the sensitivity of inputs to a neural network on output parameters
JP3226400B2 (ja) * 1993-12-06 2001-11-05 富士通株式会社 診断装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9729447A2 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11978208B2 (en) 2019-01-31 2024-05-07 Fujifilm Corporation Trained model, learning method, learning program, medical information acquisition device, medical information acquisition method, and medical information acquisition program

Also Published As

Publication number Publication date
JP2008065836A (ja) 2008-03-21
AU2316297A (en) 1997-08-28
CA2244913A1 (en) 1997-08-14
JP3480940B2 (ja) 2003-12-22
JP4168187B2 (ja) 2008-10-22
JP3782792B2 (ja) 2006-06-07
JP2008136874A (ja) 2008-06-19
JP4139822B2 (ja) 2008-08-27
WO1997029447A3 (en) 1998-05-07
JP2005319301A (ja) 2005-11-17
JP2006172461A (ja) 2006-06-29
JP2000501869A (ja) 2000-02-15
WO1997029447A2 (en) 1997-08-14
JP2004041713A (ja) 2004-02-12

Similar Documents

Publication Publication Date Title
EP0879449A2 (en) Method for selecting medical and biochemical diagnostic tests using neural network-related applications
US20030004906A1 (en) Method for selecting medical and biochemical diagnostic tests using neural network-related applications
US6527713B2 (en) Automated diagnostic system and method including alternative symptoms
CN113838577B (zh) 便捷化分层老年mods早期死亡风险评估模型、装置及建立方法
Steen Approaches to predictive modeling
Yulhendri et al. Correlated Naïve Bayes Algorithm to Determine Healing Rate of Hepatitis Patients
Ramya et al. Identification of Heart Disease Using Fs Algorithm
Saleena Analysis of machine learning and deep learning prediction models for sepsis and neonatal sepsis: A systematic review
Park et al. Original Research Article Study on prediction and diagnosis AI model of frequent chronic diseases based on health checkup big data
Wagachchi Machine learning-based system to predicting the diagnosis of coronary artery disease
Sekar et al. Feature selection and weighing for case-based reasoning system using random forests
Xu A significance test-based feature selection method for the detection of prostate cancer from proteomic patterns
Reina et al. Novel computational model for survey and trend analysis of patients with endometriosis: a decision aid tool for EBM

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19980730

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI NL PT SE

17Q First examination report despatched

Effective date: 20000105

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20000516