CN115886818B - Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof - Google Patents

Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof Download PDF

Info

Publication number
CN115886818B
CN115886818B CN202211496983.3A CN202211496983A CN115886818B CN 115886818 B CN115886818 B CN 115886818B CN 202211496983 A CN202211496983 A CN 202211496983A CN 115886818 B CN115886818 B CN 115886818B
Authority
CN
China
Prior art keywords
data
meal
anxiety disorder
depression
gastrointestinal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211496983.3A
Other languages
Chinese (zh)
Other versions
CN115886818A (en
Inventor
陈蕾
李百川
季舒铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
West China Hospital of Sichuan University
Original Assignee
West China Hospital of Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by West China Hospital of Sichuan University filed Critical West China Hospital of Sichuan University
Priority to CN202211496983.3A priority Critical patent/CN115886818B/en
Publication of CN115886818A publication Critical patent/CN115886818A/en
Application granted granted Critical
Publication of CN115886818B publication Critical patent/CN115886818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention relates to the field of disease prediction, in particular to a depression anxiety disorder prediction system based on gastrointestinal signals and a construction method thereof. The invention provides a depression anxiety disorder prediction system, which is characterized by comprising the following components: a database for storing data, the types of data comprising gastrointestinal electrical signal data and clinical data, the gastrointestinal electrical signal data comprising pre-meal gastrointestinal electrical signal data and post-meal gastrointestinal electrical signal data; the pre-meal gastrointestinal electrical signal data comprises a reaction area of a pre-meal intestine portion and a lead time difference of the pre-meal intestine portion, the post-meal gastrointestinal electrical signal data comprises a normal slow wave percentage of a post-meal intestine portion and a coupling percentage of the post-meal intestine portion, and the clinical data comprises drinking and blood glucose; a data acquisition module; a model training module; a prediction module to predict a probability of occurrence of depressive anxiety disorder in the subject. On the other hand, the invention also provides a construction method of the depression anxiety disorder model based on the system.

Description

Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof
Technical Field
The invention relates to the field of disease prediction, in particular to a depression anxiety disorder prediction system based on gastrointestinal signals and a construction method thereof.
Background
Early detection of disease is of paramount importance. It is believed that the earlier the disease is diagnosed, the greater the likelihood that the disease will be cured (or successfully controlled) and that the patient will have a better prognosis. If the disease can be screened and treated early, further exacerbation of the disease can be prevented or delayed, the therapeutic effect can be improved (e.g., prolonging patient life, improving patient quality of life), etc.
In recent years, the number of chronic nervous system diseases patients in China is continuously rising. However, due to the unobvious early symptoms, the low awareness of the patient, the cumbersome and costly examination procedures, some chronic neurological disorders (e.g., cognitive dysfunction, sleep disorders, anxiety or depression) are often diagnosed when they progress to a stage that is more difficult to intervene and treat. Thus, large-scale screening of target populations (e.g., middle aged and elderly) is required to achieve early detection of disease. However, large-scale screening requires large amounts of data to be processed and typically relies on manual analysis, the overall process is time consuming, laborious, costly, and the results of the data analysis are subjective, complex, and difficult to quantify. Thus, large-scale screening of many diseases is difficult to popularize, particularly in the early detection of depressive anxiety disorders.
It is counted that approximately one quarter of adults worldwide suffer from mental health disorders each year. Common mental health disorders include depression and anxiety, which are characterized by strong emotional distress that can have a serious impact on the social and professional functions of the patient. Depression is characterized by sustained emotional depression and/or loss of pleasure (lack of pleasure) in most activities, and a series of related emotional, cognitive, physical and behavioral symptoms such as fatigue/loss of energy, worthless or excessive or inappropriate feelings of guilt, recurrent thoughts of death, reduced mental/concentration or hesitation, mental restlessness or retardation, insomnia or hypersomnia. Anxiety is characterized by excessive fear, anxiety and avoidance to potential threats in the environment (e.g., social situations, strange situations, etc.) or itself (special feelings of physical presence). However, anxiety or depression patients are mostly concerned with their somatic symptoms (e.g., systemic symptoms or multiple symptoms of autonomic dysfunction) and are generally not actively complaint with their emotional distress (e.g., after certain symptoms appear, most patients choose to visit neurology and complain about symptoms unrelated to emotion such as headache, dizziness, tiredness, etc.), which raises difficulty in early screening for depression or anxiety disorders. Anxiety disorders are sometimes very painful, ultimately leading to depression. Of course, anxiety and depression may coexist, and it is also possible that depression occurs first, followed by symptoms and signs of anxiety.
In the prior art, the diagnosis modes of the depression anxiety disorder comprise basic clinical symptom assessment, neuropsychological state assessment and the like. In particular, when a patient exhibits symptoms of depressive anxiety disorder, he or she or her family members will go to a hospital visit. At the time of inquiry, the doctor gives the patient a paper quality chart (combination) and makes a preliminary evaluation according to the test results of the patient, and screens out patients suspected of depression or anxiety disorder in combination with other examination results. Therefore, the whole process is complex and tedious, takes a long time, has large workload and low efficiency, and is influenced by personal experience of doctors. In addition, patients may have low cognition degree and diagnosis matching degree for depression or anxiety disorder, and the patients cannot communicate with doctors and finish the scales independently due to low hearing, eyesight, understanding ability, low cultural degree and other reasons, so that the early detection (particularly large-scale screening in basic level and wide population) and timely targeted treatment and prevention of depression or anxiety disorder are difficult.
Although the prior art has a de-rated anxiety and depression psychological detection method, the method involves requiring a subject to participate in psychological interviews and collect audio and video data, and not only does not solve the problems of complex procedures, time consumption, low acceptance and the like, but also is especially unsuitable for people with low cultural level, poor understanding ability and no electronic equipment to use, and the application range is still limited.
Disclosure of Invention
In a first aspect, the present invention provides a depression anxiety disorder prediction system comprising: a database for storing data, the types of data comprising gastrointestinal electrical signal data and clinical data, the gastrointestinal electrical signal data comprising pre-meal gastrointestinal electrical signal data and post-meal gastrointestinal electrical signal data; the pre-meal gastrointestinal electrical signal data comprises a reaction area of a pre-meal intestine portion and a lead time difference of the pre-meal intestine portion, the post-meal gastrointestinal electrical signal data comprises a normal slow wave percentage of a post-meal intestine portion and a coupling percentage of the post-meal intestine portion, and the clinical data comprises drinking and blood glucose; the data includes sample data from a sample population and subject data from a subject;
the data acquisition module is used for acquiring the data and storing the data in the database;
the model training module is used for training and learning the sample data by using a machine learning algorithm so as to determine a depression anxiety disorder prediction model;
and the prediction module is used for acquiring the subject data through the data acquisition module, and calling the depression anxiety disorder prediction model to analyze the subject data so as to predict the probability of depression anxiety disorder occurrence of the subject.
In some embodiments, the sample data is divided into training and validation sets at a ratio of 7:3.
In some embodiments, the gastrointestinal signal data is acquired simultaneously by leads located in the stomach, antrum, lesser curvature, greater curvature, ascending colon, transverse colon, descending colon, and rectum, respectively.
In some embodiments, the system further comprises a verification module for evaluating the accuracy of the depressive anxiety disorder prediction model using the verification set.
In some embodiments, the evaluation index of the evaluation includes a degree of calibration or a degree of differentiation.
In a second aspect, the invention provides a method for constructing a depression anxiety disorder prediction model, which is characterized by comprising the following steps:
s1, acquiring sample data from a sample population, wherein the types of the sample data comprise gastrointestinal signal data and clinical data;
s2, pre-training the sample data to screen out predicted variables, wherein the screened predicted variables comprise reaction area of the pre-meal intestine, lead time difference of the pre-meal intestine, normal slow wave percentage of the post-meal intestine, coupling percentage of the post-meal intestine, drinking and blood sugar;
and S3, training and learning the sample data by using a machine learning algorithm based on the screened prediction variables so as to establish a depression anxiety disorder prediction model.
In some embodiments, the sample data is divided into training and validation sets at a ratio of 7:3.
In some embodiments, the pre-training comprises a first round of variable screening and a second round of variable screening; the first round of variable screening includes LASSO regression analysis and the second round of variable screening includes logistic regression analysis and stepwise regression analysis.
In some embodiments, the method further comprises S4 evaluating the accuracy of the depression anxiety disorder predictive model using the validation set, the evaluated evaluation index comprising a degree of calibration or a degree of differentiation.
In some embodiments, the pre-training includes a ridge regression or random forest model.
Compared with the prior art, the invention has the beneficial technical effects that:
the invention provides a gastrointestinal electric signal-based depression anxiety disorder prediction system and a construction method thereof. The invention performs pre-training (feature screening) on 46 feature variables in sample data through LASSO regression analysis, logistic regression analysis and stepwise regression analysis, finally reserves 6 prediction variables of 'reaction area of the foremeal intestine, lead time difference of the foremeal intestine, normal slow wave percentage of the postprandial intestine, coupling percentage of the postprandial intestine, drinking and blood sugar', and constructs a depression anxiety disorder prediction model based on the 6 prediction variables.
The prior art mainly relies on a behavior scale to evaluate the behavior of a patient so as to realize the diagnosis of depression anxiety. When the patient or surrounding individuals realize that the patient himself may suffer from a mental disorder (e.g., depression, anxiety), the patient may develop into a stage that is more difficult to intervene and treat (e.g., has developed into severe depression, anxiety, or has developed into complications with other mental disorders). The symptoms exhibited by patients in the early stages are often indiscernible and therefore may miss the preferred stages of the intervention (it is believed that the shorter the time that elapses between onset of mental illness such as depression, anxiety, etc. to the intervention, the greater the chance that the patient will recover). On the other hand, some patients, even if aware of certain symptoms, do not attribute or acknowledge themselves as psychotic patients, but choose to ignore or go to the department of medicine (e.g., neurology) for medical assistance. Based on the depression anxiety disorder prediction model, the depression anxiety disorder occurrence risk of the subject can be predicted only by the gastrointestinal electric part index and basic clinical information (such as drinking and blood sugar) of the subject without various examination and scales, the early screening of the depression anxiety disorder is facilitated, and the time is provided for the intervention treatment of the depression anxiety disorder.
The depression anxiety disorder prediction system and the depression anxiety disorder prediction method provided by the invention do not relate to high-cost (possibly invasive) examination such as imaging examination and the like, and do not need a subject to fill out a scale, so that the program is simple and the price is low, and therefore, the subject acceptance and the coordination degree are high. The depression anxiety disorder prediction system and method provided by the invention are not limited by age and cultural degree, are not influenced by communication and understanding disorders and personal experience of doctors in the process of treatment, can relatively objectively and noninvasively analyze and predict the data of the subjects, and are especially suitable for primary screening and early detection of depression anxiety disorder for larger crowds (such as communities and physical examination centers). In summary, the depression anxiety disorder prediction system and method provided by the invention are not only beneficial to assisting clinical evaluation, but also beneficial to individual prediction, and are suitable for various application scenes (such as basic medical institutions, families, hospitals and physical examination centers) and crowds.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. Like elements or portions are generally identified by like reference numerals throughout the several figures. In the drawings, elements or portions thereof are not necessarily drawn to scale. It will be apparent to those of ordinary skill in the art that the drawings in the following description are of some embodiments of the invention and that other drawings may be derived from these drawings without inventive faculty.
FIG. 1 shows the results of cross-validation of LASSO regression models (a) and ten folds (b) for the first embodiment of the present invention;
FIG. 2 shows an alignment chart (a) and a dynamic alignment chart (b) of an optimal logistic regression model according to the first embodiment of the present invention;
FIG. 3 shows a predictive model ROC curve for test set (a) and validation set (b) of a first embodiment of the invention;
FIG. 4 shows calibration curves of test set (a) and validation set (b) according to a first embodiment of the present invention;
FIG. 5 shows a block diagram of a prediction system provided by an embodiment of the present invention;
FIG. 6 illustrates placement of leads in an embodiment of the invention;
fig. 7 shows a schematic architecture diagram of a prediction system according to an embodiment of the present invention.
100 is a prediction system, 102 is a data acquisition module, 104 is a model construction module, 106 is a database, 108 is a model training module, 110 is a verification module, 112 is a prediction module, 202 is a first terminal, 204 is a second terminal, 206 is a network, 601 is a stomach body, 602 is a small curve, 603 is a large curve, 604 is a antrum, 605 is a ascending colon, 606 is a transverse colon, 607 is a descending colon, and 608 is a rectum.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Herein, "and/or" includes any and all combinations of one or more of the associated listed items.
Herein, "plurality" means two or more, i.e., it includes two, three, four, five, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As used in this specification, the term "about" is typically expressed as +/-5% of the value, more typically +/-4% of the value, more typically +/-3% of the value, more typically +/-2% of the value, even more typically +/-1% of the value, and even more typically +/-0.5% of the value.
In this specification, certain embodiments may be disclosed in a format that is within a certain range. It should be appreciated that such a description of "within a certain range" is merely for convenience and brevity and should not be construed as a inflexible limitation on the disclosed ranges. Accordingly, the description of a range should be considered to have specifically disclosed all possible sub-ranges and individual numerical values within that range. For example, the description of ranges 1-6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6, etc., as well as individual numbers within this range, e.g., 1,2,3,4,5, and 6. The above rule applies regardless of the breadth of the range.
Embodiment one: gastrointestinal electric signal and depression anxiety disorder prediction model
1.1 method
The subject:in 60 communities in the western region of China, community groups which voluntarily participate in research and sign informed consent are recruited for over 40 years old, general data such as age, sex, marital status, cultural level, life style, eating habit and the like of a subject are collected, and depression anxiety scale evaluation and gastrointestinal electrogram detection are carried out. The exclusion criteria were: eliminating gastrointestinal diseases such as gastritis, gastric ulcer, diarrhea, constipation, etc. of deaf and blind eyesAnd (3) the person; excluding patients with serious cardiac, hepatic and renal dysfunction or metabolic diseases such as diabetes and the like and serious mental diseases; to reduce the effect of the drug on the electrogastrogram test, any drug taken 1 week prior to the examination was excluded.
Depression anxiety assessment:patient health questionnaire-9 (Patient Health Questionnaire, PHQ-9) was used to assess depressive states and generalized anxiety scale-7 (Generalized Anxiety Disorder-7, GAD-7) was used to assess anxiety states. Both scales have better confidence and efficacy in assessing depression or anxiety. The two scales are filled out by the subjects under the relief of silence of two professionals under the unified term guidance. Since anxiety and depression are closely related (frequently co-morbidly occurring), in this example, the subject is diagnosed with depression or anxiety incorporates the mental disorder (i.e., anxiety and depression are predicted as a whole).
EGEG record: gastrointestinal myoelectric activity signals (XDJ-S8, hefei Kaili co., hefei, china) were measured and acquired with an 8-channel electrogastrograph. All subjects were informed of avoidance of drinking and spicy or irritating foods for 3 days and fasted for at least 6 hours prior to examination. The measurement is performed in the supine position. On the abdominal skin, 4 stomach electrodes (leads placed at stomach 601, lesser curvature 602, greater curvature 603, antrum 604, respectively) and 4 intestine electrodes (leads placed at ascending colon 605, transverse colon 606, descending colon 607, rectum 608, respectively) were placed (Hanjie co.ltd., shanghai, china) (fig. 6). During the examination, the subject is instructed to avoid any actions and speaking. After 6 minutes of pre-meal EGEG recordings, a meal functional load experiment was performed. After ingestion of about 200kcal of standard food, 6 minutes of post-meal gastrointestinal electrical signals were recorded. The placement of the leads is shown in fig. 6: gastric body 601: the midpoint of the xiphoid process and umbilical line is opened three to five centimeters to the left and one centimeter upwards; gastric antrum 604: the midpoint of the xiphoid process and umbilical line is two to four centimeters to the right; small bend 602: 1/2 of the midpoint of the xiphoid process and umbilicus connecting line is upwards; macrobend 603: the xiphoid process and umbilicus are connected at 1/2 of the midpoint downwards. Ascending colon 605: two to four centimeters to the right in line with the umbilicus; transverse colon 606: one centimeter below the umbilicus; descending colon 607: two to four centimeters to the left in line with the umbilicus; rectum 608: under the back coccyx.
Gastrointestinal electrical index:the EGEG sampling frequency is 1Hz, and the filtering frequency is 0.008Hz-0.1Hz to filter background noise including heart beat. After the artifact is detected, the original EGEG potential data is calculated by the matched software of the inspection instrument, and the frequency spectrum analysis is carried out by the software to respectively derive the following parameters: (1) waveform average amplitude; (2) waveform average frequency; (3) percent gastric (intestinal) electrical rhythm disorder; (4) a waveform reaction area; (5) lead time difference; (6) a primary frequency; (7) a main power ratio; (8) normal slow wave percentage; (9) percent coupling.
Other indexes:each subject received an assay for blood glucose and lipid levels, including glucose, triglycerides, cholesterol, high density lipoproteins, low density lipoproteins. Basic personal characteristics including gender, age, smoking history, drinking history, BMI were also collected for each subject.
Building a prediction model:first, to obtain a subset of predictors, a first round of variable screening is performed using LASSO regression analysis, one of the regularization algorithms. In addition, LASSO regression analysis runs 10-fold cross validation, centralizes and normalizes the included variables, and selects "lambda.min" as the best performance. Subjects were randomized into training and validation sets at a randomized ratio of 7:3. Then, the prediction factors screened in the LASSO regression model are subjected to a second round of variable screening by adopting a step-by-step multivariate logistic regression analysis, and the reserved prediction factors with statistical significance (in the invention, the 'prediction factors' and the 'prediction variables' represent the same meaning) are utilized to build a prediction model. And finally, applying the established prediction model to prediction of the risk of the depression anxiety disorder and establishing a nomogram prediction model. It should be appreciated that other suitable algorithms known in the art may also be used, such as random forest methods, other regularization methods (e.g., ridge regression), neural networks, and the like.
Furthermore, by using the data of the training set and the validation set, several validation methods are employed to evaluate the accuracy of the risk prediction model, respectively, including: ROC curve, area under ROC curve is used to distinguish quality of depressive anxiety disorder risk nomogram to distinguish true positive from false positive (i.e. degree of differentiation); and a calibration curve for evaluating the degree of calibration of the depression anxiety disorder risk nomogram, with a Hosmer-Lemeshow test. All analyses used the R4.1.3 version of the software package glrnet and rms, with the significance level set to double-tailed α <0.1.
1.2 results
Subject data information:
a total of 662 subjects completed all relevant examinations, including 204 men and 458 women, with 90 persons diagnosed with anxiety or depression (69 men, 21 women). The gastrointestinal electric index of the subject is obtained by averaging the above-mentioned 8-lead pre-meal or post-meal parameter index data, and represents the pre-stomach-meal lead signal index, the post-stomach-meal lead signal index, the pre-intestinal-meal lead signal index and the post-intestinal-meal lead signal index, respectively. The method of placing a plurality of leads at a plurality of positions for signal acquisition simultaneously and then averaging can better capture the overall motion law of the stomach and the intestine so as to more effectively acquire signals capable of reflecting the overall real state of the stomach and the intestine. In addition, the test of the pre-experiment shows that the signal index obtained by the multipoint signal acquisition in the mode is stable, the model is easy to build, and the built model has better universality for large-scale people.
All subjects were assigned at a random sampling rate of 7:3, with 464 and 198 subjects assigned to the training and validation sets, respectively.
Screening of independent risk factors:
using LASSO regression-based non-zero coefficient feature variable screening, the solution of the present invention finally chooses to reserve 8 feature variables as potential prediction variables (fig. 1a and 1 b) of the artificial intelligence model among the 46 relevant feature variables (also referred to as independent variables (Independent Variable, IV) in the present invention, table 1) involved in predicting response variables (also referred to as outcome variables (Dependent Variable, DV) in the present invention, namely risk of developing depressive anxiety disorder, including: drinking, blood glucose, normal slow wave percentage of the postprandial intestine, coupling percentage of the postprandial intestine, reaction area of the preprandial intestine, lead time difference of the preprandial intestine, main power ratio of the preprandial stomach and normal slow wave percentage of the preprandial stomach (table 2).
Table 1:46 related characteristic variables (parameter index)
Table 2: first round variable screening of example one
The LASSO (Least Absolute Shrinkage and Selection Operator, minimum absolute shrinkage and selection operator) regression analysis adopted by the technical scheme of the invention is a shrinkage and characteristic variable selection method of a linear regression model. To obtain a subset of predictors, LASSO regression analysis minimizes the prediction error of the response variable by applying constraints to the model parameters, reducing the regression coefficients of some of the feature variables toward zero. After the contraction process, the feature variables with regression coefficients equal to zero are excluded from the model, while the feature variables with regression coefficients non-zero have the strongest correlation with the response variables. The parameter lambda is used to adjust the complexity of the LASSO regression. Specifically, the larger λ is, the larger the penalty is to the linear regression model with more feature variables, so as to finally obtain a model with fewer feature variables and stronger relevance between the feature variables and the response variables (i.e., a model with optimal prediction performance). In fig. 1a, each curve represents a variation trace of the regression coefficient of the corresponding feature variable; wherein the ordinate represents the value of the regression coefficient, the lower abscissa represents log (λ), and the upper abscissa represents the number of non-zero regression coefficients in the model at this time. In other words, the first round of feature variable screening mainly excludes feature variables whose regression coefficients are easily reduced to zero among the 46 related feature variables, while retaining the above 8 feature variables as prediction variables of the prediction model.
Further, in order to more accurately evaluate the performance of the prediction model based on the 8 prediction variables, the technical scheme of the invention adopts LASSO regression analysis to run 10 times of cross validation (cross validation), centralizes and normalizes the 46 contained feature variables, and then picks out the optimal lambda value based on the type parameters (i.e. the target parameters which are expected to be minimized when the model is selected through cross validation) of a log likelihood function (-2 log-likelihood) and binary dependent variables (which can be understood as yes/no variables). As shown in fig. 1b, for each lambda value, the black dot represents the mean value of the target parameter and the solid line above and below the black dot represents the confidence interval for the target parameter due to cross-validation; the two dashed lines indicate two particular lambda values (i.e. lambda.min and lambda.1se), respectively, between which lambda can be considered suitable. The model constructed using lambda.1se (lambda.1se is expressed in a variance range of lambda.min, yielding the lambda value of the simplest model) is the simplest (i.e. the least number of predicted variables used); the model constructed by using lambda.min (lambda.min represents that the average value of the target parameter is the smallest in all lambda values) has higher accuracy, so that the technical scheme of the invention uses lambda.min to construct the prediction model with the best performance and highest accuracy.
Development of a prediction model:
according to the technical scheme, the characteristic variable selected in the LASSO regression model is introduced, and the prediction model is built by using gradual multi-variable logistic regression analysis. Then, the selected feature variables are introduced and the statistical significance level of the feature variables is analyzed, and part of the feature variables with statistical significance are used as prediction variables/predictors for establishing a prediction model of the risk of the depression anxiety disorder.
According to the technical scheme, the 8 predicted variables are analyzed by adopting a logistic regression model, the optimal predicted variables are selected step by step, and finally 6 predicted variables (each predicted variable has statistical significance on a 0.1 test level) are reserved. These 6 predicted variables are drinking, blood glucose, percent normal slow wave of the postprandial intestine, percent coupling of the postprandial intestine, reaction area of the preprandial intestine, and lead time differences of the preprandial intestine, respectively (table 3). The technical scheme of the invention uses various statistical means to test the 6 characteristic variables, wherein the ratio (odds ratio (OR), also called odds ratio) of the characteristic variables is emphasized. The OR value is a statistic that quantifies the strength of the association between two events and represents the ratio of the results that occur after exposure (i.e., the feature variables tested in the present invention, the same applies below) to the probability that the same exposure will occur in the absence of the same. The value OR is understood in the present invention in particular as: the strength of the association between depressive anxiety disorder (i.e., response variable) and exposure indicates that the risk of developing depressive anxiety disorder (also understood as the risk of disease) for an exposer is a multiple of that for a non-exposer. If the OR value of the characteristic variable examined is >1, then an increase in the risk of developing depressive anxiety disorder as a result of exposure is indicated, the characteristic variable being "positive" associated with depressive anxiety disorder; if the OR value of the characteristic variable examined is <1, then it is stated that the risk of developing depressive anxiety disorder is reduced by exposure, and that there is a "negative" association between the characteristic variable and the depressive anxiety disorder; if the OR value of the examined characteristic variable=1, then it is stated that depressive anxiety disorder is not associated with that characteristic variable. The 95% confidence interval (95%Confidence Interval (CI)) provides an estimate (estimate) of the accuracy of the OR value obtained by the test, which describes that the overall true value may fluctuate within the 95% confidence interval of the OR value obtained by the test, and the smaller the confidence interval, the more accurate and robust the OR value obtained by the test. The OR (upper) and OR (lower) in Table 3 represent confidence intervals (95% CI) for the OR values of the feature variables examined. Table 3 shows the ratio of the 6 predicted variables (OR) retained and their 95% confidence intervals, indicating that these predicted variables all have a certain correlation with depressive anxiety disorder and are therefore applied to the depressive anxiety disorder prediction model of the present invention.
In table 3, the regression coefficient β: the partial regression coefficient of the predicted variable and the response variable obtained through the analysis of the logistic regression model represents the magnitude and direction of the influence of the rise per unit quantity on the response variable (the partial regression coefficient can be compared after normalization, and the regression coefficient beta of the detected predicted variable is an estimated value). Standard error: i.e., the standard deviation of the regression coefficient beta of the examined prediction variable, indicates the accuracy of the regression coefficient (the greater the standard deviation, the lower the accuracy of the examined prediction variable). Z value: the z statistic, i.e., the regression coefficient β of the predicted variable being tested divided by its corresponding standard error, is used primarily to determine the P value of the predicted variable being tested. P value: the P value of the z statistic corresponding to the examined predicted variable (the smaller the P value, the more important the examined predicted variable is to the response variable).
Table 3: second round variable screening of example one
Based on the 6 predicted variables described above, this example constructs a risk prediction model of depressive anxiety disorder and better visualizes the constructed model by plotting the corresponding nomograms (nomograms), see fig. 2a and 2b (in fig. 2b, P <0.05 is considered statistically significant). Fig. 2a and 2b are different manifestations of nomograms of a predictive model of depressive anxiety disorder constructed in accordance with the present invention. In fig. 2a and 2b, scales are marked on the line segments corresponding to each variable (e.g., "drinking", "blood sugar", etc.), which represent the value ranges of the variables; while the length of the line segment reflects the magnitude of the contribution of the variable to the outcome event (i.e., the occurrence of depressive anxiety disorder). At different values, each variable may be given a corresponding single score at the uppermost "score" or "β (X-m) term (which may also be understood as the regression coefficient β) of fig. 2a or 2 b. After the value is taken, the single scores corresponding to all the variables are added to obtain the total score. From the total score, the occurrence probability of depressive anxiety disorder can be obtained at the "risk of anxiety and depression" or "probability of anxiety and depression" of the lowest of fig. 2a or 2 b. By way of example, in fig. 2a, if a subject is classified as 300 overall, the risk of developing depressive anxiety disorder is about 0.22 (22%). In addition, in the dynamic alignment chart shown in fig. 2b, the black points on the line segments corresponding to each variable respectively represent the actual values of a certain subject, and the waveform chart above the line segments shows the specific distribution of each variable. The corresponding individual score can be found at the upper "beta (X-m) term" based on the position of the black spot, and then the total score calculated and the probability of the corresponding depressive anxiety disorder obtained (in the example of fig. 2b, the probability of the subject developing depressive anxiety disorder is 0.408).
And (3) verifying a prediction model:
the invention uses the data of the training set and the validation set to draw corresponding subject operating characteristic (ROC) curves to evaluate the sensitivity (also known as true positive rate) and specificity (also known as true negative rate) of the constructed predictive model. In fig. 3a and 3b, the abscissa indicates the "false positive rate", i.e. "1-specificity"; the ordinate indicates "true positive rate", i.e. "sensitivity"; the area under the ROC curve (i.e. the solid line in fig. 3a and 3 b) (AUC, the area under the ROC curve enclosed by the coordinate axis) is analyzed to discern the quality of the risk alignment to distinguish true positives from false positives. For the established predictive model, the area under the roadmap ROC curve (AUC) is all above 0.6 (i.e. greater than the area under the dashed line): 66.03% (95% CI: 59.29% -72.76%) in the training set (FIG. 3 a) and 67.84% (95% CI:56.97% -78.71%) in the validation set (FIG. 3 b), demonstrating that the model constructed according to the present invention exhibits good robustness.
The calibration curve is used to see if the predicted probability is close to the actual probability. Good agreement was also shown on the alignment curves of the two data sets (fig. 4a and 4b, dashed curves represent the actual observed occurrence probability of depressive anxiety disorder, and solid curves represent the predicted occurrence probability of depressive anxiety disorder by the prediction model). From the verification results, the depression anxiety disorder prediction model constructed by the invention has better prediction capability.
Embodiment two: predicting the probability of a subject's disease at risk of developing anxiety-depressive disorder using the predictive model described above
Referring to fig. 7, fig. 7 is a schematic diagram of an alternative architecture of a prediction system according to an embodiment of the present invention. To enable support for one exemplary application, terminals (first terminal 202 and second terminal 204 are illustratively shown) are connected to the predictive system via a network. The network to which the present invention relates may be a wide area network or a local area network, or a combination of both, with wireless links being used to effect data transmission. The terminal related by the invention can be various user terminals such as smart phones, tablet computers, notebook computers and the like. The terminal may be used to display an interface for inputting subject data and/or sample data, and an interface for displaying the prediction results of the prediction system.
An exemplary architecture of a prediction system is described below, and in some embodiments, as shown in FIG. 5, the prediction system 100 may include:
a database 106 for storing data, the types of data including gastrointestinal signal data and clinical data, the data including sample data from a sample population and subject data from a subject;
a data acquisition module 102, configured to acquire the data and store the data in the database 106;
a model training module 108, the model training module 108 training learning the sample data using a machine learning algorithm to determine a predictive model (e.g., a depression anxiety disorder predictive model);
a prediction module 112, the prediction module 112 obtaining the subject data via the data acquisition module 102 and invoking a determined prediction model (e.g., a depressive anxiety disorder prediction model) to analyze the subject data to predict a probability of the subject developing a depressive anxiety disorder.
The predictive system 100 may also include a verification module 110, the verification module 110 configured to evaluate the accuracy of the determined predictive model (e.g., the depressive anxiety disorder predictive model), the evaluated evaluation criteria including a degree of calibration or differentiation.
Wherein the database 106, the model training module 108, and the verification module 110 may be integrated as a model building module 104.
As an example, in the depression anxiety disorder prediction system of the first embodiment, the gastrointestinal signal data specifically includes a reaction area of a pre-meal intestine portion, a lead time difference of the pre-meal intestine portion, a normal slow wave percentage of the post-meal intestine portion, and a coupling percentage of the post-meal intestine portion, and the clinical data specifically includes drinking alcohol and blood glucose.
A specific application scenario of the present invention is given below.
Community a performs a large-scale screening activity of depression anxiety disorder, collects gastrointestinal electrical signal data, blood glucose blood lipid data, and individual characteristic data (hereinafter collectively referred to as "clinical data") of a target population (for example, middle-aged and elderly people) within the community, and inputs subject data through the first terminal 202. Subject data is transmitted to the data acquisition module 102 of the prediction system 100 via the network 206. The data acquisition module 102 acquires subject data from a subject and stores it in the database 106.
The predictive module 112 obtains the subject data and invokes an established predictive model of depressive anxiety disorder to analyze the subject data and predict the probability of the subject developing depressive anxiety disorder.
As an output, the prediction module 112 may generate a report prompting the subject to be at risk of developing depressive anxiety disorder and transmit the prediction result to the first terminal 202 via the network 206. Community a may set a depression anxiety disorder occurrence risk threshold in advance (e.g., a depression anxiety disorder occurrence probability of 30%). When the predicted risk of occurrence of depressive anxiety disorder for a patient (e.g., a B patient) exceeds a set risk threshold (e.g., a probability of occurrence of 40%), community a should alert the B patient or its family members and recommend a recommended or affiliated hospital visit. When patient B goes to a hospital visit, the hospital will conduct a further, more detailed examination of patient B to determine if the patient has a suggested depressive anxiety disorder. The doctor may transmit the diagnosis of patient B to the prediction system 100 through the second terminal 204. The data of patient B (subject data + diagnostic results) can be used as new sample data for further training of the depression anxiety disorder predictive model. Of course, the diagnosis of the B patient may also be transmitted to the prediction system 100 through the first terminal 202, in other words, the terminals transmitting the diagnosis of the B patient may be the same or different.
From the above description of the embodiments, it will be clear to those skilled in the art that the above embodiment method may be implemented by means of software plus necessary general hardware platform, or of course by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a computer terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims (10)

1. A depression anxiety disorder prediction system, comprising:
a database for storing data, the types of data comprising gastrointestinal electrical signal data and clinical data, the gastrointestinal electrical signal data comprising pre-meal gastrointestinal electrical signal data and post-meal gastrointestinal electrical signal data; the pre-meal gastrointestinal electrical signal data comprises a reaction area of a pre-meal intestine portion and a lead time difference of the pre-meal intestine portion, the post-meal gastrointestinal electrical signal data comprises a normal slow wave percentage of a post-meal intestine portion and a coupling percentage of the post-meal intestine portion, and the clinical data comprises drinking and blood glucose; the data includes sample data from a sample population and subject data from a subject;
the data acquisition module is used for acquiring the data and storing the data in the database;
a model training module that performs training learning on the sample data using a machine learning algorithm to determine a depression anxiety disorder prediction model for predicting depression and anxiety as a whole, the prediction variables of the depression anxiety disorder prediction model including a reaction area of the foregut portion, a lead time difference of the foregut portion, a normal slow wave percentage of the postprandial gut portion, a coupling percentage of the postprandial gut portion, the drinking alcohol, and the blood glucose;
and the prediction module is used for acquiring the subject data through the data acquisition module, and calling the depression anxiety disorder prediction model to analyze the subject data so as to predict the probability of depression anxiety disorder occurrence of the subject.
2. The system of claim 1, wherein the sample data is divided into a training set and a validation set at a ratio of 7:3.
3. The system of claim 1, wherein the gastrointestinal signal data is acquired simultaneously by leads located in the stomach, antrum, small curve, large curve, ascending colon, transverse colon, descending colon, and rectum, respectively.
4. The system of claim 2, further comprising a verification module for evaluating accuracy of the depression anxiety disorder predictive model using the verification set.
5. The system of claim 4, wherein the evaluation index of the evaluation comprises a degree of calibration or a degree of differentiation.
6. The construction method of the depression anxiety disorder prediction model is characterized by comprising the following steps of:
s1, acquiring sample data from a sample population, wherein the types of the sample data comprise gastrointestinal signal data and clinical data;
s2, pre-training the sample data to screen out predicted variables of the depression anxiety disorder prediction model, wherein the screened predicted variables comprise reaction area of the pre-meal intestine part, lead time difference of the pre-meal intestine part, normal slow wave percentage of the post-meal intestine part, coupling percentage of the post-meal intestine part, drinking and blood sugar;
and S3, training and learning the sample data by using a machine learning algorithm based on the screened prediction variables to establish a depression anxiety disorder prediction model, wherein the depression anxiety disorder prediction model is used for predicting depression and anxiety as a whole.
7. The method of claim 6, wherein the sample data is divided into a training set and a validation set at a ratio of 7:3.
8. The method of claim 6, wherein the pre-training comprises a first round of variable screening and a second round of variable screening; the first round of variable screening includes LASSO regression analysis and the second round of variable screening includes logistic regression analysis and stepwise regression analysis.
9. The method of claim 7, further comprising S4 evaluating the accuracy of the depression anxiety disorder predictive model using the validation set, the evaluation criteria of the evaluation comprising a degree of calibration or a degree of differentiation.
10. The method of claim 6, wherein the pre-training comprises a ridge regression or a random forest model.
CN202211496983.3A 2022-11-25 2022-11-25 Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof Active CN115886818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211496983.3A CN115886818B (en) 2022-11-25 2022-11-25 Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211496983.3A CN115886818B (en) 2022-11-25 2022-11-25 Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof

Publications (2)

Publication Number Publication Date
CN115886818A CN115886818A (en) 2023-04-04
CN115886818B true CN115886818B (en) 2024-02-09

Family

ID=86489194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211496983.3A Active CN115886818B (en) 2022-11-25 2022-11-25 Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof

Country Status (1)

Country Link
CN (1) CN115886818B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105147248A (en) * 2015-07-30 2015-12-16 华南理工大学 Physiological information-based depressive disorder evaluation system and evaluation method thereof
CN106407695A (en) * 2016-09-28 2017-02-15 湖南老码信息科技有限责任公司 Anxiety disorder prediction method and prediction system based on incremental neural network model
KR20190122429A (en) * 2018-04-20 2019-10-30 고려대학교 산학협력단 Item Response Theory Algorithm Based Computerized Screening System for Depressive and Anxiety disorders
KR20190133581A (en) * 2018-05-23 2019-12-03 한국과학기술원 Method and apparatus for item selection based on machine learning for rapid screening of anxiety and depression in multiple psychological test sites
WO2021184412A1 (en) * 2020-03-18 2021-09-23 浙江大学 Enteric microorganism-based bipolar affective disorder biomarkers, and application thereof in screening

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105147248A (en) * 2015-07-30 2015-12-16 华南理工大学 Physiological information-based depressive disorder evaluation system and evaluation method thereof
CN106407695A (en) * 2016-09-28 2017-02-15 湖南老码信息科技有限责任公司 Anxiety disorder prediction method and prediction system based on incremental neural network model
KR20190122429A (en) * 2018-04-20 2019-10-30 고려대학교 산학협력단 Item Response Theory Algorithm Based Computerized Screening System for Depressive and Anxiety disorders
KR20190133581A (en) * 2018-05-23 2019-12-03 한국과학기술원 Method and apparatus for item selection based on machine learning for rapid screening of anxiety and depression in multiple psychological test sites
WO2021184412A1 (en) * 2020-03-18 2021-09-23 浙江大学 Enteric microorganism-based bipolar affective disorder biomarkers, and application thereof in screening

Also Published As

Publication number Publication date
CN115886818A (en) 2023-04-04

Similar Documents

Publication Publication Date Title
Martinez-Ríos et al. A review of machine learning in hypertension detection and blood pressure estimation based on clinical and physiological data
Li et al. Non-invasive monitoring of three glucose ranges based on ECG by using DBSCAN-CNN
RU2768581C2 (en) Devices, systems and methods for prediction, screening and control of encephalopathy/delirium
Tazawa et al. Evaluating depression with multimodal wristband-type wearable device: screening and assessing patient severity utilizing machine-learning
Holtzer et al. Interactions of subjective and objective measures of fatigue defined in the context of brain control of locomotion
CN109770921B (en) Method and device for screening early stage language and cognitive ability of autistic children
US11062792B2 (en) Discovering genomes to use in machine learning techniques
KR102302071B1 (en) Method for predicting of depression and device for predicting of depression risk using the same
Martin et al. Stress, coping, and social support in health and behavior.
Baumert et al. Sleep characterization with smart wearable devices: a call for standardization and consensus recommendations
CN115299887B (en) Detection and quantification method and system for dynamic metabolic function
Soni et al. Graphical representation learning-based approach for automatic classification of electroencephalogram signals in depression
Ho et al. Patient and caregiver characteristics related to completion of advance directives in terminally ill patients
Gutiérrez-Tobal et al. Ensemble-learning regression to estimate sleep apnea severity using at-home oximetry in adults
Rath et al. An exhaustive review of machine and deep learning based diagnosis of heart diseases
Li et al. A model for obstructive sleep apnea detection using a multi-layer feed-forward neural network based on electrocardiogram, pulse oxygen saturation, and body mass index
Sahu et al. Scz-scan: An automated schizophrenia detection system from electroencephalogram signals
Bilgin et al. Investigation of the relationship between anxiety and heart rate variability in fibromyalgia: A new quantitative approach to evaluate anxiety level in fibromyalgia syndrome
Gajendran et al. Novel machine-learning based framework using electroretinography data for the detection of early-stage glaucoma
Altıntop et al. A novel approach for detection of consciousness level in comatose patients from EEG signals with 1-D convolutional neural network
Shafi et al. Prediction of heart abnormalities using deep learning model and wearabledevices in smart health homes
CN115886818B (en) Depression anxiety disorder prediction system based on gastrointestinal electric signal and construction method thereof
CN116584962B (en) Sleep disorder prediction system based on gastrointestinal electric signals and construction method thereof
CN115517682B (en) Cognitive dysfunction prediction system based on gastrointestinal electric signals and construction method
Azimi et al. Identifying sleep biomarkers to evaluate cognition in HIV

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant