CN116110582A - Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism - Google Patents

Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism Download PDF

Info

Publication number
CN116110582A
CN116110582A CN202310119327.XA CN202310119327A CN116110582A CN 116110582 A CN116110582 A CN 116110582A CN 202310119327 A CN202310119327 A CN 202310119327A CN 116110582 A CN116110582 A CN 116110582A
Authority
CN
China
Prior art keywords
training
data
task
model
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310119327.XA
Other languages
Chinese (zh)
Inventor
林绍福
王梦真
陈建辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202310119327.XA priority Critical patent/CN116110582A/en
Publication of CN116110582A publication Critical patent/CN116110582A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a health risk assessment method based on a pre-training and multi-task bidirectional regulation mechanism, which relates to the related technology of intelligent medical treatment and deep learning, applies a large amount of discarded single physical examination data, fuses a pre-training model to learn the representation of a text, reduces the need of a large training data set, develops a multi-task learning framework with the bidirectional regulation mechanism, fuses pre-training and fine-tuning, and relieves upstream and downstream gaps caused by forgetting pre-training parameters of the model. The data of a medical institution in Hainan province is used for training test, and experiments show that the model can effectively improve the accuracy of health risk assessment.

Description

Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism
Technical Field
The invention relates to the field of intelligent medical treatment, a pre-training technology and a multi-task technology, in particular to a health risk assessment method based on a pre-training and multi-task bidirectional regulation mechanism.
Background
With the rapid development of internet medical treatment, the type and the scale of medical data are increased at an unprecedented speed, so that better health management can be provided for people, and a new method for preventing, predicting and treating chronic diseases is developed. In medical data-based internet applications, developing and constructing an effective chronic risk prediction model is of great value in chronic disease management. Despite the ongoing development of medicine, there is an increasing interest in precision medicine, but most diagnoses occur when patients begin to develop obvious signs of disease. Early diagnosis and detection of disease may provide patients and caregivers with the opportunity for early intervention, better disease management, and efficient allocation of medical resources. For persons in the preclinical stage of chronic disease or at risk of chronic disease, the progression of the disease may be significantly reduced by changing lifestyle or effective drug treatment, and thus prevention of chronic disease is particularly important.
In recent years, the deep learning model is widely applied to a plurality of fields such as finance, material science, environment and the like, and digital medical files such as electronic health files and the like are more perfect, so that more possibility is provided for the development of deep learning in the medical field. More and more researches focus on electronic health files, mine modeling and analysis from real data, and advance clinical knowledge question-answering systems, health early warning models, auxiliary diagnosis and electronic prescription recommendation development. However, since the medical system is complex and non-uniform, the data quality is uneven, the cleaning and labeling are time-consuming and labor-consuming, and a large amount of EHR data which is in order is difficult to obtain. The problem of data dependence and insufficient training samples of the deep learning model can be solved by the transfer learning, the current trend of the transfer learning is to pretrain by training a large-scale general data set and transfer acquired knowledge to a target task through fine tuning, so that the accuracy of the small-sample data training model is improved, and a plurality of researches can prove that the pretraining and fine tuning method is effective on a natural language processing task. However, large-scale model pre-training is costly, many studies are based on published, pre-trained models, but for some specific areas of expertise, such as medicine, academic papers, legal documents, etc., generic pre-training language models can be less effective because these texts contain a large number of terms of art and sentences, as opposed to generic text forms used for pre-training.
In the electronic health record data, isomorphic data of different institutions are difficult to obtain, single-time diagnosis data in physical examination records of patients are far more than multiple-time diagnosis data, most of current researches focus on longitudinal diagnosis records, and the single-time diagnosis data are often abandoned, so that a large number of single-time physical examination data are utilized for model pre-training, and training data sets of the pre-training and the fine-tuning are similar, and the model migration effect can be improved. In the prior model pre-training, a plurality of tasks are usually designed, features with different dimensionalities of the learning data are usually studied, but different task learning strategies are different, and the weight coefficient is also different, but the point is not considered in the prior pre-training model. The study fuses the pre-training model to learn the representation of the text, applies a large amount of discarded single physical examination data, provides better model initialization, and can effectively avoid the problem of overfitting of small sample data in a follow-up fine adjustment task. At the same time, the need for large training data sets is reduced, potentially yielding more generic models. And a dynamic adjustment mechanism for loss weight is added in the pre-training task, so that each task is optimized as much as possible and does not affect each other.
For this purpose, the patent proposes a health risk assessment model based on a pretraining and multitasking bi-directional adjustment mechanism. The method comprises the steps of adapting a pre-training task of a transducer model, integrating random shielding operation, designing a self-composition prediction task and a target prediction task, and enabling the model to fully learn the interrelationship between elements and the relation between the elements and the target elements through the two pre-training tasks; and a multi-task learning framework with a two-way regulation mechanism is developed, pre-training and fine-tuning are fused, and upstream and downstream gaps caused by forgetting pre-training parameters of a model are relieved, so that the problem of excessive fitting caused by local optimization under the condition of a small sample is prevented.
Disclosure of Invention
The invention aims to provide a health risk assessment model based on a pre-training and multi-task bidirectional regulation mechanism, which is used for providing assessment probability of health risk. Firstly, an adaptation pre-training task is proposed, and a transducer model is used for model pre-training; secondly, a multi-task framework with a bidirectional adjustment mechanism is provided, pre-training and fine adjustment are fused, the problem of over fitting caused by forgetting model pre-training parameters is solved, the pre-training parameters are adjusted through gradient updating for several times, and then the whole model is trained in a downstream prediction task by utilizing the updated parameters, so that the convergence rate is increased.
The specific steps of the invention are as follows:
(1) The method comprises the steps of acquiring a personal health file data set, preprocessing the personal health file data, dividing the personal health file data into single visit and multiple visit, and storing the single visit and the multiple visit data into PKL files.
(2) And (3) performing dictionary embedded representation by using the PKL file in the step (1), and pre-training PKL data of single diagnosis by using a transducer to construct a pre-training module, so as to fully mine single diagnosis data information.
(3) And (3) aggregating all the doctor-making embedments by using PKL data of multiple doctor-making, constructing a health risk assessment model based on the pre-training model in the step (2), and predicting the current risk state.
(4) The training module and the health risk assessment module are connected by adopting a multi-task learning framework, bidirectional adjustment is carried out between the training module and the risk assessment module, the method and the device have the advantages that excessive fitting caused by local optimization under the condition of a small sample is prevented, the overall framework is guaranteed to take risk assessment as a main task, the pre-training module is a sub-task, and the discarded data is utilized to carry out auxiliary work of the main task.
In step (1), a data set is constructed using personal HEALTH examination data (person_health_exam) obtained from a medical institution in the south of the hainan province, the physical examination data including various information such as physical examination, laboratory examination, life habit, etc., and hypertension is selected as a chronic disease for HEALTH risk assessment, and risk factors related to the onset of hypertension are known from the 2018 revision of the chinese hypertension control guidelines: overweight and obesity, excessive drinking, age, lack of physical labor, dyslipidemia, and the like. The experiment thus connects the desensitized health profile to the personal profile and retains 20 attributes (e.g., age, waist circumference, BMI, etc.). Wherein, height, weight, waistline, BMI can reflect whether overweight and obesity, total Cholesterol (TCHO), triglyceride (TG), serum Low Density Lipoprotein Cholesterol (LDLC), serum High Density Lipoprotein Cholesterol (HDLC) can reflect whether dyslipidemia is abnormal, and EXERCISE frequency (EXERCISE_FREQ_CODE) and DRINKING frequency (DRINKING_FREQ_CODE) can reflect whether physical labor is lack and excessive DRINKING is excessive, besides, the factors of body temperature, heart rate, pulse, systolic pressure, diastolic pressure and the like can be increased to objectively reflect the physical state of the current patient. The model structure in the study mainly depends on objective anthropometric data, and is also integrated with life habit data including smoking, drinking, exercise frequency and the like. Judging whether the patient is physical examination data before illness and within three years according to two fields of 'whether hypertension is suffered from' and 'date of diagnosing hypertension' in personal health record basic information (PERSON_INFO) and combining with personal health physical examination data 'physical examination time', wherein the physical examination data indicates that the patient is ill within three years, namely a positive sample; similarly, data from undiagnosed hypertension and data from more than three years prior to illness, negative samples.
In step (2), each electronic health record is converted into a set of multidimensional embedding as input to a subsequent module. Each record includes information of patient ID, physical examination ID, age, body temperature, pulse rate, respiratory rate, left systolic pressure, left diastolic pressure, right systolic pressure, right diastolic pressure, height, weight, waist circumference, BMI, heart rate, total cholesterol, triglycerides, low density lipoprotein cholesterol, high density lipoprotein cholesterol, exercise frequency code, smoking status code, drinking frequency code. While the embedded representation of the record consists of one-hot coding of each dictionary, that is to say each column has its own dictionary whose unbedding is the index value of its dictionary for subsequent model pre-training.
Similar to BERT, we pre-train single visit data. First record of complete physical examination of patient
Figure BDA0004079480820000051
The random masking operation is performed, and since the data is single visit data, t is negligible, and each attribute value of the patient n can be obtained:
Figure BDA0004079480820000052
wherein i is more than or equal to 0 and less than or equal to |P|, i is a positive integer, and |P| represents the number of attributes of one physical examination record. random_mask () is a random mask function, which can be expressed as:
Figure BDA0004079480820000061
the complete visit record admissions can thus be expressed as:
Figure BDA0004079480820000062
the expression of the connection is adm 0, adm 1 … adm P, and the connection is made
Unlike the Bert model, since there is no precedence relation between each attribute value, position_casting is removed; in the single visit record, any visit record has no correlation, so the segment_pulsing has no practical meaning. Therefore, only word_casting is reserved with emphasis, and the model prediction depends on the numerical value of the attribute. In this study, some patient sign data or lifestyle data were missing due to incomplete or erroneous recording of each physical examination data examination item. While we have used neighbor values of existing data to fill in missing items, the model may have difficulty adapting to downstream tasks due to the lack of some important check records. Thus, based on the operation of random masking, we devised two pre-training tasks to enhance the self-predictive and target predictive effects of the model.
Self-composition prediction task: in the pre-training task, the self-composition prediction task is designed due to the random shielding operation on the patient treatment record, so that the model has stronger self-prediction capability. The self-predicting task loss function is as follows:
Figure BDA0004079480820000063
Figure BDA0004079480820000071
wherein P is (n) [i]Representing the ith attribute value in the patient visit record for the nth patient, and adm (n) [i]Embedding, p representing the treatment of the nth patient's treatment record with the ith attribute value subjected to a transducer model and randomly masked i ∈{VOC i \P (n) [i]And p represents i Divide P for the ith dictionary table (n) [i]Any other value. To reduce as much as possible
Figure BDA0004079480820000074
The greater the likelihood that the predicted attribute value will equal the true value, and the less likely it will equal values other than the true value.
Target prediction task: the pretraining task is set to finally predict whether hypertension exists, so that the task designs a loss function aiming at a final prediction target as follows:
Figure BDA0004079480820000072
wherein the method comprises the steps of
Figure BDA0004079480820000073
Representing the probability of each attribute predicting the "is_hyper" by the embellishment of the "is_hyper" attribute of the nth patient after the attribute is subjected to a transducer model and randomly masked, so as to obtain the minimum target prediction task loss value.
The self-composition prediction task and the target prediction task are combined to form a final optimization target, and the purpose of dynamically adjusting the weight of the total loss function is to optimize two pre-training tasks as much as possible, and the two pre-training tasks are not interfered with each other, so that certain tasks are prevented from being dominant, and other tasks cannot be fully optimized.
The total loss function is expressed as follows:
Figure BDA0004079480820000081
setting the dynamic adjustment parameters of the weights to be a and b, assuming that
Figure BDA0004079480820000082
Gradient of greater than->
Figure BDA0004079480820000083
Then->
Figure BDA0004079480820000084
Is also greater than
Figure BDA0004079480820000085
So at the first parameter update due to +.>
Figure BDA0004079480820000086
The value of the model parameter per se is larger, and the negative gradient of a is-loss_self, then the value of a is reduced by a larger ratio b, and the updating direction of the model parameter in the next updating is +.>
Figure BDA0004079480820000087
a becomes smaller than b, so +.>
Figure BDA0004079480820000088
The effect on other gradients is also smaller, which will increase the probability that the loss2,3 will move to the current minima, so that the final loss minima will be closer to each loss minima.
In step (3), the final objective is to predict health risk for multiple sequential physical examination data. We aggregate all visit embedments and add a prediction layer for the hypertension risk prediction task. Specifically, the risk assessment result prediction at the time of the t-th physical examination of a certain patient depends on the previous t-1 physical examination record, the result value of the illness or not, and the t-th physical examination record, so that the current risk state can be more effectively predicted according to the records of a plurality of physical examinations of the patient:
Figure BDA0004079480820000089
and averaging the physical examination record attribute ebedding of the previous t-1 times, connecting physical examination records of the t time, and outputting predicted risk probability. Define the real mark word of the t-th physical examination as
Figure BDA00040794808200000810
The total loss function is defined as follows:
Figure BDA0004079480820000091
and (4) the study adopts multi-task learning to connect the pre-training module and the disease risk prediction module, and performs bidirectional adjustment between the pre-training module and the prediction module so as to prevent overfitting caused by local optimization under the condition of a small sample, eliminate gap between the pre-training module and the prediction model, ensure that the whole frame takes disease prediction as a main task, the pre-training module is a sub task, and the discarded data is utilized to assist the main task.
To bias the model ensemble towards the task of hypertension prediction, the present study uses a MAML update strategy and incorporates the idea of "eliminating gaps between pre-trained and predicted models". For the pre-training task, we adjust the a priori parameters of the transducer model by one or several gradient descent steps
Figure BDA0004079480820000092
Setting the learning rate to alpha for dual adaptation of the pre-training task, new a priori parameter +.>
Figure BDA0004079480820000093
Can be expressed as:
Figure BDA0004079480820000094
the parameter ω' of the hypertension prediction model can then be expressed as:
Figure BDA0004079480820000095
further, we define the total loss function as follows:
Figure BDA0004079480820000096
wherein lambda is total_pre Weights representing the pre-trained task loss function are typically set to values less than 1 so that the multitasking frame result is biased towards the hypertension prediction module.
Drawings
Fig. 1 is a diagram of the overall architecture of the present invention. The overall architecture is divided into three modules, namely an input module, model details and an output module. The input is divided into single-time diagnosis data input and multiple-time diagnosis data input, the single-time diagnosis data is input into a pre-training model, two pre-training tasks are set to be self-composition prediction tasks and target prediction tasks respectively, the pre-trained model is sent into a downstream prediction task and is trained by using multiple-time diagnosis data, the model obtained after training is sent into a multi-task frame with combined pre-training and prediction, training is carried out again, the purpose of double adjustment is achieved, and finally a model prediction result is output.
Detailed Description
For a better understanding of the technical solution of the present invention, the following detailed description of the embodiments of the present invention refers to the accompanying drawings.
It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Example 1
Figure one shows an overall architecture of the present invention.
The overall architecture is divided into three modules, namely an input module, model details and an output module. The input is divided into single-time diagnosis data input and multiple-time diagnosis data input, the single-time diagnosis data is input into a pre-training model, two pre-training tasks are set to be self-composition prediction tasks and target prediction tasks respectively, the pre-trained model is sent into a downstream prediction task, multiple-time diagnosis data are used for training, the model obtained after training is sent into a multi-task frame combining the pre-training with the downstream prediction for training again, the purpose of double adjustment is achieved, and finally a model prediction result is output.
The foregoing has described in detail embodiments of the invention, which are presented herein with particular reference to the drawings and are presented solely to aid in the understanding of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (5)

1. A health risk assessment method based on a pre-training and multi-task bidirectional regulation mechanism is characterized by comprising the following steps of:
step (1) acquiring a personal health record data set, preprocessing the personal health record data, dividing the personal health record data into single diagnosis and multiple diagnosis, and storing the single diagnosis and multiple diagnosis data as PKL files;
step (2) dictionary embedding representation is carried out by using the PKL file in the step (1), pre-training is carried out on PKL data of single diagnosis by using a transducer, a pre-training module is constructed, and single diagnosis data information is fully mined;
step (3) using PKL data of multiple visits, aggregating all the visits to embed, constructing a health risk assessment model based on the pre-training model in step (2), and predicting the current risk state;
and (4) connecting the pre-training module and the health risk assessment module by adopting a multi-task learning frame, and performing bidirectional adjustment between the pre-training module and the risk assessment module to prevent overfitting caused by local optimization under the condition of a small sample, so as to ensure that the whole frame takes the risk assessment as a main task, the pre-training module is a secondary task, and the discarded data is utilized to perform auxiliary work of the main task.
2. The method for assessing health risk based on a pretraining and multitasking bidirectional adjustment mechanism of claim 1, wherein in step (1), the step of preprocessing the personal health profile data comprises:
(1) The hypertension is selected as a chronic disease for risk assessment, and the risk factors related to the onset of hypertension are determined as follows: overweight and obese, excessive drinking, age, lack of physical labor and dyslipidemia;
(2) Repeating the deletion of the extracted data and completing the blank value by adopting the adjacent value;
(3) Judging whether the physical examination data are physical examination data before illness and within three years according to two fields of hypertension and hypertension date, and combining physical examination time of the physical examination data;
(4) Dividing the data set into a single visit data set and a plurality of visit data sets, which are single. Pkl and multi. Pkl respectively;
(5) Each code in the single file, the multi file and the single file is respectively constructed into an age dictionary, a body temperature dictionary and a pulse rate dictionary, and the dictionary data divided into multiple visits comprises the dictionary data of multiple visits and single visits;
(6) The "patient ID" in the multi-visit dataset multi.pkl was randomly divided into training, validation and test sets at 4:1:1.
3. The method of claim 1, wherein in step (2), the pre-training is performed using single visit data, the originally discarded data is used to generate a visit embedding for each record from the dictionary embedding for each EHR record, [ CLS ] is used as an initial marker for each visit embedding sequence, and each visit embedding may need to be filled in to align the input vector in order to obtain the same input length; embedding representation based on a dictionary, pre-training single-visit data by using a transducer, and fully mining single-visit data information, wherein the method comprises the following steps of:
(1) Converting each electronic health record into a group of multidimensional embedding as input of a subsequent module; the embedded representation of the record consists of one-hot codes of each dictionary, that is, each column has a respective dictionary, and the embedding is used as the index value of the dictionary so as to carry out model pre-training subsequently;
(2) Pre-training single visit data based on a transducer; we designed two pre-training tasks to enhance the model's self-prediction and target prediction effects:
self-composition prediction task: in the pre-training task, as the random shielding operation is carried out on the patient treatment record, in the random shielding task, some embedding is randomly shielded, 80% of each code is replaced by [ MASK ], 10% is unchanged and 10% is randomly replaced, and the self-composition prediction task is designed, so that the model has stronger self-prediction capability;
target prediction task: the pretraining task is set so that whether hypertension is caused or not can be finally predicted, and therefore the task performs model prediction aiming at a final prediction target.
4. The health risk assessment method based on the pre-training and multi-task bidirectional regulation mechanism according to claim 1, wherein in the step (3), the final slow disease early warning process using the MLP module is as follows:
(1) Converting EHR data of multiple visits into health physical examination characteristics for embedding, acquiring the average value of the first few patient IDs, physical examination IDs, ages, body temperatures, pulse rates, respiratory frequencies, left systolic pressures, left diastolic pressures, right systolic pressures, right diastolic pressures, heights, weights, waistlines, BMI, heart rates, total cholesterol, triglycerides, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, exercise frequency codes, smoking status codes and drinking frequency codes in the multiple visit data, connecting the average value with the embedding of the last health physical examination characteristics, and inputting the average value into a prediction module;
(2) And acquiring whether hypertension exists at the prediction time t, embedding the hypertension into the prediction time t as a classification label, and jointly calculating jaccard, f1 and pr-auc indexes by the prediction value and the true value to judge the model effect.
5. The method for assessing health risk based on a bi-directional adjustment mechanism of claim 1, wherein in step (4), the pre-training module and the health risk assessment module are connected by using multi-task learning, and bi-directional adjustment is performed between the pre-training module and the risk assessment module, as follows:
(1) Adopting an MAML updating strategy, and fusing the idea of eliminating gap between the pre-training model and the prediction model; for a pre-training task, adjusting the prior parameter phi of the transducer model through one or more gradient descent steps, and setting the learning rate to alpha to perform double adaptation of the pre-training task;
(2) Obtaining a pre-training priori parameters and hypertension early warning model parameters;
(3) Parameter updating is performed through back propagation of the overall loss function defined by the multi-task learning framework, the overall loss function is defined, and the weights of the pre-training task and the final task loss function are adjusted by using dynamic weights.
CN202310119327.XA 2023-02-15 2023-02-15 Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism Pending CN116110582A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310119327.XA CN116110582A (en) 2023-02-15 2023-02-15 Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310119327.XA CN116110582A (en) 2023-02-15 2023-02-15 Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism

Publications (1)

Publication Number Publication Date
CN116110582A true CN116110582A (en) 2023-05-12

Family

ID=86265322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310119327.XA Pending CN116110582A (en) 2023-02-15 2023-02-15 Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism

Country Status (1)

Country Link
CN (1) CN116110582A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116509350A (en) * 2023-07-03 2023-08-01 云天智能信息(深圳)有限公司 Medical monitoring system based on intelligent bracelet
CN117116476A (en) * 2023-07-04 2023-11-24 中国医学科学院阜外医院 Downstream task prediction method and device and computer readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116509350A (en) * 2023-07-03 2023-08-01 云天智能信息(深圳)有限公司 Medical monitoring system based on intelligent bracelet
CN116509350B (en) * 2023-07-03 2023-09-29 云天智能信息(深圳)有限公司 Medical monitoring system based on intelligent bracelet
CN117116476A (en) * 2023-07-04 2023-11-24 中国医学科学院阜外医院 Downstream task prediction method and device and computer readable storage medium
CN117116476B (en) * 2023-07-04 2023-12-19 中国医学科学院阜外医院 Downstream task prediction method and device and computer readable storage medium

Similar Documents

Publication Publication Date Title
Zhang et al. Patient2vec: A personalized interpretable deep representation of the longitudinal electronic health record
Kittredge et al. Where is the effect of frequency in word production? Insights from aphasic picture-naming errors
WO2023202508A1 (en) Cognitive graph-based general practice patient personalized diagnosis and treatment scheme recommendation system
CN116110582A (en) Health risk assessment method based on pre-training and multitasking bidirectional regulation mechanism
Zeng et al. Identifying breast cancer distant recurrences from electronic health records using machine learning
Chen et al. Early short-term prediction of emergency department length of stay using natural language processing for low-acuity outpatients
CN112182168B (en) Medical record text analysis method and device, electronic equipment and storage medium
CN114783603A (en) Multi-source graph neural network fusion-based disease risk prediction method and system
Sharma et al. A diabetes monitoring system and health-medical service composition model in cloud environment
Wang et al. Development and evaluation of novel ophthalmology domain-specific neural word embeddings to predict visual prognosis
Gavrilov et al. Feature extraction method from electronic health records in Russia
Fu et al. A hybrid model to identify fall occurrence from electronic health records
Mohammadi et al. Learning to identify patients at risk of uncontrolled hypertension using electronic health records data
RU2752792C1 (en) System for supporting medical decision-making
CN116959715B (en) Disease prognosis prediction system based on time sequence evolution process explanation
Bertl et al. Evaluation of deep learning-based depression detection using medical claims data
He et al. A multi-attention collaborative deep learning approach for blood pressure prediction
Abd Elkader et al. A framework for chronic kidney disease diagnosis based on case based reasoning
Zhang et al. Survival topic models for predicting outcomes for trauma patients
CN113658688B (en) Clinical decision support method based on word segmentation-free deep learning
Shahn et al. G-computation and hierarchical models for estimating multiple causal effects from observational disease registries with irregular visits
Chattopadhyay MLMI: A machine learning model for estimating risk of myocardial infarction
Saif et al. Deep-kidney: an effective deep learning framework for chronic kidney disease prediction
Krishnaveni et al. An ANN based Screener for the early diagnose of Polycystic Ovarian Syndrome in adolescent and young women
Samuel et al. Computational intelligence enabling the development of efficient clinical decision support systems: case study of heart failure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination