WO2023083485A1

WO2023083485A1 - A method for operating a patient survival assessment system and a corresponding data processing system

Info

Publication number: WO2023083485A1
Application number: PCT/EP2021/085950
Authority: WO
Inventors: Ammar Shaker; Bhushan Kotnis
Original assignee: NEC Laboratories Europe GmbH
Priority date: 2021-11-15
Filing date: 2021-12-15
Publication date: 2023-05-19

Abstract

An efficient method for operating a patient survival assessment system by means of a data processing system is developed, comprising the following steps: collecting health data from a patient suffering from a disease; collecting patient's historical data for at least one disease; and performing a survivability analysis of the patient comprising transfer learning considering the health data and/or the historical data and using conditional von Neumann loss. Further, a corresponding data processing system is provided comprising: collecting means for collecting health data from a patient suffering from a disease; collecting means for collecting patient's historical data for at least one disease; and performing means for performing a survivability analysis of the patient comprising transfer learning considering the health data and/or the historical data and using conditional von Neumann loss.

Description

A METHOD FOR OPERATING A PATIENT SURVIVAL ASSESSMENT SYSTEM AND A CORRESPONDING DATA PROCESSING SYSTEM

The present invention relates to a method for operating a patient survival assessment system by means of a data processing system.

Further, the present invention relates to a corresponding data processing system for patient survival assessment.

Corresponding prior art documents are listed as follows:

[1] Ramakanth, G. S. H., et al. "A randomized, double blind placebo controlled study of efficacy and tolerability of Withaina somnifera extracts in knee joint pain." Journal of Ayurveda and integrative medicine 7.3, 2016: 151 -157.

[2] Y. Ganin, E. Ustinova, et al. Domain-adversarial training of neural networks. JMLR, 17(1):2096-2030, 2016.

[3] Siddhant, Aditya, et al. "XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization.", 2020.

Further, “Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records”, February 28, 2019, Merijn Beeksma, Suzan Verberne, Antal van den Bosch, Enny Das, Iris Hendrickx, Stef Groenewoud, discloses a long short-term memory, LSTM, recurrent neural network system or machine learning for predicting life expectancy of patients using electronic medical records by comparing the Electronic Medical Record, EMR, data - patient disease history - with medical data/text extracted from medical scientific literature. Furthermore, a learning module provides automatic medical decision support and automatic diagnostics.

US 10 867702 B2 discloses a method and system for utilizing machine learning and statistical techniques to predict drug response phenotypes for patients for bearing medical treatment, patient longevity and quality of life. Furthermore, the method discloses the use of patient EMR data including patient disease history to predict drug response phenotypes for patients by using machine learning. The prediction score and the EMR data are compared with the data as stored in a pharmaceutical research database for making clinical decisions.

US 2015/0339442 A1 discloses a computer software or machine learning method for analyzing patient EMR data and generating a survival score. Further, the learning module compares the EMR data of the patient with a global medical database and generates treatment decisions and plans for the patient to improve patient survivability.

Finally, “Transfer learning for decision support in Covid-19 detection from a few images in big data”, May 14, 2021 , Divydharshini Karthikeyan, Aparna S. Varde, Weitian Wang, discloses a transfer learning model for automatic decision support in Covid-19 detection by extracting X-ray images from an electronic health record of the patient and then comparing the images with the benchmark data to obtain medical diagnosis and thereby improve the patient survival score.

Numerous countries in Europe and Asia, such as Lithuania, Japan, Germany, Spain, etc. are aging rapidly, which creates massive pressure on the health care systems in these countries. Additionally, the pace of research on the effects of diet, drug supplements on longevity, aging, immunity, cognitive decline, etc., has increased tremendously. Thus, it is difficult for medical caregivers to keep abreast with the research for providing optimal care to their terminally ill patients. For example, recent research [1] suggests that supplements, popularly known as “Ashwagandha” can reduce Osteoathritic pain. Additionally, diagnostics such as Heart Rate Variability, HRV, certain blood biomarkers can allow for predicting the 1 , 2, 5 year survivability of patients with terminal conditions. Keeping on top of this research for providing optimal care, pain management and best possible quality of life to patients can be daunting and expensive especially considering rapidly aging populations in many countries. Current survival analysis and treatment recommendation systems use historic patient data from EHR/EMR records. The problem with this approach is that it is unable to improve treatment plans based on new advances in biomedical research. This makes the treatment plans recommended by such systems outdated.

It is an object of the present invention to improve and further develop a method for operating a patient survival assessment system and a corresponding data processing system for providing an extra efficient survival assessment and operation of the system by simple means.

In accordance with the invention, the aforementioned object is accomplished by a method for operating a patient survival assessment system by means of a data processing system, comprising the following steps:

- collecting health data from a patient suffering from a disease;

- collecting patient’s historical data for at least one disease; and

- performing a survivability analysis of the patient comprising transfer learning considering the health data and/or the historical data and using conditional von Neumann loss.

Further, the aforementioned object is accomplished by a data processing system for patient survival assessment, comprising:

- collecting means for collecting health data from a patient suffering from a disease;

- collecting means for collecting patient’s historical data for at least one disease; and

- performing means for performing a survivability analysis of the patient comprising transfer learning considering the health data and/or the historical data and using conditional von Neumann loss.

According to the invention it has been recognized that it is possible to provide an extra efficient survival assessment and operation of the system by a particular way of performing a survivability analysis of the patient. It has been further recognized that this advantage can concretely be achieved by using transfer learning considering the health data and/or the historical data and using conditional von Neumann loss in performing the survivability analysis. In this way transfer learning is facilitated.

Thus, on the basis of the invention an extra efficient survival assessment and operation of the system are provided by simple means.

According to an embodiment of the invention the health data can comprise information about the current drug usage and dosage, blood and other medical test results and measurement results and/or electronic health records, EHR. A lot of different collected data helps in providing a very efficient survival assessment.

Within a further embodiment the conditional von Neumann loss can comprise a loss function using conditional von Neumann divergence to create invariant representations between a target and source domain or target and source domains.

According to a further embodiment the transfer learning can be used for imputing missing or incomplete patient data, for example health data or historical data. Thus, incomplete patient data can be completed for providing an efficient survival assessment or survivability score.

In a further embodiment the transfer learning can comprise multilingual transfer learning for information extraction with von Neumann loss function using joint high and low resource pair training, wherein that training can learn language and domain invariant embeddings. Such a multilingual capacity of the transfer learning provides a wide scope of applications of the invention.

According to a further embodiment sentence pairs can be used as training data, wherein these sentence pairs can be fed to a pre-trained transformer model for training, wherein this training can use stochastic gradient descent. A very efficient training is the result of this feature.

Within a further embodiment the transfer learning can operate with a symmetric discordance index, SDI, and/or with finding of a feature extractor. Depending on individual situations the transfer learning can be adapted to different requirements in a flexible way.

According to a further embodiment a text extraction system can extract information from the health data, for example vital information such as diseases and medical conditions and/or drug names. All patient related information can be extracted.

In a further embodiment the method can further comprise collecting medical literature, wherein this medical literature can cover multiple languages. Any kind of medical literature can be used in the invention for providing a high degree of information.

According to a further embodiment a multilingual information extraction system can extract information from the text extraction system and/or from the collected medical literature and/or from the survivability analysis, which can comprise a survival score.

In a further embodiment the multilingual information extraction system can be based on Artificial Intelligence, Al, wherein different kinds of Al can be used .

According to a further embodiment the multilingual information extraction system can extract an open knowledge graph containing information on drug supplements connected to the health data and/or historical data and/or extracts drug safety information. As a result, the efficiency of the survival assessment system can be enhanced.

Within a further embodiment the method can further comprise making a decision, statement or recommendation regarding further treatment or therapy of the patient by a - for example automated - decision making module, which can comprise diagnostics, drug and dietary recommendations. This can help in prolonging survivability of the patient in critical situations.

According to a further embodiment the decision making module can receive information from an information extraction module, for example from the multilingual information extraction system. As a result a very efficient operation of the survival assessment system is possible.

In a further embodiment the decision making module can keep a track of a current survivability score of the patient and pairs it with current supplements and drugs the patient is currently taking and recommended supplements and drugs.

Advantages and aspects of embodiments of the present invention are summarized as follows:

1 . Embodiments of the method can be used to transfer useful knowledge from information-rich to information-poor domains.

2. In embodiments in the field of medicine, with the support of the defined loss functions, the found actions can lead to increasing the life expectancy of patients.

3. Further embodiments of the invention can learn from external textual knowledge for improving the decision making process.

Further advantages and aspects of embodiments of the present invention are summarized as follows:

1) Embodiments of this invention can comprise the formulation of a transfer learning for survival analysis loss function using conditional von Neumann divergence to create invariant representations between the target and source domains.

2) Embodiments of this invention can comprise a system for multilingual transfer learning for information extraction with von Neumann loss function using joint high and low resource pair training that learns language and domain invariant embeddings. Further embodiments can comprise a system and method for predicting automated survival assessment for patients and improving short/long term patient survivability by leveraging transfer learning using conditional von Neumann loss along with multilingual information extraction.

Further embodiments can comprise collecting of multilingual medical literature covering multiple languages and/or a multilingual information extraction module applying an Al-based multilingual information extraction system that can extract information on new drug supplements and dosage from the latest research, clinical trials, and/or medical case studies.

Further embodiments of the invention can comprise a method and system for automated survival assessment for patients and then preferably extracting relevant drug supplement modulations and other diagnostics from medical case studies and protocols literature and then further preferably using this information to change the drug supplements, diet, dosage and diagnostics for improving short and long term patient survivability. The survivability analysis can use the current drug dosage, EHR data and diagnostics such as blood tests. It can use transfer learning for imputing missing patient data and can compute a survivability score. A text extraction system can extract vital information from EHR data such as diseases and medical conditions and it can also extract drugs names from prescriptions. A multilingual system can then extract relevant information connected to these diseases, conditions and current drug supplements from the latest medical literature. The relevant information can be in the form of new drug and dietary supplements, changes in the dosage of supplements and new diagnostics. Consequently, these new supplements, diagnostic test dosage can then automatically be administered to the patient. The multilingual extraction system can also extract safety information from case studies and clinical trials to ensure that the administered drugs and dosage is not toxic to the patients.

Embodiments of the invention can solve the problem of automated monitoring and management of patients by computing their survival chance over a given period of time based on their clinical data and then preferably administering appropriate drugs supplements with optimum dosage for gradually improving the survivability, further preferably based on the latest research, case studies and/or clinical trials.

Embodiments of the invention i) solve the problem of predicting patients' survivability using transfer learning on preferably different illness domains and applying an Al- based multilingual information extraction system that extracts information on new drug supplements and dosage from the latest research, clinical trials, and medical case studies and/or ii) take actions such as changing the patient's therapy and/or administering or modifying supplements drug dosage to patients in a controlled manner, preferably informed from current research. Transfer learning, here, can leverage information-rich illnesses - e.g., large number of use cases, studies, etc. - to improve the treatment of rare diseases.

Further embodiments of the present invention can provide methods for multilingual information-assisted prolonged survivability.

There are several ways how to design and further develop the teaching of the present invention in an advantageous way. To this end it is to be referred to the following explanation of examples of embodiments of the invention, illustrated by the drawing. In the drawing

Fig. 1 shows in a diagram an overview of an embodiment of the present invention,

Fig. 2 shows in a diagram a multilingual/multidomain information extraction architecture of an embodiment of the present invention and

Fig. 3 shows in a diagram the Jefferey von Neumann loss of an embodiment of the proposed invention.

As illustrated in Fig. 1 , an embodiment of the invention or a corresponding system consists of various process modules notable among them: (i) a transfer learning for survival analysis module 1 , (ii) a multilingual information extraction module 3, and (iii) an automated decision making module 4. Measurement and data collection components consist of collecting drug usage and dosage data along with blood test results, vitals, and patient EHR data. An EHR text extractor 2 extracts historical data from the EHR records, especially diseases, symptoms, and drugs. It uses a simple information extraction module that extracts the named entities and classifies them as drugs, diseases, or symptoms.

In the following, we define a generic method according to an embodiment of the invention that involves main process components and resources; after that, we describe in more detail how these components interact to improve the patient’s survivability and health.

Generic Method:

1) Data collection from at least one patient suffering from a disease, potentially a not very well studied disease such as a rare cancer type. This data includes information about the current drug usage and dosage, blood and other medical tests and measurements, and electronic health records, EHR. This resource is depicted as component (A) in Fig. 1.

2) Collection of patient’s historical data for different diseases varying in terms of information richness, depicted as component (B) in Fig. 1.

3) The process module transfer learning for survival analysis, depicted as module (1) in Fig. 1.

4) The process module EHR text extractor, depicted as module (2) in Fig. 1.

5) Collection of multilingual medical literature covering multiple languages. This collection of documents can also vary between information-rich domains and languages and rare domains and languages, depicted as component (C) in Fig. 1.

6) The process module multilingual information extraction, depicted as module (3) in Fig. 1.

7) The process module automated decision making, depicted as module (4) in Fig. 1. Re 3: Details of the “transfer learning for survival analysis” process module

This is one of the key modules in the system. Apart from performing a survivability analysis, it also does data imputation for missing data. It is possible that some of the patient EHR data is missing, in that case this module automatically imputes the missing data using transfer learning and uses it for predicting the survival score.

We assume that each individual or patient is represented by where x, e

R^d is the vector of covariates, is either the observation time of the event, or the

censoring time for right-censored individuals. Censoring, here, means that the target event for that individual was not observed before the termination of the study or leaving the study, thus, participating with the partial information of surviving at least till t_£. <5_£ is the event indicator, 0 for censored cases and otherwise.

Next, we define the symmetric discordance index, SDI, for two risk functions r and r₂ and show that it is a metric satisfying the triangular inequality. (

where D_ev c D is the subset of non-censored samples, and contains the

censored samples. C_{r x} is the set of samples - from D - that are assumed to be outlived, according to the risk function r, by the censored sample is the set

symmetric difference (disjunctive union). SDI is composed of two parts, (i) the disagreements in ranking each pair of non-censored samples, and (ii) the disagreements in ranking pairs of censored and non-censored instances. a_± and a₂ transform SDI into a convex combination of these two parts.

Moreover, SDI's symmetry follows from counting the discordance with censored cases twice, once for each of the risk functions while considering the other as the ground truth. Notice that the SDI is equivalent to the Kendall's tau distance between two rankings when (i) counting 0.5 as a score for ties on the survival times, and (ii) no censoring.

Finally, we perform distribution matching by an adversarial min-max game, we explicitly combine a feature extractor f_g-. X ~^> T and a class of predictor H-. T Y in a unified learning framework:

The first term of Eq. (1) enforces h to be a good predictor to all source tasks; the second term is an explicit instantiation of finding a feature extractor for which the embeddings of the source and target problems cannot be differentiated. The general idea is to find a feature extractor f_e that for any given pair of h and h', it is hard to discriminate the target domain from the weighted combination of source distribution.

Re 4: Details of the “EHR text extractor” process module

This module extracts all patient related information from the EHR. The extracted data is considered to be the target task or target domain that is information-poor.

Re 6: Details of the “multilingual/MultiDomain information extraction” process module

This is a critical module because it extracts new information from the current scientific literature so that it can modify the patient care with the goal of improving patient survivability. Therefore, it is critical that the extracted information is accurate, useful and safe enough for the patient. The Multilingual Information extraction module uses the symptoms, diseases and other data such as age extracted from the EHR patient data and extracts an open Knowledge Graph containing information on drug supplements connected to these symptoms and diseases. The system also extracts safety information, such as age restrictions, drug interactions, interactions with pre-existing conditions. The extracted drug supplement interventions are compared with the drugs/supplements the patients are taking and other patient data to check for safety.

The challenge in extracting information from medical literature from multiple languages as well as English is the lack of training data in the biomedical domain. This NID proposes an innovative method that leverages transfer learning using a novel loss function and training strategy that allows the system to extract knowledge from the medical domain even when it is not trained using biomedical data.

The system is described in Fig. 2. The first step is to create a sentence pairs for training where the first sentence comes with labels, i.e., from a high resource language or domain like English labelled Wikipedia data, while the second sentence in the pair is from low resource language or domain, in this case the biomedical domain.

The sentence pairs are constructed by pairing a sentence from the low resource domain with every sentence from the high resource domain and vice versa. Thus for every sentence in labelled data extracted from Wikipedia we find the most similar sentence using cosine similarity computed using a pretrained sentence similarity model [3] and vice versa.

The sentence pairs are used as training data and then fed to a pre-trained transformer model for training using stochastic gradient descent. To facilitate transfer learning the NID proposes a novel loss function for training that encourages the model to learn domain/language agnostic embeddings. The loss function is described in Fig. 3. We use BIO tagging for extracting the subject, predicate, object and arguments from the text to create an open Knowledge Graph - open information extraction. Additionally we ensure that the subject is an entity, i.e., we do named entity recognition for all subjects. This is because we want to extract KG related to diseases and symptoms which are typically named entities.

Re 7: Details of the “Automated Decision Making” process module

This module receives the drug and dietary recommendations from the information extraction module. It keeps a track of the current survivability score and pairs it with the current supplements and drugs the patient is currently taking and the new recommended supplements. It automatically checks for safety by comparing the extracted safety information with the patient data. It then automatically administers the supplement by scheduling/ordering a specific diet and meal plan or by placing an order for specific supplement to be delivered to the patient.

This is done using a simple decision tree. Each recommendation obtained from the multilingual extraction module is compared to current vitals of the patient and if the supplement recommendation promises an increase or decrease in the patient vitals then it is administered. For example, if the Blood Pressure of the patient is slightly above normal and the multilingual extraction system recommends Arginine supplement because it improves blood flow and reduces blood pressure then it is checked whether the patient is suffering from high blood pressure in addition to other conditions affected by Arginine and then based on the decision tree, the supplement is either recommended or rejected.

Further Embodiment:

Improving Quality of Life of Hospice Patients:

Use Case: Optimizing diet and supplements for improving quality of life, pain management for terminally ill patients. Data Source: Patient EHR data, routine blood test results, heart rate, blood pressure and other physiological data, patient prescriptions.

Present Method: The present invention ingests the patient data, predicts patient survivability, extracts dietary and drug supplements appropriate to the patients medical condition taking safety into consideration, and finally checks with historical survivability data if available and ranks the supplements.

Output: A ranked list of health interventions based on dietary and drug supplements. Physical Change - Technicity: Generate a list of dietary requirements for the cooks along with placing an order for the food ingredients, automatically purchasing dietary supplements for required dosage, administering supplements intravenously for patients with IV lines.

Many modifications and other embodiments of the invention set forth herein will come to mind to the one skilled in the art to which the invention pertains having the benefit of the teachings presented in the foregoing description and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.

Claims

C l a i m s

1. A method for operating a patient survival assessment system by means of a data processing system, comprising the following steps:

- collecting health data from a patient suffering from a disease;

- collecting patient’s historical data for at least one disease; and

2. A method according to claim 1 , wherein the health data comprise information about the current drug usage and dosage, blood and other medical test results and measurement results and/or electronic health records, EHR.

3. A method according to claim 1 or 2, wherein the conditional von Neumann loss comprises a loss function using conditional von Neumann divergence to create invariant representations between a target and source domain or target and source domains.

4. A method according to any one of claims 1 to 3, wherein the transfer learning is used for imputing missing or incomplete patient data, for example health data or historical data.

5. A method according to any one of claims 1 to 4, wherein the transfer learning comprises multilingual transfer learning for information extraction with von Neumann loss function using joint high and low resource pair training, wherein that training can learn language and domain invariant embeddings.

6. A method according to claim 5, wherein sentence pairs are used as training data, wherein these sentence pairs can be fed to a pre-trained transformer model for training, wherein this training can use stochastic gradient descent.

7. A method according to any one of claims 1 to 6, wherein the transfer learning operates with a symmetric discordance index, SDI, and/or with finding of a feature extractor.

8. A method according to any one of claims 1 to 7, wherein a text extraction system extracts information from the health data, for example vital information such as diseases and medical conditions and/or drug names.

9. A method according to any one of claims 1 to 8, wherein the method further comprises collecting medical literature, wherein this medical literature can cover multiple languages.

10. A method according to any one of claims 1 to 9, wherein a multilingual information extraction system extracts information from the text extraction system and/or from the collected medical literature and/or from the survivability analysis, which can comprise a survival score, wherein the multilingual information extraction system can be based on Artificial Intelligence, Al.

11. A method according to claim 10, wherein the multilingual information extraction system extracts an open knowledge graph containing information on drug supplements connected to the health data and/or historical data and/or extracts drug safety information.

12. A method according to any one of claims 1 to 11 , wherein the method further comprises making a decision, statement or recommendation regarding further treatment or therapy of the patient by a - for example automated - decision making module, which can comprise diagnostics, drug and dietary recommendations.

13. A method according to claim 12, wherein the decision making module receives information from an information extraction module, for example from the multilingual information extraction system.

14. A method according to claim 12 or 13, wherein the decision making module keeps a track of a current survivability score of the patient and pairs it with current supplements and drugs the patient is currently taking and recommended supplements and drugs.

15. A data processing system for patient survival assessment, preferably for carrying out a method for operating a patient survival assessment system according to any one of claims 1 to 14, comprising: