CN116830207A

CN116830207A - Oncology workflow for clinical decision support

Info

Publication number: CN116830207A
Application number: CN202280010017.8A
Authority: CN
Inventors: 辛迪·K·巴尔纳德; 桑巴西瓦劳·比拉普尼; 迪瓦卡尔·查帕盖; 阿尔卡纳·P·多尔奇; 凯瑟琳·M·朱; 伦加拉·基萨万; 考沙尔·D·帕里克; 拉曼·拉马纳森; 大卫·M·斯戈罗斯曼; 维沙卡·沙尔马
Original assignee: F HOFFMAN-ROCHE AG
Current assignee: F HOFFMAN-ROCHE AG
Priority date: 2021-01-15
Filing date: 2022-01-18
Publication date: 2023-09-29

Abstract

The present disclosure provides systems and methods for managing patient data. The system integrates medical data from multiple sources into a unified patient database. Structured and unstructured medical data is acquired, enriched (e.g., by specifying data field types, standardized data types or terms, etc.), and stored to the unified patient database. Data retrieved from different sources is stored to data elements in the unified patient database in the network of connected subjects, including data on tumor mass, treatment, reporting, medical history, and diagnosis. The data in the unified patient database is used to display patient data in a user-friendly interface view, including a patient lineage view of chronologically displaying patient data organized by data type. Different interface views may be traversed to easily display patient data originating from different sources, thereby improving the clinical decision making process.

Description

Oncology workflow for clinical decision support

Cross-reference to related patent applications

The present application claims priority from U.S. patent application Ser. No. 63/138,275, filed on 1 month 15 of 2021, and U.S. patent application Ser. No. 63/256,476, filed on 10 month 15 of 2021, both of which are incorporated herein by reference for all purposes.

Background

Hospitals worldwide create large amounts of clinical data each day. Analysis of this data is critical to understanding detailed insights into healthcare supplies and care quality and to providing a basis for improving personalized healthcare. Unfortunately, most recorded data is difficult to access and analyze because most data is captured in unstructured form. Unstructured data may include, for example, healthcare provider descriptions, imaging or pathology reports, or any other data that is neither associated with the structured data model nor organized in a predefined manner to define the context and/or meaning of the data. Data is typically stored in multiple data sources. A clinician who is attempting to analyze patient data to make a decision may need to acquire data from multiple data sources and then manually parse the data to retrieve the information necessary to make a clinical decision. But this way of obtaining data is laborious, slow, costly and error-prone in making clinical decisions.

Disclosure of Invention

Techniques for improving clinician access to patient data for making clinical decisions, such as clinical decisions related to oncology, are disclosed herein. In some examples, a medical data processing system is provided. The medical data processing system may collect patient medical data from multiple data sources, convert the medical data to structured data, and present the structured data in a variety of forms, such as in a summary format and a longitudinal time view report format. The medical data processing system may also support oncology workflow protocols that may support or conduct diagnostic operations on the collected medical data and present diagnostic results to a clinician. Oncology workflow protocols may enable a clinician (such as an oncologist or his/her representative) to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up. Oncology workflow protocols may also support other medical applications such as quality of care assessment tools for assessing the quality of care administered to a patient, medical research tools for determining correlations between various information of a patient (e.g., demographic information) and tumor information of a patient (e.g., prognosis or expected survival), and the like. The technique can also be applied to other types of disease fields, not just oncology.

In some embodiments, a method for managing medical data includes the following by a server computer: creating a patient record for a patient in a unified patient database, the patient record including an identifier of the patient and one or more data objects related to medical data associated with the patient, the unified patient database including data from a plurality of sources; retrieving medical records for the patient from an external database; receiving, via a Graphical User Interface (GUI), an identification of a primary cancer associated with the medical record; in response to receiving the identification of the primary cancer, creating a primary cancer object in the patient record, the primary cancer object having a field including the primary cancer; storing medical records linked to primary cancer subjects in patient records in a unified patient database; receiving medical data for the patient via user input to the GUI; determining that medical data for the patient is associated with a primary cancer; and storing medical data for the patient linked to the primary cancer object in a patient record in a unified patient database.

In some aspects, a medical record for the patient is in a first format including a set of data elements related to a corresponding data type; receiving an identification of a primary cancer includes: identifying a primary cancer by analyzing the data elements and data types; displaying a GUI including prompting a user to confirm the primary cancer identification; and receiving, via the GUI, a user confirmation of the primary cancer identification.

In some aspects, the medical record is a first medical record, the method further comprising: receiving a second medical record for the patient, wherein the second medical record is in a second format comprising unstructured data; identifying data elements associated with the primary cancer from the unstructured data; analyzing the unstructured data to assign data elements to data types; and storing the data elements linked to the primary cancer object in a patient record in a unified patient database based on the assigned data type and identifying the data elements as being associated with the primary cancer.

In some aspects, receiving an identification of a primary cancer associated with the medical record includes: displaying, via the GUI, the medical record and a menu configured to receive user input selecting one or more primary cancers; and receiving, via the GUI, a user input selecting the primary cancer.

In some aspects, the method further comprises storing the medical record in a patient record; the medical records are analyzed to determine that the patient record is unassociated with the particular primary cancer, wherein displaying the medical records and the menu is responsive to determining that the patient record is unassociated with the particular primary cancer.

In some aspects, the medical record includes unstructured data; the method further comprises the steps of: applying a first machine learning model to identify text in the medical record; and applying a second machine learning model to associate a portion of the identified text with the corresponding field, wherein storing the medical record further comprises storing the identified text in association with the field to a unified patient database. In some aspects, the first machine learning model includes an Optical Character Recognition (OCR) model; and the second machine learning model includes a Natural Language Processing (NLP) model.

In some aspects, the method further comprises retrieving at least a subset of the medical data for the patient from a unified patient database; and causing display of the at least a subset of the medical data for the patient via the user interface for clinical decision making. In some aspects, the external database corresponds to at least one of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) and RIS (radiology information systems). In some aspects, the medical records are retrieved based on an identifier of the patient.

In some embodiments, a method for managing a unified patient database includes the following by a server computer: storing patient records comprising a network of interconnected data objects to a unified patient database, the unified patient database comprising data from a plurality of sources; storing a first data object corresponding to a data element of a tumor mass for a primary cancer to a patient record in a unified patient database, the first data object including attributes specifying a location of the tumor mass; receiving diagnostic information corresponding to the primary cancer from a diagnostic computer; analyzing the diagnostic information to identify a correlation between the diagnostic information and the tumor mass; based on identifying the correlation between the diagnostic information and the tumor mass, storing a second data object corresponding to the diagnostic information to a unified patient database, the second data object being connected to the first data object via a network of interconnected data objects; receiving treatment information corresponding to the primary cancer from the diagnostic computer; analyzing the treatment information to identify a correlation between the treatment information and the tumor mass; and based on identifying the correlation between the treatment information and the tumor mass, storing a third data object corresponding to the treatment information to a unified patient database, the third data object being connected to the first data object via a network of interconnected data objects.

In some aspects, the method further comprises retrieving one or more of attributes, diagnostic information, and/or therapeutic information specifying a location of the tumor mass from a unified patient database; and causing, via the user interface, display of one or more of attributes, diagnostic information, and/or treatment information of the site of the designated tumor mass for clinical decision making.

In some aspects, the method further comprises receiving patient history data from a diagnostic computer; analyzing the patient history data to identify correlations between the patient history data and tumor masses; and based on identifying the correlation between the patient history data and the tumor mass, storing a fourth data object corresponding to the patient history data to a unified patient database, the fourth data object being connected to the first data object via a network of interconnected data objects.

In some aspects, the method further comprises receiving tumor mass information from the diagnostic computer corresponding to a tumor mass at the metastatic site of the primary cancer; analyzing the tumor mass information to identify correlations between diagnostic information and tumor masses; and based on receiving the tumor mass information and identifying the first data object, storing a fifth data object corresponding to the tumor mass information, connected to the first data object via a network of interconnected data objects, to a unified patient database. In some aspects, the second data object includes one or more attributes selected from the group consisting of: stage of primary cancer, biomarker, and tumor size.

In some aspects, the method further includes identifying data elements and data types associated with the patient from a unified patient database; and transmitting the data elements and data types in a structured form to an external system. In some aspects, the method further includes, upon generating each of the first data object and the second data object, generating a first timestamp stored in association with the first data object indicating a time of creation of the first data object and a second timestamp stored in association with the second data object indicating a time of creation of the second data object.

In some aspects, the method further comprises updating the unified patient database by: importing medical data from an external database; parsing the imported medical data to identify specific data elements associated with the patient and the primary cancer; and storing the particular data element in association with the first data object to a sixth data object.

In some aspects, the external database corresponds to at least one of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) and RIS (radiology information systems).

In some embodiments, a method of processing medical data to facilitate clinical decisions includes performing, by a server computer: receiving, via a graphical user interface, identification data identifying a patient; receiving user input selecting a mode of a set of selectable modes of a graphical user interface; retrieving a set of medical data associated with the patient from a unified patient database based on the identification data and the user input, the set of medical data corresponding to the selected mode; and displaying, via the graphical user interface, a user-selectable set of objects in a timeline, the objects organized in rows, each row corresponding to a different one of a plurality of categories including pathology, diagnosis, and therapy.

In some aspects, retrieving the set of medical data includes: querying a unified patient database to identify a patient record for a patient from the unified patient database, the patient record including a patient object; identifying each of a set of objects connected to the patient object; and retrieving a predetermined subset of the identified set of objects for display.

In some aspects, the set of medical data corresponds to one or more of: a treatment subject in a unified patient database, the treatment subject storing treatment type, date, and response to treatment; a diagnostic discovery object in a unified patient database, the diagnostic discovery object storing biomarker data, staging data, and/or tumor size data; and a history object in a unified patient database, the history object storing a surgical history, allergy and/or family medical history.

In some aspects, the method further includes detecting a user interaction with an object in the set of objects; identifying and retrieving corresponding reports from a unified patient database; and displaying the report via a graphical user interface. In some aspects, the graphical user interface further includes a functional area displayed above the timeline, the functional area displaying a subset of the objects marked as important.

In some aspects, the graphical user interface further comprises an element for navigating to a second interface view, the method further comprising: detecting a user interaction with an element for navigating to the second interface view; and transition to a second interface view that displays tumor summary data.

In some embodiments, a method for managing patient data includes storing a patient record to a unified patient database, the unified patient database including data from a plurality of sources, the patient record including a plurality of data objects including a first primary cancer data object storing data elements corresponding to a first tumor mass of a patient and a second primary cancer data object storing data elements corresponding to a second tumor mass of the patient; presenting and causing display of a graphical user interface comprising a patient summary comprising information summarizing patient data in patient records in a unified patient database; detecting a user interaction with an element of the graphical user interface; in response to detecting the user interaction, retrieving data elements from the unified patient database of the first primary cancer data object and the second primary cancer data object from the patient record; and presenting a first modality corresponding to a first primary cancer of the patient and a second modality corresponding to a second primary cancer of the patient; and causing a side-by-side display of the first modality and the second modality in the graphical user interface.

In some aspects, each of the modalities displays a set of biomarkers with a timestamp, staging information, and transfer site information. In some aspects, the plurality of sources includes two or more of the following: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems), RIS (radiology information systems), patient report results, wearable devices, or social media websites.

In some embodiments, a method of processing medical data to facilitate clinical decisions includes receiving, via a portal, input medical data for a patient associated with a plurality of data categories associated with oncology workflow operations; generating structured medical data of the patient based on the input medical data, generating the structured medical data to support oncology workflow operations to generate diagnostic results including one of: patients not suffering from cancer, patients suffering from primary cancer, patients suffering from multiple primary cancers, or patients suffering from cancer whose primary site is unknown; the structured medical data and a history of diagnostic results of the patient relative to time in the portal are displayed via the portal to enable clinical decisions to be made based on the history of diagnostic results.

In some aspects, the portal includes a data input interface to receive input medical data and map the input medical data to fields to generate structured medical data; and wherein the data input interface organizes the structured medical data into one or more pages, each of the one or more pages being associated with a particular primary tumor location. In some aspects, the method further comprises receiving, via the data input interface, a first indication that a first subset of medical data entered into a first page of the data input interface associated with a first primary tumor site belongs to a second primary tumor site; based on the first indication: creating a second page for a second primary tumor site; and the first subset of medical data populates the second page.

In some aspects, the method further comprises receiving, via the data input interface, a second indication that a second subset of the medical data input into the first page relates to metastasis of a second primary tumor site; and populating the second page with a second subset of the medical data based on the second indication. In certain aspects, the method further comprises importing the document file from a unified patient database; and retrieving the input medical data from the document file based on at least one of a Natural Language Processing (NLP) operation or a rule-based retrieval operation on text included in the document file.

In some aspects, the method further comprises displaying the document file in a document browser of the portal; and highlighting one or more portions of the document file from which the input medical data was retrieved. In some aspects, the method further includes displaying one or more data fields alongside the document browser; and displaying an indication of a subset of the one or more data fields to be populated with input medical data retrieved from the highlighted one or more portions of the document file to indicate correspondence between the subset of the one or more data fields and the highlighted one or more portions of the document file.

In some aspects, the indication includes emphasizing a subset of the one or more data fields and surrounding a highlighting marker on the highlighted portion or portions of the document file. In some aspects, the indication is displayed based on receiving input from a user via a portal. In some aspects, the one or more portions of the highlighting are determined based on detecting input from a user via the portal. In some aspects, the one or more portions of the highlighting are determined based on at least one of Natural Language Processing (NLP) operations or rule-based retrieval operations.

In some aspects, the method further includes determining one or more medical data categories of the captured input medical data; determining a mapping between one or more fields in the structured medical data and one or more medical data categories based on a Structured Data List (SDL); and populating one or more fields with the retrieved input medical data based on the mapping.

In some aspects, mapping includes mapping the input medical data to a normalized value. In some aspects, the input medical data is received from one or more sources including at least one of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems), RIS (radiology information systems), patient report results, wearable devices, or social media websites.

These and other embodiments of the invention are described in detail below. For example, other embodiments relate to systems, apparatuses, computer products, and computer-readable media associated with the methods described herein.

The nature and advantages of embodiments of the invention may be better understood by reference to the following detailed description and the accompanying drawings.

Drawings

A detailed description is given with reference to the accompanying drawings.

Fig. 1 illustrates a conventional clinical decision making process to be improved by examples of the present disclosure.

FIG. 2 illustrates a medical data processing system to facilitate clinical decisions according to certain aspects of the present disclosure.

Fig. 3A, 3B, 3C, 3D, 3E, 3F, 3G, and 3H illustrate examples of data input interfaces of the medical data processing system of fig. 2 in accordance with certain aspects of the present disclosure.

Fig. 4A, 4B, and 4C illustrate examples of data extraction interfaces of the medical data processing system of fig. 2, in accordance with certain aspects of the present disclosure.

Fig. 5A, 5B, 5C, and 5D illustrate examples of the operation of the data extraction interface of fig. 4A-4C.

FIGS. 6A, 6B, 6C, and 6D illustrate additional examples of data retrieval interfaces and operations of the medical data processing system of FIG. 2, according to certain aspects of the present disclosure.

Fig. 7A and 7B illustrate examples of data coordination interfaces and operations of the medical data processing system of fig. 2, according to certain aspects of the present disclosure.

Fig. 8A, 8B, and 8C illustrate examples of portal summary views that improve access to medical data of a patient in accordance with certain aspects of the present disclosure.

Fig. 9A, 9B, 9C, 9D, and 9E illustrate examples of portal patient lineage views that improve access to a patient's medical data according to certain aspects of the present disclosure.

Fig. 10 illustrates an example of a portal report view that improves access to medical data of a patient in accordance with certain aspects of the present disclosure.

Fig. 11 illustrates an example of a portal performance metrics view that improves access to medical data of a patient in accordance with certain aspects of the present disclosure.

Fig. 12 illustrates an example of a data pattern for patient data in accordance with certain aspects of the present disclosure.

Fig. 13 illustrates another example of a data pattern for patient data in accordance with certain aspects of the present disclosure.

14A, 14B, 14C, and 14D illustrate an example overview workflow for patient data management in accordance with certain aspects of the present disclosure.

Fig. 15 illustrates a method of managing patient data from different sources in a unified manner in accordance with certain aspects of the present disclosure.

Fig. 16 illustrates another method of managing patient data to improve access to patient data in accordance with certain aspects of the present disclosure.

Fig. 17 illustrates a method of displaying patient data via a graphical user interface to improve access to the patient data, in accordance with certain aspects of the present disclosure.

Fig. 18 illustrates a method of managing and displaying patient data in accordance with certain aspects of the present disclosure.

19A and 19B illustrate examples of oncology workflow implemented by the medical data processing system of FIG. 2 according to certain aspects of the present disclosure.

Fig. 20A and 20B illustrate another example of oncology workflow implemented by the medical data processing system of fig. 2 in accordance with certain aspects of the present disclosure.

Fig. 21 illustrates a method of processing medical data to facilitate clinical decisions in accordance with certain aspects of the present disclosure.

FIG. 22 illustrates an exemplary computer system that can be used to implement the techniques disclosed herein.

Detailed Description

Techniques are described for improving clinician access to patient data for clinical decision making, such as clinical decisions related to oncology. In some examples, the medical data processing system may collect medical data of a patient from multiple data sources, convert the medical data into structured data, and present the structured data in a variety of forms, such as in a summary format, a longitudinal time view reporting format, and the like. The medical data processing system may also support oncology workflow protocols that may support/perform diagnostic operations on the collected medical data and present diagnostic results to a clinician. Oncology workflow protocols may enable a clinician (such as an oncologist or his/her representative) to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up. A database and a graphical user interface for accessing the database are provided for updating and viewing patient data in oncology, e.g., representing patient history for diagnosis and/or treatment. For example, oncologists may use a graphical user interface to manage patient data and clearly understand the progression of cancer and response to treatment over time.

In some examples, a medical data processing system includes a data collection module, a data extraction module, a enrichment module, a data access module, and a data coordination module. The medical data collection module may receive or retrieve medical data of a patient. Patient data may originate from a variety of data sources (at one or more healthcare institutions) including, for example, EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) including genomic data, RIS (radiology information systems), patient report results, wearable and/or digital technology, social media, and the like.

The database system may ingest data from a number of sources. For example, data may be ingested from one or more external databases, such as an Electronic Medical Records (EMR) repository, a Picture Archiving and Communication System (PACS), and the like, as described above. The data may also be entered manually via fields in the user interface. The ingested data may include structured and unstructured data. Unstructured data may come from unstructured reports, such as PDF files. In the case of unstructured reporting, machine learning (e.g., optical Character Recognition (OCR) and/or Natural Language Processing (NLP)) is used to recognize and populate fields. Database systems such as ingestion of data from multiple sources and storage of the data in a new schema may be referred to as unified patient databases.

In a unified patient database, data may be stored in a graphical structure in which data elements are linked to relate different cancers or other conditions of a patient to different treatments, observations, and so forth. The graphical structures may also be used to correlate different cancers (e.g., primary and metastasis).

The data may be ingested and enriched via a user interface. In particular, an interface is provided for data extraction. During the data extraction process, information may be extracted from the report and used to populate fields of the interface that a user may confirm or edit to generate structured medical data. In the data enrichment process, enrichment operation is performed to improve the quality of the acquired medical data. Examples of rich operations include normalizing various values (e.g., body weight, tumor size, etc.), replacing non-standard terms provided by the patient with standardized terms, filling out missing fields to characterize or supplement the data, which may involve displaying a drop down menu including categories, data normalization formats, etc. Fields are filled in or updated automatically and/or via user input. For example, the user may interact with interface elements, classify a tumor as primary cancer (also referred to as primary tumor) or metastasis, or fill in other fields such as date, time, doctor's records, etc.

Another interface view may be used to coordinate the process. If data has been uploaded to the database, but information is lost in the record, such as associated with a primary cancer, stage, or surgical type, a coordinated interface view may be triggered. For example, in a reconciliation process, a tumor may be associated with one or more primary cancers, which may trigger storing data records for the tumor with updated mappings in a unified patient database.

Patient history can be viewed at any time during the data ingestion, extraction and coordination process. Patient history is a timeline showing various multimodal elements of patient oncology history and medical history in a time-sequential manner. This facilitates visualizing patient cancer milestones and cancer progression (e.g., metastasis, recurrence, or recurrence). The patient history includes a set of objects in a timeline. The subject may correspond to categories such as pathology, diagnosis, and treatment. Each category may have a row in the timeline in which objects in the category are displayed chronologically. Each object may be selectable by a user. Upon detecting user interaction with the object, the system may retrieve and display supplemental information, reports, etc. via the graphical user interface.

Furthermore, techniques may improve clinician access to patient data for clinical decisions, such as clinical decisions related to oncology. In some examples, the medical data processing system may collect medical data of a patient from multiple data sources, convert the medical data into structured data, and present the structured data in a variety of forms, such as in a summary format, a longitudinal time view reporting format, and the like. The medical data processing system may support oncology workflows in which a clinician may make various diagnoses at different stages of the workflow. The medical data processing system may facilitate the clinician to enter diagnostic results at different stages of the workflow and to perform post-processing of the data, both of which enable the clinician to longitudinally manage cancer patients suspected of having cancer by treatment and follow-up. The medical data processing system may also support other medical applications, such as a quality of care assessment tool for assessing the quality of care administered to a patient, a medical research tool for determining a correlation between various information of a patient (e.g., demographic information) and tumor information of a patient (e.g., prognosis or expected survival), and the like. The technique can also be applied to other types of disease fields, not just oncology.

In some examples, the medical data collection module also provides a portal to allow structured medical data to be entered and displayed into the system. The structured medical data may include various information related to tumor diagnosis, such as tumor location, stage, pathology information (e.g., biopsy results), diagnostic procedures, and biomarkers of the primary tumor as well as additional tumor locations (e.g., due to metastasis of the primary tumor). The portal may display the structured data in the form of a patient summary. The portal may also sort the display of structured data into pages, each page being associated with a particular primary tumor site and including information fields of the associated primary tumor site, and being accessible through a tab. The data input interface may allow a user to manually input medical data. Based on detecting user input to certain fields in the page of the first primary tumor (e.g., designating additional tumor sites as new primary tumors), the medical data collection module may create additional pages for the second primary tumor and populate the fields of the newly created page of the second primary tumor based on the additional tumor site information entered into the page of the first primary tumor. In some examples, the medical data collection module also allows the user to select additional tumor pieces found in the diagnostic procedure of the primary tumor and associate the pieces with the second primary tumor to represent the condition of metastasis. Based on detecting the association, the medical data collection module may transfer all diagnostic results of the additional tumor from the page of the first primary tumor to the page created for the second primary tumor.

In addition, the portal allows the user to import document files (e.g., pathology reports, doctor records, etc.) from the data sources. The medical data extraction module may then perform data extraction operations to extract various medical data from the document file and to populate fields of the patient summary to generate structured medical data. In some examples, the medical data may be retrieved based on performing, for example, natural Language Processing (NLP) operations, rule-based retrieval operations, and the like on text included in the document file. In some examples, medical data may also be retrieved from metadata of the document file, such as the date of the file, the category of the document file (e.g., pathology report and clinician's record), the clinician writing/signing the document file, and the type of procedure associated with the document file content (e.g., biopsy, imaging, or other diagnostic step). The retrieved medical data may then be used to automatically populate various fields of the patient summary. The medical data extraction module may also highlight portions of the document file from which the structured medical data was extracted, as well as fields to be populated with structured medical data, to allow a user to track/verify the results of the data extraction. In some examples, the medical data extraction module may also support manual retrieval of structured medical data from a document file via a portal.

In addition, the enrichment module may perform various enrichment operations to improve the quality of the captured medical data. One enriching operation may include a normalization operation to normalize various values (e.g., body weight, tumor size, etc.) contained in the captured medical data into a standardized unit to correct data errors or replace non-standard terms provided by the patient based on standardized terms of various medical standards/protocols, such as international disease classification (ICD) and medical System Nomenclature (SNOMED). The enriched captured medical data may then be stored in a unified patient database as part of structured medical data (e.g., structured tumor data) for the patient. In addition, in the event that the portal receives medical data manually entered by the user, the enrichment module may also control the portal to display a drop-down menu that includes alternatives that the user may select as entered standardized data (e.g., SNOMED terms) to ensure that the user entered the standardized medical data into the medical data processing system.

The medical data extraction module and the enrichment module may be continually adjusted to improve the extraction and normalization process. For example, some of the original unstructured patient data from the data source may be manually marked to indicate a mapping of certain data elements as a ground truth. For example, the text sequence in the physician's record may be marked as a ground truth indication of the adverse effects of the treatment. The marked doctor record may be used to train the NLP, e.g., of the data extraction module, to enable the NLP to extract text indicating adverse effects from other unmarked doctor records. The NLP can also be trained with other training data sets including, for example, generic data models, data dictionaries, hierarchical data (i.e., dependencies between/among text) to extract data elements based on semantic and contextual understanding of the extracted data. For example, the natural language processor may be trained to select a candidate from a standardized set of data candidates for data elements for cancer registration that has a meaning closest to the extracted data. In addition, some of the extracted data (such as digital data) may also be updated or validated as part of the process to be consistent with one or more data normalization rules.

Furthermore, the oncology workflow module may conduct/support diagnostic operations based on the structured medical data provided by the medical data collection module. In one example, a diagnostic procedure can be performed to confirm whether the biopsy results are for the same primary tumor or for different tumors, and to track the size of the primary tumor to assess the tumor's response to a particular treatment. In another example, a diagnostic operation may be performed to determine whether a patient has a single primary tumor site, multiple primary tumor sites, or an unknown primary site. The results of the diagnostic procedure may then be recorded and/or displayed in time in a portal as part of the patient's medical history to enable oncologists or their representatives to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up. The diagnostic results can also be used to support other medical applications, such as a quality of care assessment tool for assessing the quality of care administered to a patient, a medical research tool for determining correlations between various information of a patient (e.g., demographic information) and tumor information of a patient (e.g., prognosis or expected survival), and the like.

The disclosed technology enables medical data to be aggregated and retrieved to generate a patient summary and display the data in a portal. By providing all relevant medical data in the portal, and sorting the data according to tumor site, clinician access to the medical data can be significantly improved, which in turn can facilitate clinician decision-making and care management for the patient. Furthermore, as part of the oncology workflow, automated diagnostic operations may be performed that simulate a clinician's partial diagnosis, which may reduce the clinician's workload. Furthermore, displaying the diagnostic results in the portal, rather than the raw medical data, as part of the patient history may provide a clinician with better visualization of the patient's medical state. This enables the oncologist or his/her representative to longitudinally manage cancer patients suspected of having cancer by treatment and follow-up. All of these aspects can improve the quality of care provided to the patient.

I. Clinical decision making

FIG. 1 is a chart 100 illustrating a conventional clinical decision making process. As shown in fig. 1, a clinician 102 may obtain medical data 104 of a patient, which may include structured medical data 106 and unstructured medical data 108, to generate clinical decisions 110. Structured medical data 106 may include different categories of data including, for example, demographic information (age, gender, etc.) of the patient, diagnostic results described in accordance with various standardized codes international disease classification (ICD), diagnostic Related Groups (DRGs), current Program Terminology (CPT) and SNOMED codes, medical history (e.g., anatomical Treatment Chemistry (ATC)), clinical chemistry and immunochemical results, and the like. Furthermore, unstructured medical data 108 may include different categories of data including various medical reports, such as pathology reports, radiological reports, sequencing laboratory reports, surgical reports, admission reports, discharge reports, doctor records, and the like. Clinical decision 110 may include, for example, medications administered to a patient, physical treatments (e.g., radiation), and surgery. Medical data 104 is typically stored in different data sources, such as EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, and LIS (laboratory information systems).

The clinician 102 may need to access each category of data listed in the medical data 104 to make a decision. For example, the clinician 102 may need to access pathology reports and surgical reports to obtain information about the tumor. The clinician 102 may also need to access the radiological report to determine whether the tumor is local or the cancer cells have spread, and to access the sequencing laboratory report to obtain biomarker information. The clinician 102 may also need to access the physician record to obtain information about, for example, another clinician's treatment history for the patient. All of this data is critical to determining the treatment of the patient. For example, based on the radiation report, a clinician may determine that the tumor is local, and may administer certain physical therapies (e.g., radiation therapies) for the local tumor. Furthermore, certain drugs may be administered to the site based on the presence of certain biomarkers.

While the clinician 102 may access a large number of different medical data sets to make clinical decisions, acquiring medical data from different data sources can be very laborious. The lack of structured and standardized medical data also makes acquisition difficult. For example, the clinician 102 needs to read through and interpret a large number of medical reports to obtain the information they are looking for. The clinician 102 may also need to consider the physician's habit in composing the report in order to properly interpret the report. All of this is laborious and error-prone, which can affect the clinician's ability to determine and administer high quality care to the patient.

Medical data processing system

FIG. 2 illustrates an example of a medical data processing system 200 that may address at least some of the problems described above. Medical data processing system 200 may collect medical data 242 for a patient and convert medical data 242 into structured patient data 202. The medical data processing system 200 may also store the structured patient data 202 to a unified patient database 204. The unified patient database 204 may store data retrieved from various sources in a unified manner. The data may originate from one or more patient data sources 240. Patient data sources 240 may include one or more external databases or other sources, such as an Electronic Medical Records (EMR) repository, a Picture Archiving and Communication System (PACS), a Digital Pathology (DP) system, an LIS (laboratory information system) including genomic data, an RIS (radiology information system), patient report results, wearable and/or digital technology, social media, and so forth. The data stored to the unified patient database 204 may include unstructured data, such as PDF or images of scanned documents, as well as information entered directly into the medical data processing system 200 via portal 220. The unified patient database 204 may store a plurality of records, each record corresponding to a particular patient. Each patient record may include a network of interconnected data objects. The data patterns used in the unified patient database 204 are described in further detail below with reference to fig. 12 and 13.

Where the medical data relates to oncology, the structured patient data 202 may include various data categories, such as patient biographical information 212, oncological diagnostic information 214, treatment history 216, and biomarkers 218. The tumor diagnostic information 214 may also include various data subcategories or data types within a particular data category, such as tumor sites 214a, stage 214b, pathology information 214c (e.g., biopsy results), and diagnostic procedures 214d. Medical data processing system 200 further includes portal 220, which may present structured data in various forms, such as in a summary format, in a longitudinal time view reporting format, and the like, as shown in FIGS. 3A-11. In some implementations, portal 220 is displayed on a display component of a computing device separate from medical data processing system 200. For example, a diagnostic computer (not shown) displays portal 220 and receives user input, such as medical data 242.

In addition, medical data processing system 200 can support oncology workflow application 222. Oncology workflow application 222 may determine data to be collected by medical data processing system 200 to support oncology workflow. In addition, as described below, oncology workflow application 222 may perform (or support) analysis on the collected medical data and generate analysis results 224. The analysis may include determining a tumor status of the patient based on the structured patient data 202, e.g., whether the patient has a single tumor or multiple tumors, whether the patient has metastasis, etc. The analysis results may be updated each time new data (e.g., new diagnostic results, new biopsy results, etc.) is added for the patient. In some embodiments, oncology workflow application 222 executes on a diagnostic computer.

The analysis results presented in portal 220 may enable a clinician such as an oncologist or his/her representative to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up. The results of the diagnostic operation may then be recorded and/or displayed in the portal over time as part of the patient's medical history. Portal 220 may enable oncologists or his/her representatives to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up. The analysis results may also be used to support other medical applications, such as a quality of care assessment tool for assessing the quality of care administered to a patient, a medical research tool for determining correlations between various information of a patient (e.g., demographic information) and tumor information of a patient (e.g., prognosis or expected survival), and the like. Medical data processing system 200 may store structured patient data 202, and analysis results 224 in unified patient database 204, from which other medical applications may access the structured data and analysis results.

As shown, medical data processing system 200 includes portal 220, data collection module 230, data extraction module 232, enrichment module 234, and data access module 236. The data collection module 230 may receive medical data 242 from a user via a data input interface of the portal 220, wherein the user may input data into various fields, and may create structured patient data 202 via a mapping between the fields and the input data.

In addition, the data collection module 230 may also receive medical data 242 directly from the portal 220, which may provide a document extraction interface that allows a user to import a document file 244 (e.g., pathology report, doctor record, etc.) from the patient data source 240. The data extraction module 232 may perform extraction operations from the document file 244, wherein the data extraction module 232 retrieves medical data from the document file and maps the retrieved data to various data categories. The mapping may be based on a master Structured Data List (SDL) 246 that defines a list of data categories for document types of the document file 244 to support the oncology workflow application 222. Patient data sources 240 (at one or more healthcare institutions) may include, for example, EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) including genomic data, RIS (radiology information systems), patient report results, wearable and/or digital technology, social media, and the like. After the extraction operation, the user may edit and/or confirm the data retrieved from the document.

In addition, the enrichment module 234 may perform various enrichment operations to improve the quality of the captured medical data, such as performing normalization operations. For example, a normalization operation may be performed to normalize various values (e.g., body weight, tumor size, etc.) contained in the captured medical data into a normalized unit to correct data errors or to replace non-standard terms provided by the patient based on standardized terms of various medical standards/protocols, such as international disease classification (ICD) and medical System Nomenclature (SNOMED). As described below, the enrichment module 234 can normalize data received from the data collection module 230 and/or the data extraction module 232. The enriched captured medical data may then be stored to a unified patient database 204 as part of structured patient data 202 (e.g., structured tumor data) for the patient. The enrichment module 234 may also operate with the portal 220 to provide interface elements such as drop down menus, including alternatives to standardized data that may be selected by the user as input to ensure that the user enters standardized medical data into the medical data processing system.

The data access module 236 may provide temporary storage of data received from the data collection module 230 and the data extraction module 232 and update the data in the temporary storage based on edits made to the data by the user through the portal 220. The data access module 236 may release the data as structured patient data 202 to the unified patient database 204 after receiving confirmation from the user via the portal 220 that the data is complete and may be released back to the unified patient database 204. In addition, the data access module 236 may provide access to data in temporary storage to various applications, such as the oncology workflow application 222. This may provide information to the user at the data collection module 230 and the data extraction module 232 that support the workflow application to track and manage data input and data extraction operations.

The data coordination module 238 may identify data elements in the unified patient database 204 that missing information that requires proper storage and display of patient data. For example, if a data record for a particular cancer mass is not associated with a primary cancer site, the cancer mass may be marked for reconciliation. The data coordination module 238 may provide UI elements that prompt the user to enter necessary information (e.g., associate a cancer mass with a primary cancer, e.g., as a new primary cancer or as a metastasis of another primary cancer). The data coordination module 238 may retrieve the user input and modify the data record for the cancer mass to associate the cancer mass with the primary cancer identified via the user input to the UI.

Example interface

Figures 3A-11 illustrate various interfaces that may be used to display patient data and facilitate ingestion and ordering of patient data for clinical decision making. The data entry interface of fig. 3A-7B may be used to import and sort data to be stored in a unified patient database. The view interfaces of fig. 8A-11 may be used to retrieve and display data from a unified patient database for use in clinical decision making.

A. Data input interface

Fig. 3A, 3B, 3C, 3D, 3E, 3F, 3G, and 3H illustrate examples of portals 220. These examples provide an interface for managing medical data of an example patient.

1. Summary page

As shown in fig. 3A, portal 220 may provide a data input interface 300 to input data to support oncology workflow applications 222. The data input interface 300 may guide a user in manually entering data and/or approving or editing automatically retrieved data. The data received via the data input interface 300 of the portal 220 may be stored to the unified patient database 204 in an appropriate manner based on the fields 308 of the data input interface 300 using data schemas such as those described below with reference to fig. 12 and 13. The data from the unified patient database 204 may then be retrieved to display further interface views, such as a patient history view, that show a longitudinal time view report of patient data over time, as shown in fig. 9A-9E.

The data input interface 300 includes various fields of various information related to tumor diagnosis, such as a field 302 for tumor sites, a field 304 for stage, a field 306 for pathology information (e.g., biopsy results), a field 308 for diagnostic procedures, and a field 310 for biomarkers. Fields 302-310 may form a patient summary page 311 for a particular tumor site. In addition to the patient summary page 311, the data input interface 300 may also include fields for other information, such as patient reports 312, tumor therapy information 314 regarding a set of tumor therapies that the patient has received, current medication information 316 regarding current medications that the patient received, and patient history information 318 regarding various histories of the patient (e.g., medical history, surgical history, family history, social history, and substance use history). The data entry interface 300 provides an interface to aggregate the different modalities of patient data and then convert the data into structured patient data 202. Fields and various options provided in the data entry interface 300 may be defined based on the oncology workflow application 222.

Each of the patient summary page 311, patient report 312, tumor therapy information 314, current medication information 316, and patient history information 318 further includes a publish button. For example, the patient summary page 311 includes a publish button 319. As described above, when the data input interface 300 receives data entered into the various fields, the data access module 236 may store the data in temporary storage and retain the data from the unified patient database 204. Activation of the publish button 319 may prompt the data access module 236 to send the data as structured patient data 202 to the unified patient database 204.

The data input interface 300 may provide a variety of ways to input data for most fields, including manually inputting data and extracting data from a document file. For example, links 315a and 315b may be provided in fields of tumor therapy information 314. Activation of link 315a may cause display of a data extraction portal (e.g., as described below with reference to fig. 4A-6D) to retrieve data of the tumor treatment information from the document fields, while activation of link 315B may cause display of a text box and/or a drop down menu to allow a user to manually enter data of the tumor treatment information, as now described with reference to fig. 3B-3F.

2. Operation of summary page

Fig. 3B to 3F show an example of the operation of the patient summary page 311 when receiving data manually input by the user. Referring to operation 320 of fig. 3B, the primary tumor field 302 may receive the input text "right upper lung lobe" (e.g., location), but the diagnosis has not yet been confirmed and is still pending, and prompt a "pending diagnosis" flag 321. The title of the patient summary page 311 is still "unnamed primary". In addition, diagnostic procedure field 308 may receive input text indicating that positron emission tomography-X-ray computer tomography (PET-CT) is performed as part of a diagnostic procedure and that a tumor is found to be consistent with lung tumor and liver metastasis. The input text further indicates the size of the mass found in the lung and liver. In operation 322, in the primary tumor field 302, the "pending diagnosis" flag 321 de-prompts to confirm that the tumor in the upper lobe of the right lung is the primary tumor. Further, additional information is entered into pathology field 306. Such designations may be imported by medical data processing system 200 and stored in unified patient database 204 according to structured fields established via an interface.

Referring to operation 324 of fig. 3C, upon detection of the "pending diagnosis" flag cancel prompt, the data input interface 300 may change the title of the patient summary page 311 from "unnamed primary" to "right superior lung lobe" to reflect that the information in fields 302-310 pertains to a tumor of the right superior lung lobe. Further, referring to operation 326 of FIG. 3C, upon detecting that the add icon 325 is activated, the data input interface 300 may display an additional set of fields for the user to enter information regarding the new diagnostic program. Such information may include, for example, the date of the new diagnostic procedure, the name of the procedure, and the findings. In addition, a drop down menu 332 is provided to select the location of the tumor mass found in the new diagnostic procedure of field 334. The candidates listed in the drop down menu 332 may be provided by the enriching module 234 as standardized terms such that only standardized terms are entered into the field 334. As shown in fig. 3C, in operation 326, additional tumor mass (ascending colon mass) is added as a result of the new diagnostic procedure.

Fig. 3D, 3E, and 3F illustrate examples of operations to create a new page for a second primary tumor after page 311 (for the primary tumor of the upper right lung) is populated with data. Referring to fig. 3D, in operation 340, the data input interface 300 may provide a drop down menu 342 upon detecting that additional tumor mass listed in the new diagnostic procedure is selected. Drop down menu 342 includes an option 344 that allows the user to designate a newly added tumor mass (ascending colon mass) as a new primary tumor. Referring to fig. 3E, in operation 350, upon detecting a selection designating a newly added tumor site in the colon as a new primary tumor, the data input interface 300 may create a new page 352 for the primary tumor of the ascending colon in addition to the page 311 for the primary tumor at the upper right lung lobe. The enriching module 234 may also add the standardized term "adenocarcinoma" to the primary tumor site information of page 352 as a supplement to the ascending colon. In addition, fields 302-310 of page 352 are populated with information from page 311, such as a new diagnostic program added back in operation 326 of FIG. 3C. As a result of operation 340, the data collection module 230 may create a first data structure for the primary tumor site in the upper right lung lobe and a second data structure for the primary tumor site in the ascending colon, each data structure including a set of tumor diagnostic information, treatment history, and biomarkers, as part of the structured patient data 202 for the patient.

After creating page 352 for the second primary tumor site (ascending colon), certain diagnostic results of page 311 (primary tumor for upper right lung lobe) may be associated with the second primary tumor site. For example, referring to fig. 3F, the diagnostic results of page 311 include information 360 of additional tumor mass in the upper lobe of the right lung. In operation 362, the data input interface 300 can detect the selection of the information 360 and output a menu 364 that includes an option 366 to associate additional tumor mass with the second primary tumor site ascending colon. Upon detecting selection of option 366, data collection module 230 may move information 360 into page 352 for the second primary tumor site to indicate that the additional tumor mass of the upper right lung is the result of metastasis at the second primary tumor site of the ascending colon.

3. Adding various categories of medical data

A patient summary view 370 of portal 220 is shown in fig. 3G. The patient summary view 370 is a view of a graphical user interface for viewing and modifying patient data. The patient summary view 370 includes an add button 372. In response to detecting a user interaction with the add button 372, an add data modality 374 is displayed. The add data modality 374 may be a web page element displayed in front of other page content. The add data modality 374 may deactivate page content outside of the add data modality 374 when displayed. The add data modality 374 includes a list of data types and data categories in which data can be entered and stored. The data types and data categories are shown in fig. 3G, including allergen, biomarker, environmental risk, family history, current medical history, drug, metastatic site, tumor treatment, radiation, surgery, systemic antineoplastic agent 375, tumor profile, physical stamina, primary cancer, social history, stage, substance use history, and surgical history. The data category may include a data type within the data category. For example, in this example, the radiation, surgery, and systemic antineoplastic agents 374 are data types in the tumor treatment data category. The data types and data categories shown in the add data modality 374 may correspond to data objects stored in a mapping structure in a unified patent database, where the data types and data categories map out and sort corresponding data elements. For example, the data objects may include a patient-root data object 1201 that maps to associated data objects, including a tumor-mass data object 1202, a diagnostic findings data object 1205, a treatment data object 1208, and a history data object 1210, as depicted in fig. 12. This data schema facilitates the display of the patient summary view, and information entered via the patient summary view can be used to modify data in a unified patient database, as described further below in section IV.

Each of these data types and data categories may correspond to a different set of configuration data fields. In response to user interaction with one of the displayed data types or data categories, portal 220 may transition to data input view 380, including data fields corresponding to the selected data type, as depicted in FIG. 3H. As shown in fig. 3G, cursor 376 indicates user interaction with displayed data-type systemic anti-tumor drug 375. Upon hovering, systemic antineoplastic agent 375 is highlighted. Clicking on the systemic anti-tumor drug 375 causes the interface to transition to the data entry view 380, including data fields corresponding to the systemic anti-tumor drug 375.

FIG. 3H illustrates a data entry view 380 of portal 220 in accordance with some embodiments. The data input view 380 may be used to receive medical data of a patient via the portal 220. The data is stored to a unified patient database in the patient record, which may be organized in a data map, with data elements (e.g., entered into an interface) mapped to each other based on the configured data type, as shown in fig. 12. Menu 382 includes a set of fields that can accept user input to manually provide information corresponding to the respective fields. These fields may include a drop down menu from which the type of treatment, the primary cancer, the status or the outcome may be selected, as well as fields configured to accept typed user input, such as number of cycles, start date, end date, responsible party, and additional description. In response to detecting a user interaction with save button 384, the system saves the data entered into the field. For example, the data elements entered into each field may be saved to the unified patient database 204, sorted based on the data type corresponding to that field.

B. Interface for managing ingestion of data from unstructured reports

Fig. 4A-6D illustrate examples of interfaces for managing data for unstructured reports. Fig. 4A-4C illustrate examples of document extraction interfaces for importing information from a report file. Fig. 5A-5D illustrate examples of operations for retrieving data from a report using an extraction interface. Fig. 6A-6D illustrate different examples of interfaces for retrieving fields from a report.

1. Retrieving data from a report

In addition to manually entering data, portal 220 also allows a user to import a document file 244 (e.g., pathology report, doctor record, etc.) from patient data source 240, from which data extraction module 232 may retrieve various structured medical data. Fig. 4A, 4B, 4C illustrate examples of a document extraction interface 400 that may be part of portal 220.

FIG. 4A illustrates a document extraction interface 400 that may be used to guide a user in validating or updating data retrieved from a document. As shown in fig. 4A, the document extraction interface 400 includes a document catalog 402, a document browser 404, and a retrieved medical data portion 406. The document catalog 402 may display a list of selectable icons, including an icon 407 that represents documents to be selected (or documents that have been selected) for medical data retrieval and extraction operations. In addition, the document browser 404 may display the selected document. As described below, the document extraction interface 400 may highlight the portion of the document from which medical data was retrieved from the document browser 404, which allows the user to track the source of the retrieved medical data. The retrieved medical data portion 406 may include a report page 408 and a results page 410. Report page 408 may include a list of metadata retrieved from the selected document, including, for example, document name 408a, report date 408b, and document type 408c. Results page 410 includes a set of fields corresponding to a set of data categories to be retrieved from a selected document or entered by a user. In some examples, results page 410 may be part of a patient summary as described in fig. 3A-3H.

As described above, the set of fields included in the results page 410 may be defined based on a master Structured Data List (SDL) 246, which the data extraction module 232 may select based on the document type 408 c. Fig. 4B and 4C show examples of data categories to be retrieved for different document categories. Fig. 4B shows an example results page 411 of a pathology report that provides information about cancer diagnosis. As shown in fig. 4B, various categories of data may be extracted from the pathology report, including diagnostic information 412, staging information 414, and additional description 416. Further, the diagnostic information 412 may include various fields such as, for example, tumor site information 412a, histological type 412b, histological grading 412c, biomarker information 412d, etc., while the stage information 414 may include various fields to describe tumor stage. In addition, fig. 4C shows an example results page 420 of a cytological report that provides information regarding examination of cells from a patient's body. As shown in fig. 4C, various categories of data, such as tumor site information 420a and biomarker information 420b, may be retrieved from the cytological report. The data categories shown in FIG. 4B may be defined based on the SDL 246 selected by the data extraction module 232 that selects based on the document type 408c of the selected document indicating that the document is a pathology report, while the data categories shown in FIG. 4B may be defined based on the SDL 246 selected by the data extraction module 232 that selects based on the document type 408c of the selected document indicating that the document is a cytological report.

2. Capturing the result

Fig. 5A, 5B, 5C, and 5D illustrate example operations of the document extraction interface 400 on pathology reports. The document extraction interface 400 may be used to guide a user in validating the data type of data to be integrated into a unified patient database, such as in fields automatically populated using machine learning. Referring to fig. 5A, the data extraction module 232 may parse text strings of a selected document (e.g., obtained from an Optical Character Recognition (OCR) process of the document) and detect text strings containing data to be retrieved, including metadata, data, and various categories of medical data. The data extraction module 232 may then populate corresponding fields in the report page 408 and the results page 410 with the retrieved data. The data extraction module 232 may also cause the document browser 404 to display highlighting indicia, such as highlighting indicia 502, 504, 506, 508, 510, and 512. Highlighting marker 502 may correspond to text indicating a document type 408c (e.g., pathology report), while highlighting marker 504 may correspond to text indicating a reporting date, both of which may be retrieved from metadata of the pathology report. Fields 520 (reporting date) and 522 (document type) of results page 420 are then filled with reporting date and retrieved document type 408c, respectively.

Further, highlighting marker 506 may correspond to text describing the procedure involved (e.g., lumpectomy of the right breast), highlighting marker 508 may correspond to text describing clinical data (e.g., a right breast tumor of 2.5cm was noted via diagnostic mammography, fine needle penetration (FNA) of the right breast tumor was made)), highlighting marker 510 may correspond to text describing the right breast tumor (e.g., a single soft tissue fragment received in formalin), and highlighting marker 512 may correspond to details of microscopic examination of the right breast tumor (e.g., tumor size of 1.9x1.6x1.4 cm). Fields 524 (e.g., program tab), 526 (e.g., clinical data tab), and 528 (tumor size tab) of results page 420 are then filled with text highlighted by highlighting markers 506, 508, and 510, respectively. Additional display effects may also be provided to display links between fields and the highlighted portions of the document. For example, in FIG. 5A, based on user selection of field 524, highlighting marker 506 may be surrounded by a line boundary, and the line of field 524 is also emphasized to indicate the correspondence between field 524 and the data covered by highlighting marker 506. After the user confirms the filled data and activates the publish button 529, the data access module 236 may release the data to the unified patient database 204.

The data extraction module 232 may detect text containing medical data and retrieve the medical data from the text based on various techniques. For example, the detection may be based on Natural Language Processing (NLP) operations on text included in the document file, rule-based retrieval operations, and so forth. As another example, the data extraction module 232 may detect selection and dragging actions on a document via the document browser 404, and the detection may be based on text selected by the user. Upon detecting a text string containing medical data, the data extraction module 232 may determine the data category of the medical data and its associated data value, and the enrichment module 234 may convert the data value to a normalized and/or normalized value or provide an option to include the normalized/normalized value for selection by the user. The NLP and rules may be obtained from training operations based on other medical documents that include tags for data categories that are ground truth. For example, the document used for training may include the text sequence "breast, right side, lumpectomy" labeled program, which allows the data extraction module 232 to determine those texts also refer to the program in the document shown in fig. 5A. As another example, the document used for training may include one text sequence "total tumor size" followed by another text sequence recording tumor size. This allows the data extraction module 232 to determine that the text sequence "1.9x1.6x1.4 cm" with the highlighting marker 512 represents the size of the tumor. The enrichment module 234 may then convert the data values to normalized and/or normalized values, if desired. For example, if the text sequence under the highlighting marker 512 is "1.9x1.6x1.4 m," the enriching module 234 may determine that the unit (meter) is not a standard unit and may replace the unit with another unit (e.g., centimeter (cm), millimeter (mm), etc.) established as a standard unit.

The data collection module 230 may then populate fields in the results page 410 with extracted and/or normalized values for the corresponding data category. In some examples, the population of fields may be automated based on a mapping between the data class and the fields defined in the SDL 246. In some examples, the population of fields may be based on a user's selection.

FIG. 5B illustrates an example sequence of operations performed on the document extraction interface 400 to select text from the document browser 404. Referring to FIG. 5B, operation 530 begins with displaying a document in document browser 404. In operation 532, the document browser 404 may receive a selection and dragging action to select a portion of the document for a data extraction operation, and display a highlighting 533 to display a range of selection and dragging actions and the portion of the document is selected at a given point. In operation 534, the document browser 404 may receive a click action from the user indicating that the selection and dragging actions are complete and the selected document portion is confirmed. The document browser 404 may then display a border 535 around the highlighting 533 to indicate that the selected text is to be processed by the data extraction module 232 to retrieve medical data. In operation 536, field 526 of results page 410 may receive a click action from the user indicating that highlight 533 is mapped to field 526 and that the medical data retrieved from the document portion under highlight 533 fills field 526. The document extraction interface 400 may also display a line 537 in the field 526 to indicate that the field is being selected to map to the highlighting 533. After completion of the population, in operation 538, the document browser 404 may remove the boundary 535 from the highlighting 533 and the line 537 from the field 526. When the user determines the mapping, the display of the border 535 and the bar 537 allows the user to easily visualize which highlighted portion of the document maps to which field in the results page 410, which may help the user track the mapping decisions and reduce mapping errors, particularly if portions of the document are mapped to multiple fields, as shown in FIG. 5A.

FIG. 5C illustrates an example of operations on the document extraction interface 400 after text in a highlighted portion of the document is mapped to fields in the results page 410 to help a user track the data sources in the fields. As shown in FIG. 5C, in operation 540, the document extraction interface 400 detects a click action on the field 526. The document extraction interface 400 may display a line 537 in field 526 when a click action is detected. In addition, the document browser 404 may also automatically scroll to the highlight 533 and display a border 535 around the highlight 533 to indicate that the text in field 526 is from the highlight 533. Further, in operation 542, the document extraction interface 400 detects a click action on the highlight 533. The document extraction interface 400 may display the boundary 535 around the highlighting 533 when a clicking action is detected. In addition, the retrieved medical data portion 406 may also automatically scroll the results page 410 to field 526, also indicating that text in field 526 is from highlight 533.

In some examples, the data extraction module 232 may automatically detect text that may include medical data and retrieve the medical data from the text, as described further below with reference to fig. 15. Based on the SDL 246, the enrichment module 234 can determine one or more candidate data values for the retrieved medical data for the particular field. The document extraction interface 400 may then provide the candidate data value as an option to be selected for use by the user in this field.

Fig. 5D illustrates a series of operations on the document extraction interface 400 involving automatic detection of text. The document extraction interface 400 may guide the user in providing or validating information for populating a unified patient database with structured data. As shown in fig. 5D, in operation 550, the data extraction module 232 detects the text "cm" (centimeters) and causes the document browser 404 to display a highlighting 552 and a border 554 over the text "cm" to indicate that the text has been processed by the data extraction module 232. As a result of the processing, field 556 of results page 410 may display a drop down menu 558 that includes two candidate values "cm" and "mm" (millimeters) selected by the user. The document extraction interface 400 may display a bar 560 in a field 556 to indicate that the field maps to text under the highlighting 552. In operation 570, the document extraction interface 400 may receive a selection of the candidate value "cm" to populate the field 556.

Further, referring again to FIG. 2, medical data processing system 200 may support oncology workflow application 222. Oncology workflow application 222 may determine what data medical data processing system 200 is to collect to support oncology workflow, which in turn may determine the fields displayed in results page 410 and the categories of data to receive. In addition, as described below, oncology workflow application 222 may analyze the collected medical data and generate analysis results 224.

3. Retrieving data from a report

Fig. 6A-6D illustrate additional examples of interface views for retrieving and ingest data from unstructured reports, according to some embodiments. As shown in fig. 6A and 6B, different types of reports are associated with different fields that may be automatically filled in by the system using machine learning, filled in by the user via a side-by-side view of the display fields and reports, or a combination of both.

These reports may come from an external system, such as EMR. Some of the information used to ultimately generate the patient history interfaces of FIGS. 9A-9E and the patient summary interfaces of FIGS. 8A-8B may come from structured forms of EMR. Other times, information is embedded in the report. The information embedded in these reports may not be available for visualization or analysis because it is not in the structured fields. Using the interfaces of fig. 6A-6D, a list of data fields is displayed to the user allowing the user to enter information in the structured data set. The user may manually enter some information when viewing the report, which may be automatically used to populate fields when ingested in a structured form, and/or use machine learning to scan the document and match the information to the corresponding fields.

All information from these different sources can be consolidated into one place, i.e., a unified patient database. The data may come from external sources, such as EMR, may be manually entered, and/or machine learning, such as NLP, may be used to suggest values that may be presented to the user for confirmation. All of this data is consolidated and enriched in medical data processing system 200.

Fig. 6A shows an interface view 600 including a report 602 alongside a data entry panel 603. The data entry panel 603 includes a set of fields 604-620 that are identified by the system based on the type of report that has been identified, which may be stored in the report itself, such as document type 606. As shown in fig. 6A, report 602 is a surgical pathology report that is associated with a particular set of fields corresponding to a surgical pathology. As shown in the example of fig. 6A, these fields may be accessed via a drop down 604 labeled report information. Other selection mechanisms besides drop down lists may be used. These fields include document type 606, document title 608, report ID 610, report date 6012, sample collection date 614, sample collection method 616, author 618, and analysis site 620. As described above with reference to fig. 3G and 3H, each of these fields may correspond to a data category or data type for sorting and managing patient data. Based on the fields, the provided data may be stored to corresponding data objects in the data map. This may also include the data object of the report itself. Examples of such data objects are depicted in fig. 12 and 13, and described below with reference to fig. 12 and 13.

In the example interface view 600 depicted in FIG. 6A, the fields are configured to accept user input via interface elements including drop down menus 606-616, text input fields 618, and radio buttons 620. Interface view 600 displays report 602 side-by-side with data entry panel 603 so that a user can easily enter information to fill in fields while viewing the report. For example, drop down 606 may be populated with reports of each possible type that have been previously configured for the system (e.g., radiological reports, pathology reports, etc.). The user may click on the drop down 606, look at the possible report types, and select a surgical pathology report, which is then used to populate the corresponding object at the back end.

Once the user has entered information, save button 622 may be activated and in response to detecting a user interaction with save button 622, the entered data is saved to unified patient database 204.

Fig. 6B shows an interface view 625 that includes a report 626 alongside a data input panel 627. The data input panel 627 includes a set of fields 628-644. These fields may be identified by the system based on the type of identification of the report. For example, an MRI report may be expected to have certain fields, while a mammography examination report may be expected to have other fields. As described above, the fields of a given report may be identified based on a master Structured Data List (SDL) 246 that defines a list of data categories for the document types of the document file 244.

In the example depicted in fig. 6B, information has been retrieved from an external system, such as an EMR that includes structured data. Some of the information has been provided in structured form in the EMR or other external system. This information can be analyzed and associated with the report (e.g., by matching report metadata with structured data when the data is retrieved from the EMR). The interface may include an indication that data corresponding to certain fields is received from the EMR and that a given report is associated with the fields. In some implementations, data associated with the report by information retrieved from a trusted source, such as EMR, may be locked for editing, but the user may fill out the missing pieces of information. The UI shown in fig. 6B facilitates augmenting or enriching the data sets retrieved from the external system by allowing a user to add lost information for incorporation into medical data processing system 200.

As shown in fig. 6B, report 626 is a radiological report that is associated with a particular set of fields corresponding to radiology. As shown in fig. 6B, these fields are accessible for viewing through a drop down 629 labeled report information. These fields include report type 628, report header 630, report ID 632, report date 634, sample collection date 636, sample collection method 638, author 640, and anatomical site 644.

In the example depicted in FIG. 6B, fields 628-640 are highlighted. The fields can be highlighted with a particular color and indicate that the system has retrieved data from an EMR or other external database that fills the fields. Such fields may be locked for editing by the user. The field 644 is depicted as white, meaning that it should be filled in manually by the user. The dispensing location is a diagnostic task and may be most appropriate for a user such as a doctor. The user may select radio button 642 to select an existing or create a new. In the example depicted in fig. 6B, "select existing" has been selected and an anatomical region drop-down menu is displayed with which the user can interact to select an existing anatomical region. Alternatively, the user may choose to create a new button and then will display a text entry field to enter the name of the new anatomical site.

FIG. 6C shows interface view 645 including report 646 alongside data input panel 647. Data input panel 647 includes fields identified by the system based on report 646. As shown in fig. 6C, these fields may be accessed via a drop down 604 labeled report information. The first set of fields 650, 654, 656, and 658 are configured to be filled in via user input (e.g., as described above with reference to fig. 6A and 6B). Once the user has entered information, save button 659 may be activated and in response to detecting a user interaction with save button 659, the entered data is saved to unified patient database 204.

In the example depicted in fig. 6C, the second set of fields 652 is depicted highlighted. Highlighting may be a different color than that used to highlight the fields shown in fig. 6C to indicate a different state of the data filling these fields. The highlighted field 652 corresponds to the highlighted field 648 in the report 646. These are fields that suggest the use of machine learning, which can be audited and confirmed or altered by the user. In some implementations, the fields are configured to display data automatically retrieved from the report 646. One or more machine learning models, including Optical Character Recognition (OCR) and Natural Language Processing (NLP) models, may be used to identify text data for the report, analyze the report, and identify data corresponding to certain fields. Medical data processing system 200 may utilize models that have been trained on labeled data that identify different terms associated with a given predetermined field. In the example shown in fig. 6C, the biomarker has been automatically detected by the system. Medical data processing system 200 may populate data elements detected using machine learning. The user may be prompted via an interface for confirmation and in some cases may modify the data elements that populate a given field. Over time, medical data processing system 200 may learn and update machine learning models for detecting data. Using these techniques, the system may provide suggestions to reduce the data entry burden on the user. Techniques for applying machine learning to capture and categorize medical data are described in further detail in PCT publication WO 2021/046536 entitled "automated information capture and enrichment in pathology reports using natural language processing" filed on 9/8 in 2020, which is incorporated herein by reference.

Fig. 6D shows a set of interface elements depicting a data entry workflow 660 that may be performed using interfaces such as those depicted in fig. 6A-6C. The interface elements depicted in fig. 6D include interface element 662 for filling in primary tumor information, interface elements 666, 668, and 669 for reading primary tumor information, and interface elements 670 and 674 for editing primary tumor information.

In fig. 6D, interface element 662 for filling in primary tumor information includes a set of fields for accepting user input of information associated with the primary tumor. These fields include interface elements for adding information about the anatomical site (e.g., upper right lung leaf, selected from a drop down menu when the "select existing" radio button is selected). These fields further include histological type and histological grading. The user may fill in the diagnostic information. Interface element 662 further includes a user-selectable checkbox 663 that can be checked to set the diagnosed primary tumor to the patient's condition for discussion. As indicated by the cursor and highlighting on the save button 664, the user may click on the save button 664 to save the entered information to the unified patient database 204.

In fig. 6D, interface elements 666, 668, and 669 for reading the primary tumor information display the information input via interface element 662. In interface element 666, the most recently entered information is temporarily highlighted. In interface element 668, after 5 seconds (or another suitable time frame), the entered information is no longer highlighted. In interface elements 666 and 668, the diagnosis is marked as pending diagnosis. In interface element 669, the primary tumor is not marked as pending diagnosis and a pending diagnosis flag is not present.

In fig. 6D, interface elements 670 and 674 are used to edit primary tumor information. The user may interact with an interface element, such as interface element 666, for reading primary tumor information. As shown in interface element 670, the cursor is clicking on the highlighted primary tumor diagnosis. Highlighting remains until editing is complete. Upon clicking, the color enters focus and the edit drawer 674 opens. Editing drawer 674 is an interface element, such as a modality that opens upon detection of a user interaction, such as a click. Editing drawer 674 includes fields for accepting user input to edit previously entered information (e.g., via interface elements 662). The components of the drawer 674 include data entry fields that may be used to edit fields such as date, diagnosis, pending diagnosis 676, anatomical site 680, and histological type 682. Editing drawer 674 further includes radio button 678 to select an existing or create a new anatomical region. These interface elements may be used to retrieve data to update data stored to the unified patient database 204. The retrieved data may additionally or alternatively be used to train a machine learning model for automatically retrieving data from a document (e.g., if the medical data processing system identifies that the model has been incorrectly populated with fields based on user modifications, this may be used to update training data for the model).

C. Interface for reconciling unmapped data

Fig. 7A and 7B illustrate examples of interface views for reconciling unmapped data in accordance with some embodiments. Coordination may be initiated if the data is not mapped to a data field deemed necessary, such as the association of a cancer mass with a primary cancer (e.g., as a new primary cancer or as a metastasis of another primary cancer). The interfaces depicted in fig. 7A and 7B may be used to manage this coordination process, prompting the user to enter the necessary information, even after the record is stored for the cancer mass in question. For example, missing information may be marked in the unified patient database 204 for reconciliation, prompting the workflow described below.

For data from an EMR or other source system, the relationship between data elements may be lost for certain data elements. In one example, the patient has two primary cancers and one metastatic site. Primary cancer and metastatic sites have been retrieved from the report by EMR, but the primary nature associated with the metastatic site is unknown. The clinician may know that information is not retrieved from the EMR. For such use cases, the system has coordination of unmapped data functions.

In reconciliation, some data has been extracted, but the system still needs where in the UI the data belongs, e.g., to which primary condition the data should be mapped. The coordination UI may prompt the user to provide input to associate the particular anatomical site with, for example, the correct primary cancer. For a given primary cancer, certain fields may be associated with the primary cancer. The coordination UI prompts the user to associate different types of information (such as primary sites) with relevant observations (such as histology, biomarkers, stage, and metastasis sites), which are unique to the primary cancer or other data elements. The coordination UI may also be used to map certain medical interventions, such as tumor treatment or non-tumor surgical history, or certain drugs, e.g., anti-tumor or non-cancer.

In some cases, an external system, such as an EMR, will provide information indicative of the primary cancer and the location of the primary cancer metastasis. In this case, the association is known, and no additional work may be required. In other cases where coordination is required, the external system either does not capture the association or does not send this information to medical data processing system 200. If such an association is desired, medical data processing system 200 may use a coordination process to determine where to display the transfer (e.g., for the right or left breast) in an interface such as a patient summary or a patient lineage view. To be able to present the information in a clinically accurate manner, the coordination process enables the user to provide guidance as to where to display the associated, which can affect the data mapping applied in the unified patient database 204.

In some cases, coordination may be triggered when an external database, such as an EMR, sends data related to a particular site but information indicating other sites associated with the site is lost. Using the coordination interface, the user may provide information to associate a site with a particular cancer, and after updating, the site will begin to be displayed in association with the correct primary cancer in the interface view and in the unified patient database. In other cases, reports may be received from an external database without any structured information, in which case multiple granularity details may be lost. Such details may be provided by a user via an interface such as that depicted in fig. 7A.

FIG. 7A illustrates an interface summary view 700 that includes data reconciliation elements 702, 704 and 706. In some implementations, in the interface summary view 700, a data reconciliation element 702, such as a button or drop-down menu, is provided for interacting with unmapped data for reconciliation. In the upper right hand corner of the screen, the data reconciliation element 702, an "unmapped" button, allows the user to open an uncoordinated item that is not associated with any cancer, or otherwise lose mapping information. The user may provide data specifying the missing relationships and save the updated data. When the user coordinates the data, the data will then begin to appear in portal 220.

User interaction with reconciliation element 702 may trigger the display of data reconciliation elements 704 and 706. The data reconciliation element 704 includes notifications that are displayed in a conspicuous manner (e.g., highlighted and displayed with a warning sign). In the example depicted in FIG. 7A, the notification displayed in the data reconciliation element 704 states that "we do not have enough information to place these items in the patient summary and lineage view. "information about items that need reconciliation" is displayed in the data reconciliation element 706. In this example, the cancer mass, iliac crest structure, lost the necessary information, not added to the patient summary and lineage view. The data reconciliation element 706 further provides additional information about the cancer mass-the "right" and "obtained from the 11/27 integration in 2020". Such information may be retrieved from a unified patient database according to the data type mapped therein (e.g., the data obtained from the integration date may be based on a timestamp, and the right side may be based on a location data type). Upon user interaction with the data reconciliation element 706, the interface can transition to the interface view depicted in FIG. 7B for reconciliation.

FIG. 7B illustrates an interface view 720 for data reconciliation. In some implementations, a report 721 and a drawer 723 (e.g., having a modality for accepting elements of user input) are included in interface summary view 720 for accepting data for reconciliation. Included in drawer 723 is a header 722 labeled "map anatomy". The drawer 723 indicates that the missing information to coordinate is to correlate the iliac crest structure 726 with the primary carcinoma or metastasis, or to mark it as benign. Drawer 723 also includes an alert 724 similar to data reconciliation element 704 described above with reference to FIG. 7A. The drawer 723 of interface view 720 further includes a set of check boxes that the user can use to associate the iliac crest structure 726 with a particular primary cancer or transition, or to mark the iliac crest structure 726 as benign. In some embodiments, the unified patent database stores patient records having subjects corresponding to different cancer sites. Based on the anatomical site map established using interface view 720, the objects for iliac crest structure 726 may be correspondingly linked to other objects. For example, if the iliac crest structure 726 is marked by a user as metastasis of right breast cancer, the iliac crest structure object will link to the right breast cancer object. The received iliac crest structure as a primary, metastatic, or benign designation may be stored to a unified patient database in association with a "performance" data type in a data object for a tumor mass, as further described below with reference to fig. 12.

As shown, possible choices include setting to the primary or metastasis of a new primary cancer, which may trigger the display of additional interface elements for establishing the new primary cancer. Possible choices further include setting the iliac crest structure as a primary site. This will cause the iliac crest structure to be stored as a primary cancer object in a unified patient database, which will have its own set of linked objects, as shown in fig. 12. Alternatively, the iliac crest structure is configured as a pre-established metastasis of cancer-right breast cancer or left breast cancer. This will cause the iliac crest structure to be stored in a unified patient database as a subject linked to a type-transferring subject and to another data subject corresponding to the selected primary cancer. Another check box is provided to mark the iliac crest structure 726 as benign, which will cause it to be hidden in the summary. In this case, the iliac crest structure 726 may be stored in a unified patient database as an object linked to benign type objects and not linked to any object corresponding to the primary cancer. Once the user selects the association of cancer, update button 730 will be activated. The user may interact with the update button to trigger the system to store the provided coordination data to the unified patient database 204. The data schema for storing the data objects responsive to the selected anatomical region map will be described in further detail below in section IV with reference to fig. 12 and 13.

D. Patient portal interface

Fig. 8A, 8B, 8C, 9A, 9B, 9C, 9D, 9E, 10, and 11 illustrate examples of portals 220 that provide a focused view of patient data. Portal 220 may display various interface views including a patient summary interface view as shown in fig. 8A-8C, a patient history interface view as shown in fig. 9A-9E, a report interface view as shown in fig. 10, and a care quality metrics interface view as shown in fig. 11.

1. Patient summary interface

8A-8C illustrate examples of patient summary interface views according to some embodiments. The patient summary interface displays summary data of the patient. The patient summary interface view may be used to display data enabling a user to view detailed information about different primary cancer sites and tumor summary information, as well as provide an initiation console for data entry and coordination via portal 220.

Referring to fig. 8A, portal 220 may display the patient's current diagnosis (lung adenocarcinoma), diagnosis date, last visit record, upcoming visits, and current treatment. Portal 220 may receive structured patient data 202 from unified patient database 204, either manually entered via data input interface 300 or automatically retrieved and extracted from the medical report by data extraction module 232.

Fig. 8B illustrates another embodiment of a patient summary interface 800 of portal 220. The patient summary interface 800 displays summary information about a particular patient, which may be retrieved from the unified patient database 204 for display. The top functional area 801 may display patient information such as the patient's name, age, date of birth, gender, and identifier of the patient.

The patient summary interface 800 includes a primary cancer element 802. The primary cancer element 802 includes tabs 803A and 803B corresponding to different primary cancers, breast cancers (in tab 803A), and lung cancers (in tab 803B). In primary cancer element 802, information about each primary cancer is described, including events, related biomarkers, stage, and metastasis sites.

In fig. 8B, the patient summary interface 800 further includes a tumor summary element 804, a tumor treatment element 806, and a medication element 808, displaying information about each element. The patient summary interface includes a patient history element 810 that displays patient history information including medical history, surgical history, family history, and social history.

Patient summary interface 800 also includes user-selectable elements that may be used to navigate to other interface views. The patient history element 811 may be selected to transition to the patient history view shown in fig. 9A-9E. Reporting element 812 may be selected to transition to reporting view 1000 as shown in FIG. 10. Unmapped data element 813 may be selected for conversion to the coordinated view depicted in fig. 7B. The add element 814 may be selected to transition to a view for adding data (e.g., directly into the portal 220 or through an upload report). A summary element 815 is also included and may be used to transition from other views to a patient summary view. Thus, the patient summary view may be used to transition to various views of portal 220. The primary information element 805 may be selected to cause a modality to be displayed overlaid on the patient summary view 800 and including additional information about one or more primary cancers, as shown in fig. 8C. Based on the selected view or mode, data is retrieved from the unified patient database 204 according to the mapping of data connections and types therein.

Referring to fig. 8C, an example of a patient summary view 820 having two modalities 822 and 824 corresponding to the two primary cancers is shown. In response to detecting a user interaction with the primary information element 805, in this example, two modalities of primary cancer associated with the patient are displayed. The modality 822 (e.g., first modality) is directed to right breast cancer and includes information about right breast primary cancer that includes a set of related biomarkers with a time stamp. Modality 824 (e.g., second modality) is directed against left breast cancer and includes information about the primary cancer of the left breast, including a set of related biomarkers with a time stamp.

The first modality and the second modality are displayed side by side in a graphical user interface. Advantageously, the side-by-side view allows the user to view more detailed information about multiple primary cancers at a time without navigating away from the summary interface screen. Furthermore, since each modality corresponds to a different primary site, data can be efficiently retrieved for each site, so that side-by-side analysis can be provided. This sort of database (e.g., the data schema described in section IV) enables such data retrieval and visualization via a graphical interface.

In some embodiments, the unified patient database stores data objects corresponding to right breast primary cancer and data objects corresponding to left breast primary cancer. These data objects are linked to various other data objects that are time stamped and describe different events associated with the primary cancer. For example, a right primary cancer subject is linked to one or more biomarker data subjects, and a left primary cancer subject is linked to one or more biomarker data subjects. In response to detecting a user interaction with the primary information element 805, the system queries a unified patient database to identify right and left breast primary cancer data objects. The objects linked to each primary cancer data object are identified based on a mapping between the primary cancer data objects and the corresponding sub-data objects identified in the unified patient database. Information associated with the identified link object is retrieved. The identified link objects are used to populate modalities 822 and 824 with the retrieved information, as further described below with reference to method 1800 of fig. 18.

The right and left breast cancers may be different cancers located in different parts of the body, which are unrelated. The patient summary view 820 may be used to display information associated with each primary cancer, with different information about each primary being displayed side-by-side. Using the interface view 820, a clinician can view information about multiple primary cancers and compare information such as diagnosis, date of onset, location of each primary site, key biomarkers for the patient, stage, and any metastasis. This may be accomplished by sorting the data according to the primary cancer designation using the proprietary data schema described herein, which may be acquired to display the interface view 820 of the primary side-by-side.

2. Patient history interface

9A-9E illustrate examples of patient lineage interface views according to some embodiments. The patient history interface displays data associated with the patient in a chronological manner. Using the patient lineage interface depicted in fig. 9A-9E, the progression of cancer and available information about cancer can be viewed in an organized and chronological manner. The user can click on an object in the timeline to see how the cancer evolves.

The patient lineage interface view can display patient information from suspected cancer to diagnosis, treatment planning, monitoring, survival, and the like. Cancer care is multidisciplinary and multi-institutional in nature. In general, patient information may be dispersed among different systems. By retrieving and integrating information from reports and other data retrieved from different sources, the medical data processing system may construct inter-institution patient histories that enable a user to view the overall patient histories across data points collected by different service providers and service types.

Once this information appears in the patient history, any other user who is interested in the patient history can see the information depicted in exactly the same way. This is advantageous from a care collaboration point of view. Using the prior art, excessive reliance on medical records results in different providers often taking different information about what the patient is happening based on different recording styles. Thus, it is difficult to know the reality of the patient using existing systems. The patient history interface shown in fig. 9A-9E addresses these and other problems by allowing any user to view a unified view of the patient's treatment history. Patient history may be reviewed and populated by a cross-role team, such as a radiologist, a physician, a surgeon oncologist, and/or an attending physician. These users will be able to interact with the patient summary UI and view patient data in a user-friendly, unified manner to observe and understand the evolution of patient conditions such as cancer.

To display the patient lineage view, the system can receive data identifying the patient (e.g., patient ID, name, etc.) via a graphical user interface. Based on the data identifying the patient, the system may retrieve medical data associated with the patient from a unified patient database and display a user-selectable set of objects in a timeline via a graphical user interface, the objects including a plurality of categories organized by rows, the categories including pathology, diagnosis, and treatment, as shown in fig. 9A-9E. Techniques for populating a patient lineage view are further described below with reference to method 1700 of fig. 17.

Referring to fig. 9A, portal 220 may display a timeline view of patient history. The timeline view may display various laboratory test and imaging results over time, as well as diagnostic results provided by oncology workflow module 222.

Fig. 9B is another embodiment of a patient history interface view 900. The patient history interface view 900 of portal 220 includes a summary function area 902 and an adjustable timeline 908.

The summary function area 902 may be a function area displayed above the timeline. Summary function area 902 may display a subset of objects marked as important and associated information. The user can bookmark an object to be displayed in summary function area 902. Given that patient history may be long and include many objects, summary function field 902 is useful for locating important objects forefront. When the object is no longer important, the user can also remove the object from the bookmark, which will be removed from the functional area and disappear. The summary function field 902 itself may be used as a mini-patient history to display key objects that occur on the patient.

The adjustable timeline 908 includes information about the patient's tumor medical history. The adjustable timeline 908 displays information chronologically, with older objects to the left and newer objects to the right. The time period of display may be controlled with start and end date elements 904 and 905 and a scroll bar 906 that is adjustable to select a time window in which the object is displayed in the timeline 908.

The information in the timeline 908 is displayed in a set of rows corresponding to different data categories, including events 910, pathology 912, diagnostic imaging and procedures 914, treatments 916, biomarkers 918, and response assessment 920. For each category, the associated information may be color coded (e.g., orange for events, red for pathologies, etc.). Each row may display information about the patient collected in the corresponding data category. A given row may include multiple entries at a given time, as shown in fig. 9B. For example, in event 910, multiple events of the month of January correspond to two different cancers.

Event 910 data categories include events (e.g., categories of display objects) that correspond to information about the progress of the cancer itself. For example, event 910 includes breast cancer, invasive 922, and date 1 month 2020. This may correspond to the date that the primary cancer was diagnosed and added to the patient record. If the user clicks on event 922, the system will display additional information about event 922, as shown in FIGS. 9C and 9D. Event 910 line may show the diagnosis of cancer for each different cancer the patient has, as well as the progression of the cancer. For example, as shown in fig. 9C, event 922 is the location of right milk invasive ductal carcinoma at the 6 o' clock position. When the user clicks on these different items in the interface view 900, the user will be able to see how the cancer evolves. For example, when cancer is just started, there is no metastasis. The user may scroll the timeline to see if the cancer has metastasized elsewhere after one or two years, it will be visible in the particular box. Thus, the patient history interface view 900 allows the user to see the progression of cancer over time. This information may be retrieved from tumor mass data objects 1202 and/or cancer status data objects in a unified patient database according to the data schema depicted in fig. 12.

Pathology 912 data categories include objects corresponding to pathology reports, displayed in chronological order. If multiple reports are associated with one date, multiple pathology reports may be displayed in a stacked manner. The pathology report may be associated with event 910, for example, for diagnosing a particular cancer mass. Examples of pathology reports include biopsy reports, cytology reports, genomic reports, surgical resection reports, and the like. Via the patient lineage interface view 900, a user can go deep into a specific pathology report to discern information such as how much cancer has spread, what size is, what stage is, key biomarkers tested from the obtained samples, and so forth. This information may be retrieved from diagnostic findings data objects 1205 in a unified patient database according to the data schema depicted in fig. 12.

Diagnostic imaging and procedure 914 data categories include subjects corresponding to diagnostic imaging such as MRIs, CT scans, and the like. For example, the diagnostic and imaging procedure 914 includes an MRI 924 beginning on 14 days 1 month in 2020, and so on, as shown in FIG. 9B. The objects are displayed in chronological order of the objects. These objects may be linked to diagnostic imaging reports such as MRI reports or certain lesions. A clinician viewing this information should be able to see that, on a given date in the timeline, an MRI was performed on the patient and the results of the MRI were deeply reviewed by opening the report. For example, if the user clicks on MRI (924) on day 1, 14 of 2020, the report may be opened directly from the patient lineage interface view 900. Advantageously, the user does not need to navigate to another system to find a report, which is required without the techniques of this disclosure. This information may be retrieved from diagnostic findings data objects 1205 in a unified patient database according to the data schema depicted in fig. 12.

The treatment 916 data category includes a subject corresponding to a treatment administered to a patient. As shown in fig. 9B, the treatment may span months. This information may be retrieved from treatment data objects 1208 in a unified patient database according to the data schema depicted in fig. 12.

The biomarker 918 data class includes objects corresponding to biomarkers associated with a patient. This may include genomic markers, diagnostic markers, prognostic markers, therapeutic markers, and the like. These biomarkers may originate from various types of reports, but are handled similarly in the system. For example, the biomarker may be from a cytological report, a genomic report, or the like. These various types of biomarker objects are shown in biomarker 918 rows. This information may be retrieved from diagnostic findings data objects 1205 (e.g., molecular/biomarker objects) in a unified patient database according to the data schema shown in fig. 12.

Response assessment 920 the data category includes objects corresponding to an assessment of the clinician's response to the patient. At each step of disease management, the clinician will evaluate the patient's tumor status and clinical condition to determine the effect of the treatment and decide whether and how to continue with the current treatment plan. And determining the therapeutic effect is based on complex decisions of elements such as clinical response, radiological response, molecular response, and serological response. The patient lineage interface view 900 concentrates it into a single icon view that is very telegraphing on a timeline so that the clinician can view it chronologically according to treatment, scan, and other data. For example, a physician may notice that the patient has received a certain number of cycles of a particular medication and radiation therapy, and that the patient has a partial response. Using the patient history view, if the clinician wants to record the progress of this particular cancer at any time, the user can input an assessment of the response, such as whether the response is partial, whether the cancer is stable, how the patient feels, any adverse events, whether the cancer has progressed smoothly, and so forth. In some implementations, the patient history is completely read-only, except for answering this field. One key effort of oncologists is the ability to manage toxic side effects and monitor the response of patients when receiving therapy, so that in many cases response assessment by a clinician is critical.

In fig. 9C, a cursor 923 hovers over the breast cancer invasive element 922. This would cause a system expansion view so that additional text is visible in breast cancer, invasive element 922-ductal carcinoma, right milk (6:00).

In fig. 9D, if the user subsequently clicks on the breast cancer, invasive element 922, a pop-up 927 will be displayed to display more information such as date, location, etc., as shown in fig. 9D.

In fig. 9E, the adjustable timeline 908 has been adjusted (e.g., via a slider) to display different time windows. In fig. 9E, a time window from 10 months 1 in 2020 to 1 month 1 in 2021 is shown. Thus, via user interaction with the GUI, the user can scroll through the movement of the slider 906 to display different objects in the timeline to view the timeline for a longer period of time or focused on a period of time of interest.

In some implementations, the report can be previewed from the patient summary view. The system may detect user interactions with objects displayed in the patient summary view, then identify and retrieve corresponding reports from the unified patient database, and display the reports via the graphical user interface (e.g., as a pop-up on the patient summary view). The user may navigate to the report view to obtain a more detailed view of the report.

The patient history view may be used to see how the patient's cancer evolves over time. For example, for the first time, the patient has one primary cancer site (e.g., in the example shown in fig. 9A). In the second time, a primary is still visible in the patient history view. At the third time, two different origins can be seen (e.g., in the example shown in fig. 9B). Thus, in this particular example depicted in fig. 9B-9E, two primary cancers, left and right breast cancers, are shown in the patient history.

3. Report viewing interface

FIG. 10 illustrates an example of a report interface view 1000 in accordance with some embodiments. Report interface view 1000 includes a list 1001 of reports associated with the patient. One of the listed reports 1002 has been selected and the report 1003 is displayed on the right hand side. An interface element is also provided that a user can click on to open a complete report.

In some implementations, the patient history, summary, and reporting tab at the top (1005) may be used to navigate between the respective interface views. For example, in the patient history view, the system detects user interaction with the summary tab and transitions to the summary view, displaying tumor summary data.

4. High-quality nursing measurement interface

Fig. 11 illustrates another interface view for displaying quality care metrics. As shown in fig. 11, portal 220 may display quality of care metrics, such as tumor practice quality advocates (QOPIs) at times of different patients. Metrics may be calculated based on structured patient data 202 at different points in time.

Example schema of unified patient database

Fig. 12 and 13 illustrate example data schemas for structuring data stored to a unified patient database. The patient summary and patient history interface described above is implemented by retrieving interconnected data elements associated with the patient, which are time stamped and hierarchically bound together. These data elements are dynamically updated and enriched. This may be accomplished using a proprietary data schema of a unified patient database. FIG. 12 shows examples of different types of data objects connected together in a patient data map. FIG. 13 illustrates an example of a particular data object that may be stored and modified.

A. Data schema for patient data elements

Fig. 12 illustrates an example data schema 1200 of patient data elements according to some embodiments. Using data schema 1200, the different data elements retrieved by medical data processing system 200 are decomposed and used to generate discrete data objects (also referred to as data entities). The data object stores various data elements associated with patient data. The relationships between these data entities are maintained and updated. This data schema allows the system to continuously maintain up-to-date pictures of the patient and store detailed data elements in a structured manner. In some implementations, the data pattern is based on HL7 FHIR (fast healthcare interoperability resource) criteria, as described in "welcome use FHIR," https:// www.hl7.org/FHIR/(2019)).

In this example, the data schema 1200 includes a set of data objects (illustrated in the block, e.g., tumor mass data object 1202, diagnostic findings data object 1205, etc.). In a unified patient database, one patient record may be stored for each patient. The patient record includes a network of interconnected data objects, each of which may include a set of configured data elements corresponding to a data type of a given data object. Each of the boxes depicted in fig. 12 are data objects that can be implemented as resources using the HL7 FHIR standard. Alternatively, the data objects may be implemented as tables using a relational database.

As shown in FIG. 12, each data object has associated attributes in the form of a set of data types corresponding to the class of data objects. For example, the data schema 1200 includes a tumor mass data object 1202 that can store data elements corresponding to data types such as history, anatomical sites, site descriptions, and performance, as shown in fig. 12. The data types may correspond to fields shown in the interface view (e.g., fields 606-620 shown in FIG. 6A, fields 628-644 shown in FIG. 6B, etc.).

Each data object, such as tumor mass data object 1202, diagnostic findings data object 1205, and patient root data object 1201, represents one clinical data entity. These data objects may be related to each other, which helps manage the map of patient data as a network of interconnected data objects. For example, a given data object may include information containing a data element such as a "colon," which information relates to a data type, such as a "location," of the corresponding characterizing or classifying data element.

In fig. 12, a line 1220 connecting data objects indicates the relationship between elements. For example, the cancer status data object must be linked to the patient data object and one or more tumor mass data objects, and may optionally be linked to one or more tumor therapy data objects. The circle 1222 indicates what may be optional, a single solid bar 1224 indicates a one-to-one relationship, a V-shaped symbol indicates a one-to-many relationship, and the circle and V-shaped symbol (e.g., for the intermediate connector of report 1204) indicate a zero or more relationship. Links (connections) between objects may be specified in a unified patient database in various ways. For example, a master list may be stored for a patient record that identifies each object linked to another object. The direction of the link may be specified, for example, creating a report of the tumor mass.

Root data object 1201 is a data object for a patient and may include information such as the patient's name, date of birth, gender, and identifier. As indicated by connection 1220, root data object 1201 is connected to various other data objects corresponding to tumor data of the patient. The data objects may be categorized according to diagnosis, treatment, history, or other suitable categories of data. Each of the other data objects may be bound back to the patient root data object 1201. Information from the patient-based data object 1201 may be displayed along the top of the interface view of the portal 220 (e.g., patient function area 801 of FIG. 8B). The patient root data object 1201 may be used to identify and traverse patient data records to identify additional information for display and editing via portal 220.

Various data objects corresponding to data types are connected to a root data object 1201 for a patient. Each data object is related in some way to the patient-dependent data object. For example, the diagnosis-related data object 1203 is a data object for describing diagnosis information for a patient. Each diagnosis-related data object 1203 is connected to a patient-dependent data object 1201. The diagnosis discovery data object 1205 is a data object corresponding to a diagnosis connected to the tumor mass data object 1202. This includes various types of diagnostic discovery data objects 1205, including TNM staging data objects, molecular/biomarker data objects, tumor size data objects, and other pathology/imaging discovery data objects, as shown at the top of fig. 12.

Each of these data objects may store a corresponding data element of the configuration data type. For example, the tumor-size data object is configured to store data elements corresponding to a data type maximum dimension, an additional dimension, a unit, and a date. Another pathology/imaging findings data object may be configured to store data elements corresponding to a data type, value, and date. The findings may be any type of information about the anatomical site, obtained from one or more samples of the site, which may be derived from a report. For example, findings such as histological grading may be extracted from pathology reports; from the imaging report, findings such as tumor size, etc. can be retrieved. In the data pattern 1200, findings are typically associated with a particular location, although some findings may be directly related to the cancer condition itself rather than a particular location. For example, cancer stage may be defined at a higher level than at individual sites. In the patient summary UI shown in fig. 9B-9E, these diagnostic data objects correspond to the data category events 910 displayed in the top row, with one example being a cancer diagnostic event 922.

Another data object in the diagnostic 1203 category is a tumor mass data object 12002. The tumor mass data object stores data elements characterizing the tumor, organized according to data type histology, anatomical site, site description, and performance. For example, the tumor mass data object 1202 includes a structured field for the data type "representation" that indicates whether a tumor is a primary tumor, a metastatic tumor, or a benign tumor.

For multiple tumor blocks, there may be separate data objects connected to the root data object 1201. There may be multiple instances of a tumor mass object, each instance corresponding to a different tumor mass identified at a different location in the patient. Tumor masses can be designated as primary cancers or metastases, which will affect network interconnections with other objects. Thus, a data object may include a data object corresponding to a primary cancer, another data object corresponding to metastasis of the primary cancer, and so on. As shown in fig. 12, for each tumor mass, the data object may include information such as histology, anatomical location, location description, and performance. The data object is linked to various other diagnostic related data objects 1203, including cancer status, diagnostic findings 1205, and reports 1204.

One or more treatment-related data objects correspond to treatments and are connected to the patient root data object 1201 and/or the tumor mass data object 1202. The treatment-related data objects include a tumor treatment data object 1208. The tumor therapy data object 1208 is configured to store data elements of type therapy type, date, response, and may be linked to an associated report. The tumor therapy data object 1208 may be used to populate the therapy 916 lines of the patient history interface of fig. 9B-9E.

One or more report data objects 1204 may be connected to the patient root data object 1201 and/or the diagnostic discovery data object 1205. Reports from EMR or other sources may also be stored as report data objects 1204. As shown in fig. 12, report data object 1204 is configured to store data type status, category, title, date, attachment. Report data object 1204 may include attachments or appendices in the form of PDFs or images. When changes are made to the patient's clinical files and medical records, an appendix will be published. They may include information that is not available at the time of initial input, or corrections to previously published medical information. It is important for a clinician to know whether a particular patient report is updated or added, and to view the complete report. Report data object 1204 may also include text data retrieved from the report (e.g., using OCR).

One or more history related data objects 1210 may be stored and connected to the patient root data object 1201 and/or the tumor mass data object 1202. The history-related data objects 1210 may include various different types of data objects having corresponding attributes, as shown in FIG. 12. For example, the data schema 1200 may include a medication data object, a complications data object, a family history data object, a surgical history data object, an allergy data object, a substance abuse data object, a physical fitness status data object, an environmental risk data object, a social history data object, and other historical discovery data objects, as depicted in fig. 12. History-related data elements of various data types, as shown in fig. 12, may be stored to history-related data object 1210. The data map as shown in FIG. 12 may be used to establish where corresponding data elements will be displayed in the various interface views.

In data schema 1200, each data object can be stored in association with one or more timestamps. The time stamp may track the time at which the event occurred. For example, a given data object may include a timestamp corresponding to a date and/or time of diagnosis, treatment, sample collection, procedure date, report issue, or other event. The time stamp may further track when the data is integrated into a unified patient database. For example, when data is stored to a unified patient database, medical data processing system 200 generates and stores a timestamp indicating the time at which the data was incorporated into the unified patient database.

B. Data pattern example

FIG. 13 illustrates an example data pattern 1300 according to some embodiments. The data schema 1300 includes data objects for different cancer sites. Cancer 1 1302 and cancer 2 1307 are primary cancers. Each of these is stored as its own data object with associated information such as staging, diagnostics, etc., stored to the data object.

At a first time T1, cancer 1 may be associated with a plurality of data objects in the data schema 1200 of the unified patient database shown in fig. 12, including tumor mass 1202. Other objects for findings associated with cancer 1 are linked to the tumor mass, including TNM stage objects, biomarker objects, tumor size objects, and the like.

Later, other sites can be found and associated with the primary cancer, such as cancer 1 or cancer 2. When a new site (tumor mass) is identified, a new tumor mass object may be linked in the data model. For example, as shown in fig. 13, the tumor 1 data object 1306 is stored in association with the primary cancer 1 data object 1302. The tumor 2 data object 1308 and the tumor 3 data object 1310 may be two data objects stored in association with the cancer 2 data object 1304. The tumor mass 1 data object 1306, tumor mass 2 data object 1308, and tumor mass 3 data object 1310 may correspond to multiple tumor mass objects 1202 linked to the same patient object 1201, as shown in fig. 12.

The example depicted in fig. 13 shows how the data pattern of fig. 12 may be used to process diagnostic patient histories that a cancer patient may experience, which may include various tests, imaging, and other diagnostics, with new information appearing over time. The data schema is set to be capable of being updated while maintaining complex relationships of different data types from different data sources at different times.

As shown on the right, each object of the bumps 1306, 1308, and 1310 has an associated data element for storing information such as location, size, liver, location, etc., and if the data is from a report, the report is also stored as a data element to the data object (e.g., reports 1312, 1214, 1316, and 1318 and associated data attributes that may be retrieved from the reports). For example, information is extracted from the report using the interfaces shown with reference to FIGS. 3A-7B and described above with reference to FIGS. 3A-7B. NLP can be used to assign data categories to data elements, which can then be used to populate the appropriate data objects.

Each of data objects 1306, 1308, and 1310 may correspond to three hypothetical points in time. Each point in time represents a time at which data filling the corresponding data object is obtained. For example, data object 1306 is populated with data from a radiation report PDF 1312 obtained at a given date, data object 1308 is populated with data from a pathology report 1314 obtained at a later time, and data object 1310 is populated with data from another pathology report 1316 obtained at a given date. Each of these may be ingested into a unified patient database at a different respective time and tracked with a time stamp stored in the respective data object.

For example, at a first point in time, a lung tumor is found in the patient's lung based on the radiation report 1312. At this point, other tests are waiting. The data pattern 1300 may be updated when additional information becomes available. The initial data object may correspond to an initial assumption about the patient diagnosis. For example, there are two centimeters of mass in the lung and one centimeter of mass in the liver. The primary diagnosis of input is the presence of primary lung cancer that may have metastasized to the liver. The doctor may send them to make additional examinations. In this example, report 1312 is connected to two tumors 1306 and 1308, indicating that when report 1312 is obtained, both tumors 1306 and 1308 are included in the radiological analysis and the corresponding data is retrieved.

At time two, when additional test results 1314 and 1316 are returned, more than two reports are added to data pattern 1300. Report 1314 is a pathological report related to tumor 1 1306. At time two, pathology report 1314 is ingested into the system, and the NLP is used to identify the data category corresponding to the data field retrieved from radiology report 1314 and populate the corresponding data objects, including tumor mass data object 1202 corresponding to tumor mass 11306 and the link data object corresponding to the relevant findings. Report 1316 is a pathology report related to tumor 11306. At time two, pathology report 1316 is also ingested into the system, and the NLP is used to identify the data category corresponding to the data fields retrieved from pathology report 1316, and populate the corresponding data objects, including tumor mass data object 1202 corresponding to tumor mass 2 1308 and the link data object corresponding to the relevant findings.

At time three, once the medical data processing system 200 retrieves the colonoscopy report 1318, additional colonoscopy findings are extracted from the report. This facilitates additional diagnostics by the user, such as confirming that the liver mass matches a colonoscopy found colon mass. A final picture of the patient diagnosis may then be created. In this example, the diagnosis includes lung cancer and colon cancer that has metastasized, with two different primary diseases at the same time.

The data patterns depicted in fig. 12 and 13 help represent all three states as snapshots in time, but also allow the user to alter the relationships between entities as new information from a new report becomes available. The data pattern provides a representation of the report itself, as well as a representation of individual findings extracted from the report. The data patterns also provide a representation of each cancer and anatomical site, as well as the attributes of those sites. Each data object is associated with one or more time stamps so that patient history of the patient can be tracked over time to better facilitate the clinical decision making process. The data schema links the locations, discoveries, and reports while allowing the locations to be related to pieces of up-to-date information. Some of these relationships may be modified individually without affecting the rest of the graph of data elements and attributes. When these associations are created, there will be a timestamp associated with the association. Thus, the data schema facilitates interface views that provide visibility not only for report creation time, but also for new associations, old associations, and time-varying visibility of the associations. The data schema can also track provenance information (e.g., who was where what was edited).

V. method

A. Medical data workflow overview

Fig. 14A-14D show an overview of oncology workflow for ingestion, modification and display of patient data. The workflow of fig. 14A-14D includes collecting and storing data to a unified patient database 1409 (e.g., the unified patient database 204 depicted in fig. 2). The data may include radiographic, procedural, and pathological findings relating to one or more primary tumors and their associated metastatic lesions, which may be updated through the course of cancer treatment and other aspects of patient history. The data may also be retrieved from the unified patient database 204 and displayed in a series of interface views that facilitate clinical patient management, care, and diagnosis. Fig. 14A-14D provide an overview of the operations described in further detail with reference to the methods of fig. 15-21.

In fig. 14A, data is collected and stored to a unified patient database 1409. First, a patient record is created, which may be generated via input from the user 1401 and/or the EMR integration 1401. The user may manually create a new patient at 1403. The EMR may send the selected patient to the system at 1404. The data may be from an external database of EMR or other data systems such as in a laboratory system or in a hospital. This may result in patient data such as patient identifiers and other data types that may be stored to a patient root data object 1201 as shown in fig. 12. The data collected at 1403 and 1404 includes patient data identifying the patient. If there is no pre-existing record for the patient, a new record is created.

The additional data may then be stored to a unified patient database, for example, as the additional data is collected and/or collected periodically. At 1408, the emr sends a report to the system. The system generates structured data from the report and sends the structured data to a unified patient database 1409 for storage in association with the patient record. This may be done using the interfaces shown with reference to fig. 4A-7B and described above with reference to fig. 4A-7B. At 1406, the user manually adds structured data stored to the unified patient database 1409. This may be done using the interfaces shown with reference to fig. 3A-3H and described above with reference to fig. 3A-3H. The data may be stored according to the data patterns described above with reference to fig. 12 and 13. Thus, the system can collect structured and unstructured data from different sources and store it in a unified manner in a unified patient database 1409.

The data stored to the unified patient database 1409 may include identifying information about the patient and the patient demographics. The data stored to the unified patient database may include structured data regarding a patient's diagnosis, medication, medical history, and the like. The data stored to the unified patient database may also include unstructured data such as pathology reports, imaging reports, clinical notes, and the like. For example, as shown in FIG. 12, data may be stored into data objects that map to each other, and may be updated and modified over time. As shown in FIG. 13, specific instances of these data objects may store the reports themselves in association with data that has been retrieved from these reports.

If all of the data stored to the unified patient database is in structured form, the data may be used to generate various analyses or visualizations, such as patient summaries and patient lineage views, as described above with reference to section III. Before this can be achieved, data enrichment operations are performed on data from the EMR or other external database/system.

In fig. 14B, data extraction of report 1412 is performed. As shown in fig. 14B, report 1412 may include a pathology report, treatment, and the like. At 1414, the user opens the report. The report may or may not contain structured data. The user may open the report for display. Based on the open report, the list of fields that can be populated with information in this report may vary. For example, as shown in fig. 6A, an interface 600 for data extraction displays a surgical pathology report and a corresponding set of fields associated with the surgical pathology report to be populated. As shown in fig. 6B, interface 625 for data extraction displays a radiation report and a corresponding set of fields associated with the radiation report to be populated.

Extraction occurs at 1416. Certain fields or medical concepts may be highlighted in the data extraction UI for the user to provide information such as diagnosis, records, etc. At 1418, the user fills in the missing information. Structured data maps to terms and is aided by OCR and NLP, where possible. The process generates structured data from the unstructured report and the structured data is persisted at 1419. Once the user has saved all of this information, it is immediately sent to the unified patient database. The data is enriched by adding more structured information extracted from the report and sending it back to the unified patient database.

In fig. 14C, further details regarding the data extraction process are shown. At 1421, the user extracts findings about the anatomical site from the report. At 1422, the system determines whether the site has been associated with any primary cancers. This may be accomplished via, for example, user input to provide or confirm an associated interface. If the site has been associated with a primary cancer, the system stores the anatomical site findings associated with the primary cancer at 1424 while preserving the existing association. If the site has not been associated with a primary cancer, at 1423 the system displays the anatomical site in the reconciliation area and allows the user to associate the anatomical site with the primary cancer. The flow then proceeds to the coordination process described below with reference to fig. 14D.

At 1425, the system determines whether the user wants to add/update the association. If the user does not want to add or update the associated, the add/update process is skipped. If the user does want to add or update the association, then at 1426, the system displays an associated menu for each anatomical region related finding. Via the association menu, the user may associate a site as a primary site for any one of the primary cancers, or allow the user to associate the site with a transition of one or more of the primary cancers. In some embodiments, at a given time, the anatomical site may be a primary cancer site of only one primary cancer. For various reasons, such as pending diagnosis or medical judgment, or not important to the progress of treatment of the patient, the anatomical site may be associated with multiple primary sites at any given time.

These associated updates have a number of options. At 1428, the site is previously marked as a primary cancer site, and thereafter, the site is marked as metastasis. The user is then allowed to proceed only if there is no stage associated with the primary at 1432. At 1433, the system displays the findings as metastasizing to the newly associated primary tumor and updates the data objects, including biomarkers and pathology/radiological reports, with the latest information about the findings. Any tracked biomarkers will be displayed accordingly. The system also allows a user to select and track which of a number of biomarkers are critical to the description of the cancer. The biomarker information may be pre-presented in the patient summary view. In addition to the patient summary interface views depicted in fig. 8A-8C, the patient summary interface may further include an interface view that displays relevant biomarkers and accepts user input to add, modify, or view in depth more detailed biomarker data.

At 1429, the location is marked, as depicted at 1427, with each discovery/location update associated. For example, one location was previously designated as a transition. This is updated so that the location is associated with the primary location. At 1434, the anatomical site is displayed in the metastatic portion of the newly associated primary cancer. The system moves the anatomical site and corresponding biomarker and pathology/radiological report findings to the correct primary. By using the connection of data objects as described above with reference to fig. 12 and 13, information, such as biomarkers and findings, etc., relating to different primary cancer objects can be stored. Any biomarkers will be displayed in association with the updated primary cancer subject accordingly. For example, if a stage is found associated with it, the new primary will be updated to that stage.

At 1430, the user retains the current association for the discovery. With the current association maintained, the system updates at 1435 any information about the discovery from the new report. Primary cancer association is preserved. Any tracked biomarkers or stage will be displayed accordingly. If the pathology report has any stage associated with the findings, the stage information may not be displayed in the patient summary unless the site is the primary site.

At 1431, the user marks the anatomical site as benign. If the user marks the anatomical site as benign, at 1436 the benign site disappears from the patient summary and patient history visualization because it is no longer relevant to cancer diagnosis. At 1437, the report exits and the interface transitions back to the patient summary view.

FIG. 14D shows a data visualization 1410 portion of a workflow. This may include retrieving data associated with the patient from a unified patient database and displaying a user interface, such as a patient history interface or a patient summary interface.

At 1442, data is retrieved from the unified patient database 1409 and displayed in the patient history. Patient records in a unified patient database are identified based on the patient's identifier. This may include a patient root data object 1201 that may be identified by querying a unified patient database to identify the patient object corresponding to the identifier. As shown in fig. 12, patient-based data object 1201 is mapped to various different data objects that may be time stamped and used to visualize patient history of the patient over time. Examples of patient history interfaces are shown in fig. 9A-9E.

At 1440, data is retrieved from the unified patient database 1409 and displayed in the patient summary. Patient records in a unified patient database are identified based on the patient's identifier. This may include a patient root data object 1201 that may be identified by querying a unified patient database to identify the patient object corresponding to the identifier. As shown in FIG. 12, patient-based data object 1201 is mapped to various different data objects, which may be time stamped and used to populate a patient summary interface. Examples of patient summary interfaces are shown in fig. 8A-8C. The user may update from the patient summary, for example, if the user wants to update the association from the transition section in the patient summary at 1444. This may trigger the display of the modified UI at 1446. If the site is created manually, no report will be displayed, only the associated UI will be displayed. If the location originates from a report, the report is also visible.

Data reconciliation occurs at 1450. The user may interact with the GUI to establish a missing relationship (e.g., associate an identified cancer mass with a particular primary location, etc.). The unassociated findings are marked for later reconciliation. In coordination, unmapped data is identified. In one example, the cancer mass does not specify an associated primary site. In another example, the surgical history record does not specify whether it is a tumor surgical history or a non-tumor surgical history. As another example, coordination may be used to identify the stage of cancer. Coordination may be used to identify missing relationships and fill out those missing relationships, as well as to determine where in the UI it is appropriate to display a particular piece of information. The coordination may be guided using the interfaces depicted in fig. 7A-7B.

B. Data management techniques

Fig. 15 illustrates a method 1500 of managing patient data from different sources in an integrated manner. Method 1500 may be performed by, for example, medical data processing system 200 of fig. 2. The method 1500 may be used to integrate structured and unstructured data from various sources into a unified patient database in a unified and organized manner so that the data may be used to generate useful visualizations as described herein.

In step 1502, the medical data processing system 200 creates a patient record for a patient in a unified patient database. The patient record includes an identifier of the patient and one or more data objects related to medical data associated with the patient. The identifier of the patient may be, for example, the patient's name, the patient's alphanumeric identifier, etc. As described above with reference to fig. 12, a unified patient database may store a plurality of different types of data objects that collate different types of medical data associated with patients. For example, the patient record may include data objects corresponding to tumor masses, data objects corresponding to treatments administered to the patient, and so forth.

The unified patient database includes data from multiple sources (e.g., data may be ingested into the unified patient database from EMR, RIS, user entries, wearable devices, etc.). As described above with reference to fig. 14A, patient records may be created via user input or information retrieved from an external database, such as an EMR. Creating a patient record may include generating and storing a data object, table, or other record for the patient. The stored data may include patient identifier, demographic information, date of birth, and the like.

In step 1504, the medical data processing system 200 retrieves medical records for the patient from an external database. The medical records may include unstructured data such as PDF or image formatted reports. Alternatively or additionally, the medical records may include structured data, such as tables. Medical records may be retrieved from one or more external databases including, for example, EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) including genomic data, RIS (radiology information systems), patient report results, wearable and/or digital technology, social media, and the like. The medical records may include information such as a name identifying a particular cancer block, a timestamp associated with the report, and other information, as described herein. In some embodiments, the medical data processing system 200 retrieves the medical record based on the identifier of the patient. For example, the medical data processing system 200 queries a unified patient database to identify records that include or are indexed by patient identifiers. Alternatively or additionally, medical data processing system 200 may periodically retrieve medical records (e.g., by downloading data from an external database in batches).

The medical records may include structured data and/or unstructured data. For example, the medical record for the patient is structured (e.g., in a first format). The structured data may include a set of data elements associated with a corresponding data type. The data elements may include a word or set of words corresponding to elements in the medical record, examples of which may include "right breast tumor", "MRI of day 1, month 5 of 2021", and so forth. Each data element may be marked and/or stored in association with a corresponding data type that characterizes the data element, such as "primary tumor," "treatment," and the like. Alternatively or additionally, the medical record for the patient is unstructured (e.g., in a second format). Unstructured data may include data elements of unspecified data types.

In some embodiments, the medical record includes unstructured data. Medical data processing system 200 may identify text from unstructured data, such as PDF or images. The medical data processing system 200 may apply a first machine learning model to identify text in a medical record. For example, the first machine learning model is or includes an Optical Character Recognition (OCR) model and recognizes text using OCR.

Medical data processing system 200 may apply a second machine learning model to associate a portion of the identified text with a corresponding field. Medical data processing system 200 may use a second machine learning model to identify data elements, such as a word or a set of words, and analyze unstructured data to assign the data elements to data types. For example, after identifying the data element "colon cancer", surrounding words and phrases themselves are analyzed to assign a data type "diagnosis" to the data element.

In some aspects, the second machine learning model is or includes a Natural Language Processing (NLP) model. The trained NLP model identifies the data type of text in the unstructured report (e.g., the NLP model determines that the text "1 st 10 th 2020" corresponds to the "date" data type/field, and the text "radiation" corresponds to the "treatment type" data type/field). Medical data processing system 200 may identify entities from input text strings using NLP, for example. The NLP model may identify entities corresponding to predefined medical categories and classifications, such as medical diagnoses, procedures, medications, specific locations/organs within the patient, and the like. This may in some embodiments use a named entity identifier trained on medical data to identify entities corresponding to the data type of interest. Each entity may be marked with a data type indicating a category/classification and specify a data element or value corresponding to the classified data. Medical data processing system 200 may then generate structured medical data that associates data types with data elements based on the mapping. Techniques for using machine learning to process unstructured medical data are described in more detail in PCT publication WO 2021/046536 above.

In some embodiments, medical data processing system 200 is communicatively coupled to a plurality of external databases/systems, including EMR, PACS, DP and the like. When these systems make changes to data associated with one or more patients managed by medical data processing system 200, the data is transmitted to medical data processing system 200. Medical data processing system 200 may periodically pull medical records from one or more external databases to periodically update a unified patient database.

In step 1506, medical data processing system 200 receives an identification of the primary cancer associated with the medical record via a Graphical User Interface (GUI). For example, the extraction process may be performed using an associated interface such as that shown in FIG. 7B. The user may associate the cancer mass identified in the report with a particular primary cancer location. In some embodiments, receiving an identification of a primary cancer associated with the medical record includes displaying the medical record and a menu via the GUI, the menu configured to receive user input selecting one or more primary cancers; and receiving user input selecting the primary cancer via the graphical user interface.

In some cases, such user selection is made during the course of the coordination process, as described above with reference to fig. 7A and 7B. For example, the medical records are stored in a patient record. The medical data processing system 200 parses the medical record to determine that the patient record is not associated with a particular primary cancer. The medical data processing system 200 displays medical records and prompts a user to coordinate data via an interface such as depicted in fig. 7A and 7B in response to determining that the patient records are not associated with a particular primary cancer.

Alternatively or additionally, the medical data processing system 200 receives an identification of a potential primary cancer associated with the medical record from an external database (e.g., EMR). In some cases, such identification received from the remote database may be confirmed via user input to the GUI. For example, medical data processing system 200 identifies a primary cancer by analyzing data elements and data types. For example, a particular data element (e.g., "left breast cancer") may be labeled as a data type, indicating that the data element corresponds to a primary cancer (e.g., "primary cancer"). In some embodiments, the data extraction module 232 retrieves medical data from a document file and maps the retrieved data to a specific primary cancer. The mapping may be based on a master Structured Data List (SDL) defining a list of data categories for document types of the document.

Medical data processing system 200 may display a GUI prompting the user to confirm the primary cancer identification (e.g., with pre-filled fields that may be highlighted and/or marked with text prompting the user to confirm or modify the primary cancer designation). Medical data processing system 200 may then receive a user confirmation of the primary cancer identification via the GUI. Alternatively or additionally, medical data processing system 200 may, in some cases, identify a primary cancer without user intervention. For example, the data elements may be stored to a unified patient database and marked in association with a data type (e.g., structured field) that indicates tumor "manifestation," as shown in fig. 12, which may indicate whether a tumor is a primary tumor, a metastatic tumor, or a benign tumor.

In some cases, identifying the primary cancer may include analysis of unstructured data by medical data processing system 200. For example, the medical records are received in an unstructured format including unstructured data. Medical data processing system 200 identifies data elements associated with a primary cancer from unstructured data and analyzes the unstructured data to assign the data elements to data types. This may be done using one or more machine learning models as described above with reference to step 1504.

In step 1508, the medical data processing system 200 stores the medical records linked to the primary cancer subject in patient records in a unified patient database. Storing the medical records may include storing the identified text in association with the identified fields to a unified patient database using the data patterns described above with reference to fig. 12 and 13.

In step 1510, the medical data processing system 200 receives medical data for the patient via user input to the GUI. This may be the direct input of medical data using the interface shown with reference to fig. 3A-3H and described above with reference to fig. 3A-3H. For example, the user may input therapy information, diagnostic information, information regarding primary cancer metastasis, etc., into corresponding fields of the GUI.

In step 1512, the medical data processing system 200 determines that medical data for the patient is associated with the primary cancer. For example, data entered into the GUI by the user may be entered into a field specified for the primary cancer. As another example, the data retrieved from the external database indicates that the medical data for the patient is associated with the primary cancer. The medical data processing system 200 may compare the fields received at 1510 with corresponding stored data elements in the unified patient database corresponding to the medical records retrieved at 1504.

In step 1514, the medical data processing system 200 stores the patient-specific medical data linked to the primary cancer object in a patient record in a unified patient database. The data elements may be linked using the data patterns described above with reference to fig. 12 and 13.

Data stored in a unified patient database can be efficiently retrieved and displayed for the user. For example, the medical data processing system 200 retrieves at least a subset of the medical data for the patient from a unified patient database. Via the user interface, the medical data processing system 200 causes display of the at least a subset of medical data for the patient for clinical decision making. Causing display may include displaying a user interface on a display component of medical data processing system 200 itself, or transmitting instructions that may be used by an external computing device to display the user interface. The displayed information is displayed in a user-friendly manner via interfaces such as those depicted in fig. 7A-10 to facilitate clinical decision-making.

C. Techniques for data management

Fig. 16 illustrates a method 1600 of managing a unified patient database using a data schema such as that depicted in fig. 12. The data schema can be used to manage patient data to facilitate efficient generation of interface views as described herein to facilitate clinical decision making and to facilitate output of structured medical data. Method 1600 may be performed by, for example, medical data processing system 200 of fig. 2.

In step 1602, medical data processing system 200 stores a patient record including a network of interconnected data objects to a unified patient database. As described above, the unified patient database may include data from multiple sources, such as data integrated from an EMR system, provided to an interface on a remote computer via user input, data collected from a wearable device, and so forth.

In step 1604, the medical data processing system 200 stores a first data object corresponding to a data element of a tumor mass for the primary cancer to a patient record in a unified patient database, the first data object including attributes specifying a location of the tumor mass. In some embodiments, the initial data is uploaded from one or more of a plurality of sources to a unified patient database. When a given data element (e.g., information corresponding to a particular field, such as information characterizing a tumor mass) is ingested into the system, the medical data processing system 200 creates a data object for storing the information. A data object may be created in response to data obtained from different sources. For example, a user may input data from a user interface. Some data may be automatically ingested from an external system such as an EMR. Additional structured data may be automatically extracted from the document (e.g., PDF) and verified by the user. The data object may also include one or more data attributes, including attributes that specify the location of a tumor mass (e.g., right lung, left breast, etc.).

In step 1606, medical data processing system 200 receives diagnostic information corresponding to the primary cancer from the diagnostic computer. Medical data processing system 200 may receive information that a physician has entered into a user interface provided by medical data processing system 200, for example, over a network. As a specific example, using a GUI such as that depicted in fig. 4B and 4C, a physician may enter diagnostic information such as findings and biomarkers. Such information may be collected in a structured manner based on input data fields of the GUI.

In step 1608, the medical data processing system 200 analyzes the diagnostic information to identify correlations between the diagnostic information and tumor mass. For example, this may involve traversing data received from the GUI. As a specific example, as depicted in fig. 4C, the GUI includes fields for tumor site information 420a and biomarker information 420 b. When the medical data processing system 200 receives data from such a GUI, it can determine that a tumor site (e.g., a primary tumor mass) is associated with a biomarker. Alternatively or additionally, the diagnostic information may be from unstructured reports and medical data processing system 200 may apply one or more machine learning models to identify data types and correlations, as described above with reference to step 1504 of fig. 15.

In step 1610, based on identifying the correlation between the diagnostic information and the tumor mass, medical data processing system 200 stores a second data object corresponding to the diagnostic information to a unified patient database, the second data object being connected to the first data object via a network of interconnected data objects. The second data object may include one or more attributes, such as stage, biomarker, and/or tumor size of the primary cancer. Medical data processing system 200 may store data objects connected to a first data object using the data schema described above with reference to fig. 12 and 13.

In step 1612, medical data processing system 200 receives therapy information corresponding to the primary cancer from the diagnostic computer. Treatment information may be received from the diagnostic computer in a similar manner to the diagnostic information, as described above in step 1606. For example, treatment information may be retrieved from a diagnostic computer via input to a GUI, analysis of unstructured reports, or other suitable means.

In step 1614, medical data processing system 200 analyzes the therapy information to identify correlations between the therapy information and tumor mass. Medical data processing system 200 may analyze the structured fields and/or NLP the text data, for example, in a similar manner as described above with reference to step 1608.

In step 1616, based on identifying the correlation between the therapy information and the tumor mass, medical data processing system 200 stores a third data object corresponding to the therapy information to the unified patient database, the third data object being connected to the first data object via a network of interconnected data objects.

Medical data processing system 200 may also receive and store patient history data such as surgical history, complications, medications, and other family history, as described above with reference to fig. 12. For example, medical data processing system 200 receives patient history data. Patient history data may be received from a diagnostic computer (e.g., via direct user input). Alternatively or additionally, patient history data may be received from an external computing system, such as an EMR. The medical data processing system 200 analyzes the patient history data to identify correlations between the patient history data and the tumor mass (e.g., in a similar manner as described above with reference to step 1608). Based on identifying the correlation between the patient history data and the tumor mass, the medical data processing system 200 stores a fourth data object corresponding to the patient history data to a unified patient database. The fourth data object is connected to the first data object via a network of interconnected data objects.

The medical data processing system 200 may also receive and store information about additional tumor masses, such as tumor masses at a metastatic site of a primary cancer, tumor masses associated with another primary cancer, and so forth. For example, the medical data processing system 200 receives tumor mass information from a diagnostic computer corresponding to a tumor mass at a metastatic site of a primary cancer. The user may enter the tumor mass information into the GUI or upload the document, and the data may be transmitted to the medical data processor via the GUI in a similar manner as described above with reference to step 1606. The medical data processing system 200 analyzes the tumor mass information to identify correlations between diagnostic information and tumor masses (e.g., in a similar manner as described above with reference to step 1608). Based on receiving the tumor mass information and identifying the first data object, the medical data processing system 200 stores a fifth data object corresponding to the tumor mass information, connected to the first data object via a network of interconnected data objects, to a unified patient database.

Medical data processing system 200 may then update the unified patient database. For example, medical data processing system 200 imports medical data from an external database. The external databases may correspond to one or more of EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) and/or RIS (radiology information systems), for example. In some examples, medical data processing system 200 parses imported data to identify specific data elements associated with a patient and a primary cancer. The medical data processing system 200 can, for example, parse structured data received from EMR or other sources to identify fields indicating that the data element describes treatment, tumor mass, or other types of medical data. The structured data can also record a primary cancer corresponding to the first data object (e.g., a field in the ingested structured data can indicate that a treatment was applied for the primary cancer). Medical data processing system 200 may then store the particular data element in association with the first data object (e.g., the sixth data object). For example, the sixth data object is linked to the first data object in a data schema similar to that depicted in FIG. 12.

As described above with reference to fig. 12, the data stored to the unified patient database may be indexed using time stamps. The time stamp may track the time at which the event occurred (e.g., the day and/or time at which MRI or treatment was administered or diagnosis was given). The time stamp may further track when the data is integrated into a unified patient database. For example, upon generating each of the first data object and the second data object, medical data processing system 200 generates a first timestamp stored in association with the first data object indicating a time of creation of the first data object. Medical data processing system 200 generates a second timestamp stored in association with the second data object indicating a time of creation of the second data object. These timestamps are then stored to the corresponding data objects and can be used to display the history of the database entries. The time stamps that track when each event occurs may further be used to generate a chronological visualization, such as the patient lineage views shown in fig. 9A-9E.

Data stored in a unified patient database can be efficiently retrieved and displayed for the user. For example, the medical data processing system 200 retrieves one or more of attributes, diagnostic information, and/or treatment information specifying the location of the tumor mass from a unified patient database. Retrieving attributes may include querying a unified patient database. In some aspects, medical data processing system 200 traverses connections between data objects to identify associated data objects. For example, medical data processing system 200 may identify a pointer from one data object corresponding to a tumor mass to another data object corresponding to a tumor mass treatment and retrieve treatment information therefrom.

The medical data processing system 200 may cause display of one or more of attributes, diagnostic information, and/or treatment information of a site of a designated tumor mass for clinical decision making via a user interface (e.g., such as the GUI depicted in fig. 8A-9E). For example, referring to fig. 8B, patient summary interface 800 displays the attribute "right milk 2:00 location" at 802 that specifies the location of a tumor mass. GUI 800 also displays diagnostic information such as stage and "invasive ductal carcinoma" on the left side. Patient summary interface 800 also displays treatment information, such as tumor treatment, at 806. Similarly, in the patient history views 9A-9E, information including tumor mass site information, diagnostic information, therapeutic information, and other information may be displayed in the timeline view. Causing display may include displaying the GUI on a display component of the medical data processing system 200 itself, or transmitting instructions that may be used by an external computing device to display the GUI. The displayed information is displayed in a user-friendly manner to facilitate clinical decisions, as the medical professional can view all information showing the patient's response over time in an organized manner at one location.

The data stored in the unified patient database can also be effectively provided to an external system, such as in the structured form of EMR. For example, medical data processing system 200 identifies data elements and data types associated with a patient from a unified patient database. Medical data processing system 200 transmits data elements and data types in a structured form to external systems. As described above, some data objects or data fields may be populated by integration, but such data may generally be unstructured or semi-structured. Using the techniques described herein, user and/or machine learning may add more detail or relationships between data objects (e.g., via coordination or extraction tools). This facilitates storing the data to a unified patient database with structured information (e.g., characterizing different data elements as different data types). Such structured data can then be used to send the structured data to an external system such as an EMR, if desired.

D. Techniques for displaying patient data via a patient lineage interface

Fig. 17 illustrates a method 1700 of displaying patient data via a patient lineage interface view, such as depicted in fig. 9A-9E, to facilitate navigation and presentation. The patient history view may provide a view of how the patient responded to the treatment over time, with different types of data sorted in rows, which helps the clinician better understand and manage the treatment of the patient. Method 1700 may be performed by, for example, medical data processing system 200 of fig. 2.

In step 1702, medical data processing system 200 receives data identifying a patient via a graphical user interface. For example, the user may enter a patient name or identifier in the portal, or select a patient identifier from a displayed menu.

In step 1704, medical data processing system 200 receives a user input selecting a mode of a set of selectable modes of a graphical user interface. For example, as shown in FIGS. 8A-10, various modes or interface views may be used, including a patient history mode, a summary view, and a report view. For example, the user selects a patient history view. The medical data processing system 200 detects that the user clicks on the patient history tab shown near the top of the interface 800 depicted in fig. 8B, causing the view to transition to the patient history view.

In step 1706, based on the identification data and the user input, medical data processing system 200 retrieves a set of medical data associated with the patient from a unified patient database. The set of medical data corresponds to the selected mode. For example, the set of medical data corresponds to a patient history pattern. The medical data processing system 200 may query a unified patient database to identify records for the patient (e.g., by identifying patient data records as shown in fig. 12).

Retrieving the set of medical data may include querying a unified patient database to identify patient records for the patient from the unified patient database. The patient record may include a patient object, such as root data object 1201 depicted in fig. 12. Based on the patient object, an object connected to the patient object is identified. Some or all of these objects may then be retrieved for display. For example, as shown in FIG. 12, a patient-dependent data object 1201 is connected to a number of different data objects, each of which may store various data elements. The patient lineage view can be configured to display some of this information, which the system can identify based on preconfigured data object types and/or elements. A list of data object types to be displayed in the patient history may be stored in a configuration file. Based on the object types in the list, instances of these object types may be retrieved, for example, as long as they have timestamps within a specified time window. For example, as shown in fig. 9A-9E, some data is not within the currently displayed time window and is not acquired for display at a given time. Other data, such as benign tumors, may be stored in a unified patient database, but not displayed in the patient history UI.

The retrieved data may include various data objects and elements described herein, e.g., with reference to fig. 12, for example, the set of medical data may correspond to (e.g., be retrieved from or associated with) treatment objects in a unified patient database that store treatment types, dates, and responses. The set of medical data may alternatively or additionally correspond to a diagnostic discovery subject in a unified patient database, the diagnostic discovery subject storing biomarker data, staging data, and/or tumor size data. The set of medical data may alternatively or additionally correspond to a history object in a unified patient database that stores surgical, allergy and/or family medical histories.

In step 1708, medical data processing system 200 displays a user-selectable set of objects in a timeline via a graphical user interface, the objects organized in rows, each row corresponding to a different category of a plurality of categories, the categories including pathology, diagnosis, and therapy. This may correspond to the patient history views shown in fig. 9A-9E. Medical data processing system 200 may retrieve this information from a unified patient database and use it to display a patient lineage view. For example, based on the object type defined above with reference to fig. 12, a corresponding row in the patient lineage interface is identified for a particular object. Based on the time stamp associated with the object, the object is placed at a particular time on the timeline identifying the patient lineage view in the row. As a specific example, biomarker objects are placed in a biomarker row at a particular time. This is repeated at 1606 for each element of the medical data retrieved, which may result in a GUI view such as depicted in fig. 9A-9E.

The graphical user interface further includes a functional area displayed above the timeline, the functional area displaying a subset of the objects marked as important. For example, as shown in FIG. 9B, there is a summary function field 902 that highlights key events. The summary function area may highlight key events in an easy-to-view location and the user may use the underlying timeline view to carefully and deeply view the events and their order of occurrence. This provides an improved user experience and may help facilitate clinical decision making by providing the user with key events and temporal views of the events.

The graphical user interface may receive user interactions to prompt for a display of additional information including a report. Reports can be viewed in detail or in simplified form. In some implementations, the user can hover over an object in the timeline (e.g., MRI 924 shown in fig. 9B), and the medical data processing system 200 retrieves and displays the report. Medical data processing system 200 may detect user interactions with objects in the set of objects, such as MRI 924 shown in fig. 9B. Medical data processing system 200 identifies and retrieves corresponding reports from a unified patient database. For example, as shown in fig. 12, the data record may include reports 1204 linked to different data objects, such as a patient root data object 1201 for the patient, a diagnostic finding 1205, and so on. The medical data processor may traverse such connections in the unified patient database to identify reports associated with the objects in the patient summary graphical user interface. The medical data processor may then display the report via a graphical user interface. This provides a convenient way to get a thorough understanding of the different objects displayed in the patient summary view.

From the patient lineage view, the user can switch to other available views, such as a patient summary view or a report view. For example, the graphical user interface further includes elements for navigating to a second interface view, such as selectable reporting element 812 and summary element 815 depicted in FIG. 8B. Medical data processing system 200 detects a user interaction with an element for navigating to the second interface view. For example, the second interface view is a summary view, as shown in fig. 7A and 8A-8B, and displays tumor summary data. As another example, the second interface view is a report view, displaying a particular report or list of reports, for example, as depicted in fig. 10.

E. Techniques for managing and displaying multiple tumor mass data

Fig. 18 illustrates a method 1800 of displaying patient data via a side-by-side tumor mass view such as depicted in fig. 8C for navigation and presentation. Method 1800 may be performed by, for example, medical data processing system 200 of fig. 2.

In step 1802, the medical data processing system 200 stores patient records to a unified patient database. The patient record includes a plurality of data objects including a first primary cancer data object storing data elements corresponding to a first tumor mass of the patient and a second primary cancer data object storing data elements corresponding to a second tumor mass of the patient. For example, one object may be stored in association with a primary cancer in the right breast, and another data object may be stored in association with another primary cancer in the right lung. As shown in fig. 13, the data schema used by medical data processing system 200 may include a plurality of subjects for a plurality of primary cancers, each subject having a respective data object, such as cancer 1 1302 and cancer 2 1304.

As described above, the unified patient database includes data from a plurality of sources, which may include EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems), RIS (radiology information systems), patient report results, wearable devices, social media websites, and so forth.

In step 1804, medical data processing system 200 presents and causes display of a graphical user interface. The graphical user interface includes a patient summary. As shown in fig. 8A-8C, the patient summary view may include information summarizing patient data in patient records in a unified patient database. The patient summary view may be displayed as described above with reference to fig. 17.

In step 1806, medical data processing system 200 detects a user interaction with an element of a graphical user interface. For example, the patient summary view displays information about a primary cancer, as well as elements for displaying more information about one or more primary cancers. As a specific example, the patient summary view shown in fig. 8B includes a box 802 with information about two primary cancers, "breast cancer" and "lung cancer," and an element 805 with which a user can interact to display more information. In some embodiments, there are elements configured to initiate display of information about multiple primary cancers. Alternatively, the graphical user interface may display a first element 805 when viewing a first primary cancer (e.g., breast cancer) and a second element 805 when viewing a second primary cancer (e.g., lung cancer). In this case, the user may click on each of the two buttons in turn.

In some embodiments, medical data processing system 200 identifies a plurality of primary cancers and displays information about each identified primary cancer. For example, medical data processing system 200 stores each tumor mass represented as an independent data object with structured data fields indicating tumor mass behavior, as shown in fig. 12 and 13. This data schema allows the medical data processing system 200 to count the number of primary or metastatic tumors as necessary.

In response to detecting the user interaction, the medical data processing system 200 retrieves data elements from the first and second primary cancer data objects of the patient record from the unified patient database in step 1808. The medical data processing system 200 may identify one or more primary cancer data objects based on the elements interacted with at step 1806 and query a unified patient database to retrieve data elements associated with the corresponding primary cancer data objects. In some embodiments, each tumor mass is represented as a separate data object having structured data fields indicating information about the tumor mass, such as its manifestation (e.g., primary, metastasis, etc.), as shown in fig. 12. This allows the medical data processing system 200 to identify primary cancer data objects in a unified patient database.

In step 1810, medical data processing system 200 presents a first modality corresponding to a first primary cancer of a patient and a second modality corresponding to a second primary cancer of the patient. Rendering the modality may include generating a graphic to overlay the current GUI (e.g., as a pop-up on the patient summary view).

In step 1812, medical data processing system 200 causes a display of the first modality and the second modality side-by-side in the graphical user interface. The side-by-side modality may include two popup windows overlaid on the patient summary view, as shown in fig. 8C. As shown in fig. 8C, the first modality and the second modality may provide a summary of key information about each primary cancer. The information displayed in the first modality and the second modality may include a set of time stamped biomarkers, staging information, and transfer site information, as shown in fig. 8C. Displaying the primary cancers side-by-side in a summary manner can help a clinician, such as an oncologist, see how multiple primary cancers progress simultaneously. Causing display of the modality may include displaying the modality on a display component of the medical data processing system 200 itself, or transmitting instructions that may be used by an external computing device to display the GUI.

F. Overview of diagnostic workflow

Fig. 19A and 19B illustrate examples of oncology workflows that may be implemented by the oncology workflow application 222. The goal of the workflow of fig. 19A and 19B is to maintain detailed planning tables regarding radiographic, procedural and pathological findings regarding the primary tumor and its associated metastatic lesions, which can be updated through the course of cancer treatment and other aspects of patient history. Primary tumors and metastatic lesions are considered target lesions. The measurement results may be captured as structured data, which allows for a more objective determination of tumor response or progression, as well as better notification of the determination of the clinical status of the patient. As the changes are found, they are recorded in an iterative manner. FIG. 19A illustrates an example chart 1900 showing changes in lesion size over time for different target lesions, which can be obtained from an example workflow.

Fig. 19B shows a flowchart 1901 of an example of a oncology workflow that allows an oncologist to select target lesions for monitoring and for response assessment. Referring to FIG. 19B, in step 1902, diagnostic program findings, discovered features, and comments/diagnostic interpretations of the findings by the program specialist are recorded based on the data received from the medical data processing system 200. In step 1904, it is determined whether the finding (e.g., lesion, etc.) indicates a primary tumor. If the lesion is neither a primary tumor (in step 1904) nor a metastasis (in step 1906), the iteration may end and then step 1902 is repeated at a later time to record new diagnostic program findings, features of findings, and comments/diagnostic interpretations of findings by program experts. If the lesion is a metastasis (in step 1906), the finding may be assigned as a metastasis in the data input interface 300 in step 1908. The assignment of the metastasis may be made in the patient summary page 311, as shown in fig. 3F, and the tumor data of the patient may be updated, then the iteration may be ended, and step 1902 may be repeated.

On the other hand, if the lesion is a primary tumor (in step 1904), then in step 1910 the lesion is assigned as the primary tumor via data input interface 300, as shown in operation 340 in fig. 3D. If the diagnosis is confirmed (in step 1912), the patient's tumor data may be updated. If the diagnosis is not confirmed (in step 1912), in step 1914, the "pending diagnosis" flag 321 of FIG. 3B may remain prompted. In both cases, the iteration may end and step 1902 may be repeated. Further, beginning at step 1910, in step 1920, it may be determined whether a biopsy has been performed. If the findings have been biopsied (in step 1920), then in step 1922, pathology findings may be recorded as part of the patient's structured data. Step 1902 is then repeated at a later time to record new diagnostic program findings, discovered features, and comments/diagnostic interpretations of the findings by the program specialist.

Fig. 20A and 20B illustrate a flowchart 2000 of another example oncology workflow. The oncology workflow of flowchart 2000 enables oncologists (and their representatives) to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up by taking advantage of the complete context of patient information. Referring to fig. 20A, in step 2002, the data collection module 230 may collect medical data of a patient suspected of having cancer via the portal 220. In step 2004, the oncologist may analyze the data to determine if the patient has cancer. If no cancer is confirmed (in steps 2006 and 2008), the oncology workflow may end. If, however, a cancer is identified, a determination is made in step 2010 as to whether the clinical findings suggest the presence of a single primary cancer.

Referring to fig. 20B, if the clinical findings suggest a single primary cancer (in step 2010), the biopsy and examination data may be analyzed in step 2012 to confirm the primary tumor. If there is no evidence of metastasis (in step 2014), then a conclusion may be drawn in step 2016 that the patient has a single primary cancer. On the other hand, if there is evidence of metastasis (in step 2014), and all metastases are associated with a known primary (in step 2018), then a conclusion can be drawn in step 2020 that there is metastasis from a single primary cancer.

If the clinical findings do not suggest a single primary cancer (in step 2010), or the metastasis is not associated with a known primary (in step 2018), the biopsy and examination data may be analyzed in step 2022 to determine if multiple primary sites are present. If the biopsy and examination data of step 2022 confirm that there is only one primary site, it may be determined in step 2020 that the metastasis is from a single primary cancer. But if the biopsy and inspection data of step 2022 does not confirm that there is only one primary site, the workflow may proceed to a different route. For example, if the clinical data suggests a primary unknown cancer (in step 2026) and the biopsy shows similar histology (in step 2028), it may be determined in step 2020 that the metastasis is from a single primary cancer. However, if the clinical data suggests a primary unknown cancer (in step 2026) and the biopsy shows a different histology (in step 2028), then the patient may be determined to have a primary unknown cancer in step 2030. Further, returning to step 2024, if the biopsy shows two histological cues of two different primary sites (step 2032), and the user marks two primary cancers in step 2034 (e.g., via the assignment of a tumor to a primary cancer site, as shown in fig. 3D and 3E), then in step 736 the patient may be determined to have two primary cancers. Certain diagnostic results (e.g., the discovery of tumor masses) may be associated with a second primary tumor, as shown in fig. 3F.

G. Method of processing medical data to facilitate clinical decisions

Fig. 21 illustrates a method 2100 of processing medical data to facilitate clinical decisions. Method 2100 may be performed by, for example, medical data processing system 200 of fig. 2.

In step 2102, medical data processing system 200 receives input medical data of a patient via a portal (e.g., portal 220). Patient data may originate from a variety of data sources (at one or more healthcare institutions) including, for example, EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) including genomic data, RIS (radiology information systems), patient report results, wearable and/or digital technology, social media, and the like.

In some examples, the portal may provide a data input interface that includes various fields for receiving input medical data, and may generate structured medical data based on a mapping between the fields and the data. The structured medical data may include various information related to tumor diagnosis, such as tumor location, stage, pathology information (e.g., biopsy results), diagnostic procedures, and biomarkers of the primary tumor as well as additional tumor locations (e.g., due to metastasis of the primary tumor). The portal may display the structured data in the form of a patient summary. The portal may also sort the display of structured data into pages, each page being associated with a particular primary tumor site and including information fields of the associated primary tumor site, and being accessible through a tab. Based on detecting user input to certain fields in the first primary tumor page (e.g., designating additional tumor sites as new primary tumors), the portal may create additional pages for the second primary tumor and populate the fields of the newly created page of the second primary tumor based on the additional tumor site information entered into the first primary tumor page. In some examples, the portal processor also allows the user to select additional tumor pieces found in the diagnostic program of the primary tumor and associate the pieces with a second primary tumor to represent the condition of metastasis. Based on detecting the association, the medical data processor may transfer all diagnostic results of the additional tumor from the first primary tumor page to the page created for the second primary tumor.

In some examples, the portal also allows a user to import a document file (e.g., pathology report, doctor record, etc.) from the data source described above. The medical data extraction module may then retrieve various structured medical data from the document file. The structured medical data may be retrieved based on, for example, natural Language Processing (NLP) operations, rule-based retrieval operations, etc., on text included in the document file. The medical data extraction module also allows for manual retrieval of structured medical data from the document file via the portal. In addition to the document file, the portal may also display the retrieved medical data.

For example, the portal may overlay the text of the file with highlighting indicia. The portal may also display a text box on the highlighted text that includes medical data retrieved from the text. In addition, structured medical data may also be retrieved from various metadata of the document file, such as the date of the file, the category of the document file (e.g., pathology report and clinician's record), the clinician who is writing/signing the document file, and the type of procedure associated with the document file content (e.g., biopsy, imaging, or other diagnostic step). The portal may then populate the various fields of the page based on the retrieved data. Various kinds of enriching operation can be carried out on the captured data so as to improve the quality of the captured medical data. One enriching operation may include a normalization operation to normalize various values (e.g., body weight, tumor size, etc.) contained in the captured medical data into a standardized unit to correct data errors or replace non-standard terms provided by the patient based on standardized terms of various medical standards/protocols, such as international disease classification (ICD) and medical System Nomenclature (SNOMED). The enriched captured medical data may then be stored in a unified patient database as part of structured medical data (e.g., structured tumor data) for the patient.

In step 2104, structured medical data is generated based on the input medical data. Generating structured medical data to support oncology workflow operations to generate diagnostic results including one of: patients not suffering from cancer, patients suffering from primary cancer, patients suffering from multiple primary cancers, or patients suffering from cancer with unknown primary sites. An example of a oncology workflow is depicted in fig. 19A-20B. Oncology workflows may also perform diagnostic operations based on structured medical data. In one example, a diagnostic procedure can be performed to confirm whether the biopsy results are for the same primary tumor or for different tumors, and to track the size of the primary tumor to assess the tumor's response to a particular treatment. In another example, a diagnostic operation may be performed to determine whether a patient has a single primary tumor site, multiple primary tumor sites, or an unknown primary site. The results of the diagnostic operations may then be recorded and/or displayed in time in a portal as part of the patient's medical patient history to enable oncologists or their representatives to longitudinally manage cancer patients suspected of having cancer through treatment and follow-up. The diagnostic results can also be used to support other medical applications, such as a quality of care assessment tool for assessing the quality of care administered to a patient, a medical research tool for determining correlations between various information of a patient (e.g., demographic information) and tumor information of a patient (e.g., prognosis or expected survival), and the like.

In step 2106, the portal may display a history of the patient's diagnostic results over time to enable clinical decisions to be made based on the diagnostic history. For example, the portal may display a timeline representing patient medical patient history, as shown in fig. 9A-9E, which may include a history of primary tumor size, a history of other diagnostic results, and so forth. This allows the clinician to make clinical decisions, for example, regarding treatment of the patient.

V. example computer System

Any of the computer systems mentioned herein may utilize any suitable number of subsystems. An example of such a subsystem in a computer system 2200 is shown in fig. 22. In some embodiments, the computer system comprises a single computer device, wherein the subsystem may be a component of the computer device. In other embodiments, a computer system may include multiple computer devices, each being a subsystem with internal components. Computer systems may include desktop and portable computers, tablet computers, mobile phones, and other mobile devices. In some implementations, a cloud infrastructure (e.g., amazon Web Services), a Graphics Processing Unit (GPU), etc., may be used to implement the disclosed techniques.

The subsystems shown in fig. 22 are interconnected via a system bus 75. Additional subsystems such as a printer 74, a keyboard 78, a storage device 79, a monitor 76 and other components coupled to a display adapter 82 are shown. Peripheral devices and input/output (I/O) devices coupled to I/O controller 71 may be connected by any number of means known in the art, such as an input/output (I/O) port 77 (e.g., USB,) Is connected to a computer system. For example, I/O port 77 or external interface 81 (e.g., ethernet, wi-Fi, etc.) may be used to connect computer system 10 to a wide area network, such as the Internet, a mouse input device, or a scanner. Interconnection through system bus 75 allows central processor 73 to communicate with each subsystem and control the execution of multiple instructions from system memory 72 or storage device 79 (e.g., a fixed disk, such as a hard drive, or an optical disk), as well as the exchange of information between subsystems. The system memory 72 and/or storage 79 may comprise computer-readable media. Another subsystem is a data acquisition device 85 such as a camera, microphone, accelerometer, etc. Any of the data mentioned herein may be output from one component to another and may be output to a user.

The computer system may include multiple identical components or subsystems, connected together, for example, through external interfaces 81 or through internal interfaces. In some embodiments, the computer systems, subsystems, or devices may communicate over a network. In this case, one computer may be regarded as a client and another computer may be regarded as a server, wherein each computer may be regarded as a part of the same computer system. The client and server may each include multiple systems, subsystems, or components.

Aspects of the embodiments may be implemented in a modular or integrated manner, in the form of control logic, using hardware (e.g., an application specific integrated circuit or field programmable gate array) and/or using computer software with a general programmable processor. As used herein, a processor includes a single-core processor, a multi-core processor on the same integrated chip, or multiple processing units on a single circuit board or networked. Based on the disclosure and teachings provided herein, one of ordinary skill in the art will know and appreciate other ways and/or methods of implementing embodiments of the invention using hardware and combinations of hardware and software.

Any of the software components or functions described in this application may be implemented as software code executed by a processor using any suitable computer language, such as, for example, java, C, C++, C#, objective-C, swift, or scripting languages, such as Perl or Python, using, for example, conventional or object-oriented techniques. The software code may be stored as a series of instructions or commands on a computer readable medium for storage and/or transmission. Suitable non-transitory computer readable media may include Random Access Memory (RAM), read Only Memory (ROM), magnetic media such as a hard disk drive or floppy disk, or optical media such as Compact Disk (CD) or DVD (digital versatile disk), flash memory, etc. The computer readable medium may be any combination of such storage or transmission devices.

Such programs may also be encoded and transmitted using carrier signals adapted for transmission via a wired, optical, and/or wireless network conforming to various protocols including the internet. As such, a computer readable medium may be created using a data signal encoded with such a program. The computer readable medium encoded with the program code may be packaged with a compatible device or provided separately from other devices (e.g., downloaded over the internet). Any such computer-readable medium may reside on or within a single computer product (e.g., a hard drive, CD, or entire computer system), and may reside on or within different computer products within a system or network. The computer system may include a monitor, printer, or other suitable display for providing any of the results mentioned herein to the user.

Any of the methods described herein may be performed in whole or in part by a computer system comprising one or more processors, which may be configured to perform steps. Thus, embodiments may be directed to a computer system configured to perform the steps of any of the methods described herein, possibly with different components performing the corresponding steps or groups of steps. Although the steps of the methods herein are numbered, the steps may be performed simultaneously or in a different order. Furthermore, part of the steps may be used together with part of the steps in other methods. In addition, all or part of the steps may be optional. In addition, any steps of any method may be performed using modules, units, circuits, or other means for performing the steps.

The particular details of the particular embodiments may be combined in any suitable manner without departing from the spirit and scope of the embodiments of the invention. However, other embodiments of the invention may be directed to specific embodiments relating to each individual aspect, or specific combinations of these individual aspects.

The foregoing description of the exemplary embodiments of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form described, and many modifications and variations are possible in light of the above teaching.

References to "a," "an," or "the" are intended to mean "one or more" unless specifically indicated to the contrary. Unless specifically indicated to the contrary, the use of "or" is intended to mean "comprising or" rather than "excluding or". References to "a first" component do not necessarily require that a second component be provided. Furthermore, reference to a "first" or "second" component does not limit the referenced component to a particular location unless explicitly stated otherwise.

All patents, patent applications, publications, and descriptions mentioned herein are incorporated by reference in their entirety for all purposes. None is considered to be prior art.

Claims

1. A method for managing medical data, comprising, by a server computer:

creating a patient record for a patient in a unified patient database, the patient record comprising an identifier of the patient and one or more data objects related to medical data associated with the patient, the unified patient database comprising data from a plurality of sources;

retrieving medical records for the patient from an external database;

receiving, via a Graphical User Interface (GUI), an identification of a primary cancer associated with the medical record;

In response to receiving the identification of the primary cancer, creating a primary cancer object in the patient record, the primary cancer object having a field including the primary cancer;

storing the medical records linked to the primary cancer subject in the patient records in the unified patient database;

receiving medical data for the patient via user input to the GUI;

determining that the medical data for the patient is associated with the primary cancer; and

the medical data for the patient linked to the primary cancer subject is stored in the patient record in the unified patient database.

2. The method according to claim 1, wherein:

the medical record for the patient is in a first format, the first format including a set of data elements related to a corresponding data type; and

receiving the identification of the primary cancer includes:

identifying the primary cancer by analyzing the data elements and the data types;

displaying the GUI including prompting a user to confirm a primary cancer identification; and

user confirmation of the primary cancer identification is received via the GUI.

3. The method of claim 2, wherein the medical record is a first medical record, the method further comprising:

receiving a second medical record for the patient, wherein the second medical record is in a second format comprising unstructured data;

identifying, from the unstructured data, a data element associated with the primary cancer;

analyzing the unstructured data to assign the data elements to data types; and

based on the assigned data type and identifying that the data element is associated with the primary cancer, the data element linked to the primary cancer object is stored in the patient record in the unified patient database.

4. The method of claim 1, wherein receiving the identification of the primary cancer associated with the medical record comprises:

displaying, via the GUI, the medical records and a menu configured to receive user input selecting one or more primary cancers; and

user input is received via the GUI selecting the primary cancer.

5. The method as in claim 4, further comprising:

storing the medical record in the patient record; and

Parsing the medical record to determine that the patient record is not associated with a particular primary cancer,

wherein displaying the medical record and the menu is responsive to determining that the patient record is unassociated with a particular primary cancer.

6. The method according to claim 1, wherein:

the medical record includes unstructured data; and is also provided with

The method further comprises:

applying a first machine learning model to identify text in the medical record; and

a second machine learning model is applied to associate a portion of the identified text with the corresponding field,

wherein storing the medical record further comprises storing the identified text in association with the field to the unified patient database.

7. The method according to claim 6, wherein:

the first machine learning model includes an Optical Character Recognition (OCR) model; and is also provided with

The second machine learning model includes a Natural Language Processing (NLP) model.

8. The method as recited in claim 1, further comprising:

retrieving at least a subset of the medical data for the patient from the unified patient database; and

the display of the at least a subset of the medical data for the patient is caused via a user interface for clinical decision making.

9. The method of claim 1, wherein the external database corresponds to at least one of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) and RIS (radiology information systems).

10. The method of claim 1, wherein the medical record is retrieved based on the identifier of the patient.

11. A method for managing a unified patient database, the method comprising the following by a server computer:

storing patient records comprising a network of interconnected data objects to the unified patient database, the unified patient database comprising data from a plurality of sources;

storing a first data object corresponding to a data element of a tumor mass for a primary cancer to the patient record in the unified patient database, the first data object including an attribute specifying a location of the tumor mass;

receiving diagnostic information corresponding to the primary cancer from a diagnostic computer;

analyzing the diagnostic information to identify a correlation between the diagnostic information and the tumor mass;

based on identifying the correlation between the diagnostic information and the tumor mass, storing a second data object corresponding to the diagnostic information to the unified patient database, the second data object being connected to the first data object via the network of interconnected data objects;

Receiving treatment information corresponding to the primary cancer from the diagnostic computer;

analyzing the treatment information to identify a correlation between the treatment information and the tumor mass; and

based on identifying the correlation between the therapy information and the tumor mass, a third data object corresponding to the therapy information is stored to the unified patient database, the third data object being connected to the first data object via the network of interconnected data objects.

12. The method as recited in claim 11, further comprising:

retrieving one or more of the attributes, the diagnostic information, and/or the therapeutic information specifying the location of the tumor mass from the unified patient database; and

display of one or more of the attribute, the diagnostic information, and/or the treatment information specifying the location of the tumor mass is caused via a user interface for clinical decision making.

13. The method as recited in claim 11, further comprising:

receiving patient history data from the diagnostic computer;

analyzing the patient history data to identify correlations between the patient history data and the tumor mass; and

Based on identifying the correlation between the patient history data and the tumor mass, a fourth data object corresponding to the patient history data is stored to the unified patient database, the fourth data object being connected to the first data object via the network of interconnected data objects.

14. The method as recited in claim 11, further comprising:

receiving tumor mass information from the diagnostic computer corresponding to a tumor mass at a metastatic site of the primary cancer;

analyzing the tumor mass information to identify a correlation between the diagnostic information and the tumor mass; and

based on receiving the tumor mass information and identifying the first data object, a fifth data object corresponding to the tumor mass information, connected to the first data object via the network of interconnected data objects, is stored to the unified patient database.

15. The method of claim 11, wherein the second data object comprises one or more attributes selected from the group consisting of: stage, biomarker, and tumor size of the primary cancer.

16. The method as recited in claim 11, further comprising:

Identifying data elements and data types associated with the patient from the unified patient database; and

the data elements and the data types are transmitted in a structured form to an external system.

17. The method as recited in claim 11, further comprising: upon generating each of the first data object and the second data object, a first timestamp stored in association with the first data object indicating a time of creation of the first data object and a second timestamp stored in association with the second data object indicating a time of creation of the second data object are generated.

18. The method of claim 11, further comprising updating the unified patient database by:

importing medical data from an external database;

parsing the imported medical data to identify specific data elements associated with the patient and the primary cancer; and

the particular data element is stored in association with the first data object to a sixth data object.

19. The method of claim 18, wherein the external database corresponds to at least one of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems) and RIS (radiology information systems).

20. A method of processing medical data to facilitate clinical decisions, the method comprising, by a server computer:

receiving, via a graphical user interface, identification data identifying a patient;

receiving user input selecting a mode of a set of selectable modes of the graphical user interface;

retrieving a set of medical data associated with the patient from a unified patient database based on the identification data and the user input, the set of medical data corresponding to the selected mode; and

a user-selectable set of objects is displayed in a timeline via the graphical user interface, the objects organized in rows, each row corresponding to a different one of a plurality of categories including pathology, diagnosis, and treatment.

21. The method of claim 20, wherein retrieving the set of medical data comprises:

querying a unified patient database to identify patient records for the patient from the unified patient database, the patient records including patient objects;

identifying each of a set of objects connected to the patient object; and

a predetermined subset of the identified set of objects is retrieved for display.

22. The method of claim 20, wherein the set of medical data corresponds to one or more of:

a treatment subject in a unified patient database, the treatment subject storing treatment type, date, and response to treatment;

a diagnostic discovery object in the unified patient database, the diagnostic discovery object storing biomarker data, staging data, and/or tumor size data; and

a history object in the unified patient database, the history object storing a surgical history, allergy and/or family medical history.

23. The method as recited in claim 20, further comprising:

detecting a user interaction with an object of the set of objects;

identifying and retrieving corresponding reports from the unified patient database; and

the report is displayed via the graphical user interface.

24. The method of claim 20, the graphical user interface further comprising:

a functional area is displayed above the timeline, the functional area displaying a subset of the objects marked as important.

25. The method of claim 20, the graphical user interface further comprising an element for navigating to a second interface view, the method further comprising:

Detecting a user interaction with the element for navigating to the second interface view; and

and switching to the second interface view, wherein the second interface view displays tumor summary data.

26. A method for managing patient data, the method comprising:

storing a patient record to a unified patient database, the unified patient database comprising data from a plurality of sources, the patient record comprising a plurality of data objects including a first primary cancer data object storing data elements corresponding to a first tumor mass of a patient and a second primary cancer data object storing data elements corresponding to a second tumor mass of the patient;

presenting and causing display of a graphical user interface comprising a patient summary comprising information summarizing patient data in the patient records in the unified patient database;

detecting a user interaction with an element of the graphical user interface;

in response to detecting the user interaction, retrieving the data elements from the first and second primary cancer data objects of the patient record from the unified patient database; and

Presenting:

a first modality corresponding to a first primary cancer of a patient; and

a second modality corresponding to a second primary cancer of the patient; and

causing a side-by-side display of the first modality and the second modality in the graphical user interface.

27. The method according to claim 26, wherein:

each of the modalities displays a set of biomarkers with time stamps, staging information, and transfer site information.

28. The method of claim 26, wherein the plurality of sources comprises two or more of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems), RIS (radiology information systems), patient report results, wearable devices, or social media websites.

29. A method of processing medical data to facilitate clinical decisions, the method comprising:

receiving, via a portal, input medical data for a patient associated with a plurality of data categories associated with oncology workflow operations;

generating structured medical data of the patient based on the input medical data, the structured medical data being generated to support the oncology workflow operation to generate a diagnostic result comprising one of: patients not suffering from cancer, patients suffering from primary cancer, patients suffering from multiple primary cancers, or patients suffering from cancer whose primary site is unknown; and

A history of the structured medical data and the diagnostic results of the patient relative to time in the portal is displayed via the portal to enable a clinical decision to be made based on the history of the diagnostic results.

30. The method of claim 29, wherein the portal includes a data input interface for receiving the input medical data and mapping the input medical data to fields to generate the structured medical data; and

wherein the data input interface organizes the structured medical data into one or more pages, each of the one or more pages being associated with a particular primary tumor location.

31. The method of claim 30, further comprising:

receiving, via the data input interface, a first indication that a first subset of the medical data entered into a first page of the data input interface associated with a first primary tumor site belongs to a second primary tumor site; and

based on the first indication:

creating a second page for the second primary tumor site; and

the second page is populated with the first subset of the medical data.

32. The method of claim 31, further comprising:

receiving, via the data input interface, a second indication that a second subset of the medical data entered into the first page relates to metastasis of the second primary tumor site; and

based on the second indication, the second page is populated with the second subset of the medical data.

33. The method of claim 29, further comprising:

importing a document file from a unified patient database; and

the input medical data is extracted from the document file based on at least one of a Natural Language Processing (NLP) operation or a rule-based extraction operation of text included in the document file.

34. The method of claim 33, further comprising:

displaying the document file in a document browser of the portal; and

one or more portions of the document file from which the input medical data is extracted are highlighted.

35. The method as recited in claim 34, further comprising:

displaying one or more data fields alongside the document browser; and

Displaying an indication of a subset of the one or more data fields to be populated with the input medical data to be extracted from the highlighted one or more portions of the document file to indicate a correspondence between the subset of the one or more data fields and the highlighted one or more portions of the document file.

36. The method of claim 35, wherein the indication comprises emphasizing the subset of the one or more data fields and surrounding a highlighting marker on one or more portions of the highlighting of the document file.

37. The method of claim 36, wherein the indication is displayed based on receiving input from a user via the portal.

38. The method of claim 35, wherein one or more portions of the highlighting are determined based on detecting input from a user via the portal.

39. The method of claim 35, wherein one or more portions of the highlighting are determined based on at least one of the Natural Language Processing (NLP) operation or the rule-based extraction operation.

40. The method of claim 33, further comprising:

determining one or more medical data categories of the extracted input medical data;

determining a mapping between one or more fields in the structured medical data and one or more medical data categories based on a Structured Data List (SDL); and

filling the one or more fields with the extracted input medical data based on the mapping.

41. The method of claim 40, wherein the mapping comprises mapping the input medical data to a normalized value.

42. The method of claim 29, wherein the input medical data is received from one or more sources including at least one of: EMR (electronic medical records) systems, PACS (picture archiving and communication systems), digital Pathology (DP) systems, LIS (laboratory information systems), RIS (radiology information systems), patient report results, wearable devices, or social media websites.

43. A computer product comprising a computer readable medium storing a plurality of instructions for controlling a computer system to perform the operations of any of the above methods.

44. A system, comprising:

the computer product of claim 43; and

one or more processors configured to execute instructions stored on the computer-readable medium.

45. A system comprising means for performing any of the above methods.

46. A system configured to perform any of the above methods.

47. A system comprising modules that individually perform the steps of any of the above methods.