CN111161881A - Method and device for identifying disease co-occurrence relationship and storage medium - Google Patents

Method and device for identifying disease co-occurrence relationship and storage medium Download PDF

Info

Publication number
CN111161881A
CN111161881A CN201910509243.0A CN201910509243A CN111161881A CN 111161881 A CN111161881 A CN 111161881A CN 201910509243 A CN201910509243 A CN 201910509243A CN 111161881 A CN111161881 A CN 111161881A
Authority
CN
China
Prior art keywords
disease
disease association
clinical
association model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910509243.0A
Other languages
Chinese (zh)
Inventor
郎超
王少博
刘水清
梁玮
李文琪
杜辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Yiji Cloud Medical Data Research Institute Co ltd
Original Assignee
Nanjing Yiji Cloud Medical Data Research Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Yiji Cloud Medical Data Research Institute Co ltd filed Critical Nanjing Yiji Cloud Medical Data Research Institute Co ltd
Priority to CN201910509243.0A priority Critical patent/CN111161881A/en
Publication of CN111161881A publication Critical patent/CN111161881A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention is suitable for the technical field of medical data mining, and provides a method, a device and a storage medium for identifying disease co-occurrence relationship, wherein the method comprises the following steps: establishing a standardized clinical research database according to clinical medical data; acquiring a disease association model according to a standardized clinical research database to obtain a disease association relation; and correcting the disease association relation which does not accord with the preset conditions in the disease association model to obtain an optimized disease association model. The clinical medical data are from a large amount of real data of medical institutions, accord with the habit of medical clinical research, and are beneficial to docking with the clinical research; by optimizing the disease association model, the obtained disease association model is ensured to be more in accordance with clinical significance.

Description

Method and device for identifying disease co-occurrence relationship and storage medium
Technical Field
The invention belongs to the technical field of medical data mining, and particularly relates to a method and a device for identifying disease co-occurrence relationship and a storage medium.
Background
With the popularization of medical electronic medical records, more and more clinical medical records are electronized and digitalized and become data sources which can be directly processed by a computer. With the continuous development of big data technology, people increasingly use computer technology means to analyze a large amount of medical care data from patients and crowds to obtain valuable implicit information, and the information is used for assisting clinical researchers, clinicians, managers, researchers and health policy makers to finally benefit patients. Particularly, with the rapid development of the internet +, medical big data not only have various data types and complex relationships, but also grow explosively, and are difficult to be effectively displayed by a common data visualization method, so that the visualization of the medical data faces huge challenges.
Complications refer to any additional clinical entity that may occur during the clinical course of a patient for whom the index disease is already present or under study. In medicine, complications describe the effects of all other diseases that an individual patient may have, except the primary disease of interest. Some diseases have comorbid syndromes, and therefore finding an association of diseases helps clinicians and medical researchers to more effectively explore and identify potential modifiers or new risk factors that affect association with a particular condition, thereby greatly improving the quality and speed of the study. For example, cancer is a common major disease, including many subclasses of diseases, and current co-morbid cancers have formed a systematic theoretical study. The presence or absence and severity of complications may alter diagnostic and therapeutic recommendations, and thus co-morbid syndrome-related studies are of great interest.
However, in the current study of comorbid syndromes, although there are some methods for revealing the relationship between diseases, these methods do not conform to the habit of clinical study in medicine, so that it is difficult to interface with clinical study, or the relationship between diseases is not well reflected.
Disclosure of Invention
In view of this, the embodiment of the present invention provides a method for identifying a disease co-occurrence relationship, so as to solve the technical problem in the prior art that the disease co-occurrence relationship cannot be revealed in a medically approved manner.
A first aspect of an embodiment of the present invention provides a method for identifying a disease co-occurrence relationship, including:
establishing a standardized clinical research database according to clinical medical data;
acquiring a disease association model according to the standardized clinical research database to obtain a disease association relation;
and correcting the disease association relation which does not meet the preset conditions in the disease association model to obtain an optimized disease association model.
A second aspect of the embodiments of the present invention provides an apparatus for identifying a disease co-occurrence relationship, including:
the database module is used for establishing a standardized clinical research database according to clinical medical data;
the disease association model acquisition module is used for acquiring a disease association model according to the standardized clinical research database so as to acquire a disease association relation;
and the disease association model optimization module is used for correcting the disease association relation which does not accord with the preset conditions in the disease association model to obtain an optimized disease association model.
A third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the method for identifying a disease co-occurrence relationship when executing the computer program.
A fourth aspect of the embodiments of the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described method for identifying a disease co-occurrence relationship.
Compared with the prior art, the embodiment of the invention has the following beneficial effects: the clinical medical data of the patient obtained by the embodiment of the invention is derived from a large amount of real data of medical institutions, and is beneficial to efficiently and objectively exploring common treatment processing modes of the disease; the obtained clinical medical data of the patient conform to the habit of medical clinical research and are beneficial to docking with the clinical research; the acquired disease co-occurrence relation in the disease association model is objective, real and accurate, and the acquired disease association model is ensured to be more in line with clinical significance by optimizing the disease association model.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a first flow chart illustrating an implementation of a method for identifying a disease co-occurrence relationship according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating the implementation of establishing a standardized clinical research database in the method for identifying disease co-occurrence relationship provided in the embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating an implementation of obtaining a disease association relationship in the method for identifying a disease co-occurrence relationship according to the embodiment of the present invention;
fig. 4 is a schematic flow chart of an implementation of the method for identifying a disease co-occurrence relationship according to the embodiment of the present invention;
FIG. 5 is a diagram illustrating an example of tuning chord chart parameters in the method for identifying co-occurrence relationships of diseases according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating examples of chord graphs defining a disease of interest in the method for identifying co-occurrence relationships of diseases according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating examples of chord graphs defining a non-focused disease in the method for identifying co-occurrence relationships of diseases according to an embodiment of the present invention;
FIG. 8 is a first schematic diagram of an apparatus for identifying co-occurrence relationship of diseases according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of a database module in the device for identifying co-occurrence relationship of diseases provided by the embodiment of the present invention;
fig. 10 is a schematic diagram of a disease association model obtaining module in the device for identifying disease co-occurrence relationship according to the embodiment of the present invention;
FIG. 11 is a second schematic diagram of a device for identifying co-occurrence relationship of diseases according to an embodiment of the present invention;
fig. 12 is a schematic diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
Referring to fig. 1, the present embodiment provides a method for identifying a disease co-occurrence relationship, including:
step S10: and establishing a standardized clinical research database according to clinical medical data.
With the popularization of medical informatization and digitization, health institutions such as large medical institutions generate a large number of electronic medical records containing patient demographic information, diagnosis cases, medical history, and a large number of examination results and clinical information. A large amount of important information is hidden behind the proliferated clinical medical data and is also an important source of clinical research evidence, and through proper utilization, the information can bring significant significance to clinical decision and treatment level improvement.
Referring to fig. 2, in one embodiment, step S10 includes:
step S101: and acquiring clinical medical data of the patient according to the clinical research multimodal information database.
In this embodiment, the clinical medical data may be data derived from a large clinical research multimodal information database, such as big data of a medical center and data of an intelligent database platform, which are high in accuracy and good in effectiveness, and are helpful for objectively exploring a common treatment processing mode of a disease. Since clinical medical data usually contains sensitive information, in order to prevent leakage of sensitive information and the like, desensitization processing needs to be performed on the acquired clinical medical data when data of a large-scale clinical research multimodal information database is acquired, and meanwhile, a natural language processing technology is adopted to efficiently acquire the clinical medical data of a patient. The clinical medical data of the large clinical research multi-modal information database are extracted and processed, so that the standardized clinical research database is constructed.
Step S102: and performing structured integration on the clinical medical data according to the unique patient identification to obtain structured clinical data.
In the establishment of standardized clinical research databases, if the data sources are single and the data is small, situations such as bias, incomplete and noise are easy to occur, so that a large amount of relevant data needs to be integrated, and the analysis result data can be deeply understood by systematically integrating and comparing multi-dimensional information. Clinical medical data of patients acquired from a large-scale clinical research multimodal information database are scattered, and in order to facilitate subsequent analysis and processing of the clinical medical data, the data needs to be structured, so that the data can be classified. For example, the scattered patient information may be integrated by a unique patient identifier, which may be selected according to actual needs, such as an identification number of the patient, a hospital identification number, an electronic medical record, and the like. Because the patient identifiers are unique, scattered patient information corresponding to the same patient identifier can be associated, and structured integration of clinical medical data is realized.
Step S103: and carrying out semantic normalization processing on the structured clinical data to obtain complete information of the patient so as to establish a standardized clinical research database.
Because each medical institution carries out coding expansion according to self requirements on the basis of International Classification of Diseases (ICD) codes, medical record codes of the medical institutions are inconsistent, medical data of each medical institution are isolated from each other, and the medical data cannot be directly shared and commonly recognized. In this embodiment, the ICD-10 standardized vocabulary is used to perform semantic normalization processing on medical data, so as to integrate the medical data of each medical institution and obtain relatively complete information of a patient in the medical institution, where the information includes information recorded in an electronic medical record, such as information of the patient's identity, hospital identification number, gender, age, race, diagnosis, medication, pulse, and the like. Of course, the information may also include other information, and is not limited to the above-described case. A standardized clinical study database can be constructed by integrating the structured medical data of all patients.
Referring to fig. 1, step S20: and acquiring a disease association model according to the standardized clinical research database to obtain a disease association relation.
In order to obtain the co-occurrence relationship between diseases, a disease association model is required to be established according to the structured medical data, and the disease association model has the association relationship between the diseases. Referring to fig. 3, the process of establishing the disease association model may be:
step S201: and determining a preset data range, acquiring data in the preset data range in a standardized clinical research database, and establishing a database to be processed.
Wherein the predetermined data range may be the disease present in the same patient within a predetermined time period, for example, if the same patient has disease a and disease B within five years, then disease a and disease B are considered to be related and a co-occurrence relationship exists. Certainly, when the preset time period is selected, the time period is not too short or too long, and when the time period is too short, the omission of diseases with co-occurrence relationship may be caused; when the time period is too long, diseases that do not have relevance themselves may be mistaken for co-occurrence. It should be understood that in other embodiments, the preset data range may be set to other types according to the requirement, and is not limited to the time period.
Step S202: and acquiring the co-occurrence frequency of the diseases in the database to be processed.
For example, if the same patient has disease a and disease B within five years, disease a and disease B are considered to be related and occur frequently. For different patients, some patients may have the disease a and the disease B within five years, some patients may not have the disease a or the disease B at the same time, and the frequency of the disease occurrence in the patients with the disease a and the disease B at the same time is different, and the frequency of the disease co-occurrence is calculated here.
Step S203: and establishing disease association degree according to the frequency of disease co-occurrence.
After acquiring the disease co-occurrence frequency, an Odds Ratio (OR) and a Confidence Interval (CI) may be calculated. The OR value mainly refers to the ratio of the number of exposed persons to the number of unexposed persons in a case group divided by the ratio of the number of exposed persons to the number of unexposed persons in a control group, and is a common index in case control research in epidemiology. When the OR value is equal to 1, it indicates that the factor does not contribute to the co-occurrence of the disease; when the OR value is greater than 1, it indicates that the factor is a risk factor, i.e., co-onset of disease; when the OR value is less than 1, it means that the factor is a protective factor, i.e., no disease co-occurrence is caused.
Similar to other sampling statistics, the OR value of the statistics based on the true data has uncertainty and thus can be characterized by the confidence interval of the OR value. For example, 95% CI is the 95% confidence interval for the OR value, and "95% CI L to R" indicates that in the case of one disease, the value of OR ranges from L to R when the probability of another disease is within 95%, e.g., "95% CI 1.1 to 6.0" indicates that the probability of the OR value being between 1.1 and 6.0 is within 95%.
In one embodiment, after the OR value is obtained from the disease co-occurrence frequency calculation, the CI value may be obtained from the OR value. The CI value may be calculated by the following steps:
a first step of calculating in (OR) from the OR value;
second, calculating Standard Error (SE (in (OR)) of the result obtained in the first step;
third, calculate in (or) 95% confidence interval 95% conf.int. ═ in (or) ± 1.96 × SE (in (or));
the fourth step, a 95% confidence interval CI ═ 95% conf.int. ═ exp (in (or) ± 1.96 × SE (in (or))).
Of course, in other embodiments, CI may have other values, and is not limited to 95%. Typically, the CI value should be less than 1; if the CI value is greater than 1, this may be because it would show two conflicting possibilities (increased risk and decreased risk), making the conclusion difficult to interpret. Therefore, after the CI value is obtained, whether the CI value is larger than 1 needs to be judged, and if the CI value is larger than 1, the condition needs to be eliminated; if the CI value is less than 1, this is retained.
Step S204: and obtaining a disease association model according to the disease association degree.
After the association degree between diseases is obtained, a disease association model can be established by excluding unreasonable data, so that the association relation between the diseases can be obtained.
Further, after the OR value and the confidence interval CI are obtained, the corresponding P value of the OR value in the positive distribution table may be further obtained. In the embodiment, after the P value is obtained, the P value or other statistical tools participating in hypothesis testing are not used for screening data, so that actual information contained in the data can be truly reflected, any subjective explanation is abandoned, and otherwise important information with medical significance can be omitted due to introduction of a selection mode.
In one embodiment, the P value may be calculated as:
a first step of obtaining a CI value according to the above, with a 95% confidence interval of OR (L, R) ═ exp (in (OR) ± 1.96 × SE (in (OR));
secondly, obtaining a standard error SE ═ InR-InL)/2/1.96;
thirdly, calculating the value Z, wherein Z is in (OR)/SE;
fourthly, according to the Z value, the positive Tai distribution table is inquired, and therefore the corresponding P value can be obtained.
Referring to fig. 1, after the disease association model is obtained, the model needs to be further optimized.
Step S30: and correcting the disease association relation which does not accord with the preset conditions in the disease association model to obtain an optimized disease association model.
The preset condition herein may be of clinical significance. Because the statistical analysis result does not necessarily represent clinical significance, in order to ensure that low-level errors which do not conform to medical logic due to confusion do not occur in the association relationship, the disease association model can be manually inspected by medical personnel, for example, the association data in the disease association model can be respectively inspected by a plurality of medical personnel, and when a plurality of medical personnel mark the same association data, the association data does not conform to clinical significance, the clinical data needs to be removed, so that the disease association model obtained after optimization through the step is ensured to be more consistent with clinical significance.
Referring to fig. 4, after step S30, the method further includes:
step S40: and visually displaying the optimized disease association model.
After the disease association model is established, the association relationship between the diseases can be obtained, and the next step is to enable knowledge contained in the association model to be easy to visualize, so that the relationship between the diseases is very intuitive and understandable, and a user can conveniently and quickly obtain and understand the disease co-occurrence relationship.
The embodiment displays the association between diseases in the optimized disease association model through the chord graph, and is beneficial to carrying out effective and interactive visualization on the multivariate relation data. The string graph is a circle on the whole, and the size of the circle can be adjusted according to needs. The edge of the circle is provided with a plurality of circular arcs, each circular arc represents an entity (corresponding to a specific disease), and the colors of different circular arcs can be set to be different in order to distinguish different diseases; the greater the number of connections between a disease and other diseases, the wider the arc corresponding to the disease. The connection line between the circular arcs indicates that the two diseases are connected, the tightness degree of the connection line can be represented by different widths of the connection line, and the wider the width of the connection line, the tighter the connection between the two diseases is, which means that the diseases occur together more times in a preset time period.
Because the optimized disease association model is displayed through the visualized chord graph, the contained information is complicated (for example, a large-scale multivariable relation among thousands of disease data can be displayed), and in order to facilitate the user to view the interested content, the embodiment can also interact with the chord graph through the external device, so that the displayed content of the chord graph is adjusted, and the method is simple, intuitive and clear.
By adjusting the relevant parameters, the appearance of the chord graph can be controlled. For example, the present embodiment may set the size, position, color, meaning, etc. of the circular arc in the chord graph, so that the adjustment may be performed as needed to display different features of the data. At the same time, the present embodiment also allows the user to control the appearance of the circle within different threshold ranges (e.g., number of disease associations, confidence, etc.).
The incidence relation between one or more diseases in the chord graph is defined through an external device (such as a keyboard, a mouse and the like), so that only the incidence relation between the defined diseases is displayed in the chord graph, and other contents are transparent or directly hidden.
The incidence relation between one or more diseases in the non-concerned chord graph is defined through external equipment (such as a keyboard, a mouse and the like), so that the defined non-concerned range in the chord graph can be hidden, the circular arc of the edge displayed by the chord graph and the related incidence relation are changed, and other contents are normally displayed.
The method for identifying the disease co-occurrence relationship provided by the embodiment at least has the following beneficial effects:
(1) the clinical medical data of the patient obtained by the embodiment is derived from a large amount of real data of a medical institution, and is helpful for efficiently and objectively exploring common treatment processing modes of the disease types.
(2) When clinical medical data of a patient are obtained, semantic normalization processing is carried out on the medical data through the ICD-10 standardized word list, structured clinical medical data are obtained, habits of medical clinical research are met, and docking with the clinical research is facilitated.
(3) When the disease association relationship is obtained, the disease co-occurrence relationship is automatically identified through the OR value and the confidence interval CI, the disease co-occurrence relationship is objective, real and accurate, the labor cost and the time cost are greatly reduced, and the efficiency of obtaining the disease co-occurrence relationship is improved.
(4) By introducing medical personnel to review the associated data in the disease associated model, the data which do not accord with clinical significance are removed, thereby ensuring that the obtained disease associated model accords with clinical significance better.
(5) The chord graph is adopted to visually display the disease association model, the disease co-occurrence relation can be clearly and visually displayed, a user can know the disease co-occurrence relation through the visual chord graph, meanwhile, the user can independently select the disease co-occurrence relation needing attention through interaction with the chord graph, and academic literature query, case storage and arrangement operation and the like can be performed through graphical interaction.
Specific examples of a method for identifying disease co-occurrence relationships are provided below. It should be understood that the following examples are only used for illustrating the disease co-occurrence relationship provided in the present embodiment, and are not intended to limit the identification method of the disease co-occurrence relationship.
The server of each medical institution contains the clinical medical data information of the patient who is treated in the medical institution, and the server of each medical institution can be in butt joint with a public server, so that a large-scale clinical research multi-mode information database can be constructed, the data has high accuracy and good effectiveness, and the method is favorable for objectively exploring common treatment modes of the disease. In this embodiment, clinical medical data is first obtained from the large-scale clinical research multimodal information database, desensitization processing is performed on the clinical medical data, and meanwhile, a natural language processing technology is used to efficiently obtain clinical medical data of a patient. The scattered patient information is integrated through the unique patient identification, and the medical data is subjected to semantic normalization processing through the ICD-10 standardized vocabulary, so that the medical data of each medical institution is structurally integrated, and relatively complete information of the patient in the medical institution is obtained, wherein the information comprises information recorded in an electronic medical record, such as the identity information, hospital identification number, sex, age, race, diagnosis, medication, pulse and the like of the patient. A standardized clinical study database is constructed by integrating the structured medical data of all patients.
After the standardized clinical study database is constructed, a disease association model needs to be obtained. In this particular example, a 5 year (i.e., preset time period) window was used to observe and calculate associations of cancer and other diseases. All patient records are stratified by visit type (including outpatient, emergency and hospitalization) and gender (male and female), with a corresponding data set comprising 6 subsets (i.e., 3 visit types, each with 2 sexes), each subset being used to quantify the strength of association of a disease with a disease. After all diseases of the 6 subsets are calculated to be associated with diseases (Disease-Disease, DD), the obtained association relationship is stored in a Disease relationship database in which every two diseases are uniquely associated and have corresponding OR values to obtain a Disease association model.
Since the statistical analysis result does not necessarily represent clinical significance, in order to ensure that low-level errors which do not conform to medical logic due to confusion do not occur in the association relationship, in the present embodiment, multiple medical personnel respectively review the association data in the disease association model, and when three medical personnel mark the same association data, it means that the association data does not conform to clinical significance, so that the clinical data is removed, and it is ensured that the disease association model obtained after optimization more conforms to clinical significance.
After the disease association model is established, the embodiment displays the association between the diseases in the optimized disease association model through the chord graph, which is helpful for the efficient and interactive visualization of multivariate relationship data. Wherein, each circular arc of the chord graph represents a disease, and the colors of different circular arcs are set to be different in order to distinguish different diseases; the association between diseases is represented by a line, the width of the line represents the degree of closeness of connection between the two diseases, and the wider the width of the line, the closer the connection between the two diseases is, which means the more times the diseases co-occur within a preset time period. Referring to fig. 5, during the use process, the user may also adjust parameters of the chord chart through an external device (e.g. a keyboard, a mouse, etc.), such as the number of co-occurrences of diseases, the number of diseases, the total number of diseases, OR value, confidence interval, Z value, P value, etc., and may also allow the user to control the appearance of the circle within different threshold ranges (the number of disease associations, confidence, etc.). Moreover, in order to make the display of the disease association relationship concerned by the user more concise and intuitive, the user may also define the association relationship between one or more diseases in the chord graph concerned through the external device (for example, fig. 6), so that only the range in which the attention is defined in the chord graph is reserved, and other contents become transparent. Or, the user can also define the association relationship between one or more diseases in the non-attention chord graph through an external device (for example, fig. 7), so that the non-attention range is hidden from the chord graph, and other contents are normally displayed.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Referring to fig. 8, the present embodiment further aims to provide a device for identifying disease co-occurrence relationship, which includes a database module 51, a disease association model obtaining module 52, and a disease association model optimizing module 53. Wherein, the database module 51 is used for establishing a standardized clinical research database according to clinical medical data; the disease association model obtaining module 52 is configured to obtain a disease association model according to the standardized clinical research database to obtain a disease association relationship; the disease association model optimization module 53 is configured to correct a disease association relation that does not meet a preset condition in the disease association model, and obtain an optimized disease association model.
Referring to fig. 9, the database module 51 further includes a medical data obtaining unit 511, a structured clinical data unit 512, and a database unit 513, wherein the medical data obtaining unit 511 is configured to obtain clinical medical data of a patient according to a clinical research multimodal information database; the structured clinical data unit 512 is configured to perform structured integration on the clinical medical data according to the unique patient identifier to obtain structured clinical data; the database unit 513 is configured to perform semantic normalization on the structured clinical data to obtain complete information of the patient, thereby establishing a standardized clinical research database.
Referring to fig. 10, further, the disease association model obtaining module 52 includes a to-be-processed database unit 521, a disease co-occurrence frequency obtaining unit 522, a disease association degree obtaining unit 523, and a disease association model unit 524, where the to-be-processed database unit 521 is configured to determine a preset data range, obtain data in the preset data range in the standardized clinical research database, and establish a to-be-processed database; the disease co-occurrence frequency acquiring unit 522 is configured to acquire a disease co-occurrence frequency in the database to be processed; the disease association degree obtaining unit 523 is configured to establish a disease association degree according to the frequency of co-occurrence of diseases; the disease association model unit 524 is used to obtain a disease association model according to the disease association degree.
Referring to fig. 11, further, the apparatus for identifying a disease co-occurrence relationship further includes a visualization display module 54, where the visualization display module 54 is configured to visually display the optimized disease association model. The visualization display module 54 can be used for displaying the association between diseases in the optimized disease association model through the chord graph, and also allowing the user to interact with the chord graph through external equipment, so that the displayed content of the chord graph is adjusted, and the user can intuitively and clearly know the co-occurrence relationship of diseases.
Fig. 12 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 12, the terminal device 6 of this embodiment includes: a processor 60, a memory 61, and a computer program 62 stored in the memory 61 and executable on the processor 60. The processor 60 executes the computer program 62 to implement the steps in the above-mentioned embodiment of the method for identifying co-occurrence relationships of diseases, such as the steps S10 to S30 shown in fig. 1.
Illustratively, the computer program 62 may be divided into one or more modules/units, which are stored in the memory 61 and executed by the processor 60 to implement the present invention. One or more of the modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 62 in the terminal device 6.
The terminal device 6 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. Terminal device 6 may include, but is not limited to, a processor 60, a memory 61. Those skilled in the art will appreciate that fig. 12 is merely an example of the terminal device 6 and does not constitute a limitation of the terminal device 6 and may include more or less components than those shown, or combine certain components, or different components, for example, the terminal device 6 may also include input-output devices, network access devices, buses, etc.
The Processor 60 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 61 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. The memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk provided on the terminal device 6, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 61 may also include both an internal storage unit of the terminal device 6 and an external storage device. The memory 61 is used for storing computer programs and other programs and data required by the terminal device 6. The memory 61 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium and used by a processor to implement the steps of the above-described embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, U.S. disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution media, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims (12)

1. A method for identifying disease co-occurrence relationships, comprising:
establishing a standardized clinical research database according to clinical medical data;
acquiring a disease association model according to the standardized clinical research database to obtain a disease association relation;
and correcting the disease association relation which does not meet the preset conditions in the disease association model to obtain an optimized disease association model.
2. The method of identifying disease co-occurrence relationships according to claim 1, wherein said building a standardized clinical study database from clinical medical data comprises:
acquiring clinical medical data of a patient according to a clinical research multimodal information database;
performing structured integration on the clinical medical data according to the unique patient identification to obtain structured clinical data;
and carrying out semantic normalization processing on the structured clinical data to obtain complete information of the patient so as to establish a standardized clinical research database.
3. The method of claim 2, wherein the information comprises information recorded in an electronic medical record.
4. The method for identifying disease co-occurrence relationships according to claim 1, wherein obtaining a disease association model from the standardized clinical research database to obtain disease association relationships comprises:
determining a preset data range, acquiring data in the preset data range in the standardized clinical research database, and establishing a database to be processed;
acquiring the co-occurrence frequency of diseases in the database to be processed;
establishing disease association degree according to the disease co-occurrence frequency;
and obtaining a disease association model according to the disease association degree.
5. The method according to claim 4, wherein the predetermined data range includes diseases occurring in the same patient within a predetermined time period.
6. The method for identifying disease co-occurrence relationship according to claim 4, wherein establishing the disease association degree according to the disease co-occurrence frequency comprises:
obtaining a ratio according to the co-occurrence frequency of the diseases;
and obtaining a confidence interval according to the ratio.
7. The method for identifying disease co-occurrence relationship according to claim 1, wherein the correcting the disease association relationship that does not meet the preset condition in the disease association model to obtain the optimized disease association model comprises:
judging whether the disease association relation in the disease association model accords with medical significance;
if the disease association relationship accords with the medical significance, the disease association relationship is reserved;
and if the disease association does not accord with the medical significance, deleting the disease association.
8. The method for identifying disease co-occurrence relationship according to any one of claims 1 to 7, wherein the correcting the disease association relationship that does not meet the preset condition in the disease association model to obtain the optimized disease association model further comprises:
and visually displaying the optimized disease association model, and displaying the association between the diseases in the optimized disease association model through a chord chart.
9. The method for identifying disease co-occurrence relationships according to claim 8, wherein the visually displaying the optimized disease association model further comprises:
and interacting with the chord graph to adjust the display content of the chord graph.
10. An apparatus for identifying disease co-occurrence relationship, comprising:
the database module is used for establishing a standardized clinical research database according to clinical medical data;
the disease association model acquisition module is used for acquiring a disease association model according to the standardized clinical research database so as to acquire a disease association relation;
and the disease association model optimization module is used for correcting the disease association relation which does not accord with the preset conditions in the disease association model to obtain an optimized disease association model.
11. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method for identifying disease co-occurrence relationship according to any one of claims 1 to 9 when executing the computer program.
12. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements a method for identifying disease co-occurrence relationship according to any one of claims 1 to 9.
CN201910509243.0A 2019-06-13 2019-06-13 Method and device for identifying disease co-occurrence relationship and storage medium Pending CN111161881A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910509243.0A CN111161881A (en) 2019-06-13 2019-06-13 Method and device for identifying disease co-occurrence relationship and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910509243.0A CN111161881A (en) 2019-06-13 2019-06-13 Method and device for identifying disease co-occurrence relationship and storage medium

Publications (1)

Publication Number Publication Date
CN111161881A true CN111161881A (en) 2020-05-15

Family

ID=70555774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910509243.0A Pending CN111161881A (en) 2019-06-13 2019-06-13 Method and device for identifying disease co-occurrence relationship and storage medium

Country Status (1)

Country Link
CN (1) CN111161881A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221162A (en) * 2021-04-28 2021-08-06 健康数据(北京)科技有限公司 Private disease-specific big data privacy protection method and system based on block chain
CN117809827A (en) * 2024-03-01 2024-04-02 吉林大学 Nursing information management system based on Internet of things

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066783A (en) * 2016-11-16 2017-08-18 哈沙斯特医学研发有限公司 A kind of cross-platform clinical big data analysis and display system
CN107887036A (en) * 2017-11-09 2018-04-06 北京纽伦智能科技有限公司 Construction method, device and the clinical decision accessory system of clinical decision accessory system
CN108346474A (en) * 2018-03-14 2018-07-31 湖南省蓝蜻蜓网络科技有限公司 The electronic health record feature selection approach of distribution within class and distribution between class based on word
CN108538395A (en) * 2018-04-02 2018-09-14 上海市儿童医院 A kind of construction method of general medical disease that calls for specialized treatment data system
CN108573752A (en) * 2018-02-09 2018-09-25 上海米因医疗器械科技有限公司 A kind of method and system of the health and fitness information processing based on healthy big data
CN109119134A (en) * 2018-08-09 2019-01-01 脉景(杭州)健康管理有限公司 Medical history data processing method, medical data recommender system, equipment and medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066783A (en) * 2016-11-16 2017-08-18 哈沙斯特医学研发有限公司 A kind of cross-platform clinical big data analysis and display system
CN107887036A (en) * 2017-11-09 2018-04-06 北京纽伦智能科技有限公司 Construction method, device and the clinical decision accessory system of clinical decision accessory system
CN108573752A (en) * 2018-02-09 2018-09-25 上海米因医疗器械科技有限公司 A kind of method and system of the health and fitness information processing based on healthy big data
CN108346474A (en) * 2018-03-14 2018-07-31 湖南省蓝蜻蜓网络科技有限公司 The electronic health record feature selection approach of distribution within class and distribution between class based on word
CN108538395A (en) * 2018-04-02 2018-09-14 上海市儿童医院 A kind of construction method of general medical disease that calls for specialized treatment data system
CN109119134A (en) * 2018-08-09 2019-01-01 脉景(杭州)健康管理有限公司 Medical history data processing method, medical data recommender system, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221162A (en) * 2021-04-28 2021-08-06 健康数据(北京)科技有限公司 Private disease-specific big data privacy protection method and system based on block chain
CN117809827A (en) * 2024-03-01 2024-04-02 吉林大学 Nursing information management system based on Internet of things

Similar Documents

Publication Publication Date Title
CN109346145B (en) Method and system for actively monitoring adverse drug reactions
US8214224B2 (en) Patient data mining for quality adherence
Gallego et al. Bringing cohort studies to the bedside: framework for a ‘green button’to support clinical decision-making
US11250956B2 (en) Duplication detection in clinical documentation during drafting
US20150106022A1 (en) Interactive visual analysis of clinical episodes
CN112635011A (en) Disease diagnosis method, disease diagnosis system, and readable storage medium
CN112270988B (en) Auxiliary diagnosis method for rare diseases
CN115497631A (en) Clinical scientific research big data analysis system
CN112489812A (en) Drug development analysis method, drug development analysis device, electronic device, and storage medium
CN113345577A (en) Diagnosis and treatment auxiliary information generation method, model training method, device, equipment and storage medium
EP2922018A1 (en) Medical information analysis program, medical information analysis device, and medical information analysis method
CN111161881A (en) Method and device for identifying disease co-occurrence relationship and storage medium
CN110675952A (en) Checking decision method and device, terminal equipment and computer readable storage medium
CN113903423A (en) Medication scheme recommendation method, device, equipment and medium
Lamy A data science approach to drug safety: Semantic and visual mining of adverse drug events from clinical trials of pain treatments
CN113724860A (en) Medical examination recommendation method, device, equipment and medium based on artificial intelligence
CN117271903A (en) Event searching method and device based on clinical big data of hospital
Panwar et al. A review: Exploring the role of ChatGPT in the diagnosis and treatment of oral pathologies
CN114078576B (en) Clinical auxiliary decision-making method, device, equipment and medium
CN115775635A (en) Medicine risk identification method and device based on deep learning model and terminal equipment
CN114724693A (en) Method and device for detecting abnormal diagnosis and treatment behaviors, electronic equipment and storage medium
CN114141381A (en) Clinical data analysis method and device based on diagnosis and treatment events
EP4226383A1 (en) A system and a way to automatically monitor clinical trials - virtual monitor (vm) and a way to record medical history
CA2756717A1 (en) Method for selecting custom order hospitalization sets
Ganguly et al. Explainable Artificial Intelligence (XAI) for the Prediction of Diabetes Management: An Ensemble Approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination