CN112635072A - ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium - Google Patents

ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium Download PDF

Info

Publication number
CN112635072A
CN112635072A CN202011635403.5A CN202011635403A CN112635072A CN 112635072 A CN112635072 A CN 112635072A CN 202011635403 A CN202011635403 A CN 202011635403A CN 112635072 A CN112635072 A CN 112635072A
Authority
CN
China
Prior art keywords
icu
medical
case
matched
similarity calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011635403.5A
Other languages
Chinese (zh)
Inventor
包一平
李雪
于丹
来关军
孙箫宇
孙永樯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Neusoft Education Technology Group Co ltd
Original Assignee
Dalian Neusoft Education Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Neusoft Education Technology Group Co ltd filed Critical Dalian Neusoft Education Technology Group Co ltd
Priority to CN202011635403.5A priority Critical patent/CN112635072A/en
Publication of CN112635072A publication Critical patent/CN112635072A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Public Health (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses an ICU similar case retrieval method based on similarity calculation, which comprises the following steps: acquiring an input text, and analyzing the input text to obtain medical features and feature attributes corresponding to the medical features included in the input text; searching based on each medical characteristic included in the input text in a case searching system, and acquiring a historical ICU case matched with the characteristic attribute of the medical characteristic; for each historical ICU case, respectively scoring the matching degree of each matched medical feature according to a similarity calculation mode corresponding to the type of each matched medical feature; weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case; and sequencing all the historical ICU cases according to the total scores of the historical ICU cases to obtain and output an ICU case retrieval result set. The invention realizes the high-precision and high-efficiency retrieval of ICU cases.

Description

ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to an ICU (intensive care unit) similar case retrieval method and system based on similarity calculation.
Background
With the progress of informatization of medical and health services, large medical institutions such as hospitals and physical examination centers generate a large amount of medical electronic health records. The data content is mainly from electronic medical records of hospitals and comprises a large amount of unstructured/semi-structured data. Case query based on patient similarity can be a technical supplement of doctors, doctors can carry out preliminary diagnosis on patients according to the technology, and the patient similarity can also be applied to the fields of patient group identification, patient risk classification and the like.
ICU cases have more and more complex characteristic data than normal cases. For example, a patient who enters an ICU ward has a large amount of medication information, diagnosis information, care information, image information, ventilator information, and the like in a case where the information is not present in general cases or the amount of information is insufficient; patients in common cases typically have one to two disorders, while patients entering the ICU ward typically have more than 10 complex disorders, which results in other characteristics being similarly complex.
Currently, for general case retrieval, the following solutions exist:
1. training text data in a case database, establishing a case word vector model, converting an input text into word vectors, calculating the similarity between the word vectors of the input text and all the case word vectors in the case database, sequencing according to the similarity, and finally obtaining N cases with the highest similarity as a return result.
2. And establishing a knowledge graph for the case database, and connecting different kinds of medical characteristics by using the knowledge graph. And carrying out normalized processing on the input text, then associating the input text with corresponding cases according to the matching condition of the knowledge graph nodes, and finally obtaining N cases with the highest association degree as a return result.
3. The method comprises the steps of performing word segmentation on an input text, then retrieving all cases containing relevant fields of the input text in a database in a searching mode, then scoring through searching matching degree, and finally obtaining N cases with the highest matching degree as a return result.
However, the above scheme has several problems: the early-stage processes of the schemes 1 and 2 are too complex, and a word vector model and a knowledge graph can be established for a common case system. However, the features are more complex and more complex for establishing an ICU case system, and it is more difficult to train a model or establish a knowledge map in the early stage. Moreover, cases are increased every day, models or knowledge maps need to be updated repeatedly, consumed resources are more, and the method is not suitable for practical application. The retrieval granularity of the scheme 3 is coarse, and the retrieval precision is low; and all cases need to be traversed during searching, so that the searching speed is also slow.
In summary, the existing solution for common medical case retrieval has certain defects, and is not suitable for ICU case retrieval, and a retrieval system which ensures retrieval accuracy and speed, consumes less resources and is specialized for the characteristics of ICU cases is urgently needed.
Disclosure of Invention
In view of this, the present invention provides a method, a system and a storage medium for searching similar cases of ICU based on similarity calculation, so as to realize fast and accurate searching of similar cases of ICU.
In order to achieve the above object, the following solutions are proposed:
the invention provides an ICU similar case retrieval method based on similarity calculation, which comprises the following steps:
acquiring an input text, and analyzing the input text to obtain medical features and feature attributes corresponding to the medical features included in the input text;
searching based on each medical characteristic included in the input text in a case searching system, and acquiring a historical ICU case matched with the characteristic attribute of the medical characteristic;
for each historical ICU case, respectively scoring the matching degree of each matched medical feature according to a similarity calculation mode corresponding to the type of each matched medical feature;
weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case;
and sequencing all the historical ICU cases according to the total scores of the historical ICU cases to obtain and output an ICU case retrieval result set.
Further, parsing the input text comprises:
performing word segmentation processing on an input text;
removing words which can cause ambiguity to obtain structured data;
finding out synonyms and near synonyms of the input text after word segmentation processing, and adding the synonyms and near synonyms into the input text;
adopting a natural language understanding technology to analyze the processed text word set to obtain medical characteristics and initial characteristic attributes corresponding to the medical characteristics included in the input text, and converting the medical characteristics into medical characteristics: the expression of the characteristic attribute ".
Further, parsing the input text comprises:
finding out synonyms and near synonyms of all the participles included in the input text, and adding the synonyms and near synonyms into the input text;
adopting a natural language understanding technology to analyze the processed text word set to obtain medical characteristics and characteristic attributes corresponding to the medical characteristics included in the input text, and converting the medical characteristics into' medical characteristics: the expression of the characteristic attribute ".
Further, the medical features include: one or more of the type of illness, medication, admission symptoms, other characteristics; other characteristics are the type of illness, the medication, the symptoms of admission, and others.
Further, the weight of the weighted sum is determined based on the experience of the doctor.
Further, the disease type and admission symptoms are weighted higher than the medication, and the medication category is weighted higher than the other characteristics.
Further, the similarity calculation method corresponding to the disease type includes:
Figure BDA0002881006460000031
Figure BDA0002881006460000032
wherein wi=2/[e(n-1)+1]Parameter representing the degree of importance of the disease, fiWhether the disease types are matched or not is shown, if the disease types are matched, the disease types are 1, if the disease types are not matched, the disease types are 0, n is a positive integer and shows the total number of the disease types;
the similarity calculation mode corresponding to the medication condition comprises the following steps:
Figure BDA0002881006460000041
wherein
Figure BDA0002881006460000042
Parameter indicating the degree of importance of the medication, fjWhether the medication conditions are matched or not is shown, if the medication conditions are matched, the medication conditions are 1, if the medication conditions are not matched, the medication conditions are 0, m is a positive integer and shows the total number of the medication types; a isjRepresenting the number of times each drug was administered;
the similarity calculation method corresponding to the admission symptoms comprises the following steps: TF-IDF calculation formula;
the similarity calculation method corresponding to other features includes: if the feature is a structural feature, scoring according to the sum of the matching times; if the features are natural language type features, the TF-IDF calculation formula is used for scoring.
In another aspect, the present invention further provides an ICU similar case retrieval system based on similarity calculation, where the system includes:
the input module is used for acquiring an input text and analyzing the input text to obtain medical features and feature attributes corresponding to the medical features in the input text;
the case retrieval module is used for retrieving in a case retrieval system based on each medical characteristic in the input text obtained by the input module and acquiring a historical ICU case matched with the characteristic attribute of the medical characteristic;
the similarity scoring module is used for scoring the matching degree of each matched medical characteristic according to a similarity calculation mode corresponding to the type of each matched medical characteristic aiming at each historical ICU case searched by the case searching module; weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case;
and the output module is used for sequencing each historical ICU case according to the total score of the historical ICU case obtained by the similarity scoring module to obtain and output an ICU case retrieval result set.
Further, the medical features include: one or more of the type of illness, medication, admission symptoms, other characteristics; other characteristics are those other than the type of illness, the condition of medication, and the symptoms of admission;
the similarity calculation mode corresponding to the disease type comprises the following steps:
Figure BDA0002881006460000051
wherein wi=2/[e(n-1)+1]Parameter representing the degree of importance of the disease, fiWhether the disease types are matched or not is shown, if the disease types are matched, the number is 1, the disease types are not matched, the number is 0, n is a positive integer, and the total number of the disease types is shown;
the similarity calculation mode corresponding to the medication condition comprises the following steps:
Figure BDA0002881006460000052
wherein
Figure BDA0002881006460000053
Parameter indicating the degree of importance of the medication, fjWhether the medicine is matched or not is shown, if the medicine is matched, the matching is 1, if the medicine is not matched, the matching is 0, m is a positive integer and shows the total number of the medicine types; a isjRepresenting the number of times each drug was administered;
the similarity calculation method corresponding to the admission symptoms comprises the following steps: TF-IDF calculation formula;
the similarity calculation method corresponding to other features includes: if the feature is a structural feature, scoring according to the sum of the matching times; if the features are natural language type features, the TF-IDF calculation formula is used for scoring.
In still another aspect, the present invention further provides a computer-readable storage medium, in which a set of computer instructions is stored, and when executed by a processor, the method for retrieving an ICU similar case based on similarity calculation as described above is implemented.
According to the technical scheme, the invention provides a more detailed similarity calculation formula aiming at different characteristics of an ICU case and a common case, and sets different similarity calculation formulas aiming at each characteristic. The matching of the ICU case is more accurate, and the requirement of the doctor on the accuracy of the searched ICU similar case is met.
Meanwhile, the retrieval mode of the invention is to perform local retrieval according to the characteristic types after extracting the characteristics, thereby avoiding the mode that all the case libraries need to be retrieved in each retrieval in the traditional method, reducing the data calculation amount in the retrieval, reducing the retrieval difficulty, improving the retrieval efficiency, and particularly improving the retrieval efficiency of some common diseases with huge document quantity.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a similarity calculation-based ICU similar case retrieval method according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of a medical feature disclosed in an embodiment of the present invention;
FIG. 3 is a diagram of an application scenario of the ICU similar case retrieval method according to the embodiment of the present invention;
fig. 4 is another application scenario diagram of the ICU similar case retrieval method according to the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, a schematic flow chart of an ICU similar case retrieval method based on similarity calculation in an embodiment of the present invention is shown, where the method includes the following steps:
step 1: the method comprises the steps of obtaining an input text, analyzing the input text, and obtaining medical features and feature attributes corresponding to the medical features in the input text.
Wherein, the step 1 comprises the following steps:
s101, performing word segmentation processing on an input text;
s102, removing words such as stop words and the like which can cause ambiguity to obtain structured data;
if the input text is already structured data, steps S103 and S104 may be directly performed.
S103, finding out the synonyms and the synonyms of the input text after the word segmentation processing, and adding one of the synonyms and the synonyms into the input text;
s104, analyzing the processed text word set by adopting a Natural Language Understanding (NLU) technology to obtain the medical characteristics and the characteristic attributes corresponding to the medical characteristics included in the input text, and converting the medical characteristics into' medical characteristics: the expression of the characteristic attribute ".
As shown in fig. 2, the medical features include: one or more of the type of illness, admission symptoms, medication, other characteristics; other characteristics are the types of diseases, the medication conditions, the characteristics except the admission symptoms, such as operation conditions, image diagnosis, nursing information and physical examination diagnosis.
Step 2: and searching based on each medical characteristic included in the input text in a case searching system, and acquiring historical ICU cases matched with the characteristic attribute of the medical characteristic.
In practical application, the case data can be searched in case data warehouses comprising various ICU cases besides the case searching system.
Wherein, the step 2 comprises the following steps:
s201, aiming at the input text which is obtained in the step 1 and subjected to analysis processing, assuming that M medical characteristics exist after the analysis processing, and respectively searching the M medical characteristics in a case searching system, wherein M is a positive integer.
S202, aiming at each medical feature, only the medical feature field in the case database is searched, and N historical ICU cases which are matched with the feature attribute of the medical feature are obtained, wherein N is a positive integer. Thus, a total of S ICU cases can be obtained, wherein S is a positive integer and 0< S ≦ M N.
And step 3: for each historical ICU case, respectively scoring the matching degree of each matched medical feature according to a similarity calculation mode corresponding to the type of each matched medical feature;
and scoring the matching degree of each medical characteristic matched by each historical ICU case, wherein if one historical ICU case searches the medical characteristics matched with the N input texts, the ICU case has N scores.
Because each medical characteristic has different characteristics, in order to calculate the similarity more accurately, the invention designs a plurality of similarity calculation modes according to the characteristics of each medical characteristic, and carries out similarity calculation according to the similarity calculation mode corresponding to the type of the medical characteristic, such as:
aiming at the similarity calculation of the disease types, the disease type characteristics belong to structural characteristics, and the following characteristics exist: basically, there are more than 10 disease types in each ICU case, and the importance thereof is sequentially decreased according to the recorded serial number. In this regard, the similarity calculation formula of the disease types is:
Figure BDA0002881006460000081
wherein wi=2/[e(n-1)+1]Parameter representing the degree of importance of the disease, fiWhether the disease types are matched or not is shown, if the disease types are matched, the disease types are 1, the disease types are not matched, the disease types are 0, and n is a positive integer and shows the total number of the disease types.
Aiming at similarity calculation of medication situations, the medication situation characteristics belong to structural characteristics and have the following characteristics: there were multiple days of dosing data in each ICU case. For this purpose, the number of times of administration of each case was counted, and the total number of times of administration A and the number of times of administration a of each case were countedjThus, the similarity calculation formula of the medication situation is as follows:
Figure BDA0002881006460000082
wherein wj=ajA represents the parameter of the degree of importance of the medication, fjWhether the medication conditions are matched or not is shown, if the medication conditions are matched, the medication conditions are 1, if the medication conditions are not matched, the medication conditions are 0, m is a positive integer and shows the total number of the medication types; a isjThe number of times each drug was administered is indicated.
And aiming at similarity calculation of admission symptoms, wherein the admission symptom characteristics belong to natural language characteristics and describe the state of the patient when the patient is admitted, and a TF-IDF calculation formula is adopted for similarity calculation. Wherein, the TF-IDF calculation formula of the word i used for describing the medical characteristics in the medical record document j is as follows: TF-IDFij=TFij*IDFiWherein TFijRepresents Term Frequency (TF), which refers to the ratio of the number of occurrences of a given term i in document j to the total number of terms in document j, i.e., TFijNumber of occurrences of word i in document j/total number of words in document j, IDFiAn Inverse Document Frequency (IDF) representing the word i is calculated by the following formula:
Figure BDA0002881006460000091
similarity calculations for other features, such as: the operation condition, the physical examination diagnosis, the image diagnosis, the nursing information and the like. If the medical feature is a structured feature, calculating the similarity score of the medical feature as the sum of the matching times; if the medical characteristics are natural language type characteristics, similarity calculation is carried out on the medical characteristics by using a TF-IDF calculation formula.
And 4, step 4: weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case;
and 5: and sequencing all the historical ICU cases according to the total scores of the historical ICU cases to obtain and output an ICU case retrieval result set.
Performing weighted summation on the scores of all the medical characteristics to obtain a final total score which is used as a basis for sorting the final retrieval result; and finally, outputting preset ICU cases with the top rank as an ICU case retrieval result set, or outputting S sequenced ICU cases as a result set.
The weight is given mainly according to the experience of the doctor, generally, under the same conditions, the disease type and the admission symptoms are taken as the most priority items with the largest weight, the medication condition is taken as the second priority item with the second weight, and other characteristics are taken as the secondary items with the smallest weight.
In the embodiment of the invention, a more detailed similarity calculation formula is provided aiming at different characteristics of an ICU case and a common case, and different similarity calculation formulas are set aiming at each characteristic. The matching of the ICU case is more accurate, and the requirement of the doctor on the accuracy of the searched ICU similar case is met. Meanwhile, the retrieval mode in the embodiment of the invention is to perform local retrieval according to the feature types after feature extraction, so that the mode that all the case libraries need to be retrieved in each retrieval in the traditional method is avoided, the data calculation amount in the retrieval is reduced, the retrieval difficulty is reduced, the retrieval efficiency is improved, and particularly the retrieval efficiency of some common diseases with huge document quantity is improved.
There are many application scenarios for the retrieval of ICU cases. For example: doctors can input various medical characteristics and search similar cases to generate a result set, and the result set can be used as a scientific research topic of the doctors. As another example, during a patient treatment, a clinician selects an early warning category for the patient to be detected, during which a physiological indicator of the patient is detected and early warned. When the early warning is generated, similar cases are matched in a data warehouse according to the condition triggered by the early warning and the characteristics of the patient transferred to the moment, and a doctor checks the treatment and development of the historical cases to be used as an auxiliary diagnosis mode of the doctor.
For the convenience of understanding, the ICU similar case retrieval method based on similarity calculation in the present invention is described below by taking two specific application scenarios of ICU case retrieval as examples.
Example 1:
fig. 3 is an application scenario diagram of the ICU similar case retrieval method according to the embodiment of the present invention. According to a single ICU case with characteristics, a clinician can check data of the ICU case in the hospital period, select one or more medical characteristic indexes, and generate an ICU similar case retrieval result set through a case retrieval system, wherein the result set can be used as a scientific research topic. The method comprises the following specific steps:
performing structured parsing processing on an input text, wherein the input text is unstructured data, and steps S101 to S104 in step 1 of the inventive content need to be executed to convert the input text into: "medical characteristics: the expression of the characteristic attribute ".
And (3) searching the ICU cases matched with the input text in the case searching system according to the step 2, wherein S ICU cases can be obtained in total, and S is a positive integer.
And (4) scoring the matching degree of each medical characteristic in the S ICU cases according to the step 3 to obtain N scores.
And performing weighted summation on the scores of all the medical features to obtain a final total score which is used as a basis for sequencing the final retrieval result. The weights are given primarily based on the experience of the physician. And finally outputting S sequenced ICU cases, wherein the S ICU cases are the final output result set.
Example 2:
fig. 4 is another application scenario diagram of the ICU similar case retrieval method according to the embodiment of the present invention. During the treatment of a patient, a clinician selects an early warning type needing to be detected for the patient, during the period, physiological indexes of the patient are detected and early warned, when the early warning is generated, similar ICU cases are matched in a case retrieval system according to the condition triggered by the early warning and the characteristics from the patient to ICU to the moment, and the doctor checks the treatment and development of the historical ICU cases. The method comprises the following specific steps:
in this embodiment, the input text is the early warning information, and is structured data, and only the steps S103 and S104 in step 1 of the invention content need to be executed, so that the input text is converted into: "medical characteristics: the expression of the characteristic attribute ".
And (3) searching the ICU cases matched with the input text in the case searching system according to the step 2, wherein S ICU cases can be obtained in total, and S is a positive integer.
And (4) scoring the matching degree of each medical characteristic in the S ICU cases according to the step 3 to obtain N scores.
And performing weighted summation on the scores of all the medical features to obtain a final total score which is used as a basis for sequencing the final retrieval result. The weights are given primarily based on the experience of the physician. This resulted in S ordered ICU cases.
The doctor can set the top TopN ICU cases with the highest similarity and reference value before output, so that only the N ICU cases with the highest similarity sum are finally output as the reference for the subsequent diagnosis of the doctor.
Corresponding to the ICU similar case retrieval method based on similarity calculation in the application, the application also provides an ICU similar case retrieval system based on similarity calculation, and the system comprises:
the input module is used for acquiring an input text and analyzing the input text to obtain medical features and feature attributes corresponding to the medical features in the input text;
the case retrieval module is used for retrieving in a case retrieval system based on each medical characteristic in the input text obtained by the input module and acquiring a historical ICU case matched with the characteristic attribute of the medical characteristic;
the similarity scoring module is used for scoring the matching degree of each matched medical characteristic according to a similarity calculation mode corresponding to the type of each matched medical characteristic aiming at each historical ICU case searched by the case searching module; weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case;
and the output module is used for sequencing each historical ICU case according to the total score of the historical ICU case obtained by the similarity scoring module to obtain and output an ICU case retrieval result set.
Further, the medical features include: one or more of the type of illness, medication, admission symptoms, other characteristics; other characteristics are those other than the type of illness, the condition of medication, and the symptoms of admission;
the similarity calculation mode corresponding to the disease type comprises the following steps:
Figure BDA0002881006460000121
wherein wi=2/[e(n-1)+1]Parameter representing the degree of importance of the disease, fiWhether the disease types are matched or not is shown, if the disease types are matched, the disease types are 1, if the disease types are not matched, the disease types are 0, n is a positive integer and shows the total number of the disease types;
the similarity calculation mode corresponding to the medication condition comprises the following steps:
Figure BDA0002881006460000122
wherein
Figure BDA0002881006460000123
Parameter indicating the degree of importance of the medication, fjWhether the medication conditions are matched or not is shown, if the medication conditions are matched, the medication conditions are 1, if the medication conditions are not matched, the medication conditions are 0, m is a positive integer and shows the total number of the medication types; a isjRepresenting the number of times each drug was administered;
the similarity calculation method corresponding to the admission symptoms comprises the following steps: TF-IDF calculation formula;
the similarity calculation method corresponding to other features includes: if the feature is a structural feature, scoring according to the sum of the matching times; if the features are natural language type features, the TF-IDF calculation formula is used for scoring.
Further, the input module for parsing the input text comprises:
(1) performing word segmentation processing on the input text;
(2) removing words which can cause ambiguity to obtain structured data;
(3) finding out synonyms and near synonyms of the input text after word segmentation processing, and adding the synonyms and near synonyms into the input text;
(4) analyzing the processed text word set by adopting a natural language understanding technology to obtain medical characteristics and characteristic attributes corresponding to the medical characteristics in the input text, and converting the medical characteristics into medical characteristics: the expression of the characteristic attribute ".
And (4) if the input text is structured data, directly executing the steps (3) and (4).
Further, the weight of the weighted sum is determined based on the experience of the doctor.
Further, the type of illness and admission symptoms are weighted higher than the medication cases, which are weighted higher than other characteristics.
For the ICU similar case retrieval system based on similarity calculation according to the embodiment of the present invention, the description is simple because it is the ICU similar case retrieval method based on similarity calculation corresponding to the above embodiment, and the related similarities are only referred to the descriptions in the above embodiment, and are not described in detail here.
The embodiment of the application also discloses a computer-readable storage medium, wherein a computer instruction set is stored in the computer-readable storage medium, and when being executed by a processor, the computer instruction set realizes the similarity calculation-based ICU similar case retrieval method provided by any one of the above embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An ICU similar case retrieval method based on similarity calculation is characterized by comprising the following steps:
acquiring an input text, and analyzing the input text to obtain medical features and feature attributes corresponding to the medical features included in the input text;
searching based on each medical characteristic included in the input text in a case searching system, and acquiring a historical ICU case matched with the characteristic attribute of the medical characteristic;
for each historical ICU case, respectively scoring the matching degree of each matched medical feature according to a similarity calculation mode corresponding to the type of each matched medical feature;
weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case;
and sequencing all the historical ICU cases according to the total scores of the historical ICU cases to obtain and output an ICU case retrieval result set.
2. The ICU similar case retrieval method based on similarity calculation according to claim 1, wherein parsing the input text comprises:
performing word segmentation processing on an input text;
removing words which can cause ambiguity to obtain structured data;
finding out synonyms and near synonyms of the input text after word segmentation processing, and adding the synonyms and near synonyms into the input text;
adopting a natural language understanding technology to analyze the processed text word set to obtain medical characteristics and characteristic attributes corresponding to the medical characteristics included in the input text, and converting the medical characteristics into' medical characteristics: the expression of the characteristic attribute ".
3. The ICU similar case retrieval method based on similarity calculation according to claim 1, wherein parsing the input text comprises:
finding out synonyms and near synonyms of all the participles included in the input text, and adding the synonyms and near synonyms into the input text;
adopting a natural language understanding technology to analyze the processed text word set to obtain medical characteristics and characteristic attributes corresponding to the medical characteristics included in the input text, and converting the medical characteristics into' medical characteristics: the expression of the characteristic attribute ".
4. The ICU similar case retrieval method based on similarity calculation according to claim 1, wherein the medical features include: one or more of the type of illness, medication, admission symptoms, other characteristics; other characteristics are the type of illness, the medication, the symptoms of admission, and others.
5. The ICU similar case retrieval method based on similarity calculation according to claim 1, wherein the weight of the weighted sum is determined according to the doctor's experience.
6. The ICU similarity case retrieval method based on similarity calculation of claim 4, wherein the disease type and the hospitalization symptoms are weighted higher than the medication case, which is weighted higher than other characteristics.
7. The ICU similar case retrieval method based on similarity calculation of claim 4, wherein the similarity calculation manner corresponding to the disease type comprises:
Figure FDA0002881006450000021
Figure FDA0002881006450000022
wherein wi=2/[e(n-1)+1]Parameter representing the degree of importance of the disease, fiWhether the disease types are matched or not is shown, if the disease types are matched, the disease types are 1, if the disease types are not matched, the disease types are 0, n is a positive integer and shows the total number of the disease types;
the similarity calculation mode corresponding to the medication condition comprises the following steps:
Figure FDA0002881006450000023
wherein
Figure FDA0002881006450000024
Parameter indicating the degree of importance of the medication, fjWhether the medication conditions are matched or not is shown, if the medication conditions are matched, the medication conditions are 1, if the medication conditions are not matched, the medication conditions are 0, m is a positive integer and shows the total number of the medication types; a isjRepresenting the number of times each drug was administered;
the similarity calculation method corresponding to the admission symptoms comprises the following steps: TF-IDF calculation formula;
the similarity calculation method corresponding to other features includes: if the feature is a structural feature, scoring according to the sum of the matching times; if the features are natural language type features, the TF-IDF calculation formula is used for scoring.
8. An ICU similar case retrieval system based on similarity calculation, characterized in that the system comprises:
the input module is used for acquiring an input text and analyzing the input text to obtain medical features and feature attributes corresponding to the medical features in the input text;
the case retrieval module is used for retrieving in a case retrieval system based on each medical characteristic in the input text obtained by the input module and acquiring a historical ICU case matched with the characteristic attribute of the medical characteristic;
the similarity scoring module is used for scoring the matching degree of each matched medical characteristic according to a similarity calculation mode corresponding to the type of each matched medical characteristic aiming at each historical ICU case searched by the case searching module; weighting and summing the scores of all the matched medical characteristics in the historical ICU case to obtain the total score of the historical ICU case;
and the output module is used for sequencing each historical ICU case according to the total score of the historical ICU case obtained by the similarity scoring module to obtain and output an ICU case retrieval result set.
9. The ICU similar cases retrieval system based on similarity calculation according to claim 8, characterized in that said medical features include: one or more of the type of illness, medication, admission symptoms, other characteristics; other characteristics are those other than the type of illness, the condition of medication, and the symptoms of admission;
the similarity calculation mode corresponding to the disease type comprises the following steps:
Figure FDA0002881006450000031
wherein wi=2/[e(n-1)+1]Parameter representing the degree of importance of the disease, fiWhether the disease types are matched or not is shown, if the disease types are matched, the disease types are 1, if the disease types are not matched, the disease types are 0, n is a positive integer and shows the total number of the disease types;
the similarity calculation mode corresponding to the medication condition comprises the following steps:
Figure FDA0002881006450000032
wherein
Figure FDA0002881006450000033
Parameter indicating the degree of importance of the medication, fjWhether the medication conditions are matched or not is shown, if the medication conditions are matched, the medication conditions are 1, if the medication conditions are not matched, the medication conditions are 0, m is a positive integer and shows the total number of the medication types; a isjRepresenting the number of times each drug was administered;
the similarity calculation method corresponding to the admission symptoms comprises the following steps: TF-IDF calculation formula;
the similarity calculation method corresponding to other features includes: if the feature is a structural feature, scoring according to the sum of the matching times; if the features are natural language type features, the TF-IDF calculation formula is used for scoring.
10. A computer-readable storage medium, wherein a computer instruction set is stored in the computer-readable storage medium, and when the computer instruction set is executed by a processor, the method for retrieving the ICU similar case based on the similarity calculation according to any one of claims 1 to 7 is implemented.
CN202011635403.5A 2020-12-31 2020-12-31 ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium Pending CN112635072A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011635403.5A CN112635072A (en) 2020-12-31 2020-12-31 ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011635403.5A CN112635072A (en) 2020-12-31 2020-12-31 ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium

Publications (1)

Publication Number Publication Date
CN112635072A true CN112635072A (en) 2021-04-09

Family

ID=75290419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011635403.5A Pending CN112635072A (en) 2020-12-31 2020-12-31 ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium

Country Status (1)

Country Link
CN (1) CN112635072A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113241136A (en) * 2021-05-17 2021-08-10 哈尔滨工业大学(深圳) Similar case analysis method and system
CN116564539A (en) * 2023-07-10 2023-08-08 神州医疗科技股份有限公司 Medical similar case recommending method and system based on information extraction and entity normalization
CN117690581A (en) * 2023-12-13 2024-03-12 江苏济远医疗科技有限公司 Disease inquiry process auxiliary information generation method based on large language model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572675A (en) * 2013-10-16 2015-04-29 中国人民解放军南京军区南京总医院 Similar medical history searching system and method
CN106682397A (en) * 2016-12-09 2017-05-17 江西中科九峰智慧医疗科技有限公司 Knowledge-based electronic medical record quality control method
CN107799160A (en) * 2017-10-26 2018-03-13 医渡云(北京)技术有限公司 Medication aid decision-making method and device, storage medium, electronic equipment
CN109473152A (en) * 2018-09-07 2019-03-15 大连诺道认知医学技术有限公司 Lookup method, device and the electronic equipment of similar case history
CN109545382A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 A kind of identical case recognition methods and calculating equipment based on big data
CN110517785A (en) * 2019-08-28 2019-11-29 北京百度网讯科技有限公司 Lookup method, device and the equipment of similar case
CN111402973A (en) * 2020-03-02 2020-07-10 平安科技(深圳)有限公司 Information matching analysis method and device, computer system and readable storage medium
CN111414393A (en) * 2020-03-26 2020-07-14 湖南科创信息技术股份有限公司 Semantic similar case retrieval method and equipment based on medical knowledge graph

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572675A (en) * 2013-10-16 2015-04-29 中国人民解放军南京军区南京总医院 Similar medical history searching system and method
CN106682397A (en) * 2016-12-09 2017-05-17 江西中科九峰智慧医疗科技有限公司 Knowledge-based electronic medical record quality control method
CN107799160A (en) * 2017-10-26 2018-03-13 医渡云(北京)技术有限公司 Medication aid decision-making method and device, storage medium, electronic equipment
CN109473152A (en) * 2018-09-07 2019-03-15 大连诺道认知医学技术有限公司 Lookup method, device and the electronic equipment of similar case history
CN109545382A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 A kind of identical case recognition methods and calculating equipment based on big data
CN110517785A (en) * 2019-08-28 2019-11-29 北京百度网讯科技有限公司 Lookup method, device and the equipment of similar case
CN111402973A (en) * 2020-03-02 2020-07-10 平安科技(深圳)有限公司 Information matching analysis method and device, computer system and readable storage medium
CN111414393A (en) * 2020-03-26 2020-07-14 湖南科创信息技术股份有限公司 Semantic similar case retrieval method and equipment based on medical knowledge graph

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113241136A (en) * 2021-05-17 2021-08-10 哈尔滨工业大学(深圳) Similar case analysis method and system
CN116564539A (en) * 2023-07-10 2023-08-08 神州医疗科技股份有限公司 Medical similar case recommending method and system based on information extraction and entity normalization
CN116564539B (en) * 2023-07-10 2023-10-24 神州医疗科技股份有限公司 Medical similar case recommending method and system based on information extraction and entity normalization
CN117690581A (en) * 2023-12-13 2024-03-12 江苏济远医疗科技有限公司 Disease inquiry process auxiliary information generation method based on large language model

Similar Documents

Publication Publication Date Title
CN111414393B (en) Semantic similar case retrieval method and equipment based on medical knowledge graph
CN109299239B (en) ES-based electronic medical record retrieval method
CN107341264B (en) Electronic medical record retrieval system and method supporting user-defined entity
CN107656952B (en) The modeling method of parallel intelligence case recommended models
Alicante et al. Unsupervised entity and relation extraction from clinical records in Italian
CN110109887B (en) Data retrieval method, electronic device, and computer storage medium
CN109753516B (en) Method for sorting medical record search results and related device
CN112635072A (en) ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium
CN104572675B (en) A kind of system and method for similar case history retrieval
US20030167252A1 (en) Topic identification and use thereof in information retrieval systems
US20140344274A1 (en) Information structuring system
CN110299209B (en) Similar medical record searching method, device and equipment and readable storage medium
JP7464800B2 (en) METHOD AND SYSTEM FOR RECOGNITION OF MEDICAL EVENTS UNDER SMALL SAMPLE WEAKLY LABELING CONDITIONS - Patent application
Gerstmair et al. Intelligent image retrieval based on radiology reports
Cao et al. Multi-information source hin for medical concept embedding
Névéol et al. Automatic indexing of online health resources for a French quality controlled gateway
EP3262533A1 (en) Method and system for context-sensitive assessment of clinical findings
Wijewickrema et al. Selecting a text similarity measure for a content-based recommender system: A comparison in two corpora
CN112071431B (en) Clinical path automatic generation method and system based on deep learning and knowledge graph
Gobeill et al. Question answering for biology and medicine
CN115631823A (en) Similar case recommendation method and system
CN114098638A (en) Interpretable dynamic disease severity prediction method
Zhang et al. Extraction of English Drug Names Based on Bert-CNN Mode.
CN112712866A (en) Method and device for determining text information similarity
Deshpande et al. Multimodal Ranked Search over Integrated Repository of Radiology Data Sources.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 116000 room 206, no.8-9, software garden road, Ganjingzi District, Dalian City, Liaoning Province

Applicant after: Neusoft Education Technology Group Co.,Ltd.

Address before: 116000 room 206, no.8-9, software garden road, Ganjingzi District, Dalian City, Liaoning Province

Applicant before: Dalian Neusoft Education Technology Group Co.,Ltd.