CN111414393B - Semantic similar case retrieval method and equipment based on medical knowledge graph - Google Patents
Semantic similar case retrieval method and equipment based on medical knowledge graph Download PDFInfo
- Publication number
- CN111414393B CN111414393B CN202010221246.7A CN202010221246A CN111414393B CN 111414393 B CN111414393 B CN 111414393B CN 202010221246 A CN202010221246 A CN 202010221246A CN 111414393 B CN111414393 B CN 111414393B
- Authority
- CN
- China
- Prior art keywords
- case
- entity
- similarity
- matching
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Abstract
The invention discloses a semantic similar case retrieval method and equipment based on a medical knowledge graph, wherein the method comprises the following steps: acquiring an electronic case meeting the requirements of case content specifications; carrying out structuralization processing on the electronic case text, and combining a medical knowledge map to obtain a structuralized electronic case with unified standard terms; calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity; and sorting and outputting the cases in the library according to the calculated similarity. The similarity calculation method combines the alignment of the medical knowledge graph and the construction of a semantic similarity calculation model according to the semantic network of the knowledge graph to calculate the similarity between the structured electronic case and the cases in the database, not only considers the number of matching but also considers the matching metric value, thereby defining the similarity as more to be matched and the matching accuracy, and improving the granularity requirement and the accuracy of matching of similar cases.
Description
Technical Field
The invention relates to the field of similar case retrieval, in particular to a semantic similar case retrieval method and semantic similar case retrieval equipment based on a medical knowledge graph.
Background
With the development of computer technology, retrieval has become a commonly used means for acquiring information in daily life. In the medical field, similar case retrieval has great significance in scientific research and clinic, similar cases can not only assist doctors to make better diagnosis and analysis on current cases based on past similar cases and improve the diagnosis accuracy, but also can make treatment plans of current cases through treatment schemes of similar cases, shorten the cure period of patients and improve the treatment efficiency.
The knowledge map is also called scientific knowledge map, is called knowledge domain visualization or knowledge domain mapping map in the book information field, is a series of different graphs for displaying the relation between the knowledge development process and the structure, describes knowledge resources and carriers thereof by using visualization technology, and excavates, analyzes, constructs, draws and displays knowledge and the mutual relation between the knowledge resources and the carriers. With the proposition and establishment of knowledge maps, information can be searched and inquired more conveniently, clearly and accurately, and more industries establish professional knowledge maps of various industries, such as medical knowledge maps.
The traditional similar case retrieval method is to search in a library according to medical features extracted from an input text and return a matched similar case, but the complicated relation among the medical features often causes inaccurate definition, so that the retrieval granularity is coarse and the retrieval is inaccurate.
Disclosure of Invention
The invention provides a semantic similar case retrieval method based on a medical knowledge graph, which aims to solve the problems of inaccurate definition, coarse retrieval granularity and inaccurate retrieval in the conventional similar case retrieval.
The technical scheme adopted by the invention is as follows:
a semantic similar case retrieval method based on a medical knowledge graph comprises the following steps:
acquiring an electronic case meeting the requirements of case content specifications;
carrying out structuralization processing on the electronic case text, and combining a medical knowledge map to obtain a structuralized electronic case with unified standard terms;
calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity;
and sorting and outputting the cases in the library according to the calculated similarity.
As a possible embodiment, the electronic case meeting the requirements of the case content specification includes basic information of the patient including the name, sex, age and marital status of the patient and basic health information including chief complaints, current medical history, past medical history, personal history, family history and physical examination.
As a possible embodiment, the step of structuring the electronic case text by combining the medical knowledge map to obtain a structured electronic case with uniform canonical terms specifically includes the steps of:
extracting a medical entity from the basic health information of the patient by using an entity extraction model;
aligning and standardizing the extracted medical entity with the medical knowledge graph, and aligning the non-professional term expression with the professional term expression to obtain the medical entity with standard terms;
and classifying the medical entities with the standard terms according to preset entity categories to obtain the structured electronic cases with the uniform standard terms.
As a feasible embodiment, the entity extraction model adopts a named entity recognition model bilstm-crf, and training learning is carried out based on an electronic case text; when the extracted medical entity is aligned and standardized with the medical knowledge graph, a translation model bilstm-attention based on an encoding and decoding technology is adopted, and training learning is carried out based on a unified and standardized medical term system in the medical knowledge graph.
As a possible embodiment, the predetermined entity category is obtained by classifying several categories of clinical features according to different sources of the entity and the negative or positive of the entity, and includes: chief symptoms, chief signs, non-chief symptoms, non-chief signs, current illness, historical illness, current causes, historical causes, familial illness, current medications, historical medications, current surgery, historical surgery, current examination items, historical examination items, current examination results, historical examination results, current physical examination, historical physical examination, current occupation, historical occupation, physical constitution, physical condition, the plurality of medical clinical characteristics including chief symptoms, chief signs, non-chief symptoms, non-chief signs, illness, causes, surgery, medication, physical condition, physical constitution, occupation, physical examination, examination items, examination results, and examination results, the negative positive of the entity indicates the existence of the entity, positive indicates the presence of positive, and negative indicates the absence of negative.
As a possible embodiment, the calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity specifically comprises the following steps:
calculating a content matching degree of the structured electronic case and the case in the library, wherein the content matching degree is obtained by dividing the entity matching score of the structured electronic case and the case in the library by the total entity score of the structured electronic case:
wherein M represents a content matching degree, S1Entity matching score, S, representing structured case and case in repository2The method comprises the following steps of representing the total entity score of a structured electronic case, w representing entity category weight, m representing entity type total number, i representing the currently traversed entity type ordinal number, n representing the entity total number corresponding to the ith entity type, j representing the currently traversed entity ordinal number, f representing the result of entity matching, wherein the value is 0-1, the matching factor is equal to 1 if complete matching is successful, and the matching factor is 0 if complete matching is failed, wherein the matching factor f between any two entities is calculated based on a tree structure formed by the subordinate relations between the entities in a medical knowledge graph:
fab=1/(1+n)
wherein n is the distance for finding b from the entity a to the root node or finding a from the entity b to the root node, if not, the distance n is infinite, the matching factor between the entity a and the entity b is 0, if a is b, the distance n is 0, and the matching factor is 1;
calculating the similarity of the scale of the structured electronic case and the scale of the case in the library, wherein the calculation formula is as follows:
C=N1/N2,N2≥N1
wherein C represents the scale approximation, N1Representing the total number of case entities, N, with a smaller number of entities2Representing the total number of the case entities with more entities;
calculating the similarity between the structured electronic case and the case in the library according to the formula
As a possible embodiment, the sorting and outputting the cases in the library according to the calculated similarity includes:
acquiring a list of cases with high to low similarity between the cases in the database and the structured electronic cases according to the calculated similarity;
and traversing the case list, filtering the case list according to a preset similarity threshold t, and sequentially storing cases in the library with the similarity greater than or equal to the similarity threshold t into a final return list and outputting the cases.
A semantic similar case retrieval device based on medical knowledge mapping comprises:
the case acquisition module is used for acquiring the electronic case meeting the case content standard requirement;
the case structuring module is used for structuring the electronic case text by combining a medical knowledge map to obtain a structured electronic case with uniform standard terms;
the similarity calculation module is used for calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity;
and the output module is used for sequencing and outputting the cases in the library according to the calculated similarity.
A storage medium comprising a stored program which, when executed, controls a device on which the storage medium is located to perform the method for semantic similar case retrieval based on medical knowledge-graph.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the medical knowledge-graph based semantic similar case retrieval method as described when executing the program.
Compared with the prior art, the invention has the following beneficial effects:
the invention extracts the clinical expression information of the case content through the entity extraction model, and performs the structuralization processing on the electronic case text by combining the medical knowledge map to obtain the structuralized electronic case with the uniform standard terms, and according to the content matching degree and scale similarity degree calculating the similarity of said structured electronic case and case in the library to obtain and output the similar case in the library, thereby ensuring the correctness and normalization of the transformation from the unstructured case to the structured case, and simultaneously, because the similarity between the structured electronic case and the case in the database is calculated by constructing a semantic similarity calculation model according to the semantic network of the knowledge graph, the matching quantity and the matching metric value are considered, therefore, the definition of the similarity is not only much to be matched, but also the matching accuracy, and the granularity requirement and the accuracy of matching of similar cases are improved.
In addition to the objects, features and advantages described above, other objects, features and advantages of the present invention are also provided. The present invention will be described in further detail below with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 is a flow chart of the semantic similar case retrieval method based on medical knowledge mapping according to the preferred embodiment of the invention.
Fig. 2 is a sample basic health information for an electronic patient case.
Fig. 3 is a schematic diagram of structured extraction of electronic patient cases according to the preferred embodiment of the present invention.
Fig. 4 is a diagram illustrating the effect of the present invention on the normalization of structured electronic cases.
Fig. 5 is a schematic diagram of an output similar cases interface in accordance with a preferred embodiment of the present invention.
FIG. 6 is a schematic diagram of a tree structure of dependencies between entities in a medical knowledge graph.
Detailed Description
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
As shown in fig. 1, a semantic similar case retrieval method based on medical knowledge-graph includes the steps:
s1, acquiring the electronic case meeting the case content specification requirement;
s2, carrying out structuring processing on the electronic case text, and combining a medical knowledge map to obtain a structured electronic case with uniform standard terms;
s3, calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity;
and S4, sorting and outputting the cases in the library according to the calculated similarity.
The embodiment extracts clinical performance information of case content through an entity extraction model, performs structural processing on the electronic case text by combining a medical knowledge graph to obtain a structural electronic case with uniform standard terms, and according to the content matching degree and scale similarity degree calculating the similarity of said structured electronic case and case in the library to obtain and output the similar case in the library, thereby ensuring the correctness and normalization of the transformation from the unstructured case to the structured case, and simultaneously, because the similarity between the structured electronic case and the case in the database is calculated by constructing a semantic similarity calculation model according to the semantic network of the knowledge graph, the matching quantity and the matching metric value are considered, therefore, the definition of the similarity is not only much to be matched, but also the matching accuracy, and the granularity requirement and the accuracy of matching of similar cases are improved.
As a possible embodiment, as shown in table 1, the electronic case meeting the requirements of the case content specification includes basic patient information including patient name, sex, age, marital status, and occupation, and basic health information including chief complaints, current medical history, past medical history, personal history, family history, and physical examination. Wherein the basic information of the patient is structured information, and the basic health information is unstructured text, and further structured extraction is needed.
TABLE 1 basic patient information and basic health information
The basic health information in the electronic case is shown in fig. 2, for example, and further structured extraction is required because the basic health information is unstructured text.
As a possible embodiment, the step of structuring the electronic case text by combining the medical knowledge map to obtain a structured electronic case with uniform canonical terms specifically includes the steps of:
s21, extracting medical entities from basic health information of a patient by using an entity extraction model, wherein the entity extraction model adopts a named entity recognition model bilstm-crf and is trained and learned based on an electronic case text;
s22, aligning and standardizing the extracted medical entity with the medical knowledge graph, aligning the non-professional term expression with the professional term expression to obtain the medical entity with standard terms, and adopting a translation model bilstm-attention based on coding and decoding technology when aligning and standardizing the extracted medical entity with the medical knowledge graph, and training and learning based on a medical term system with unified specification in the medical knowledge graph;
and S23, classifying the medical entities with the standard terms according to preset entity categories to obtain the structured electronic cases with the uniform standard terms.
As shown in table 2, the predetermined entity categories are obtained by classifying several categories of clinical features according to different sources of the entities and negative and positive properties of the entities, including: chief symptoms, chief signs, non-chief symptoms, non-chief signs, current illness, historical illness, current causes, historical causes, familial illness, current medications, historical medications, current surgery, historical surgery, current examination items, historical examination items, current examination results, historical examination results, current physical examination, historical physical examination, current occupation, historical occupation, physical constitution, physical condition, the plurality of medical clinical characteristics including chief symptoms, chief signs, non-chief symptoms, non-chief signs, illness, causes, surgery, medication, physical condition, physical constitution, occupation, physical examination, examination items, examination results, and examination results, the negative positive of the entity indicates the existence of the entity, positive indicates the presence of positive, and negative indicates the absence of negative.
Table 2 entity classes for structured extraction
In the above embodiment, the structured extraction process is mainly divided into two steps, the first step is to extract an entity from the basic health information of the patient; and secondly, classifying the extracted entities according to the classification rules in the table 2.
After structured extraction is performed on the basic health information of the patient, the extraction result shown in fig. 3 is obtained.
In order to normalize the extracted entities, all entities are further structured and aligned (non-professional expressions are aligned with professional expressions, and all entities in the knowledge graph are expressed by professional terms), and the normalized processing effect is as shown in fig. 4, converting the entities into a structured effect in JSON format, and secondly converting the entities into standard terms in medical knowledge graph by aligning, and describing as "iterating" because there is no "reappearance" term in the knowledge graph. After all entities are extracted and normalized, all entities are classified and organized according to the classification mode provided by the table 2, and finally the structured electronic case with normalized terms is obtained.
As a possible embodiment, the calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity specifically comprises the following steps:
calculating a content matching degree of the structured electronic case and the case in the library, wherein the content matching degree is obtained by dividing the entity matching score of the structured electronic case and the case in the library by the total entity score of the structured electronic case:
wherein M represents a content matching degree, S1Entity matching score, S, representing structured case and case in repository2The method comprises the following steps of representing the total entity score of a structured electronic case, w represents entity category weight, m represents entity type total number, i represents the currently traversed entity type ordinal number, n represents the entity total number corresponding to the ith entity type, j represents the currently traversed entity ordinal number, f is a matching factor, represents the result of entity matching, and takes the value of 0-1, if complete matching is successful, the matching factor is equal to 1, and if complete matching is failed, the matching factor is 0, wherein the entity and knowledge in a knowledge graph have an affiliation relationship, so that a tree structure is formed, as shown in FIG. 6, therefore, the matching factor f between any two entities is calculated based on the tree structure formed by the affiliation relationship between the entity and the entity in a medical knowledge graph:
fab=1/(1+n)
wherein n is a distance from the entity a to the root node or a from the entity b to the root node, if the distance n is not found, the distance is infinite, the matching factor between the entity a and the entity b is 0, if the distance a is b, the distance n is 0, the matching factor is 1, in the tree structure diagram shown in fig. 6, the irritant dry cough belongs to dry cough, the dry cough belongs to the cough, and the cough belongs to respiratory system symptoms, the repeated cough belongs to cough, if the entity a is irritant dry cough, and if the entity b is cough, the distance n from the entity a (irritant dry cough) to the entity b (cough) is 2, the matching factor f between the entity a (irritant dry cough) and the entity b (cough) is 1/3, if the entity a is repeated cough, and the b is cough, the distance n from the entity a (recurrent cough) to the entity b (cough) is 1, the matching factor f between entity a (repeated cough) and entity b (cough) is 1/2;
calculating the similarity of the scale of the structured electronic case and the scale of the case in the library, wherein the calculation formula is as follows:
C=N1/N2,N2≥N1
wherein C represents the scale approximation, N1Representing the total number of case entities, N, with a smaller number of entities2Representing the total number of the case entities with more entities;
calculating the similarity between the structured electronic case and the case in the library according to the formula
As a possible embodiment, the sorting and outputting the cases in the library according to the calculated similarity includes:
acquiring a list of cases with high to low similarity between the cases in the database and the structured electronic cases according to the calculated similarity;
and traversing the case list, filtering the case list according to a preset similarity threshold t (default to 0.5), and sequentially storing cases with similarity greater than or equal to the similarity threshold t in a final return list and outputting the cases.
In the above embodiment, the weight w of each entity category is determined by multiple professional doctors according to years of experience of medical science, and the accuracy of the model can be adjusted through the weight parameters. The weights corresponding to the entity categories are specifically shown in table 3.
TABLE 3 weight definition and initialization values for entity classes
The similarity degree in this embodiment is calculated in two steps, including content similarity degree and scale similarity degree, which are actually two dimensions considered by the similarity degree calculation: both the number of matches and the metric of the matches are taken into account, i.e. the definition of similarity is not only how many matches are to be matched, but also what the match is to be.
The advantages of similarity calculation in this embodiment include:
(1) based on the similarity model, 17 classes of matching factors are provided, and chief complaints and non-chief complaint symptoms are distinguished.
(2) The matching values of two entities, such as 'cough' and 'repeated cough', are calculated by adopting semantic relations based on the knowledge graph, the matching values of the two entities are not 1, but are not 0, the repeated cough belongs to the cough, and the repeated cough belongs to the upper and lower relation in the knowledge graph because the later has a regular attribute of 'repeated'.
(3) And abundant weight parameters are provided, different people have different understandings on the similarity, and the accuracy of the model can be adjusted through the weight parameters.
In the above embodiment, the output similar cases all have a similar value, the value range is 0-1, 1 indicates that the cases are completely consistent, and 0 indicates that the cases are completely different. The higher the threshold value is, the higher the similarity of the output medical records is, and since the higher the reference value of the case with the higher similarity is, in order to reduce the amount of the output similar cases, and also ensure that the doctor can focus attention on the case with the earlier similarity, in the above embodiment, the similar cases are filtered by setting the similarity threshold value t, so that the case with the similarity lower than the threshold value t is filtered, and only the similar cases with the similarity greater than the threshold value t are output, and the similarity threshold value t selected in the above embodiment is 0.5. The filtered similar case information is output as shown in fig. 5.
Another embodiment of the present invention provides a semantic similar case retrieval apparatus based on a medical knowledge graph, including:
the case acquisition module is used for acquiring the electronic case meeting the case content standard requirement;
the case structuring module is used for structuring the electronic case text by combining a medical knowledge map to obtain a structured electronic case with uniform standard terms;
the similarity calculation module is used for calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity;
and the output module is used for sequencing and outputting the cases in the library according to the calculated similarity.
Another embodiment of the present invention provides a storage medium comprising a stored program, wherein when the program runs, a device in which the storage medium is located is controlled to execute the semantic similar case retrieval method based on the medical knowledge-graph.
Another embodiment of the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the medical knowledge-graph-based semantic similar case retrieval method.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
The functions of the method of the present embodiment, if implemented in the form of software functional units and sold or used as independent products, may be stored in one or more storage media readable by a computing device. Based on such understanding, part of the contribution of the embodiments of the present invention to the prior art or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device, a network device, or the like) to execute all or part of the steps of the method described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (9)
1. A semantic similar case retrieval method based on a medical knowledge graph is characterized by comprising the following steps:
acquiring an electronic case meeting the requirements of case content specifications;
carrying out structuralization processing on the electronic case text, and combining a medical knowledge map to obtain a structuralized electronic case with unified standard terms;
calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity;
sorting and outputting the cases in the library according to the calculated similarity;
the step of calculating the similarity between the structured electronic case and the case in the database according to the content matching degree and the scale similarity specifically comprises the following steps:
calculating a content matching degree of the structured electronic case and the case in the library, wherein the content matching degree is obtained by dividing the entity matching score of the structured electronic case and the case in the library by the total entity score of the structured electronic case:
wherein M represents a content matching degree, S1Entity matching score, S, representing structured case and case in repository2The method comprises the following steps of representing the total entity score of a structured electronic case, w representing entity category weight, m representing entity type total number, i representing the currently traversed entity type ordinal number, n representing the entity total number corresponding to the ith entity type, j representing the currently traversed entity ordinal number, f representing the result of entity matching, wherein the value is 0-1, the matching factor is equal to 1 if complete matching is successful, and the matching factor is 0 if complete matching is failed, wherein the matching factor f between any two entities is calculated based on a tree structure formed by the subordinate relations between the entities in a medical knowledge graph:
fab=1/(1+n)
wherein n is the distance for finding b from the entity a to the root node or finding a from the entity b to the root node, if not, the distance n is infinite, the matching factor between the entity a and the entity b is 0, if a is b, the distance n is 0, and the matching factor is 1;
calculating the similarity of the scale of the structured electronic case and the scale of the case in the library, wherein the calculation formula is as follows:
C=N1/N2,N2≥N1
wherein C represents the scale approximation, N1Representing the total number of case entities, N, with a smaller number of entities2Representing the total number of the case entities with more entities;
calculating the similarity between the structured electronic case and the case in the library according to the formula
2. The semantic similar case retrieval method based on medical knowledge-graph according to claim 1, wherein the electronic case meeting the case content specification comprises basic patient information and basic health information, the basic patient information comprises patient name, gender, age and marital conditions, and the basic health information comprises chief complaints, current medical history, past medical history, personal history, family history and physical examination.
3. The semantic similar case retrieval method based on medical knowledge-graph as claimed in claim 2, wherein the step of combining medical knowledge-graph to structure the electronic case text to obtain the structured electronic case with unified canonical terms comprises the steps of:
extracting a medical entity from the basic health information of the patient by using an entity extraction model;
aligning and standardizing the extracted medical entity with the medical knowledge graph, and aligning the non-professional term expression with the professional term expression to obtain the medical entity with standard terms;
and classifying the medical entities with the standard terms according to preset entity categories to obtain the structured electronic cases with the uniform standard terms.
4. The medical knowledge graph-based semantic similar case retrieval method according to claim 3, wherein the entity extraction model adopts a named entity recognition model bilstm-crf and is trained and learned based on electronic case texts; when the extracted medical entity is aligned and standardized with the medical knowledge graph, a translation model bilstm-attention based on an encoding and decoding technology is adopted, and training learning is carried out based on a unified and standardized medical term system in the medical knowledge graph.
5. The medical knowledge-graph-based semantic similar case retrieval method according to claim 4, wherein the preset entity categories are obtained by classifying a plurality of categories of medical clinical features according to different sources of entities and negative and positive of the entities, and the method comprises the following steps: chief symptoms, chief signs, non-chief symptoms, non-chief signs, current illness, historical illness, current causes, historical causes, familial illness, current medications, historical medications, current surgery, historical surgery, current examination items, historical examination items, current examination results, historical examination results, current physical examination, historical physical examination, current occupation, historical occupation, physical constitution, physical condition, the plurality of medical clinical characteristics including chief symptoms, chief signs, non-chief symptoms, non-chief signs, illness, causes, surgery, medication, physical condition, physical constitution, occupation, physical examination, examination items, examination results, and examination results, the negative positive of the entity indicates the existence of the entity, positive indicates the presence of positive, and negative indicates the absence of negative.
6. The semantic similar case retrieval method based on medical knowledge-graph according to claim 1, wherein the sorting and outputting the cases in the library according to the calculated similarity comprises:
acquiring a list of cases with high to low similarity between the cases in the database and the structured electronic cases according to the calculated similarity;
and traversing the case list, filtering the case list according to a preset similarity threshold t, and sequentially storing cases in the library with the similarity greater than or equal to the similarity threshold t into a final return list and outputting the cases.
7. A semantic similar case retrieval device based on medical knowledge mapping is characterized by comprising:
the case acquisition module is used for acquiring the electronic case meeting the case content standard requirement;
the case structuring module is used for structuring the electronic case text by combining a medical knowledge map to obtain a structured electronic case with uniform standard terms;
and the similarity calculation module is used for calculating the similarity between the structured electronic case and the case in the library according to the content matching degree and the scale similarity: calculating a content matching degree of the structured electronic case and the case in the library, wherein the content matching degree is obtained by dividing the entity matching score of the structured electronic case and the case in the library by the total entity score of the structured electronic case:
wherein M represents a content matching degree, S1Entity matching score, S, representing structured case and case in repository2The method comprises the following steps of representing the total entity score of a structured electronic case, w representing entity category weight, m representing entity type total number, i representing the currently traversed entity type ordinal number, n representing the entity total number corresponding to the ith entity type, j representing the currently traversed entity ordinal number, f representing the result of entity matching, wherein the value is 0-1, the matching factor is equal to 1 if complete matching is successful, and the matching factor is 0 if complete matching is failed, wherein the matching factor f between any two entities is calculated based on a tree structure formed by the subordinate relations between the entities in a medical knowledge graph:
fab=1/(1+n)
wherein n is the distance for finding b from the entity a to the root node or finding a from the entity b to the root node, if not, the distance n is infinite, the matching factor between the entity a and the entity b is 0, if a is b, the distance n is 0, and the matching factor is 1;
calculating the similarity of the scale of the structured electronic case and the scale of the case in the library, wherein the calculation formula is as follows:
C=N1/N2,N2≥N1
wherein C represents the scale approximation, N1Representing the total number of case entities, N, with a smaller number of entities2Representing the total number of the case entities with more entities;
calculating the similarity between the structured electronic case and the case in the library according to the formula
And the output module is used for sequencing and outputting the cases in the library according to the calculated similarity.
8. A storage medium comprising a stored program, characterized in that when the program is run, the apparatus on which the storage medium is located is controlled to execute the medical knowledge-graph based semantic similar case retrieval method according to any one of claims 1 to 6.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the medical knowledge graph-based semantic similar case retrieval method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010221246.7A CN111414393B (en) | 2020-03-26 | 2020-03-26 | Semantic similar case retrieval method and equipment based on medical knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010221246.7A CN111414393B (en) | 2020-03-26 | 2020-03-26 | Semantic similar case retrieval method and equipment based on medical knowledge graph |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111414393A CN111414393A (en) | 2020-07-14 |
CN111414393B true CN111414393B (en) | 2021-02-23 |
Family
ID=71491424
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010221246.7A Active CN111414393B (en) | 2020-03-26 | 2020-03-26 | Semantic similar case retrieval method and equipment based on medical knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111414393B (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111986749A (en) * | 2020-07-15 | 2020-11-24 | 万达信息股份有限公司 | Digital pathological image retrieval system |
CN112070731B (en) * | 2020-08-27 | 2021-05-11 | 佛山读图科技有限公司 | Method for guiding registration of human body model atlas and case CT image by artificial intelligence |
CN112216397A (en) * | 2020-09-10 | 2021-01-12 | 广州呼吸健康研究院 | Early warning method and system for new coronary pneumonia |
CN112635072A (en) * | 2020-12-31 | 2021-04-09 | 大连东软教育科技集团有限公司 | ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium |
CN112650860A (en) * | 2021-01-15 | 2021-04-13 | 科技谷(厦门)信息技术有限公司 | Intelligent electronic medical record retrieval system based on knowledge graph |
CN112925918B (en) * | 2021-02-26 | 2023-03-24 | 华南理工大学 | Question-answer matching system based on disease field knowledge graph |
CN113257371B (en) * | 2021-06-03 | 2022-02-15 | 中南大学 | Clinical examination result analysis method and system based on medical knowledge map |
CN113345587B (en) * | 2021-06-16 | 2022-06-17 | 北京邮电大学 | Man-machine collaborative health case matching method and system based on chronic disease big data |
CN113641784A (en) * | 2021-06-25 | 2021-11-12 | 合肥工业大学 | Medical knowledge recommendation method and system integrating medical teaching and research |
CN113221541A (en) * | 2021-07-09 | 2021-08-06 | 清华大学 | Data extraction method and device |
CN113539409B (en) * | 2021-07-28 | 2024-04-26 | 平安科技(深圳)有限公司 | Treatment scheme recommendation method, device, equipment and storage medium |
CN113488189A (en) * | 2021-08-03 | 2021-10-08 | 罗慕科技(北京)有限公司 | Similar case retrieval device, method and computer-readable storage medium |
CN113590842A (en) * | 2021-08-05 | 2021-11-02 | 思必驰科技股份有限公司 | Medical term standardization method and system |
CN113722418A (en) * | 2021-08-30 | 2021-11-30 | 平安科技(深圳)有限公司 | Clinical case standardization method, device, equipment and medium |
CN113886535B (en) * | 2021-09-18 | 2022-07-08 | 前海飞算云创数据科技(深圳)有限公司 | Knowledge graph-based question and answer method and device, storage medium and electronic equipment |
CN114300083B (en) * | 2021-11-16 | 2022-10-18 | 北京左医科技有限公司 | Medical record construction method and system |
CN113934824B (en) * | 2021-12-15 | 2022-05-06 | 之江实验室 | Similar medical record matching system and method based on multi-round intelligent question answering |
CN114743681B (en) * | 2021-12-20 | 2024-01-30 | 健康数据(北京)科技有限公司 | Case grouping screening method and system based on natural language processing |
CN115312186B (en) * | 2022-08-09 | 2023-06-09 | 北京至真互联网技术有限公司 | Auxiliary screening system for diabetic retinopathy |
CN115269613B (en) * | 2022-09-27 | 2023-01-13 | 四川互慧软件有限公司 | Patient main index construction method, system, equipment and storage medium |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2467791B1 (en) * | 2009-10-13 | 2021-04-28 | Open Text Software GmbH | Method for performing transactions on data and a transactional database |
CN108140025A (en) * | 2015-05-26 | 2018-06-08 | 阿雅斯迪公司 | For the interpretation of result of graphic hotsopt |
US10007721B1 (en) * | 2015-07-02 | 2018-06-26 | Collaboration. AI, LLC | Computer systems, methods, and components for overcoming human biases in subdividing large social groups into collaborative teams |
CN106897572A (en) * | 2017-03-08 | 2017-06-27 | 山东大学 | Lung neoplasm case matching assisted detection system and its method of work based on manifold learning |
CN106934018A (en) * | 2017-03-11 | 2017-07-07 | 广东省中医院 | A kind of doctor's commending system based on collaborative filtering |
CN106991284B (en) * | 2017-03-31 | 2019-12-31 | 南华大学 | Intelligent child-care knowledge service method and system |
CN107247868B (en) * | 2017-05-18 | 2020-05-12 | 深思考人工智能机器人科技(北京)有限公司 | Artificial intelligence auxiliary inquiry system |
US10937551B2 (en) * | 2017-11-27 | 2021-03-02 | International Business Machines Corporation | Medical concept sorting based on machine learning of attribute value differentiation |
CN108492886B (en) * | 2018-03-26 | 2020-10-09 | 合肥工业大学 | Minimally invasive surgery similar case recommendation method, device, equipment and medium |
EP3557439A1 (en) * | 2018-04-16 | 2019-10-23 | Tata Consultancy Services Limited | Deep learning techniques based multi-purpose conversational agents for processing natural language queries |
CN108595708A (en) * | 2018-05-10 | 2018-09-28 | 北京航空航天大学 | A kind of exception information file classification method of knowledge based collection of illustrative plates |
CN108875051B (en) * | 2018-06-28 | 2020-04-28 | 中译语通科技股份有限公司 | Automatic knowledge graph construction method and system for massive unstructured texts |
CN110265098A (en) * | 2019-05-07 | 2019-09-20 | 平安科技(深圳)有限公司 | A kind of case management method, apparatus, computer equipment and readable storage medium storing program for executing |
CN110222201B (en) * | 2019-06-26 | 2021-04-27 | 中国医学科学院医学信息研究所 | Method and device for constructing special disease knowledge graph |
CN110516260A (en) * | 2019-08-30 | 2019-11-29 | 腾讯科技(深圳)有限公司 | Entity recommended method, device, storage medium and equipment |
CN110598116A (en) * | 2019-09-19 | 2019-12-20 | 上海腾程医学科技信息有限公司 | Inspection item recommendation method and device, terminal equipment and storage medium |
-
2020
- 2020-03-26 CN CN202010221246.7A patent/CN111414393B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111414393A (en) | 2020-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111414393B (en) | Semantic similar case retrieval method and equipment based on medical knowledge graph | |
CN107731269B (en) | Disease coding method and system based on original diagnosis data and medical record file data | |
CN107705839B (en) | Disease automatic coding method and system | |
CN110993081B (en) | Doctor online recommendation method and system | |
CN106776711B (en) | Chinese medical knowledge map construction method based on deep learning | |
CN106682411B (en) | A method of disease label is converted by physical examination diagnostic data | |
CN113707297B (en) | Medical data processing method, device, equipment and storage medium | |
US7809660B2 (en) | System and method to optimize control cohorts using clustering algorithms | |
CN112131393A (en) | Construction method of medical knowledge map question-answering system based on BERT and similarity algorithm | |
CN113535974B (en) | Diagnostic recommendation method and related device, electronic equipment and storage medium | |
CN114817386A (en) | Method and device for generating structured medical data | |
CN111191048A (en) | Emergency call question-answering system construction method based on knowledge graph | |
WO2020074023A1 (en) | Deep learning-based method and device for screening for key sentences in medical document | |
Wang et al. | Automatic diagnosis with efficient medical case searching based on evolving graphs | |
WO2021127012A1 (en) | Unsupervised taxonomy extraction from medical clinical trials | |
Khan et al. | Development of national health data warehouse for data mining. | |
CN113764112A (en) | Online medical question and answer method | |
Wang et al. | Multiple valued logic approach for matching patient records in multiple databases | |
CN110299194B (en) | Similar case recommendation method based on comprehensive feature representation and improved wide-depth model | |
Saranya et al. | Intelligent medical data storage system using machine learning approach | |
CN113343680A (en) | Structured information extraction method based on multi-type case history texts | |
CN116304114B (en) | Intelligent data processing method and system based on surgical nursing | |
CN113284627A (en) | Medication recommendation method based on patient characterization learning | |
CN112635072A (en) | ICU (intensive care unit) similar case retrieval method and system based on similarity calculation and storage medium | |
Kalankesh et al. | Taming EHR data: using semantic similarity to reduce dimensionality |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |