CN116779137A - Data processing method and system based on medical knowledge graph - Google Patents
Data processing method and system based on medical knowledge graph Download PDFInfo
- Publication number
- CN116779137A CN116779137A CN202310470069.XA CN202310470069A CN116779137A CN 116779137 A CN116779137 A CN 116779137A CN 202310470069 A CN202310470069 A CN 202310470069A CN 116779137 A CN116779137 A CN 116779137A
- Authority
- CN
- China
- Prior art keywords
- preset
- patient
- diagnosis result
- medical
- determining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 23
- 238000003745 diagnosis Methods 0.000 claims abstract description 113
- 238000011156 evaluation Methods 0.000 claims abstract description 49
- 238000000034 method Methods 0.000 claims abstract description 42
- 201000010099 disease Diseases 0.000 claims abstract description 38
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 38
- 230000015654 memory Effects 0.000 claims description 20
- 238000012545 processing Methods 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 17
- 238000003062 neural network model Methods 0.000 claims description 16
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 abstract description 7
- 238000013461 design Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 4
- 239000003814 drug Substances 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The application provides a data processing method and a system based on medical knowledge graph, which are characterized in that audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process are obtained, and key information of the patient in the audio-visual data is extracted; determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model; determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set; and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value. Therefore, the method and the device achieve the purposes of reducing manual calibration, automatically training a retrieval model, improving retrieval efficiency and retrieval accuracy, and solving the technical problem of how to accurately search for a target entity in a complex medical knowledge graph.
Description
Technical Field
The application relates to the technical field of knowledge maps, in particular to a data processing method and system based on a medical knowledge map.
Background
The knowledge graph is a form of deep development of semantic networks, is widely applied to a plurality of technical fields as a basic data service, and provides basic data support for various upper intelligent scenes. With the deep advancement of the digitization process in the medical field, the medical knowledge graph becomes an important data support service of a novel medical system, such as an internet online medical system.
At present, most researches are focused on the construction technology of medical knowledge maps, so that the medical knowledge maps tend to be perfect and complex. This also presents new challenges for medical knowledge-graph applications. How to quickly and accurately search a target entity in a complex medical knowledge graph becomes a technical problem to be solved.
Disclosure of Invention
The application provides a data processing method and a data processing system based on a medical knowledge graph, which aim to solve the technical problem of how to quickly and accurately search a target entity in a complex medical knowledge graph.
In a first aspect, the present application provides a data processing method based on a medical knowledge graph, including:
acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process, and extracting key information of the patient in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
In one possible design, determining the set of candidate diagnostic results from the medical knowledge-graph, the patient key information, and the medical examination data using a preset retrieval model includes:
extracting texts corresponding to the key information of the patient by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors;
combining the word vector and the medical examination data into a retrieval subgraph, matching the retrieval subgraph with a medical knowledge graph, and determining a plurality of matching entities and corresponding matching degrees;
and combining the matching entities with the matching degree larger than or equal to a preset threshold value into a diagnosis result set to be selected.
In one possible design, the patient critical information includes: dialogue information between doctor and patient in the inquiry process;
extracting text corresponding to the key information of the patient by using a voice text converter, and extracting features of the text by using a preset neural network model to obtain a plurality of word vectors, wherein the method comprises the following steps:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a doctor's inquiry text;
judging whether a preset trigger word appears in the second text set;
if yes, segmenting the first text set according to the triggering time corresponding to the preset triggering word to obtain at least two first sub-text sets;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
In one possible design, when there are multiple rounds of dialogues between the doctor and the patient in the inquiry process, the disease diagnosis result includes a plurality of intermediate diagnosis results, the intermediate diagnosis results correspond to the inquiry content of the doctor in the next round of dialogues, and the inquiry content contains new preset trigger words;
according to the disease diagnosis result and the to-be-selected diagnosis result set, determining a predicted evaluation value of a preset retrieval model, including:
and determining a predicted evaluation value corresponding to each dialog according to the intermediate diagnosis result corresponding to each dialog and the candidate diagnosis result set corresponding to each dialog.
In one possible design, the disease diagnosis results include: at least one true diagnosis and at least one false diagnosis;
according to the disease diagnosis result and the to-be-selected diagnosis result set, determining a predicted evaluation value of a preset retrieval model, including:
calculating a first probability that the true diagnostic result appears in the set of candidate diagnostic results:
wherein P is 1 For the first probability, N true N is the total number of the results in the set of the diagnosis results to be selected;
determining a second probability according to the number of true and false diagnostic results present in the set of candidate diagnostic results;
and determining a predicted evaluation value according to the first probability and the second probability.
In one possible design, determining the second probability based on the number of true and false diagnostic results present in the set of candidate diagnostic results includes:
or alternatively, the process may be performed,
wherein P is 2 For the second probability, N true N is the number of true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the pseudo diagnosis results contained in the set of the diagnosis results to be selected.
In one possible design, determining the predictive evaluation value from the first probability and the second probability includes:
wherein S is a predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability.
In a second aspect, the present application provides a data processing system based on a medical knowledge graph, comprising:
the acquisition module is used for acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process;
a processing module for:
extracting patient key information in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
In a third aspect, the present application provides a data processing apparatus based on a medical knowledge graph, including: a processor, a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement any one of the possible medical knowledge-graph-based data processing methods provided in the first aspect.
In a fourth aspect, the present application provides a storage medium having stored therein computer-executable instructions which, when executed by a processor, are adapted to carry out any one of the possible medical knowledge-graph based data processing methods provided in the first aspect.
In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements any one of the possible medical knowledge-graph-based data processing methods provided in the first aspect.
The application provides a data processing method and system based on a medical knowledge graph. Acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in a medical treatment process, and extracting key information of the patient in the audio-visual data; determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model; determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set; and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value. Therefore, the method and the device achieve the purposes of reducing manual calibration, automatically training a retrieval model, improving retrieval efficiency and retrieval accuracy, and solving the technical problem of how to accurately search for a target entity in a complex medical knowledge graph.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flow chart of a data processing method based on a medical knowledge graph according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a possible implementation of S102 in FIG. 1 according to an embodiment of the present application;
fig. 3 is a schematic flow chart of a possible implementation of step S103 in fig. 1 provided in this embodiment;
fig. 4 is a schematic structural diagram of a data processing system based on a medical knowledge graph according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, including but not limited to combinations of embodiments, which are within the scope of the application, can be made by one of ordinary skill in the art without inventive effort based on the embodiments of the application.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The knowledge graph is a form of deep development of semantic networks, is widely applied to a plurality of technical fields as a basic data service, and provides basic data support for various upper intelligent scenes. With the deep advancement of the digitization process in the medical field, the medical knowledge graph becomes an important data support service of a novel medical system, such as an internet online medical system.
At present, most researches are focused on the construction technology of medical knowledge maps, so that the medical knowledge maps tend to be perfect and complex. This also presents new challenges for medical knowledge-graph applications. How to quickly and accurately search a target entity in a complex medical knowledge graph becomes a technical problem to be solved.
To solve the above problems. The application is characterized in that:
by independently constructing a retrieval model for the medical knowledge graph, the accuracy of the retrieval model by using the medical knowledge graph is perfected in a cyclic training mode. In order to avoid the need of using a large amount of manual labeling data for training data, a video or audio recording device can be installed in a consulting room of a hospital to record the whole consultation process of a doctor. The search thinking mode of the doctor during the consultation is automatically extracted through the consultation video or audio of the doctor, so that the workload of manual labeling can be reduced, the preset search model can be more similar to the doctor with abundant experience in the medical field, particularly the field like the traditional Chinese medicine field or the field combining the traditional Chinese medicine and the western medicine, and more flexible support can be provided for the application of the medical knowledge graph.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a data processing method based on a medical knowledge graph according to an embodiment of the present application. As shown in fig. 1, the specific steps of the data processing method include:
s101, acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process, and extracting patient key information in the audio-visual data.
In this step, the audiovisual data comprises: the audio data comprises audio data of dialogue between two parties when a doctor makes an inquiry to a patient, or the video data comprises monitoring video and audio data when the doctor makes an inquiry to the patient. Correspondingly, extracting the patient key information in the audio-visual data comprises: the content of at least one round of inquiry dialogue of the patient and the doctor, such as the question of the doctor to the patient and the description content of the corresponding patient to the disease state, is extracted through voice recognition and/or image recognition technology.
The medical examination data includes: examination results of various medical examinations performed by a patient in a hospital, including: blood routine, X-ray examination, ultrasound examination, nuclear magnetic resonance examination, urine examination, and the like. The disease diagnosis results include: the doctor gives the final diagnosis result.
S102, determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model.
In this step, the set of candidate diagnosis results includes at least one candidate diagnosis result, and each of the candidate diagnosis results is ranked according to its corresponding prediction similarity. So-called predictive similarity is used to characterize the probability that the candidate diagnostic result will occur correspondingly.
Fig. 2 is a schematic flow chart of a possible implementation of S102 in fig. 1 according to an embodiment of the present application. As shown in fig. 2, the specific steps of S102 in this embodiment are as follows:
s1021, extracting texts corresponding to the key information of the patient by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors.
In this embodiment, the patient key information includes: dialogue information between doctor and patient during inquiry. Specific:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a doctor's inquiry text;
judging whether a preset trigger word appears in the second text set;
if yes, segmenting the first text set according to the triggering time corresponding to the preset triggering word to obtain at least two first sub-text sets;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
For example, the voice-to-text converter in the preset search model may first divide the patient key information into at least two parts, namely a first text set and a second text set, according to the timbre. Then, suppose that the doctor asks the patient: "do you not feel comfortable? "or" please say your current illness ", the doctor's query sentences will be extracted by the voice-text converter in the preset retrieval model and put into the second text set. Assuming that "where uncomfortable", "please say" is a preset trigger word, the voice text converter will segment the first text set according to the time axis in the dialogue information at the trigger time corresponding to the preset trigger word, so as to obtain at least two first sub-text sets, and the first sub-text sets after the trigger time include the description content of the patient on the illness state of the patient. The chinese in the first subset of text is then converted to a plurality of Word vectors using neural network models, such as Word2vec models, bi-LSTM (Behavior identity-Long Short Term Memory, behavior recognition-long short term memory) models, bi-lstm+crf (conditional random field ) joint learning models, and so forth. It should be noted that, various neural network models may include a plurality of convolution layers, and various neural network models may be trained by adjusting various intermediate parameters, so that the extracted word vectors are more accurate. Those skilled in the art may select an appropriate neural network model according to the actual application scenario, which is not limited herein.
S1022, combining the word vector and the medical examination data into a retrieval sub-graph, and matching the retrieval sub-graph with the medical knowledge graph to determine a plurality of matching entities and corresponding matching degrees.
In this step, the above word vectors and the medical examination data are combined into a search sub-graph according to a preset format, and then the search sub-graph is compared with each entity in the medical knowledge graph by utilizing the graph similarity model, and when the similarity or matching degree of the search sub-graph and the medical knowledge graph is greater than a preset similarity threshold, one or more entities are determined to be matched with the search sub-graph.
S1023, combining the matching entities with the matching degree larger than or equal to a preset threshold value into a diagnosis result set to be selected.
In the step, the plurality of matching entities are further ranked according to the matching degree, and the matching entities with the matching being greater than or equal to a preset threshold value are selected, or the matching entities with the top N bits are extracted and added into a diagnosis result set to be selected.
S103, determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set, and judging whether the predicted evaluation value is lower than a preset threshold value.
In this step, if the predicted evaluation value is lower than the preset threshold, step S104 is executed, otherwise, it is proved that the preset search model already meets the requirement, and no adjustment is needed, so as to end the training process.
In order to improve the retrieval accuracy and the anti-interference capability of the preset retrieval model, at least one false diagnosis result can be artificially set to be added into the disease diagnosis result, namely the disease diagnosis result comprises: at least one true diagnostic result and at least one false diagnostic result.
Fig. 3 is a schematic flow chart of a possible implementation of step S103 in fig. 1 provided in this embodiment. As shown in fig. 3, step S103 specifically includes:
s1031, calculating a first probability that the true diagnosis result appears in the diagnosis result set to be selected.
In this step, the first probability may be calculated according to formula (1):
wherein P is 1 For the first probability, N true And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the true diagnosis results contained in the set of the diagnosis results to be selected.
S1032, determining a second probability according to the number of true and false diagnostic results present in the set of candidate diagnostic results.
In this step, the second probability is calculated by at least two methods:
the first, as shown in equation (2):
the second, as shown in equation (3):
wherein P is 2 For the second probability, N true N is the number of true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the pseudo diagnosis results contained in the set of the diagnosis results to be selected.
And S1033, determining a predictive evaluation value according to the first probability and the second probability.
In the present embodiment, the predicted evaluation value may be calculated according to the formula (4):
wherein S is a predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability. Optionally, the value of C ranges from 1 to 10, preferably c=2.
In one possible design, when there are multiple rounds of dialogs between the doctor and the patient during the inquiry, the disease diagnosis result includes a plurality of intermediate diagnosis results, and the intermediate diagnosis results correspond to the inquiry content of the doctor in the next round of dialogs, where the inquiry content includes new preset trigger words.
At this time, step S103 includes: and determining a predicted evaluation value corresponding to each dialog according to the intermediate diagnosis result corresponding to each dialog and the candidate diagnosis result set corresponding to each dialog.
That is, the data processing system trains the preset retrieval model for the intermediate diagnosis results of each round, so that the retrieval mode of the preset retrieval model is similar to the thinking mode of doctors, and the retrieval accuracy of unusual diseases or sudden epidemic situations is improved.
S104, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data.
In this step, adjusting parameters of a preset search model includes: and adjusting intermediate parameters in each layer of the preset neural network model in the preset search model and parameters of the preset loss function.
It should be noted that, the parameters adjusted in this step correspond to the type of the preset neural network model set in S102.
After the parameters are adjusted, the step S102 is returned to recalculate the predicted evaluation value, and the method is repeated until the predicted evaluation value is greater than or equal to a preset threshold value, so as to complete the training of the predicted retrieval model.
The embodiment provides a data processing method based on a medical knowledge graph, which comprises the steps of acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in a medical treatment process, and extracting patient key information in the audio-visual data; determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model; determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set; and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value. Therefore, the method and the device achieve the purposes of reducing manual calibration, automatically training a retrieval model, improving retrieval efficiency and retrieval accuracy, and solving the technical problem of how to accurately search for a target entity in a complex medical knowledge graph.
Fig. 4 is a schematic structural diagram of a data processing system based on a medical knowledge graph according to an embodiment of the present application. The medical knowledge-graph based data processing system 400 may be implemented by software, hardware, or a combination of both.
As shown in fig. 4, the battery data processing system 400 includes:
an acquisition module 401, configured to acquire audio-visual data, medical examination data and disease diagnosis results recorded by a patient during a medical treatment process;
a processing module 402, configured to:
extracting patient key information in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
In one possible design, the processing module 402 is configured to:
extracting texts corresponding to the key information of the patient by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors;
combining the word vector and the medical examination data into a retrieval subgraph, matching the retrieval subgraph with a medical knowledge graph, and determining a plurality of matching entities and corresponding matching degrees;
and combining the matching entities with the matching degree larger than or equal to a preset threshold value into a diagnosis result set to be selected.
In one possible design, the patient critical information includes: dialogue information between doctor and patient in the inquiry process;
a processing module 402, configured to:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a doctor's inquiry text;
judging whether a preset trigger word appears in the second text set;
if yes, segmenting the first text set according to the triggering time corresponding to the preset triggering word to obtain at least two first sub-text sets;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
In one possible design, when there are multiple rounds of dialogues between the doctor and the patient in the inquiry process, the disease diagnosis result includes a plurality of intermediate diagnosis results, the intermediate diagnosis results correspond to the inquiry content of the doctor in the next round of dialogues, and the inquiry content contains new preset trigger words;
a processing module 402, configured to:
and determining a predicted evaluation value corresponding to each dialog according to the intermediate diagnosis result corresponding to each dialog and the candidate diagnosis result set corresponding to each dialog.
In one possible design, the disease diagnosis results include: at least one true diagnosis and at least one false diagnosis;
a processing module 402, configured to:
calculating a first probability that the true diagnostic result appears in the set of candidate diagnostic results:
wherein P is 1 For the first probability, N true N is the total number of the results in the set of the diagnosis results to be selected;
determining a second probability according to the number of true and false diagnostic results present in the set of candidate diagnostic results;
and determining a predicted evaluation value according to the first probability and the second probability.
In one possible design, the processing module 402 is configured to determine the second probability based on the number of true and false diagnostic results that are present in the set of candidate diagnostic results, including:
or alternatively, the process may be performed,
wherein P is 2 For the second probability, N true N is the number of true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the pseudo diagnosis results contained in the set of the diagnosis results to be selected.
In one possible design, the processing module 402, configured to determine the predicted evaluation value according to the first probability and the second probability, includes:
wherein S is a predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability.
It should be noted that, the system provided in the embodiment shown in fig. 4 may perform the method provided in any of the above method embodiments, and the specific implementation principles, technical features, explanation of terms, and technical effects are similar, and are not repeated herein.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic device 500 may include: at least one processor 501 and a memory 502. Fig. 5 shows an apparatus for example a processor.
A memory 502 for storing a program. In particular, the program may include program code including computer-operating instructions.
The memory 502 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 501 is configured to execute computer-executable instructions stored in the memory 502 to implement the methods described in the method embodiments above.
The processor 501 may be a central processing unit (central processing unit, abbreviated as CPU), or an application specific integrated circuit (application specific integrated circuit, abbreviated as ASIC), or one or more integrated circuits configured to implement embodiments of the present application.
Alternatively, the memory 502 may be separate or integrated with the processor 501. When the memory 502 is a device separate from the processor 501, the electronic device 500 may further include:
a bus 503 for connecting the processor 501 and the memory 502. The bus may be an industry standard architecture (industry standard architecture, abbreviated ISA) bus, an external device interconnect (peripheral component, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. Buses may be divided into address buses, data buses, control buses, etc., but do not represent only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 502 and the processor 501 are integrated on a chip, the memory 502 and the processor 501 may complete communication through an internal interface.
Embodiments of the present application also provide a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, and specifically, the computer readable storage medium stores program instructions for the methods in the above method embodiments.
The embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the method of the above-described method embodiments.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.
Claims (10)
1. The data processing method based on the medical knowledge graph is characterized by comprising the following steps of:
acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in a medical treatment process, and extracting key information of the patient in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
2. The medical knowledge-graph-based data processing method of claim 1, wherein determining a set of candidate diagnosis results from the medical knowledge graph, the patient key information, and the medical examination data using a preset search model, comprises:
extracting texts corresponding to the patient key information by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors;
combining the word vector and the medical examination data into a retrieval subgraph, matching the retrieval subgraph with the medical knowledge graph, and determining a plurality of matching entities and corresponding matching degrees;
and combining the matching entities with the matching degree larger than or equal to a preset threshold value into the diagnosis result set to be selected.
3. The medical knowledge-graph-based data processing method according to claim 2, wherein the patient key information includes: dialogue information between doctor and patient in the inquiry process;
extracting text corresponding to the patient key information by using a voice text converter, and extracting features of the text by using a preset neural network model to obtain a plurality of word vectors, wherein the method comprises the following steps:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a consultation text of the doctor;
judging whether a preset trigger word appears in the second text set;
if yes, the first text set is segmented according to the triggering time corresponding to the preset triggering word, and at least two first sub-text sets are obtained;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
4. A medical knowledge graph-based data processing method according to claim 3, wherein when there are multiple rounds of dialogs between the doctor and the patient in the inquiry process, the disease diagnosis result includes a plurality of intermediate diagnosis results, the intermediate diagnosis results correspond to inquiry contents of the doctor in the next round of dialogs, and the inquiry contents include new preset trigger words;
the determining the predicted evaluation value of the preset search model according to the disease diagnosis result and the to-be-selected diagnosis result set comprises the following steps:
and determining a predicted evaluation value corresponding to each round of dialogue according to the intermediate diagnosis result corresponding to each round of dialogue and the set of the to-be-selected diagnosis results corresponding to each round of dialogue.
5. The medical knowledge-graph-based data processing method according to any one of claims 1 to 4, wherein the disease diagnosis result includes: at least one true diagnosis and at least one false diagnosis;
the determining the predicted evaluation value of the preset search model according to the disease diagnosis result and the to-be-selected diagnosis result set comprises the following steps:
calculating a first probability that the true diagnostic result appears in the set of candidate diagnostic results:
wherein P is 1 For the first probability, N true N is the total number of the results in the to-be-selected diagnosis result set for the number of the true diagnosis results contained in the to-be-selected diagnosis result set;
determining a second probability based on the number of the true diagnostic results and the false diagnostic results that appear in the set of candidate diagnostic results;
and determining the predictive evaluation value according to the first probability and the second probability.
6. The medical knowledge-graph-based data processing method according to claim 5, wherein said determining a second probability from the number of the true diagnosis results and the false diagnosis results appearing in the set of the candidate diagnosis results includes:
or alternatively, the process may be performed,
wherein P is 2 For the second probability, N true N being the number of the true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the candidate diagnosis result set, wherein N is the number of the false diagnosis results contained in the candidate diagnosis result set.
7. The medical knowledge-graph-based data processing method according to claim 5, wherein the determining the predictive evaluation value according to the first probability and the second probability includes:
wherein S is the predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability.
8. A medical knowledge-graph-based data processing system, comprising:
the acquisition module is used for acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process;
a processing module for:
extracting patient key information in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
9. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the medical knowledge-graph-based data processing method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the medical knowledge-graph based data processing method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310470069.XA CN116779137A (en) | 2023-04-27 | 2023-04-27 | Data processing method and system based on medical knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310470069.XA CN116779137A (en) | 2023-04-27 | 2023-04-27 | Data processing method and system based on medical knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116779137A true CN116779137A (en) | 2023-09-19 |
Family
ID=87991967
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310470069.XA Pending CN116779137A (en) | 2023-04-27 | 2023-04-27 | Data processing method and system based on medical knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116779137A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105808931A (en) * | 2016-03-03 | 2016-07-27 | 北京大学深圳研究生院 | Knowledge graph based acupuncture and moxibustion decision support method and apparatus |
CN113436754A (en) * | 2021-07-06 | 2021-09-24 | 吴国军 | Medical software and method for intelligent terminal inquiry |
CN113724858A (en) * | 2021-08-31 | 2021-11-30 | 平安国际智慧城市科技股份有限公司 | Artificial intelligence-based disease examination item recommendation device, method and apparatus |
CN115840809A (en) * | 2021-09-18 | 2023-03-24 | 中山大学 | Information recommendation method, device, equipment, system and storage medium |
CN115910263A (en) * | 2022-10-28 | 2023-04-04 | 北京邮电大学 | PET/CT image report conclusion auxiliary generation method and device based on knowledge graph |
-
2023
- 2023-04-27 CN CN202310470069.XA patent/CN116779137A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105808931A (en) * | 2016-03-03 | 2016-07-27 | 北京大学深圳研究生院 | Knowledge graph based acupuncture and moxibustion decision support method and apparatus |
CN113436754A (en) * | 2021-07-06 | 2021-09-24 | 吴国军 | Medical software and method for intelligent terminal inquiry |
CN113724858A (en) * | 2021-08-31 | 2021-11-30 | 平安国际智慧城市科技股份有限公司 | Artificial intelligence-based disease examination item recommendation device, method and apparatus |
CN115840809A (en) * | 2021-09-18 | 2023-03-24 | 中山大学 | Information recommendation method, device, equipment, system and storage medium |
CN115910263A (en) * | 2022-10-28 | 2023-04-04 | 北京邮电大学 | PET/CT image report conclusion auxiliary generation method and device based on knowledge graph |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111613339B (en) | Similar medical record searching method and system based on deep learning | |
US10929420B2 (en) | Structured report data from a medical text report | |
KR102424085B1 (en) | Machine-assisted conversation system and medical condition inquiry device and method | |
CN110069779B (en) | Symptom entity identification method of medical text and related device | |
CN111128391B (en) | Information processing apparatus, method and storage medium | |
KR102298330B1 (en) | System for generating medical consultation summary and electronic medical record based on speech recognition and natural language processing algorithm | |
CN110427486B (en) | Body condition text classification method, device and equipment | |
WO2021208444A1 (en) | Method and apparatus for automatically generating electronic cases, a device, and a storage medium | |
CN113689951A (en) | Intelligent diagnosis guiding method, system and computer readable storage medium | |
CN111145903A (en) | Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system | |
CN116759074A (en) | Training method and application of multi-round conversational medical image analysis model | |
CN115631825A (en) | Method for automatically generating structured report by using natural language model and related equipment | |
CN116313120A (en) | Model pre-training method, medical application task processing method and related devices thereof | |
Li et al. | Cortical processing of reference in language revealed by computational models | |
CN117877660A (en) | Medical report acquisition method and system based on voice recognition | |
CN112349367B (en) | Method, device, electronic equipment and storage medium for generating simulated medical record | |
CN113658690A (en) | Intelligent medical guide method and device, storage medium and electronic equipment | |
CN116779137A (en) | Data processing method and system based on medical knowledge graph | |
CN112199469B (en) | Emotion identification method and device and electronic equipment | |
Biswas et al. | Can ChatGPT be Your Personal Medical Assistant? | |
CN113761899A (en) | Medical text generation method, device, equipment and storage medium | |
CN117854715B (en) | Intelligent diagnosis assisting system based on inquiry analysis | |
WO2023185082A1 (en) | Training method and training device for language representation model | |
CN117766137B (en) | Medical diagnosis result determining method and device based on reinforcement learning | |
CN117875319A (en) | Medical field labeling data acquisition method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |