CN116779137A - Data processing method and system based on medical knowledge graph - Google Patents

Data processing method and system based on medical knowledge graph Download PDF

Info

Publication number
CN116779137A
CN116779137A CN202310470069.XA CN202310470069A CN116779137A CN 116779137 A CN116779137 A CN 116779137A CN 202310470069 A CN202310470069 A CN 202310470069A CN 116779137 A CN116779137 A CN 116779137A
Authority
CN
China
Prior art keywords
preset
patient
diagnosis result
medical
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310470069.XA
Other languages
Chinese (zh)
Inventor
夏有兵
朱鹏
徐天成
胡剑
苗淳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Nanjing University of Chinese Medicine
Xuzhou Medical University
Original Assignee
Nanjing University of Science and Technology
Nanjing University of Chinese Medicine
Xuzhou Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology, Nanjing University of Chinese Medicine, Xuzhou Medical University filed Critical Nanjing University of Science and Technology
Priority to CN202310470069.XA priority Critical patent/CN116779137A/en
Publication of CN116779137A publication Critical patent/CN116779137A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The application provides a data processing method and a system based on medical knowledge graph, which are characterized in that audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process are obtained, and key information of the patient in the audio-visual data is extracted; determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model; determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set; and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value. Therefore, the method and the device achieve the purposes of reducing manual calibration, automatically training a retrieval model, improving retrieval efficiency and retrieval accuracy, and solving the technical problem of how to accurately search for a target entity in a complex medical knowledge graph.

Description

Data processing method and system based on medical knowledge graph
Technical Field
The application relates to the technical field of knowledge maps, in particular to a data processing method and system based on a medical knowledge map.
Background
The knowledge graph is a form of deep development of semantic networks, is widely applied to a plurality of technical fields as a basic data service, and provides basic data support for various upper intelligent scenes. With the deep advancement of the digitization process in the medical field, the medical knowledge graph becomes an important data support service of a novel medical system, such as an internet online medical system.
At present, most researches are focused on the construction technology of medical knowledge maps, so that the medical knowledge maps tend to be perfect and complex. This also presents new challenges for medical knowledge-graph applications. How to quickly and accurately search a target entity in a complex medical knowledge graph becomes a technical problem to be solved.
Disclosure of Invention
The application provides a data processing method and a data processing system based on a medical knowledge graph, which aim to solve the technical problem of how to quickly and accurately search a target entity in a complex medical knowledge graph.
In a first aspect, the present application provides a data processing method based on a medical knowledge graph, including:
acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process, and extracting key information of the patient in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
In one possible design, determining the set of candidate diagnostic results from the medical knowledge-graph, the patient key information, and the medical examination data using a preset retrieval model includes:
extracting texts corresponding to the key information of the patient by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors;
combining the word vector and the medical examination data into a retrieval subgraph, matching the retrieval subgraph with a medical knowledge graph, and determining a plurality of matching entities and corresponding matching degrees;
and combining the matching entities with the matching degree larger than or equal to a preset threshold value into a diagnosis result set to be selected.
In one possible design, the patient critical information includes: dialogue information between doctor and patient in the inquiry process;
extracting text corresponding to the key information of the patient by using a voice text converter, and extracting features of the text by using a preset neural network model to obtain a plurality of word vectors, wherein the method comprises the following steps:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a doctor's inquiry text;
judging whether a preset trigger word appears in the second text set;
if yes, segmenting the first text set according to the triggering time corresponding to the preset triggering word to obtain at least two first sub-text sets;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
In one possible design, when there are multiple rounds of dialogues between the doctor and the patient in the inquiry process, the disease diagnosis result includes a plurality of intermediate diagnosis results, the intermediate diagnosis results correspond to the inquiry content of the doctor in the next round of dialogues, and the inquiry content contains new preset trigger words;
according to the disease diagnosis result and the to-be-selected diagnosis result set, determining a predicted evaluation value of a preset retrieval model, including:
and determining a predicted evaluation value corresponding to each dialog according to the intermediate diagnosis result corresponding to each dialog and the candidate diagnosis result set corresponding to each dialog.
In one possible design, the disease diagnosis results include: at least one true diagnosis and at least one false diagnosis;
according to the disease diagnosis result and the to-be-selected diagnosis result set, determining a predicted evaluation value of a preset retrieval model, including:
calculating a first probability that the true diagnostic result appears in the set of candidate diagnostic results:
wherein P is 1 For the first probability, N true N is the total number of the results in the set of the diagnosis results to be selected;
determining a second probability according to the number of true and false diagnostic results present in the set of candidate diagnostic results;
and determining a predicted evaluation value according to the first probability and the second probability.
In one possible design, determining the second probability based on the number of true and false diagnostic results present in the set of candidate diagnostic results includes:
or alternatively, the process may be performed,
wherein P is 2 For the second probability, N true N is the number of true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the pseudo diagnosis results contained in the set of the diagnosis results to be selected.
In one possible design, determining the predictive evaluation value from the first probability and the second probability includes:
wherein S is a predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability.
In a second aspect, the present application provides a data processing system based on a medical knowledge graph, comprising:
the acquisition module is used for acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process;
a processing module for:
extracting patient key information in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
In a third aspect, the present application provides a data processing apparatus based on a medical knowledge graph, including: a processor, a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement any one of the possible medical knowledge-graph-based data processing methods provided in the first aspect.
In a fourth aspect, the present application provides a storage medium having stored therein computer-executable instructions which, when executed by a processor, are adapted to carry out any one of the possible medical knowledge-graph based data processing methods provided in the first aspect.
In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements any one of the possible medical knowledge-graph-based data processing methods provided in the first aspect.
The application provides a data processing method and system based on a medical knowledge graph. Acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in a medical treatment process, and extracting key information of the patient in the audio-visual data; determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model; determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set; and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value. Therefore, the method and the device achieve the purposes of reducing manual calibration, automatically training a retrieval model, improving retrieval efficiency and retrieval accuracy, and solving the technical problem of how to accurately search for a target entity in a complex medical knowledge graph.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flow chart of a data processing method based on a medical knowledge graph according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a possible implementation of S102 in FIG. 1 according to an embodiment of the present application;
fig. 3 is a schematic flow chart of a possible implementation of step S103 in fig. 1 provided in this embodiment;
fig. 4 is a schematic structural diagram of a data processing system based on a medical knowledge graph according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, including but not limited to combinations of embodiments, which are within the scope of the application, can be made by one of ordinary skill in the art without inventive effort based on the embodiments of the application.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The knowledge graph is a form of deep development of semantic networks, is widely applied to a plurality of technical fields as a basic data service, and provides basic data support for various upper intelligent scenes. With the deep advancement of the digitization process in the medical field, the medical knowledge graph becomes an important data support service of a novel medical system, such as an internet online medical system.
At present, most researches are focused on the construction technology of medical knowledge maps, so that the medical knowledge maps tend to be perfect and complex. This also presents new challenges for medical knowledge-graph applications. How to quickly and accurately search a target entity in a complex medical knowledge graph becomes a technical problem to be solved.
To solve the above problems. The application is characterized in that:
by independently constructing a retrieval model for the medical knowledge graph, the accuracy of the retrieval model by using the medical knowledge graph is perfected in a cyclic training mode. In order to avoid the need of using a large amount of manual labeling data for training data, a video or audio recording device can be installed in a consulting room of a hospital to record the whole consultation process of a doctor. The search thinking mode of the doctor during the consultation is automatically extracted through the consultation video or audio of the doctor, so that the workload of manual labeling can be reduced, the preset search model can be more similar to the doctor with abundant experience in the medical field, particularly the field like the traditional Chinese medicine field or the field combining the traditional Chinese medicine and the western medicine, and more flexible support can be provided for the application of the medical knowledge graph.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a data processing method based on a medical knowledge graph according to an embodiment of the present application. As shown in fig. 1, the specific steps of the data processing method include:
s101, acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process, and extracting patient key information in the audio-visual data.
In this step, the audiovisual data comprises: the audio data comprises audio data of dialogue between two parties when a doctor makes an inquiry to a patient, or the video data comprises monitoring video and audio data when the doctor makes an inquiry to the patient. Correspondingly, extracting the patient key information in the audio-visual data comprises: the content of at least one round of inquiry dialogue of the patient and the doctor, such as the question of the doctor to the patient and the description content of the corresponding patient to the disease state, is extracted through voice recognition and/or image recognition technology.
The medical examination data includes: examination results of various medical examinations performed by a patient in a hospital, including: blood routine, X-ray examination, ultrasound examination, nuclear magnetic resonance examination, urine examination, and the like. The disease diagnosis results include: the doctor gives the final diagnosis result.
S102, determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model.
In this step, the set of candidate diagnosis results includes at least one candidate diagnosis result, and each of the candidate diagnosis results is ranked according to its corresponding prediction similarity. So-called predictive similarity is used to characterize the probability that the candidate diagnostic result will occur correspondingly.
Fig. 2 is a schematic flow chart of a possible implementation of S102 in fig. 1 according to an embodiment of the present application. As shown in fig. 2, the specific steps of S102 in this embodiment are as follows:
s1021, extracting texts corresponding to the key information of the patient by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors.
In this embodiment, the patient key information includes: dialogue information between doctor and patient during inquiry. Specific:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a doctor's inquiry text;
judging whether a preset trigger word appears in the second text set;
if yes, segmenting the first text set according to the triggering time corresponding to the preset triggering word to obtain at least two first sub-text sets;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
For example, the voice-to-text converter in the preset search model may first divide the patient key information into at least two parts, namely a first text set and a second text set, according to the timbre. Then, suppose that the doctor asks the patient: "do you not feel comfortable? "or" please say your current illness ", the doctor's query sentences will be extracted by the voice-text converter in the preset retrieval model and put into the second text set. Assuming that "where uncomfortable", "please say" is a preset trigger word, the voice text converter will segment the first text set according to the time axis in the dialogue information at the trigger time corresponding to the preset trigger word, so as to obtain at least two first sub-text sets, and the first sub-text sets after the trigger time include the description content of the patient on the illness state of the patient. The chinese in the first subset of text is then converted to a plurality of Word vectors using neural network models, such as Word2vec models, bi-LSTM (Behavior identity-Long Short Term Memory, behavior recognition-long short term memory) models, bi-lstm+crf (conditional random field ) joint learning models, and so forth. It should be noted that, various neural network models may include a plurality of convolution layers, and various neural network models may be trained by adjusting various intermediate parameters, so that the extracted word vectors are more accurate. Those skilled in the art may select an appropriate neural network model according to the actual application scenario, which is not limited herein.
S1022, combining the word vector and the medical examination data into a retrieval sub-graph, and matching the retrieval sub-graph with the medical knowledge graph to determine a plurality of matching entities and corresponding matching degrees.
In this step, the above word vectors and the medical examination data are combined into a search sub-graph according to a preset format, and then the search sub-graph is compared with each entity in the medical knowledge graph by utilizing the graph similarity model, and when the similarity or matching degree of the search sub-graph and the medical knowledge graph is greater than a preset similarity threshold, one or more entities are determined to be matched with the search sub-graph.
S1023, combining the matching entities with the matching degree larger than or equal to a preset threshold value into a diagnosis result set to be selected.
In the step, the plurality of matching entities are further ranked according to the matching degree, and the matching entities with the matching being greater than or equal to a preset threshold value are selected, or the matching entities with the top N bits are extracted and added into a diagnosis result set to be selected.
S103, determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set, and judging whether the predicted evaluation value is lower than a preset threshold value.
In this step, if the predicted evaluation value is lower than the preset threshold, step S104 is executed, otherwise, it is proved that the preset search model already meets the requirement, and no adjustment is needed, so as to end the training process.
In order to improve the retrieval accuracy and the anti-interference capability of the preset retrieval model, at least one false diagnosis result can be artificially set to be added into the disease diagnosis result, namely the disease diagnosis result comprises: at least one true diagnostic result and at least one false diagnostic result.
Fig. 3 is a schematic flow chart of a possible implementation of step S103 in fig. 1 provided in this embodiment. As shown in fig. 3, step S103 specifically includes:
s1031, calculating a first probability that the true diagnosis result appears in the diagnosis result set to be selected.
In this step, the first probability may be calculated according to formula (1):
wherein P is 1 For the first probability, N true And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the true diagnosis results contained in the set of the diagnosis results to be selected.
S1032, determining a second probability according to the number of true and false diagnostic results present in the set of candidate diagnostic results.
In this step, the second probability is calculated by at least two methods:
the first, as shown in equation (2):
the second, as shown in equation (3):
wherein P is 2 For the second probability, N true N is the number of true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the pseudo diagnosis results contained in the set of the diagnosis results to be selected.
And S1033, determining a predictive evaluation value according to the first probability and the second probability.
In the present embodiment, the predicted evaluation value may be calculated according to the formula (4):
wherein S is a predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability. Optionally, the value of C ranges from 1 to 10, preferably c=2.
In one possible design, when there are multiple rounds of dialogs between the doctor and the patient during the inquiry, the disease diagnosis result includes a plurality of intermediate diagnosis results, and the intermediate diagnosis results correspond to the inquiry content of the doctor in the next round of dialogs, where the inquiry content includes new preset trigger words.
At this time, step S103 includes: and determining a predicted evaluation value corresponding to each dialog according to the intermediate diagnosis result corresponding to each dialog and the candidate diagnosis result set corresponding to each dialog.
That is, the data processing system trains the preset retrieval model for the intermediate diagnosis results of each round, so that the retrieval mode of the preset retrieval model is similar to the thinking mode of doctors, and the retrieval accuracy of unusual diseases or sudden epidemic situations is improved.
S104, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data.
In this step, adjusting parameters of a preset search model includes: and adjusting intermediate parameters in each layer of the preset neural network model in the preset search model and parameters of the preset loss function.
It should be noted that, the parameters adjusted in this step correspond to the type of the preset neural network model set in S102.
After the parameters are adjusted, the step S102 is returned to recalculate the predicted evaluation value, and the method is repeated until the predicted evaluation value is greater than or equal to a preset threshold value, so as to complete the training of the predicted retrieval model.
The embodiment provides a data processing method based on a medical knowledge graph, which comprises the steps of acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in a medical treatment process, and extracting patient key information in the audio-visual data; determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model; determining a predicted evaluation value of a preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set; and if the predicted evaluation value is lower than the preset threshold value, adjusting parameters of a preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value. Therefore, the method and the device achieve the purposes of reducing manual calibration, automatically training a retrieval model, improving retrieval efficiency and retrieval accuracy, and solving the technical problem of how to accurately search for a target entity in a complex medical knowledge graph.
Fig. 4 is a schematic structural diagram of a data processing system based on a medical knowledge graph according to an embodiment of the present application. The medical knowledge-graph based data processing system 400 may be implemented by software, hardware, or a combination of both.
As shown in fig. 4, the battery data processing system 400 includes:
an acquisition module 401, configured to acquire audio-visual data, medical examination data and disease diagnosis results recorded by a patient during a medical treatment process;
a processing module 402, configured to:
extracting patient key information in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
In one possible design, the processing module 402 is configured to:
extracting texts corresponding to the key information of the patient by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors;
combining the word vector and the medical examination data into a retrieval subgraph, matching the retrieval subgraph with a medical knowledge graph, and determining a plurality of matching entities and corresponding matching degrees;
and combining the matching entities with the matching degree larger than or equal to a preset threshold value into a diagnosis result set to be selected.
In one possible design, the patient critical information includes: dialogue information between doctor and patient in the inquiry process;
a processing module 402, configured to:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a doctor's inquiry text;
judging whether a preset trigger word appears in the second text set;
if yes, segmenting the first text set according to the triggering time corresponding to the preset triggering word to obtain at least two first sub-text sets;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
In one possible design, when there are multiple rounds of dialogues between the doctor and the patient in the inquiry process, the disease diagnosis result includes a plurality of intermediate diagnosis results, the intermediate diagnosis results correspond to the inquiry content of the doctor in the next round of dialogues, and the inquiry content contains new preset trigger words;
a processing module 402, configured to:
and determining a predicted evaluation value corresponding to each dialog according to the intermediate diagnosis result corresponding to each dialog and the candidate diagnosis result set corresponding to each dialog.
In one possible design, the disease diagnosis results include: at least one true diagnosis and at least one false diagnosis;
a processing module 402, configured to:
calculating a first probability that the true diagnostic result appears in the set of candidate diagnostic results:
wherein P is 1 For the first probability, N true N is the total number of the results in the set of the diagnosis results to be selected;
determining a second probability according to the number of true and false diagnostic results present in the set of candidate diagnostic results;
and determining a predicted evaluation value according to the first probability and the second probability.
In one possible design, the processing module 402 is configured to determine the second probability based on the number of true and false diagnostic results that are present in the set of candidate diagnostic results, including:
or alternatively, the process may be performed,
wherein P is 2 For the second probability, N true N is the number of true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the set of the diagnosis results to be selected, wherein N is the number of the pseudo diagnosis results contained in the set of the diagnosis results to be selected.
In one possible design, the processing module 402, configured to determine the predicted evaluation value according to the first probability and the second probability, includes:
wherein S is a predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability.
It should be noted that, the system provided in the embodiment shown in fig. 4 may perform the method provided in any of the above method embodiments, and the specific implementation principles, technical features, explanation of terms, and technical effects are similar, and are not repeated herein.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic device 500 may include: at least one processor 501 and a memory 502. Fig. 5 shows an apparatus for example a processor.
A memory 502 for storing a program. In particular, the program may include program code including computer-operating instructions.
The memory 502 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 501 is configured to execute computer-executable instructions stored in the memory 502 to implement the methods described in the method embodiments above.
The processor 501 may be a central processing unit (central processing unit, abbreviated as CPU), or an application specific integrated circuit (application specific integrated circuit, abbreviated as ASIC), or one or more integrated circuits configured to implement embodiments of the present application.
Alternatively, the memory 502 may be separate or integrated with the processor 501. When the memory 502 is a device separate from the processor 501, the electronic device 500 may further include:
a bus 503 for connecting the processor 501 and the memory 502. The bus may be an industry standard architecture (industry standard architecture, abbreviated ISA) bus, an external device interconnect (peripheral component, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. Buses may be divided into address buses, data buses, control buses, etc., but do not represent only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 502 and the processor 501 are integrated on a chip, the memory 502 and the processor 501 may complete communication through an internal interface.
Embodiments of the present application also provide a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, and specifically, the computer readable storage medium stores program instructions for the methods in the above method embodiments.
The embodiments of the present application also provide a computer program product comprising a computer program which, when executed by a processor, implements the method of the above-described method embodiments.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (10)

1. The data processing method based on the medical knowledge graph is characterized by comprising the following steps of:
acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in a medical treatment process, and extracting key information of the patient in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
2. The medical knowledge-graph-based data processing method of claim 1, wherein determining a set of candidate diagnosis results from the medical knowledge graph, the patient key information, and the medical examination data using a preset search model, comprises:
extracting texts corresponding to the patient key information by using a voice text converter, and extracting features of the texts by using a preset neural network model to obtain a plurality of word vectors;
combining the word vector and the medical examination data into a retrieval subgraph, matching the retrieval subgraph with the medical knowledge graph, and determining a plurality of matching entities and corresponding matching degrees;
and combining the matching entities with the matching degree larger than or equal to a preset threshold value into the diagnosis result set to be selected.
3. The medical knowledge-graph-based data processing method according to claim 2, wherein the patient key information includes: dialogue information between doctor and patient in the inquiry process;
extracting text corresponding to the patient key information by using a voice text converter, and extracting features of the text by using a preset neural network model to obtain a plurality of word vectors, wherein the method comprises the following steps:
extracting a first text set and a second text set in the key information of the patient by using a voice text converter, wherein the first text set is used for representing a disease description text of the patient, and the second text set is used for representing a consultation text of the doctor;
judging whether a preset trigger word appears in the second text set;
if yes, the first text set is segmented according to the triggering time corresponding to the preset triggering word, and at least two first sub-text sets are obtained;
and extracting features of the first sub-text set after the triggering time by using a preset neural network model, and determining a plurality of word vectors.
4. A medical knowledge graph-based data processing method according to claim 3, wherein when there are multiple rounds of dialogs between the doctor and the patient in the inquiry process, the disease diagnosis result includes a plurality of intermediate diagnosis results, the intermediate diagnosis results correspond to inquiry contents of the doctor in the next round of dialogs, and the inquiry contents include new preset trigger words;
the determining the predicted evaluation value of the preset search model according to the disease diagnosis result and the to-be-selected diagnosis result set comprises the following steps:
and determining a predicted evaluation value corresponding to each round of dialogue according to the intermediate diagnosis result corresponding to each round of dialogue and the set of the to-be-selected diagnosis results corresponding to each round of dialogue.
5. The medical knowledge-graph-based data processing method according to any one of claims 1 to 4, wherein the disease diagnosis result includes: at least one true diagnosis and at least one false diagnosis;
the determining the predicted evaluation value of the preset search model according to the disease diagnosis result and the to-be-selected diagnosis result set comprises the following steps:
calculating a first probability that the true diagnostic result appears in the set of candidate diagnostic results:
wherein P is 1 For the first probability, N true N is the total number of the results in the to-be-selected diagnosis result set for the number of the true diagnosis results contained in the to-be-selected diagnosis result set;
determining a second probability based on the number of the true diagnostic results and the false diagnostic results that appear in the set of candidate diagnostic results;
and determining the predictive evaluation value according to the first probability and the second probability.
6. The medical knowledge-graph-based data processing method according to claim 5, wherein said determining a second probability from the number of the true diagnosis results and the false diagnosis results appearing in the set of the candidate diagnosis results includes:
or alternatively, the process may be performed,
wherein P is 2 For the second probability, N true N being the number of the true diagnostic results contained in the set of candidate diagnostic results false And N is the total number of the results in the candidate diagnosis result set, wherein N is the number of the false diagnosis results contained in the candidate diagnosis result set.
7. The medical knowledge-graph-based data processing method according to claim 5, wherein the determining the predictive evaluation value according to the first probability and the second probability includes:
wherein S is the predicted evaluation value, P 1 For the first probability, P 2 And C is a preset adjustment coefficient for the second probability.
8. A medical knowledge-graph-based data processing system, comprising:
the acquisition module is used for acquiring audio-visual data, medical examination data and disease diagnosis results recorded by a patient in the medical treatment process;
a processing module for:
extracting patient key information in the audio-visual data;
determining a diagnosis result set to be selected according to the medical knowledge graph, the patient key information and the medical examination data by using a preset retrieval model;
determining a predicted evaluation value of the preset retrieval model according to the disease diagnosis result and the to-be-selected diagnosis result set;
and if the predicted evaluation value is lower than a preset threshold value, adjusting parameters of the preset retrieval model according to the patient key information and the medical examination data until the predicted evaluation value is greater than or equal to the preset threshold value.
9. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the medical knowledge-graph-based data processing method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the medical knowledge-graph based data processing method of any one of claims 1 to 7.
CN202310470069.XA 2023-04-27 2023-04-27 Data processing method and system based on medical knowledge graph Pending CN116779137A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310470069.XA CN116779137A (en) 2023-04-27 2023-04-27 Data processing method and system based on medical knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310470069.XA CN116779137A (en) 2023-04-27 2023-04-27 Data processing method and system based on medical knowledge graph

Publications (1)

Publication Number Publication Date
CN116779137A true CN116779137A (en) 2023-09-19

Family

ID=87991967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310470069.XA Pending CN116779137A (en) 2023-04-27 2023-04-27 Data processing method and system based on medical knowledge graph

Country Status (1)

Country Link
CN (1) CN116779137A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808931A (en) * 2016-03-03 2016-07-27 北京大学深圳研究生院 Knowledge graph based acupuncture and moxibustion decision support method and apparatus
CN113436754A (en) * 2021-07-06 2021-09-24 吴国军 Medical software and method for intelligent terminal inquiry
CN113724858A (en) * 2021-08-31 2021-11-30 平安国际智慧城市科技股份有限公司 Artificial intelligence-based disease examination item recommendation device, method and apparatus
CN115840809A (en) * 2021-09-18 2023-03-24 中山大学 Information recommendation method, device, equipment, system and storage medium
CN115910263A (en) * 2022-10-28 2023-04-04 北京邮电大学 PET/CT image report conclusion auxiliary generation method and device based on knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808931A (en) * 2016-03-03 2016-07-27 北京大学深圳研究生院 Knowledge graph based acupuncture and moxibustion decision support method and apparatus
CN113436754A (en) * 2021-07-06 2021-09-24 吴国军 Medical software and method for intelligent terminal inquiry
CN113724858A (en) * 2021-08-31 2021-11-30 平安国际智慧城市科技股份有限公司 Artificial intelligence-based disease examination item recommendation device, method and apparatus
CN115840809A (en) * 2021-09-18 2023-03-24 中山大学 Information recommendation method, device, equipment, system and storage medium
CN115910263A (en) * 2022-10-28 2023-04-04 北京邮电大学 PET/CT image report conclusion auxiliary generation method and device based on knowledge graph

Similar Documents

Publication Publication Date Title
CN111613339B (en) Similar medical record searching method and system based on deep learning
US10929420B2 (en) Structured report data from a medical text report
KR102424085B1 (en) Machine-assisted conversation system and medical condition inquiry device and method
CN110069779B (en) Symptom entity identification method of medical text and related device
CN111128391B (en) Information processing apparatus, method and storage medium
KR102298330B1 (en) System for generating medical consultation summary and electronic medical record based on speech recognition and natural language processing algorithm
CN110427486B (en) Body condition text classification method, device and equipment
WO2021208444A1 (en) Method and apparatus for automatically generating electronic cases, a device, and a storage medium
CN113689951A (en) Intelligent diagnosis guiding method, system and computer readable storage medium
CN111145903A (en) Method and device for acquiring vertigo inquiry text, electronic equipment and inquiry system
CN116759074A (en) Training method and application of multi-round conversational medical image analysis model
CN115631825A (en) Method for automatically generating structured report by using natural language model and related equipment
CN116313120A (en) Model pre-training method, medical application task processing method and related devices thereof
Li et al. Cortical processing of reference in language revealed by computational models
CN117877660A (en) Medical report acquisition method and system based on voice recognition
CN112349367B (en) Method, device, electronic equipment and storage medium for generating simulated medical record
CN113658690A (en) Intelligent medical guide method and device, storage medium and electronic equipment
CN116779137A (en) Data processing method and system based on medical knowledge graph
CN112199469B (en) Emotion identification method and device and electronic equipment
Biswas et al. Can ChatGPT be Your Personal Medical Assistant?
CN113761899A (en) Medical text generation method, device, equipment and storage medium
CN117854715B (en) Intelligent diagnosis assisting system based on inquiry analysis
WO2023185082A1 (en) Training method and training device for language representation model
CN117766137B (en) Medical diagnosis result determining method and device based on reinforcement learning
CN117875319A (en) Medical field labeling data acquisition method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination