CN113505243A - Intelligent question-answering method and device based on medical knowledge graph - Google Patents
Intelligent question-answering method and device based on medical knowledge graph Download PDFInfo
- Publication number
- CN113505243A CN113505243A CN202110863613.8A CN202110863613A CN113505243A CN 113505243 A CN113505243 A CN 113505243A CN 202110863613 A CN202110863613 A CN 202110863613A CN 113505243 A CN113505243 A CN 113505243A
- Authority
- CN
- China
- Prior art keywords
- preset
- data
- medical knowledge
- model
- medical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 239000013598 vector Substances 0.000 claims abstract description 54
- 238000000605 extraction Methods 0.000 claims description 86
- 238000012549 training Methods 0.000 claims description 43
- 230000009466 transformation Effects 0.000 claims description 31
- 238000000844 transformation Methods 0.000 claims description 30
- 238000013507 mapping Methods 0.000 claims description 15
- 238000004821 distillation Methods 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 8
- 238000007906 compression Methods 0.000 claims description 7
- 230000006835 compression Effects 0.000 claims description 3
- 230000036541 health Effects 0.000 abstract description 2
- 201000010099 disease Diseases 0.000 description 16
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 16
- 230000008569 process Effects 0.000 description 8
- 230000014509 gene expression Effects 0.000 description 7
- 238000002372 labelling Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 6
- 208000024891 symptom Diseases 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 229940079593 drug Drugs 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 230000004630 mental health Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 230000002829 reductive effect Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 241000157593 Milvus Species 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 208000031940 Disease Attributes Diseases 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000009323 psychological health Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H80/00—ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Artificial Intelligence (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Biomedical Technology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Pathology (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to an intelligent question-answering method and device based on a medical knowledge graph, and belongs to the technical field of medical health. Wherein, the method comprises the following steps: acquiring a patient consultation text; simultaneously carrying out image search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model, and correspondingly obtaining three recall results; inputting all the recall results into a preset sorting scoring model to obtain scoring data of all the recall results; determining a target recall result with the highest score from all recall results; and sending the target recall result to a preset terminal. According to the method and the device, answers of the user questions are retrieved in the preset medical knowledge map through three retrieval modes, namely graph search, text retrieval and semantic vector retrieval, and the answer which best meets the user questions is determined in all the retrieved answers, so that the method and the device can provide answers with strong pertinence according to the user questions.
Description
Technical Field
The invention relates to the technical field of medical treatment and health, in particular to an intelligent question-answering method and device based on a medical knowledge graph.
Background
The medical platform can provide intelligent question-answering service of medical knowledge for the user, so that the user who is inconvenient to go to the hospital can consult related medical answers through the medical platform. In the related technology, natural language question sentences input by users are understood in a template matching mode, graph data query sentences are constructed according to question sentence types of templates and medical entities in the question sentences of the users, and answers are retrieved from related medical knowledge maps.
However, since the medical field has strong knowledge expertise and is complex, the answer retrieved by the related art through the template matching in the single retrieval mode may not be the answer desired by the user, so that the related art is difficult to give a highly targeted answer according to the user's question, and the answer accuracy is low.
Disclosure of Invention
In view of this, an intelligent question-answering method and device based on a medical knowledge graph are provided to solve the problem that it is difficult to provide a highly targeted answer according to the user's problem in the related art.
The invention adopts the following technical scheme:
in a first aspect, the present application provides an intelligent question-answering method based on a medical knowledge graph, including:
acquiring a patient consultation text;
according to the patient consultation text and the preset consultation question-answer model, image search, text search and semantic vector search are simultaneously carried out in a preset medical knowledge map, and three recall results are correspondingly obtained;
inputting all the recall results into a preset sorting scoring model to obtain scoring data of all the recall results;
determining a target recall result with the highest score from all the recall results;
and sending the target recall result to a preset terminal.
Preferably, the preset medical knowledge map is constructed by the following method:
acquiring medical knowledge data; the medical knowledge data comprises structured data, semi-structured data, and unstructured data;
converting the structured data and the semi-structured data into first extraction result data based on a preset rule, and extracting second extraction result data from the unstructured data based on a preset medical knowledge automatic extraction model; the first extraction result data and the second extraction result data form a medical knowledge extraction result set;
and fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge map.
Preferably, the data format of the first extraction result data is RDF triple or graph data;
the data format of the second extraction result data is RDF triple or graph data.
Preferably, the preset automatic medical knowledge extraction model is obtained by a model training method as follows:
acquiring an initial training corpus data set;
expanding the initial training corpus data set according to a preset data enhancement rule to obtain a final training corpus data set;
training to obtain the preset medical knowledge automatic extraction model based on the final training corpus data set; the preset medical knowledge automatic extraction model is a model about BERT + BilSTM + CRF.
Preferably, the intelligent question-answering method based on the medical knowledge graph further comprises the following steps: replacing a base model of BERT in the model of BERT + BilSTM + CRF with a TinyALBERT Chinese model.
Preferably, the fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge map includes:
constructing a domain synonymous entity library based on the medical knowledge extraction result set and the preset open source knowledge library; the domain synonym entity library comprises synonym pairs of medical entities;
establishing a medical entity mapping relation between the medical knowledge extraction result set and the preset open source knowledge base according to the domain synonymous entity base;
and fusing the medical knowledge extraction result set with a preset open source knowledge base according to the medical entity mapping relation to obtain the preset medical knowledge map.
Preferably, after the medical knowledge extraction result set is fused with a preset open source knowledge base according to the medical entity mapping relationship to obtain the preset medical knowledge map, the method further includes:
storing the entity-relationship data in the preset medical knowledge graph into a preset Neo4j graph database, and storing the entity-attribute data in the preset medical knowledge graph into a preset ElasticSearch.
Preferably, the preset consultation question-answer model is constructed by the following method:
acquiring a training data set;
constructing a sensor transformations twin BERT model;
calculating sentence semantic similarity in the training dataset based on the sensor transformations twin BERT model;
subjecting the sensor transformations twin BERT model to a distillation compression process;
loading a sensor transformations twin BERT model after distillation compression, and selecting TinyALBERT;
finely adjusting the sensor transformations twin BERT model according to the sentence semantic similarity;
sending the fine-tuned sensor transformations twin BERT model to a preset intelligent question-answering system;
converting the problem sample in the preset intelligent question-answering system into a sentence vector of a domain knowledge question-answering sentence through a prediction interface of the fine-tuned sensor transformations twin BERT model;
and storing the sentence vectors of the domain knowledge question-answer sentences into a preset vector storage engine, and creating semantic indexes to obtain the preset consulting question-answer model.
Preferably, the preset ranking score model is an L2R model.
In a second aspect, the present application provides an intelligent question-answering device based on medical knowledge-graph, comprising:
the consultation text acquisition module is used for acquiring patient consultation texts;
the retrieval module is used for simultaneously carrying out image search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model to correspondingly obtain three recall results;
the scoring module is used for inputting all the recall results into a preset ranking scoring model to obtain scoring data of all the recall results;
the target recall result determining module is used for determining a target recall result with the highest score in all the recall results;
and the data sending module is used for sending the target recall result to a preset terminal.
By adopting the technical scheme, the invention provides an intelligent question-answering method based on a medical knowledge graph, which comprises the following steps: acquiring a patient consultation text; simultaneously carrying out image search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model, and correspondingly obtaining three recall results; inputting all the recall results into a preset sorting scoring model to obtain scoring data of all the recall results; determining a target recall result with the highest score from all recall results; and sending the target recall result to a preset terminal. Based on the method, the answers of the user questions are retrieved in the preset medical knowledge graph through three retrieval modes, namely graph search, text retrieval and semantic vector retrieval, and the answers most conforming to the user questions are determined from all the retrieved answers, so that the method can provide answers with strong pertinence according to the user questions, and is high in answer accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of an intelligent question-answering method based on a medical knowledge graph according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of an automatic extraction model of preset medical knowledge according to an embodiment of the present application.
Fig. 3 is a model of sensor transform training provided in an embodiment of the present application.
Fig. 4 is a representation of similarity between two sentences calculated by using a sentence vector according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of an intelligent question-answering device based on a medical knowledge graph according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
Fig. 1 is a schematic flow chart of an intelligent question-answering method based on a medical knowledge graph according to an embodiment of the present invention. As shown in fig. 1, the intelligent question-answering method based on medical knowledge graph of the present embodiment includes:
s101, acquiring a patient consultation text;
s102, simultaneously carrying out image search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model, and correspondingly obtaining three recall results;
s103, inputting all the recall results into a preset sorting scoring model to obtain scoring data of all the recall results;
s104, determining a target recall result with the highest score from all the recall results;
and S105, sending the target recall result to a preset terminal.
Specifically, there are various methods for obtaining the patient consultation text, for example, when the patient visits the medical platform, the patient fills and submits the consultation text according to the platform guidance, so that the platform obtains the patient consultation text.
After the platform acquires the patient consultation text, the consultation text is inquired and analyzed. Specifically, the consultation text is subjected to automatic error correction, query rewriting, word segmentation, keyword extraction, term normalization and query command conversion generation processing, and the consultation text is recalled based on the BM25 algorithm. The advisory text is converted into semantic vector representation by a preset sentence vector representation model and mapped to a semantic vector space which is the same as the question sentence vector in the related field. And then submitting retrieval based on semantic vectors to Milvus of a preset consultation question-answer model, and determining K answers with high matching degree with the consultation text through an Artificial Neural Network (ANN) algorithm to realize recall of the semantic vectors. The consulting text is also converted into a Cypher query sentence of a Neo4j graph database through corresponding rules, and answers are retrieved in a preset medical knowledge graph in a graph searching mode to obtain a corresponding recall result. The consulting text also carries out full-text retrieval through Elastic Search (ES) to obtain a corresponding recall result.
And after three recall results are obtained, inputting all the recall results into a preset ranking scoring model, and obtaining scoring data of all the recall results. And then determining the highest-grade target recall result from all the recall results. And finally, sending the target recall result to a preset terminal so that the patient can obtain answer information according to the content displayed by the preset terminal.
By adopting the technical scheme, the intelligent question-answering method based on the medical knowledge graph comprises the following steps: acquiring a patient consultation text; simultaneously carrying out image search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model, and correspondingly obtaining three recall results; inputting all the recall results into a preset sorting scoring model to obtain scoring data of all the recall results; determining a target recall result with the highest score from all recall results; and sending the target recall result to a preset terminal. Based on the method, the answers of the user questions are retrieved in the preset medical knowledge graph through three retrieval modes, namely graph search, text retrieval and semantic vector retrieval, and the answers which are most consistent with the user questions are determined in all the retrieved answers, so that the method can provide answers with strong pertinence according to the questions of the user.
Preferably, the preset medical knowledge map is constructed by the following method:
acquiring medical knowledge data; the medical knowledge data comprises structured data, semi-structured data, and unstructured data;
converting the structured data and the semi-structured data into first extraction result data based on a preset rule, and extracting second extraction result data from the unstructured data based on a preset medical knowledge automatic extraction model; the first extraction result data and the second extraction result data form a medical knowledge extraction result set;
and fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge map.
In detail, the structured data includes data in a relational database, excel resources, professional classifications, and domain dictionaries. Semi-structured data includes network resources and encyclopedia data for the vertical medical domain. Unstructured data includes web resources in the vertical medical domain, medical professional literature, professional textbooks, and training courses.
For structured data and semi-structured data, the structured data and the semi-structured data are converted into ternary data through manually defined rules in advance, and initial domain knowledge representation data are quickly and efficiently acquired by adopting batch processing tasks. In a specific application process, the two-dimensional table data is converted into attribute map data for structured data, such as desensitized electronic medical record data, a Chinese symptom knowledge base, a mental health diagnosis table, medical industry standards and specifications, a professional classification system and industry open source data. For the semi-structured data, firstly, the content data of the subdivided field is selected according to the conversation scene, and a wrapper is customized according to the content data. The wrapper is then defined, generated, updated, and maintained. And finally, extracting target data from the related database through a wrapper, and performing structuring and normalization processing on the target data to convert the target data into the representation of the attribute graph database.
In a specific application process, firstly, a graph database schema is defined according to the medical field terms and the business rules to the entity types, the entity relations and the entity attribute ranges of the medical knowledge graph. The entity types include, among others, diseases, drugs and symptoms. The entity relationships include disease-symptom, disease-drug, disease-diet (appropriate or contraindicated), and disease-disease (complication). Entity attributes include disease attributes, drug attributes, auxiliary exam attributes, and surgical attributes. Then, a medical data source is selected, and a wrapper with a web crawler package is constructed according to the entity type, the entity relationship and the entity attribute which are defined in advance. Wherein the medical data source includes structured data and semi-structured data. And finally, applying medical service rules, website element rules and a wrapper of the customized web crawler, extracting target data and target service rules from the structured data and the semi-structured data through the wrapper, expressing all the target data in a triple form, and storing the triple form into a medical knowledge extraction result attribute database.
And under business scenes of psychological health consultation and the like, defining a strategy for carrying out combined annotation on the entity and the relationship, wherein the strategy simultaneously comprises entity information and the relationship between the entity information and the relationship. Based on this labeling strategy, the joint extraction of entities and relationships can be translated into a sequence labeling problem in natural language processing. An end-to-end modeling task is accomplished by using neural networks without the need for complex feature engineering. In this embodiment, based on collected and screened mental health consultation professional content data, a corpus of a preset medical knowledge automatic extraction model is constructed, a model based on BERT + BiLSTM + CRF is trained according to the corpus, and a joint learning model for named entity identification and relationship extraction is performed.
The construction method of the automatic extraction model of the preset medical knowledge comprises the following steps:
acquiring and cutting professional literature and website text content to obtain an initial text data set.
And step two, predefining entity relations. The knowledge extraction task is to extract entities and relationships between entities from unstructured text data to form triples like (entity a, entity a relationship to entity B, entity B). The relationship is a predefined entity relationship. Analysis and rule screening of the material data are carried out, and entity relations including alias, disease-symptom and disease-disease (complication) are predefined. And constructing data samples aiming at single entity 1-1, single entity 1-N, multi-entity multi-relation and the like.
And step three, determining a data annotation strategy and annotating the initial text data set. Specifically, a BIOES marking specification is adopted, entity and relation marking is carried out according to a predefined entity relation, data of a single entity 1-1 relation, a single entity 1-N relation and a plurality of entities and relationships are marked in the same way, and therefore the model can effectively complete entity identification and relation extraction in a complex scene. The labeled content comprises position information of entity words, type information of entity relations, role information of entities and directions representing entity relations.
And step four, expanding the initial training corpus data set according to a preset data enhancement rule to obtain a final training corpus data set. Specifically, the method comprises the steps of utilizing a natural language semantic expression form and a sentence pattern of an effective corpus, replacing the effective corpus by different entity items of the same type of words, and performing synonymy expression replacement and sentence pattern reconstruction to expand a labeled data set, wherein the labeled data set specifically comprises alias names and disease-symptom. Disease-disease (complication).
The method for replacing the same entity item by the same word and different entity items refers to a method for replacing other entity items by text segments with the same entity type in a sentence. Thus, the effect of data diversity and noise data can be achieved by the entity instance replacement mode. Synonymous expression substitution refers to a method of substituting some text segments in a sentence for different expressions having the same semantics. Sentence pattern reconstruction refers to sentence pattern replacement of sentences based on different expression rules of Chinese natural language sentence patterns without changing semantic information of original sentences.
It should be noted that, in the implementation process of the above different methods, the labeling information of the text cannot be lost, that is, the replaced text content also carries the corresponding entity relationship tag. And finally outputting the sample which is the corpus carrying the labeling information.
And step five, obtaining corresponding word vectors by the linguistic data carrying the labeling information through a BERT pre-training language model. And then, inputting the word vector into a BilSTM module for further processing to obtain a processed word vector, and inputting the processed word vector into a CRF module to obtain a prediction labeling sequence. Then, each entity in the sequence is extracted and classified, and the whole process of Chinese entity identification is completed.
The preset medical knowledge automatic extraction model of the embodiment does not need a user to train word vectors and word vectors in advance, and only needs to directly input the sequence into the BERT, so that the preset medical knowledge automatic extraction model can automatically extract rich word-level features, grammatical structure features and semantic features in the sequence. BERT can learn semantic features of the corpus, BilSTM can learn longer context relations between words, and CRF can correct sequence errors of BilSTM prediction. The present embodiment can directly use the BERT model, and the direct use of the BERT model has the advantages of high accuracy, but has the disadvantage of low inference speed. In order to solve the disadvantage, the present embodiment may also use a compressed base model using a BERT model instead of BERT, so as to achieve the purpose of improving the inference speed of the whole model while not reducing the accuracy of the whole model. In addition, the knowledge extraction is automatically completed through a preset medical knowledge automatic extraction model, so that the dependence on the expert knowledge in the medical field is reduced, the workload of manual labeling is reduced, and the cost of data cleaning is reduced.
In addition, in the embodiment, aiming at the problem of training of a small sample model with limited manually labeled high-quality training data, a data enhancement strategy is applied, and basic training data is expanded through template rule transformation operation, so that more new training data are created. The data quantity of model training can be increased through a data enhancement mode, data with diversity is generalized, the generalization capability of the model is improved, noise data can also be increased, and the robustness of the model is improved.
Fig. 2 is a schematic structural diagram of an automatic extraction model of preset medical knowledge according to an embodiment of the present application. As shown in fig. 2, in the preset medical knowledge automatic extraction model of this embodiment, B represents the beginning of a semantic block, and the first word in the semantic block is labeled; i denotes the middle content of the semantic block, O denotes the content not belonging to the semantic block, and E denotes the end of the semantic block.
And designing a question subject label needing to be identified and extracted aiming at the question in the specific field, wherein the specific field can be a mental health field. The method is used for marking the standard of training data in an NLP sequence marking task, and realizes the extraction of the domain entity and the entity relation in a question by a model, and the label is shown in the following table:
preferably, after the structured data and the semi-structured data are converted into first extraction result data based on a preset rule, and second extraction result data are extracted from the unstructured data based on a preset medical knowledge automatic extraction model, the preset medical knowledge graph construction method further includes: and manually checking the second extraction result data, and dividing the second extraction result data into a medical knowledge extraction result set after the second extraction result data passes the checking.
Preferably, the data format of the first extraction result data is RDF triple or graph data; the data format of the second extraction result data is RDF triple or graph data.
Preferably, the step of fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge map comprises:
constructing a domain synonymous entity library based on the medical knowledge extraction result set and the preset open source knowledge library; the domain synonym entity library comprises synonym pairs of medical entities;
establishing a medical entity mapping relation between the medical knowledge extraction result set and the preset open source knowledge base according to the domain synonymous entity base;
and fusing the medical knowledge extraction result set with a preset open source knowledge base according to the medical entity mapping relation to obtain the preset medical knowledge map.
In detail, firstly, a synonym pair is crawled on a relevant website based on an XPath data extraction rule based on a page content DOM model in a web crawler. Specifically, a web crawler wrapper is constructed for medical entity types such as diseases, symptoms, examinations, preventive measures, medicines and the like, aliases, English names, abbreviations and the like of medical concepts are acquired from related websites through the wrapper, and word lists are output to serve as the basis of the domain synonymous entity library.
And then, utilizing the alias relationship extracted in the stage of extracting the domain entities and the entity relationships by the medical knowledge extraction model, and adding the domain entity pair with the effective alias relationship acquired from the unstructured data into a domain synonymous entity library.
Next, a domain synonym library is constructed using word vector semantic similarity. Specifically, texts in medical textbooks and network articles are extracted, Chinese word segmentation is carried out, the texts are used as corpus training word2vec word vectors, semantic relevance of the word vectors is utilized, similarity of other strings is calculated, top-n similar entities of the entities are found out, and effective entity pairs are added into a domain synonymous entity library through screening.
And then, aligning the entities based on the similarity of the local relationship attributes of the domain entities. Specifically, with a disease entity as a target, important relationships and attributes of diseases are selected as influence factors for measuring entity similarity, corresponding weights are respectively set, and the overall similarity is calculated through weighted summation. And finally, similar disease entities among knowledge bases from different sources are found out through threshold value screening, and the effective synonymous entity pairs are added into the field synonymous entity base.
And then, establishing a medical entity mapping relation between the medical knowledge extraction result set and the preset open source knowledge base according to the domain synonymous entity base. And fusing the medical knowledge extraction result set with a preset open source knowledge base according to the medical entity mapping relation to obtain the preset medical knowledge map. Specifically, direct mapping of entities among knowledge bases is established through a domain synonymous entity base, after a synonymous entity pair is found, dissimilarity information among the entities is processed, and the relationship and the attribute of the medical entity are combined, wherein the tasks include redundant processing and difference value combination. In the merging process, when the knowledge of the attribute class is merged, the problem that the same attribute corresponds to different attribute values needs to be considered, in this embodiment, a mode of setting the source confidence of the knowledge base is adopted, and the confidence of the knowledge base is set according to the level, the confidence and the authority of the knowledge base, so that when a plurality of knowledge bases conflict, the attribute value of the knowledge base with high confidence is retained.
Preferably, the step of fusing the medical knowledge extraction result set with a preset open source knowledge base according to the medical entity mapping relationship to obtain the preset medical knowledge map further includes:
storing the entity-relationship data in the preset medical knowledge graph into a preset Neo4j graph database, and storing the entity-attribute data in the preset medical knowledge graph into a preset ElasticSearch.
Specifically, the entity-relationship data in the preset medical knowledge graph is stored in a preset Neo4j graph database, and the entities and the relationships in the knowledge graph are represented, so that the front-end application can represent the relationships among various field concepts in a visual association network form. And storing the entity-attribute class data in the preset medical knowledge map into a preset elastic search, defining mapping, and constructing a full-text index. In the embodiment, the domain knowledge is stored by using a mode of fusing a graph database and an ElasticSearch database, and a multi-dimensional index is constructed, so that the embodiment supports intelligent search with fusion of various retrieval algorithms in a question answering stage.
Preferably, the preset consultation question-answer model is constructed by the following method:
acquiring a training data set;
constructing a sensor transformations twin BERT model;
calculating sentence semantic similarity in the training dataset based on the sensor transformations twin BERT model;
subjecting the sensor transformations twin BERT model to a distillation compression process;
loading a distillation compressed sensor transformations twin BERT model, and selecting albert _ chip _ tiny (TinyALBERT);
finely adjusting the sensor transformations twin BERT model according to the sentence semantic similarity;
sending the fine-tuned sensor transformations twin BERT model to a preset intelligent question-answering system;
converting the problem sample in the preset intelligent question-answering system into a sentence vector of a domain knowledge question-answering sentence through a prediction interface of the fine-tuned sensor transformations twin BERT model;
and storing the sentence vectors of the domain knowledge question-answer sentences into a preset vector storage engine, and creating semantic indexes to obtain the preset consulting question-answer model.
In detail, data are collected from a public psychological consulting question and answer corpus by using a cold start and data enhancement method, and then manual marking is carried out to generate a psychological consulting question similar sentence pair training data set with balanced positive and negative sample proportion.
And then, based on the pre-trained BERT model, calculating sentence semantic similarity, and completing the field migration learning of the expression model of the sentence vector. Fig. 3 is a model of sensor transform training provided in an embodiment of the present application. Fig. 4 is a representation of similarity between two sentences calculated by using a sentence vector according to an embodiment of the present application. As shown in fig. 3 and 4, u and v respectively represent vector representations of two input sentences, and | u-v | represents taking absolute values of the two vectors, (u, v, | u-v |) represents splicing the three vectors in a-1 dimension, so that the dimension of the obtained vector is 3 × d, wherein d represents a hidden layer dimension.
Next, the distillation compressed sensor transformations twin BERT model was loaded and TinyALBERT was selected. The sensor transformations model is trained, and the cosine values of two sentence vectors are used for measuring the similarity of two text semantics. And fine-tuning the pre-training model. And storing the trimmed BERT model, packaging and issuing the BERT model to the production environment of the mental health consultation intelligent question-answering system. And converting the question samples in the psychological consultation domain question-answer data set into sentence vector representation of domain knowledge question-answer sentences by using the finely adjusted BERT prediction interface. And storing the generated sentence vector into a vector storage engine milvus, creating a semantic index, and realizing high-speed retrieval of the vector. Therefore, the pre-trained BERT model is finely adjusted by using the professional field linguistic data, the purpose of field migration learning is achieved, the purpose of semantic vectorization expression of question sentences is achieved by using the BERT model, richer semantic features are extracted, and the performance of NLP downstream tasks is improved.
Preferably, the preset ranking score model is an L2R model.
Specifically, the L2R model is constructed by the following method:
acquiring training data of an L2R model; the training data for the L2R model may be data similar to the recall results previously described;
determining similarity scores of text lengths of the two questions; and determining the score of the target data based on the Skip-Gram Scorer, wherein the score is calculated according to the following formula:
wherein, PsbAnd QsbA Skip-Gram set representing a question;
term Match Scorer, which calculates for each Term the sum of the idfs of the matching Term, and the sum of the idfs of all terms in the question. Because the importance of different vocabularies is different, the idf can meet the requirement of the vocabularies;
text Alignment score, specifically using the Waterman-Smith distance to calculate an Alignment score; this distance is more biased for local alignment, i.e., alignment of the optimal subsequence, than the edit distance or Needleman-Wunsch distance;
the Embedding Scorer specifically obtains a problem vector by averaging word vectors, and calculates the similarity of the two problem vectors, including the similarity based on words and words;
entity Scorer: an entity overlap score;
after the basic features are obtained, the final L2R model is obtained by GBDT training.
Fig. 5 is a schematic structural diagram of an intelligent question-answering device based on a medical knowledge graph according to an embodiment of the present application. As shown in fig. 5, the intelligent question-answering device based on medical knowledge graph of the present embodiment includes: a consultation text acquisition module 41, a retrieval module 42, a scoring module 43, a target recall result determination module 44 and a data transmission module 45.
Wherein, the consultation text acquiring module 41 is used for acquiring patient consultation texts; the retrieval module 42 is used for simultaneously performing graph search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model to correspondingly obtain three recall results; the scoring module 43 is configured to input all the recall results into a preset ranking scoring model to obtain scoring data of all the recall results; a target recall result determining module 44, configured to determine a highest-scoring target recall result from all the recall results; and the data sending module 45 is configured to send the target recall result to a preset terminal.
Preferably, the retrieval module 42 is further configured to construct a preset medical knowledge graph, and the preset medical knowledge graph is constructed by the following method:
acquiring medical knowledge data; the medical knowledge data comprises structured data, semi-structured data, and unstructured data;
converting the structured data and the semi-structured data into first extraction result data based on a preset rule, and extracting second extraction result data from the unstructured data based on a preset medical knowledge automatic extraction model; the first extraction result data and the second extraction result data form a medical knowledge extraction result set;
and fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge map.
Preferably, the retrieval module 42 is further configured to construct an automatic extraction model of preset medical knowledge, which is constructed by the following method:
acquiring an initial training corpus data set;
expanding the initial training corpus data set according to a preset data enhancement rule to obtain a final training corpus data set;
training to obtain the preset medical knowledge automatic extraction model based on the final training corpus data set; the preset medical knowledge automatic extraction model is a model about BERT + BilSTM + CRF.
Preferably, the retrieval module 42 is also configured to replace the base model of the BERT with a TinyALBERT chinese model.
The retrieval module 42 is specifically configured to implement the following method:
constructing a domain synonymous entity library based on the medical knowledge extraction result set and the preset open source knowledge library; the domain synonym entity library comprises synonym pairs of medical entities;
establishing a medical entity mapping relation between the medical knowledge extraction result set and the preset open source knowledge base according to the domain synonymous entity base;
and fusing the medical knowledge extraction result set with a preset open source knowledge base according to the medical entity mapping relation to obtain the preset medical knowledge map.
The retrieval module 42 is further configured to store the entity-relationship data in the preset medical knowledge graph into a preset Neo4j graph database, and store the entity-attribute class data in the preset medical knowledge graph into a preset ElasticSearch.
Preferably, the retrieval module 42 is further configured to construct a preset consulting question-answer model, where the preset consulting question-answer model is constructed by the following method:
acquiring a training data set;
constructing a sensor transformations twin BERT model;
calculating sentence semantic similarity in the training dataset based on the sensor transformations twin BERT model;
subjecting the sensor transformations twin BERT model to a distillation compression process;
loading a sensor transformations twin BERT model after distillation compression, and selecting TinyALBERT;
finely adjusting the sensor transformations twin BERT model according to the sentence semantic similarity;
sending the fine-tuned sensor transformations twin BERT model to a preset intelligent question-answering system;
converting the problem sample in the preset intelligent question-answering system into a sentence vector of a domain knowledge question-answering sentence through a prediction interface of the fine-tuned sensor transformations twin BERT model;
and storing the sentence vectors of the domain knowledge question-answer sentences into a preset vector storage engine, and creating semantic indexes to obtain the preset consulting question-answer model.
Preferably, the scoring module 43 is specifically configured to input all the recall results into a preset L2R model, so as to obtain scoring data of all the recall results.
The present embodiment and the above embodiments belong to a general inventive concept, and have the same or corresponding execution processes and beneficial effects, which are not described herein again.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It is to be noted that, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow diagrams or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present invention includes additional implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
Claims (10)
1. An intelligent question-answering method based on a medical knowledge graph is characterized by comprising the following steps:
acquiring a patient consultation text;
according to the patient consultation text and the preset consultation question-answer model, image search, text search and semantic vector search are simultaneously carried out in a preset medical knowledge map, and three recall results are correspondingly obtained;
inputting all the recall results into a preset sorting scoring model to obtain scoring data of all the recall results;
determining a target recall result with the highest score from all the recall results;
and sending the target recall result to a preset terminal.
2. The intelligent question-answering method based on the medical knowledge graph according to claim 1, wherein the preset medical knowledge graph is constructed by the following method:
acquiring medical knowledge data; the medical knowledge data comprises structured data, semi-structured data, and unstructured data;
converting the structured data and the semi-structured data into first extraction result data based on a preset rule, and extracting second result data from the unstructured data based on a preset medical knowledge automatic extraction model; the first extraction result data and the second extraction result data form a medical knowledge extraction result set;
and fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge map.
3. The medical knowledge graph-based intelligent question answering method according to claim 2, wherein the data format of the first extraction result data is RDF triple or graph data;
the data format of the second extraction result data is RDF triple or graph data.
4. The medical knowledge graph-based intelligent question-answering method according to claim 2, wherein the preset automatic medical knowledge extraction model is obtained by a model training method comprising the following steps:
acquiring an initial training corpus data set;
expanding the initial training corpus data set according to a preset data enhancement rule to obtain a final training corpus data set;
training to obtain the preset medical knowledge automatic extraction model based on the final training corpus data set; the preset medical knowledge automatic extraction model is a model about BERT + BilSTM + CRF.
5. The medical knowledge graph-based intelligent question answering method according to claim 4, further comprising: replacing a base model of BERT in the model of BERT + BilSTM + CRF with a TinyALBERT Chinese model.
6. The intelligent question-answering method based on the medical knowledge graph according to claim 2, wherein the step of fusing the medical knowledge extraction result set with a preset open source knowledge base to obtain the preset medical knowledge graph comprises the following steps:
constructing a domain synonymous entity library based on the medical knowledge extraction result set and the preset open source knowledge library; the domain synonym entity library comprises synonym pairs of medical entities;
establishing a medical entity mapping relation between the medical knowledge extraction result set and the preset open source knowledge base according to the domain synonymous entity base;
and fusing the medical knowledge extraction result set with a preset open source knowledge base according to the medical entity mapping relation to obtain the preset medical knowledge map.
7. The intelligent question-answering method based on the medical knowledge graph according to claim 6, wherein after the medical knowledge extraction result set is fused with a preset open source knowledge base according to the medical entity mapping relationship, the method further comprises:
storing the entity-relationship data in the preset medical knowledge graph into a preset Neo4j graph database, and storing the entity-attribute data in the preset medical knowledge graph into a preset ElasticSearch.
8. The medical knowledge graph-based intelligent question-answering method according to claim 1, wherein the preset consulting question-answering model is constructed by the following method:
acquiring a training data set;
constructing a sensor transformations twin BERT model;
calculating sentence semantic similarity in the training dataset based on the sensor transformations twin BERT model;
subjecting the sensor transformations twin BERT model to a distillation compression process;
loading a sensor transformations twin BERT model after distillation compression, and selecting TinyALBERT;
finely adjusting the sensor transformations twin BERT model according to the sentence semantic similarity;
sending the fine-tuned sensor transformations twin BERT model to a preset intelligent question-answering system;
converting the problem sample in the preset intelligent question-answering system into a sentence vector of a domain knowledge question-answering sentence through a prediction interface of the fine-tuned sensor transformations twin BERT model;
and storing the sentence vectors of the domain knowledge question-answer sentences into a preset vector storage engine, and creating semantic indexes to obtain the preset consulting question-answer model.
9. The medical knowledge graph-based intelligent question-answering method according to claim 1, wherein the preset ranking score model is an L2R model.
10. An intelligent question-answering device based on a medical knowledge graph is characterized by comprising:
the consultation text acquisition module is used for acquiring patient consultation texts;
the retrieval module is used for simultaneously carrying out image search, text retrieval and semantic vector retrieval in a preset medical knowledge map according to the patient consultation text and a preset consultation question-answer model to correspondingly obtain three recall results;
the scoring module is used for inputting all the recall results into a preset ranking scoring model to obtain scoring data of all the recall results;
the target recall result determining module is used for determining a target recall result with the highest score in all the recall results;
and the data sending module is used for sending the target recall result to a preset terminal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110863613.8A CN113505243A (en) | 2021-07-29 | 2021-07-29 | Intelligent question-answering method and device based on medical knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110863613.8A CN113505243A (en) | 2021-07-29 | 2021-07-29 | Intelligent question-answering method and device based on medical knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113505243A true CN113505243A (en) | 2021-10-15 |
Family
ID=78015131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110863613.8A Pending CN113505243A (en) | 2021-07-29 | 2021-07-29 | Intelligent question-answering method and device based on medical knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113505243A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114385830A (en) * | 2022-01-14 | 2022-04-22 | 中国建设银行股份有限公司 | Operation and maintenance knowledge online question and answer method and device, electronic equipment and storage medium |
CN114420232A (en) * | 2022-01-17 | 2022-04-29 | 深圳万海思数字医疗有限公司 | Method and system for generating health education data based on electronic medical record data |
CN114416927A (en) * | 2022-01-24 | 2022-04-29 | 招商银行股份有限公司 | Intelligent question and answer method, device, equipment and storage medium |
CN114757169A (en) * | 2022-03-22 | 2022-07-15 | 中国电子科技集团公司第十研究所 | Self-adaptive small sample learning intelligent error correction method based on ALBERT model |
CN115470338A (en) * | 2022-10-27 | 2022-12-13 | 之江实验室 | Multi-scene intelligent question and answer method and system based on multi-way recall |
CN115714022A (en) * | 2022-11-04 | 2023-02-24 | 杭州市临平区妇幼保健院 | Neonatal jaundice health management system based on artificial intelligence |
CN115858819A (en) * | 2023-01-29 | 2023-03-28 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Sample data augmentation method and device |
CN115878790A (en) * | 2022-04-08 | 2023-03-31 | 北京中关村科金技术有限公司 | Intelligent question and answer method and device, storage medium and electronic equipment |
CN116703337A (en) * | 2023-08-08 | 2023-09-05 | 金现代信息产业股份有限公司 | Project document examination system and method based on artificial intelligence technology |
CN117059229A (en) * | 2023-10-09 | 2023-11-14 | 北京健康有益科技有限公司 | Diabetes catering scheme generation method, device, electronic equipment and storage medium |
CN117454989A (en) * | 2023-11-14 | 2024-01-26 | 生命奇点(北京)科技有限公司 | System for updating electronic medical record question-answer model based on parameter adjustment |
CN117573843A (en) * | 2024-01-15 | 2024-02-20 | 图灵人工智能研究院(南京)有限公司 | Knowledge calibration and retrieval enhancement-based medical auxiliary question-answering method and system |
CN117874178A (en) * | 2023-10-30 | 2024-04-12 | 阿里健康科技(杭州)有限公司 | Method, device, equipment and medium for determining medical response text data |
CN118035394A (en) * | 2023-12-08 | 2024-05-14 | 重庆邮电大学 | Medical question-answering method and system based on multi-source data integration |
CN118070813A (en) * | 2023-12-19 | 2024-05-24 | 中联国际工程管理有限公司 | Investment decision consultation question-answering method and system based on NLP and large language model |
CN118194996A (en) * | 2024-05-14 | 2024-06-14 | 智慧眼科技股份有限公司 | Knowledge graph-based large-model reliable medical knowledge injection method and device |
CN118427307A (en) * | 2024-05-23 | 2024-08-02 | 中日友好医院(中日友好临床医学研究所) | Parathyroid medical knowledge intelligent query method and device based on knowledge graph |
CN118552205A (en) * | 2024-07-16 | 2024-08-27 | 深圳市荣信诚科技有限公司 | Intelligent service method and computer equipment applied to intelligent community |
CN117454989B (en) * | 2023-11-14 | 2024-10-22 | 生命奇点(北京)科技有限公司 | System for updating electronic medical record question-answer model based on parameter adjustment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109543047A (en) * | 2018-11-21 | 2019-03-29 | 焦点科技股份有限公司 | A kind of knowledge mapping construction method based on medical field website |
CN111475623A (en) * | 2020-04-09 | 2020-07-31 | 北京北大软件工程股份有限公司 | Case information semantic retrieval method and device based on knowledge graph |
CN112487154A (en) * | 2020-12-24 | 2021-03-12 | 武汉烽火众智数字技术有限责任公司 | Intelligent search method based on natural language |
CN112632261A (en) * | 2020-12-30 | 2021-04-09 | 中国平安财产保险股份有限公司 | Intelligent question and answer method, device, equipment and storage medium |
CN112905764A (en) * | 2021-02-07 | 2021-06-04 | 深圳万海思数字医疗有限公司 | Epidemic disease consultation prevention and training system construction method and system |
-
2021
- 2021-07-29 CN CN202110863613.8A patent/CN113505243A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109543047A (en) * | 2018-11-21 | 2019-03-29 | 焦点科技股份有限公司 | A kind of knowledge mapping construction method based on medical field website |
CN111475623A (en) * | 2020-04-09 | 2020-07-31 | 北京北大软件工程股份有限公司 | Case information semantic retrieval method and device based on knowledge graph |
CN112487154A (en) * | 2020-12-24 | 2021-03-12 | 武汉烽火众智数字技术有限责任公司 | Intelligent search method based on natural language |
CN112632261A (en) * | 2020-12-30 | 2021-04-09 | 中国平安财产保险股份有限公司 | Intelligent question and answer method, device, equipment and storage medium |
CN112905764A (en) * | 2021-02-07 | 2021-06-04 | 深圳万海思数字医疗有限公司 | Epidemic disease consultation prevention and training system construction method and system |
Non-Patent Citations (3)
Title |
---|
ZILLIZ PLANET: ""相似问答检索——汽车之家的 Milvus 实践"", pages 1 - 4, Retrieved from the Internet <URL:https://blog.csdn.net/weixin_44839084/article/details/108373546> * |
唐子惠: "《医学人工智能导论》", 30 April 2020, 上海科学技术出版社, pages: 392 - 396 * |
陈程;翟洁;秦锦玉;江嘉;武海霞;蔡婷婷;: "基于中医药知识图谱的智能问答技术研究", 中国新通信, no. 02 * |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114385830A (en) * | 2022-01-14 | 2022-04-22 | 中国建设银行股份有限公司 | Operation and maintenance knowledge online question and answer method and device, electronic equipment and storage medium |
CN114420232A (en) * | 2022-01-17 | 2022-04-29 | 深圳万海思数字医疗有限公司 | Method and system for generating health education data based on electronic medical record data |
CN114416927A (en) * | 2022-01-24 | 2022-04-29 | 招商银行股份有限公司 | Intelligent question and answer method, device, equipment and storage medium |
CN114416927B (en) * | 2022-01-24 | 2024-04-02 | 招商银行股份有限公司 | Intelligent question-answering method, device, equipment and storage medium |
CN114757169A (en) * | 2022-03-22 | 2022-07-15 | 中国电子科技集团公司第十研究所 | Self-adaptive small sample learning intelligent error correction method based on ALBERT model |
CN115878790A (en) * | 2022-04-08 | 2023-03-31 | 北京中关村科金技术有限公司 | Intelligent question and answer method and device, storage medium and electronic equipment |
CN115878790B (en) * | 2022-04-08 | 2023-08-25 | 北京中关村科金技术有限公司 | Intelligent question-answering method and device, storage medium and electronic equipment |
CN115470338A (en) * | 2022-10-27 | 2022-12-13 | 之江实验室 | Multi-scene intelligent question and answer method and system based on multi-way recall |
CN115714022B (en) * | 2022-11-04 | 2024-02-23 | 杭州市临平区妇幼保健院 | Neonatal jaundice health management system based on artificial intelligence |
CN115714022A (en) * | 2022-11-04 | 2023-02-24 | 杭州市临平区妇幼保健院 | Neonatal jaundice health management system based on artificial intelligence |
CN115858819A (en) * | 2023-01-29 | 2023-03-28 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Sample data augmentation method and device |
CN116703337A (en) * | 2023-08-08 | 2023-09-05 | 金现代信息产业股份有限公司 | Project document examination system and method based on artificial intelligence technology |
CN117059229A (en) * | 2023-10-09 | 2023-11-14 | 北京健康有益科技有限公司 | Diabetes catering scheme generation method, device, electronic equipment and storage medium |
CN117874178A (en) * | 2023-10-30 | 2024-04-12 | 阿里健康科技(杭州)有限公司 | Method, device, equipment and medium for determining medical response text data |
CN117454989A (en) * | 2023-11-14 | 2024-01-26 | 生命奇点(北京)科技有限公司 | System for updating electronic medical record question-answer model based on parameter adjustment |
CN117454989B (en) * | 2023-11-14 | 2024-10-22 | 生命奇点(北京)科技有限公司 | System for updating electronic medical record question-answer model based on parameter adjustment |
CN118035394A (en) * | 2023-12-08 | 2024-05-14 | 重庆邮电大学 | Medical question-answering method and system based on multi-source data integration |
CN118070813A (en) * | 2023-12-19 | 2024-05-24 | 中联国际工程管理有限公司 | Investment decision consultation question-answering method and system based on NLP and large language model |
CN118070813B (en) * | 2023-12-19 | 2024-07-30 | 中联国际工程管理有限公司 | Investment decision consultation question-answering method and system based on NLP and large language model |
CN117573843B (en) * | 2024-01-15 | 2024-04-02 | 图灵人工智能研究院(南京)有限公司 | Knowledge calibration and retrieval enhancement-based medical auxiliary question-answering method and system |
CN117573843A (en) * | 2024-01-15 | 2024-02-20 | 图灵人工智能研究院(南京)有限公司 | Knowledge calibration and retrieval enhancement-based medical auxiliary question-answering method and system |
CN118194996A (en) * | 2024-05-14 | 2024-06-14 | 智慧眼科技股份有限公司 | Knowledge graph-based large-model reliable medical knowledge injection method and device |
CN118427307A (en) * | 2024-05-23 | 2024-08-02 | 中日友好医院(中日友好临床医学研究所) | Parathyroid medical knowledge intelligent query method and device based on knowledge graph |
CN118552205A (en) * | 2024-07-16 | 2024-08-27 | 深圳市荣信诚科技有限公司 | Intelligent service method and computer equipment applied to intelligent community |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113505243A (en) | Intelligent question-answering method and device based on medical knowledge graph | |
CN109684448B (en) | Intelligent question and answer method | |
CN111475623B (en) | Case Information Semantic Retrieval Method and Device Based on Knowledge Graph | |
CN112131393B (en) | Medical knowledge graph question-answering system construction method based on BERT and similarity algorithm | |
CN112214593B (en) | Question-answering processing method and device, electronic equipment and storage medium | |
CN106776711B (en) | Chinese medical knowledge map construction method based on deep learning | |
WO2023029506A1 (en) | Illness state analysis method and apparatus, electronic device, and storage medium | |
CN104216913B (en) | Question answering method, system and computer-readable medium | |
Zubrinic et al. | The automatic creation of concept maps from documents written using morphologically rich languages | |
CN112487202B (en) | Chinese medical named entity recognition method and device fusing knowledge map and BERT | |
US20220405484A1 (en) | Methods for Reinforcement Document Transformer for Multimodal Conversations and Devices Thereof | |
CN102955848B (en) | A kind of three-dimensional model searching system based on semanteme and method | |
US20140280314A1 (en) | Dimensional Articulation and Cognium Organization for Information Retrieval Systems | |
CN102663129A (en) | Medical field deep question and answer method and medical retrieval system | |
CN103229223A (en) | Providing answers to questions using multiple models to score candidate answers | |
CN103250129A (en) | Providing question and answers with deferred type evaluation using text with limited structure | |
CN103229162A (en) | Providing answers to questions using logical synthesis of candidate answers | |
CN113632092A (en) | Entity recognition method and device, dictionary establishing method, equipment and medium | |
CN113868387A (en) | Word2vec medical similar problem retrieval method based on improved tf-idf weighting | |
CN116992002A (en) | Intelligent care scheme response method and system | |
CN116975212A (en) | Answer searching method and device for question text, computer equipment and storage medium | |
Houssein et al. | Semantic protocol and resource description framework query language: a comprehensive review | |
Saint-Dizier et al. | Knowledge and reasoning for question answering: Research perspectives | |
CN113314236A (en) | Intelligent question-answering system for hypertension | |
Zhang et al. | Construction of MeSH-like obstetric knowledge graph |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |