WO2022041730A1 - 医疗领域意图识别方法、装置、设备及存储介质 - Google Patents

医疗领域意图识别方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022041730A1
WO2022041730A1 PCT/CN2021/084659 CN2021084659W WO2022041730A1 WO 2022041730 A1 WO2022041730 A1 WO 2022041730A1 CN 2021084659 W CN2021084659 W CN 2021084659W WO 2022041730 A1 WO2022041730 A1 WO 2022041730A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
medical
grained
preset
coarse
Prior art date
Application number
PCT/CN2021/084659
Other languages
English (en)
French (fr)
Inventor
原丽娜
Original Assignee
康键信息技术(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 康键信息技术(深圳)有限公司 filed Critical 康键信息技术(深圳)有限公司
Publication of WO2022041730A1 publication Critical patent/WO2022041730A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of medical data, and in particular, to a method, apparatus, device and storage medium for identifying intent in the medical field.
  • online consultation has gradually broken the limitations of traditional medical treatment, bringing users a convenient and efficient medical experience, and users can meet their medical needs without leaving home It also saves medical resources and improves the efficiency of consultation.
  • the online consultation system is gradually developing towards the direction of intelligence. For example, the introduction of an intelligent question answering engine into the consultation system can replace the doctor to answer user questions during the consultation process and provide assistance to the doctor. decision support, making the consultation process more efficient.
  • the traditional medical question answering system uses a large number of manually proofreaded question answers as the knowledge base, and based on the text similarity, matching the question answer with the highest similarity with the user's question and feeding it back to the user.
  • the inventor realized that due to the diversity and particularity of the diseased group, the description of disease symptoms, and the corresponding treatment methods during the treatment process, the fixed question and answer knowledge could not be covered, and the reasoning mechanism could not be formed, plus the maintenance of the knowledge base. It requires a lot of labor costs, so the template-based medical question answering system uses rule matching or sentence matching to identify intents, which cannot fully cover a variety of question description forms, and has low accuracy in identifying intents in the medical field.
  • the present application provides a method, apparatus, device, and storage medium for recognizing intent in the medical field, which solves the problem of low accuracy in recognizing intent in the medical field.
  • a first aspect of the present application provides a method for recognizing intent in the medical field, including: acquiring an initial question statement from a terminal, where the initial question statement is a question statement input by a target user in a medical intelligent question answering system;
  • the preset recognition model performs entity recognition on the initial question statement, and obtains an entity recognition result, which includes multiple coarse-grained entity labels and multiple entity relationships; Perform entity linking on coarse-grained entity labels to obtain linked entity labels; perform intent recognition on the initial question statement according to a preset intent recognition model, the entity recognition results and the linked entity labels, and obtain candidate medical intents Generating a knowledge map query statement according to the candidate medical intent; performing a knowledge map query on a preset medical knowledge map based on the knowledge map query statement, obtaining a knowledge map query result, and generating a corresponding target word according to the knowledge map query result technology and sent to the terminal.
  • a second aspect of the present application provides a medical field intent recognition device, comprising a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor, the processor executing the computer
  • the instruction is readable, the following steps are implemented: obtaining an initial question statement from the terminal, where the initial question statement is the question statement input by the target user in the medical intelligent question answering system; calling a preset recognition model to perform entity recognition on the initial question statement, Obtain an entity recognition result, which includes multiple coarse-grained entity tags and multiple entity relationships; perform entity linking on the multiple coarse-grained entity tags according to a preset medical entity synonym table, and obtain the linked entity tags ;
  • the preset intent recognition model the entity recognition result and the linked entity label, perform intent recognition on the initial question statement to obtain candidate medical intent; generate a knowledge graph query statement based on the candidate medical intent;
  • the knowledge graph query sentence performs a knowledge graph query on a preset medical knowledge graph, obtains a knowledge graph query result, and generates a corresponding target phrase according to the
  • a third aspect of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on a computer, the computer is caused to perform the following steps: obtaining an initial question from a terminal
  • the initial question sentence is the question sentence input by the target user in the medical intelligent question answering system
  • the preset recognition model is called to perform entity recognition on the initial question sentence, and the entity recognition result is obtained, and the entity recognition result includes multiple Coarse-grained entity labels and multiple entity relationships
  • entity links are performed on the plurality of coarse-grained entity labels according to a preset synonym table of medical entities to obtain linked entity labels; according to a preset intent recognition model, the entity identification
  • the result and the linked entity tag perform intent recognition on the initial question statement to obtain a candidate medical intent; generate a knowledge graph query sentence according to the candidate medical intent; based on the knowledge graph query sentence in a preset medical knowledge graph
  • a knowledge graph query is performed to obtain a knowledge graph query result, and a corresponding target
  • a fourth aspect of the present application provides a device for recognizing intent in the medical field, comprising: a sentence acquisition module for acquiring an initial question sentence from a terminal, where the initial question sentence is a question sentence input by a target user in a medical intelligent question answering system; an entity The recognition module is used to call the preset recognition model to perform entity recognition on the initial question statement, and obtain the entity recognition result, and the entity recognition result includes a plurality of coarse-grained entity labels and a plurality of entity relationships; the entity link module is used for Perform entity linking on the plurality of coarse-grained entity tags according to the preset synonym table of medical entities, to obtain the linked entity tags; the intent recognition module is used for recognizing the model according to the preset intent, the entity recognition result and the The linked entity tags perform intent recognition on the initial question statement to obtain candidate medical intents; a statement generation module is used to generate a knowledge graph query sentence according to the candidate medical intent; a graph query module is used to query based on the knowledge graph The sentence performs a knowledge graph query on a prese
  • the initial question statement is obtained from the terminal, and the initial question statement is the question statement input by the target user in the medical intelligent question answering system; the preset recognition model is called to perform entity recognition on the initial question statement, and the entity recognition result is obtained , the entity recognition result includes multiple coarse-grained entity labels and multiple entity relationships; entity links are performed on multiple coarse-grained entity labels according to the preset synonym table of medical entities, and the linked entity labels are obtained; according to the preset intent recognition model , the entity recognition results and the linked entity labels are used to identify the intent of the initial question statement to obtain candidate medical intent; generate a knowledge graph query sentence according to the candidate medical intent; perform a knowledge graph query in the preset medical knowledge graph based on the knowledge graph query sentence, The knowledge graph query result is obtained, and the corresponding target phrase is generated according to the knowledge graph query result and sent to the terminal.
  • a deep learning model integrating multi-dimensional features is used to separate entity recognition and relationship extraction, and at the same time, coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction. This improves the accuracy of entity recognition results, which in turn improves the accuracy of intent recognition results in the medical field.
  • FIG. 1 is a schematic diagram of an embodiment of a method for identifying intentions in the medical field in an embodiment of the present application
  • FIG. 2 is a schematic diagram of another embodiment of the method for recognizing intent in the medical field according to the embodiment of the present application;
  • FIG. 3 is a schematic diagram of an embodiment of an intention identification device in the medical field according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of another embodiment of the device for recognizing intention in the medical field according to the embodiment of the present application.
  • FIG. 5 is a schematic diagram of an embodiment of an intention identification device in the medical field according to an embodiment of the present application.
  • the present application provides a method, device, equipment and storage medium for intent identification in the medical field, which are used to reduce erroneous transmission and interference of redundant information during entity extraction, improve the accuracy of entity identification results, and further improve intent identification in the medical field. the accuracy of the results.
  • FIG. 1 a flowchart of a method for identifying intent in the medical field provided by an embodiment of the present application, which specifically includes:
  • the server obtains the initial question statement from the terminal, and the initial question statement is the question statement input by the target user in the medical intelligent question answering system.
  • the initial question statement is the medical knowledge question that the user wants to know, for example, "Can I drink alcohol after taking cephalosporin?", "Which department should I go to for a consultation for muscle soreness?" This embodiment does not limit the consultation field of the initial question statement. , as long as it is medically relevant.
  • the execution subject of the present application may be an intention identification device in the medical field, or may be a server, which is not specifically limited here.
  • the embodiments of the present application take the server as an execution subject as an example for description.
  • the server invokes the first preset recognition model to perform entity recognition on the initial question statement to obtain multiple coarse-grained entity labels; invokes the second preset recognition model to perform relationship extraction on the initial question statement to obtain multiple entity relationships; A coarse-grained entity label and multiple entity relationships generate entity recognition results.
  • the server can call the BILSTM layer of the second preset recognition model to extract the context relationship of the initial question sentence, and obtain multiple time sequence vectors, which are used to indicate the context relationship; input the multiple time sequence vectors into the second preset recognition model.
  • the Attention layer generates multiple sentence feature vectors, which are used to indicate entity relationships; among them, the Attention layer first calculates the weight of each time series vector, and then uses the weighted sum of all time series vectors as the feature vector, and then performs softmax. Classification.
  • the server invokes the first preset recognition model to perform entity recognition on the initial question statement, and obtains multiple coarse-grained entity labels, specifically including:
  • the server invokes the first preset recognition model to perform entity recognition on the initial question statement according to fine granularity, and obtains multiple fine-grained entity labels; Multiple coarse-grained entity tags.
  • a deep learning model integrating multi-dimensional features is used to separately perform entity recognition and relationship extraction to reduce the interference of erroneous transmission and redundant information. Accuracy.
  • the server performs entity linking on a plurality of coarse-grained entity labels according to a preset synonym table of medical entities, and obtains the linked entity labels. Specifically, the server searches the preset medical entity synonym table for multiple standard medical terms corresponding to multiple coarse-grained entity labels, each coarse-grained entity label corresponds to a standard medical term, and the coarse-grained entity label corresponds to the standard medical term. Terms are synonyms; the server fuses multiple coarse-grained entity tags to obtain multiple fused coarse-grained entity tags; the server performs entity linking operations on multiple fused coarse-grained entity tags and multiple standard medical terms, and generates links. entity tag.
  • some users express more colloquial medical words, and perform entity linking operations to link to standard medical terms. For example, if the user describes “after an abortion”, the corresponding medical term is “after a miscarriage”. , link “after an abortion” to "after a miscarriage”, another example, the user's description of "lower abdominal pain” needs to be linked to the standard terms of "lower abdominal pain”; “pregnant 34+” and “pregnant 40+” Both correspond to the standard medical data "third trimester", therefore, "pregnancy 34+” and “pregnancy 40+” are linked to "third trimester".
  • coarse-grained entity tags can also be fused to obtain fused coarse-grained entity tags.
  • "pregnancy 34+” and “pregnancy 40+” belong to +” and "pregnancy 40+” can be fused into "34 to 40 weeks of pregnancy", "34 to 40 weeks of pregnancy” is the fused coarse-grained entity label, and then the fused coarse-grained entity label is processed in standard medical terms. Link.
  • the server performs intent identification on the initial question sentence according to a preset intent identification model, entity identification results and linked entity labels, and obtains candidate medical intents.
  • the intent recognition model is a deep learning model, which consists of an input layer, a BERT word vector layer, a BiLSTM layer, an Attention layer, and a Softmax classification layer; since the intent of the question is closely related to entities and entity labels, in this embodiment, entity recognition is used.
  • entity recognition is used.
  • the result and the linked entity label are also used as the input of the intent recognition model.
  • the initial question sentence, the recognition result and the linked entity label are combined as the sentence input of the input layer.
  • the BERT word vector layer generates a word vector from the input sentence, and the output of the BERT word vector layer is used as the input of the BiLSTM layer; the fully connected output of the BiLSTM layer is used as the input of the Attention layer; the output of the Attention layer uses the Softmax classifier for final
  • the intent label classification of obtains candidate medical intents, where the intent types include: cause, explanation, complications, mode of transmission, treatment methods, related examinations, disease diagnosis, precautions, efficacy, side effects/harm, operation method, use/take Method, usage and dosage, dietary advice, whether or not.
  • the deep learning model is used to identify the user's intention, which reduces the number of templates, improves the coverage and accuracy of the question-and-answer situation in the real dialogue, and reduces the maintenance cost.
  • the server generates a knowledge graph query sentence according to the candidate medical intent.
  • the query mapping of the knowledge graph is performed in combination with the entity recognition result and the intent recognition result of the initial question sentence to generate a knowledge graph query sentence, wherein the query object may be a relationship between entities or an attribute of an entity.
  • the server performs a query on a preset medical knowledge graph based on a knowledge graph query statement, and obtains a knowledge graph query result.
  • the knowledge graph query result includes the relationship of the target entity, the attributes of the target entity, and multiple entities;
  • the attributes of the target entity generate the corresponding target speech, and send the target speech to the terminal. .
  • entity types include entity relationships, entity attributes, and entities.
  • entity relationships For different knowledge graph query results, that is, not querying different entity types, entity types include entity relationships, entity attributes, and entities.
  • the knowledge graph query result is the relationship of the query target entity, that is, the knowledge graph query sentence queries the relationship of the target entity.
  • the entity recognition result is "liver cirrhosis: disease”
  • the intent recognition result is "complications”
  • n.name liver cirrhosis
  • m.name combine the name attribute of the node labelled symptom that connects the cirrhosis complication relationship to generate
  • the target speech "complications of liver cirrhosis include liver function impairment, portal hypertension, gastrointestinal bleeding, hepatic encephalopathy, peritonitis, etc.”, and the target speech is sent to the terminal.
  • the knowledge graph query result is the attribute of the query target entity, that is, the knowledge graph query sentence queries the attribute of the target entity.
  • the entity extraction result is "fibrate lipid-lowering drugs: drugs”
  • the intent recognition result is "side effects/harm”
  • the adverse drug reactions are gastrointestinal discomfort, rash, hair loss, headache, loss of libido, etc.”, and send the target language to the terminal;
  • the knowledge graph query result is to query multiple entities, for example, the user's initial question sentence is "what should I pay attention to during pregnancy butt pain?", the entity extraction result is "pregnancy: special period, butt pain: symptoms”, the intent recognition result is " Note”, the corresponding graph query statement is: “match(n:SpecialPeriod ⁇ name:”pregnancy” ⁇ )-[:MultiConditionRestriction]->(p:SpanNode),(m:Symptom ⁇ name:”butt pain” ⁇ )-[:MultiConditionRestriction]->(p:SpanNode)return p.attention", determine the attention attribute value of blank nodes related to pregnancy and butt pain, and generate the target phrase "pregnant woman buttocks" according to the attention attribute value If you have pain, you can use a hot towel or hot water bottle to compress the painful area for about half an hour, and the pain can be relieved a lot.”
  • the server converts the knowledge graph query sentence in combination with the entity type, formulates a personalized language and feeds back the results to the terminal used by the user, which can provide auxiliary decision support for doctors in the online consultation application, making the consultation process more convenient. Efficient.
  • a deep learning model integrating multi-dimensional features is used to separate entity recognition and relationship extraction, and at the same time, coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction.
  • coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction.
  • This improves the accuracy of entity recognition results, which in turn improves the accuracy of intent recognition results in the medical field.
  • this solution can be applied in the field of smart medical care, thereby promoting the construction of smart cities.
  • FIG. 2 another flowchart of the method for identifying intent in the medical field provided by the embodiment of the present application, specifically including:
  • the server builds a preset medical knowledge graph. Specifically include:
  • the server obtains multiple data sources, and the multiple data sources include structured medical data, semi-structured medical data, and online medical consultation dialogue data.
  • structured medical data mainly comes from the existing storage and relational databases in the business of disease, medicine and inspection-related data
  • semi-structured medical data mainly comes from Wikipedia medical data, Baidu Baike medical data, data Clear and save as semi-structured data.
  • the above-mentioned structured and semi-structured data has long text content and high professionalism, which is not easy for users to understand. Therefore, in this embodiment, when constructing a medical knowledge graph, the knowledge of questions and answers generated in the online consultation dialogue after the doctor's proofreading is used. (that is, online medical consultation dialogue data) is also used as one of the data sources, and the solution of the present application is more inclined to simulate the dialogue of real consultation scenarios, which optimizes the user consultation experience.
  • the server performs entity extraction on multiple data sources, obtains multiple entities and multiple entity relationships, and sets entity attributes corresponding to multiple entities and relationship attributes corresponding to multiple entity relationships.
  • the graph is constructed in a top-down manner, that is, an entity recognition and relation extraction method based on a deep learning model is used to perform entity recognition and relation extraction on structured medical data and semi-structured medical data, and add them to the knowledge graph.
  • step (2) specifically includes:
  • the server uses a deep learning model to perform entity recognition and relationship extraction on structured medical data; the server uses a deep learning model to perform entity recognition and relationship extraction on semi-structured medical data; the server generates multiple entities and multiple entity relationships; Set corresponding attributes for each entity respectively to obtain multiple entity attributes, and set corresponding attributes for each entity relationship to obtain multiple entity relationship attributes.
  • multiple entities include departments, diseases, symptoms, medicines, treatment methods, food and health care products, and the entity relationship includes visiting departments, related symptoms, suitable medicines and complications.
  • entity relationship includes visiting departments, related symptoms, suitable medicines and complications.
  • Different types of entities or relationships can have different attributes.
  • the entity “disease” corresponds to attributes such as “explanation”, “cause”, and “incidence”
  • entity “drug” corresponds to “specification” and “efficacy”.
  • the entity relationship "complication” corresponds to "shock", "infection” and so on.
  • the server uses a preset deep learning model to construct an initial knowledge graph according to multiple entities, entity attributes corresponding to multiple entities, multiple entity relationships, and relationship attributes corresponding to multiple entity relationships.
  • the server performs entity alignment and relationship fusion on the initial knowledge graph to generate a preset medical knowledge graph.
  • entity alignment and relationship fusion the purpose of entity alignment and relationship fusion is to discover and merge multi-source heterogeneous entities that have different entity names in different data sources but represent the same concept and thing, and merge the attributes and relationships of the entities.
  • entity alignment adopts the commonly used entity alignment method based on attribute similarity score, which is the prior art, and details are not described here.
  • the server obtains the initial question statement from the terminal, and the initial question statement is the question statement input by the target user in the medical intelligent question answering system.
  • the initial question statement is the medical knowledge question that the user wants to know, for example, "Can I drink alcohol after taking cephalosporin?", "Which department should I go to for a consultation for muscle soreness?" This embodiment does not limit the consultation field of the initial question statement. , as long as it is medically relevant.
  • the execution subject of the present application may be an intention identification device in the medical field, or may be a server, which is not specifically limited here.
  • the embodiments of the present application take the server as an execution subject as an example for description.
  • the server invokes the first preset recognition model to perform entity recognition on the initial question statement in a fine-grained manner, and obtains a plurality of fine-grained entity tags; Identify, get multiple coarse-grained entity labels.
  • the server invokes the first preset recognition model to perform entity recognition on the initial question statement according to fine-grainedness, and the process of obtaining multiple fine-grained entity labels specifically includes:
  • the server extracts multiple feature dimension vectors for the initial question according to fine granularity, and the multiple feature dimension vectors include word vectors, word label vectors, word position vectors and part-of-speech feature vectors; the server inputs the multiple feature dimension vectors into the first preset recognition model In the BiLSTM layer of the above, multiple intermediate vectors output by the BiLSTM layer are obtained; the server inputs multiple intermediate vectors into the CRF layer of the first preset recognition model to generate multiple fine-grained entity labels.
  • the word label vector is the word label encoded by BIOES
  • the word position feature vector is the position vector of the word segmented by the jieba word segmentation tool
  • the part-of-speech feature vector is the part-of-speech vector of the word after the part-of-speech tagging by the jieba word segmentation tool.
  • the server invokes the first preset recognition model to perform entity recognition on a plurality of fine-grained entity labels according to coarse-grained, and the process of obtaining the plurality of coarse-grained entity labels specifically includes:
  • the server invokes the first preset recognition model to identify multiple fine-grained entity labels according to the coarse-grained, and obtains multiple narrow-sense entity features and multiple limited entity features, and multiple narrow-sense entity features include symptoms, diseases, parts, medicine, examination and For treatment, multiple limited entity features include time, frequency, degree, negative word, description and value; the server combines multiple narrow entity features and multiple limited entity features according to preset rules to generate multiple generalized entity features, multiple The generalized entity features include generalized symptoms, generalized examinations, generalized treatments, and generalized drugs; the server determines a plurality of generalized entity features as a plurality of coarse-grained entity labels.
  • the user's question sentence is "Hello doctor, my head has been hurting from morning to night, what is the reason?"
  • "head” is a body part
  • "pain” is a descriptive term
  • " “Early” is time
  • “late” is time
  • "headache from morning to night” is identified as a generalized symptom according to the coarse-grained entity recognition rule.
  • the server invokes the second preset recognition model to perform relationship extraction on the initial question statement, and obtains multiple entity relationships.
  • the server can call the BILSTM layer of the second preset recognition model to extract the context relationship of the initial question sentence, and obtain multiple time sequence vectors, which are used to indicate the context relationship; input the multiple time sequence vectors into the second preset recognition model.
  • the Attention layer generates multiple sentence feature vectors, which are used to indicate entity relationships; among them, the Attention layer first calculates the weight of each time series vector, and then uses the weighted sum of all time series vectors as the feature vector, and then performs softmax. Classification.
  • the server generates entity recognition results based on multiple coarse-grained entity labels and multiple entity relationships.
  • the server uses a deep learning model that integrates multi-dimensional features to separately perform entity recognition and relationship extraction, thereby reducing the interference of wrong transmission and redundant information, and at the same time, using coarse-grained entity recognition to optimize the fine-grained entity recognition results, which can further improve recognition accuracy.
  • the server performs entity linking on a plurality of coarse-grained entity labels according to a preset synonym table of medical entities, and obtains the linked entity labels. Specifically, the server searches the preset medical entity synonym table for multiple standard medical terms corresponding to multiple coarse-grained entity labels, each coarse-grained entity label corresponds to a standard medical term, and the coarse-grained entity label corresponds to the standard medical term. Terms are synonyms; the server fuses multiple coarse-grained entity tags to obtain multiple fused coarse-grained entity tags; the server performs entity linking operations on multiple fused coarse-grained entity tags and multiple standard medical terms, and generates links. entity tag.
  • some users express more colloquial medical words, and perform entity linking operations to link to standard medical terms. For example, if the user describes “after an abortion”, the corresponding medical term is “after a miscarriage”. , link “after an abortion” to "after a miscarriage”, another example, the user's description of "lower abdominal pain” needs to be linked to the standard terms of "lower abdominal pain”; “pregnant 34+” and “pregnant 40+” Both correspond to the standard medical data "third trimester", therefore, "pregnancy 34+” and “pregnancy 40+” are linked to "third trimester".
  • coarse-grained entity tags can also be fused to obtain fused coarse-grained entity tags.
  • "pregnancy 34+” and “pregnancy 40+” belong to +” and "pregnancy 40+” can be fused into "34 to 40 weeks of pregnancy", "34 to 40 weeks of pregnancy” is the fused coarse-grained entity label, and then the fused coarse-grained entity label is processed in standard medical terms. Link.
  • entity normalization and entity fusion operations are required to maintain the synonym table of medical entities to build a medical knowledge graph that removes redundancy and conflicts, and ensures that the question answering system has high quality. data support.
  • the server performs intent identification on the initial question sentence according to a preset intent identification model, entity identification results and linked entity labels, and obtains candidate medical intents.
  • the intent recognition model is a deep learning model, which consists of an input layer, a BERT word vector layer, a BiLSTM layer, an Attention layer, and a Softmax classification layer; since the intent of the question is closely related to entities and entity labels, in this embodiment, entity recognition is used.
  • entity recognition is used.
  • the result and the linked entity label are also used as the input of the intent recognition model.
  • the initial question sentence, the recognition result and the linked entity label are combined as the sentence input of the input layer.
  • the BERT word vector layer generates a word vector from the input sentence, and the output of the BERT word vector layer is used as the input of the BiLSTM layer; the fully connected output of the BiLSTM layer is used as the input of the Attention layer; the output of the Attention layer uses the Softmax classifier for final
  • the intent label classification of obtains candidate medical intents, where the intent types include: cause, explanation, complications, mode of transmission, treatment methods, related examinations, disease diagnosis, precautions, efficacy, side effects/harm, operation method, use/take Method, usage and dosage, dietary advice, whether or not.
  • the deep learning model is used to identify the user's intention, which reduces the number of templates, improves the coverage and accuracy of the question-and-answer situation in the real dialogue, and reduces the maintenance cost.
  • the server generates a knowledge graph query sentence according to the candidate medical intent.
  • the query mapping of the knowledge graph is performed in combination with the entity recognition result and the intent recognition result of the initial question sentence to generate a knowledge graph query sentence, wherein the query object may be a relationship between entities or an attribute of an entity.
  • the server performs a query on a preset medical knowledge graph based on a knowledge graph query statement, and obtains a knowledge graph query result.
  • the knowledge graph query result includes the relationship of the target entity, the attributes of the target entity, and multiple entities;
  • the attributes of the target entity generate the corresponding target speech, and send the target speech to the terminal. .
  • entity types include entity relationships, entity attributes, and entities.
  • entity relationships For different knowledge graph query results, that is, not querying different entity types, entity types include entity relationships, entity attributes, and entities.
  • the knowledge graph query sentence queries the relationship of the target entity.
  • the entity recognition result is "liver cirrhosis: disease”
  • the intent recognition result is "complications”
  • n.name liver cirrhosis
  • m.name combine the name attribute of the node with the label of symptom connected to the complication relation of liver cirrhosis to generate
  • the target speech "complications of liver cirrhosis include liver function impairment, portal hypertension, gastrointestinal bleeding, hepatic encephalopathy, peritonitis, etc.”, and the target speech is sent to the terminal.
  • the knowledge graph query result is the attribute of the query target entity, that is, the knowledge graph query sentence queries the attribute of the target entity.
  • the entity extraction result is "fibrate lipid-lowering drugs: drugs”
  • the intent recognition result is "side effects/harm”
  • the adverse drug reactions are gastrointestinal discomfort, rash, hair loss, headache, loss of libido, etc.”, and send the target language to the terminal;
  • the knowledge graph query result is to query multiple entities, for example, the user's initial question sentence is "what should I pay attention to during pregnancy butt pain?", the entity extraction result is "pregnancy: special period, butt pain: symptoms”, the intent recognition result is " Note”, the corresponding graph query statement is: “match(n:SpecialPeriod ⁇ name:”pregnancy” ⁇ )-[:MultiConditionRestriction]->(p:SpanNode),(m:Symptom ⁇ name:”butt pain” ⁇ )-[:MultiConditionRestriction]->(p:SpanNode)return p.attention", determine the attention attribute value of blank nodes related to pregnancy and butt pain, and generate the target phrase "pregnant woman buttocks" according to the attention attribute value If you have pain, you can use a hot towel or hot water bottle to compress the painful area for about half an hour, and the pain can be relieved a lot.”
  • the server converts the knowledge graph query sentence in combination with the entity type, formulates a personalized language and feeds back the results to the terminal used by the user, which can provide auxiliary decision support for doctors in the online consultation application, making the consultation process more convenient. Efficient.
  • a deep learning model integrating multi-dimensional features is used to separate entity recognition and relationship extraction, and at the same time, coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction. This improves the accuracy of entity recognition results, which in turn improves the accuracy of intent recognition results in the medical field. And this solution can be applied in the field of smart medical care, so as to promote the construction of smart city.
  • An embodiment of the apparatus for recognizing intent in the medical field in the embodiment of the present application includes:
  • a statement acquisition module 301 configured to acquire an initial question statement from a terminal, where the initial question statement is a question statement input by the target user in the medical intelligent question answering system;
  • the entity recognition module 302 is configured to call a preset recognition model to perform entity recognition on the initial question statement, and obtain an entity recognition result, where the entity recognition result includes multiple coarse-grained entity labels and multiple entity relationships;
  • the entity linking module 303 is configured to perform entity linking on the plurality of coarse-grained entity tags according to a preset synonym table of medical entities, to obtain the linked entity tags;
  • an intent recognition module 304 configured to perform intent recognition on the initial question statement according to a preset intent recognition model, the entity recognition result and the linked entity label, to obtain candidate medical intents;
  • a statement generation module 305 configured to generate a knowledge graph query statement according to the candidate medical intent
  • the graph query module 306 is configured to perform a knowledge graph query in a preset medical knowledge graph based on the knowledge graph query statement, obtain a knowledge graph query result, and generate a corresponding target vocabulary according to the knowledge graph query result and send it to the terminal.
  • a deep learning model integrating multi-dimensional features is used to separate entity recognition and relationship extraction, and at the same time, coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction.
  • coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction.
  • This improves the accuracy of entity recognition results, which in turn improves the accuracy of intent recognition results in the medical field.
  • this solution can be applied in the field of smart medical care, thereby promoting the construction of smart cities.
  • another embodiment of the device for recognizing intent in the medical field in the embodiment of the present application includes:
  • a statement acquisition module 301 configured to acquire an initial question statement from a terminal, where the initial question statement is a question statement input by the target user in the medical intelligent question answering system;
  • the entity recognition module 302 is configured to call a preset recognition model to perform entity recognition on the initial question statement, and obtain an entity recognition result, where the entity recognition result includes multiple coarse-grained entity labels and multiple entity relationships;
  • the entity linking module 303 is configured to perform entity linking on the plurality of coarse-grained entity tags according to a preset synonym table of medical entities, to obtain the linked entity tags;
  • an intent recognition module 304 configured to perform intent recognition on the initial question statement according to a preset intent recognition model, the entity recognition result and the linked entity label, to obtain candidate medical intents;
  • a statement generation module 305 configured to generate a knowledge graph query statement according to the candidate medical intent
  • the graph query module 306 is configured to perform a knowledge graph query in a preset medical knowledge graph based on the knowledge graph query statement, obtain a knowledge graph query result, and generate a corresponding target vocabulary according to the knowledge graph query result and send it to the terminal.
  • the entity identification module 302 includes:
  • the entity recognition unit 3021 is used to call the first preset recognition model to perform entity recognition on the initial question statement, and obtain a plurality of coarse-grained entity labels;
  • a relationship extraction unit 3022 configured to invoke the second preset recognition model to perform relationship extraction on the initial question statement to obtain a plurality of entity relationships
  • a generating unit 3023 configured to generate an entity recognition result according to the plurality of coarse-grained entity tags and the plurality of entity relationships.
  • the entity identification unit 3021 includes:
  • the first identification subunit 30211 is used to call the first preset identification model to perform entity identification on the initial question statement according to the fine-grainedness, and obtain a plurality of fine-grained entity labels;
  • the second identification subunit 30212 is configured to call the first preset identification model to perform entity identification on the plurality of fine-grained entity tags according to the coarse granularity, and obtain a plurality of coarse-grained entity tags.
  • the first identification subunit 30211 is specifically used for:
  • the multiple feature dimension vectors include word vector, word label vector, word position vector and part of speech feature vector; input the multiple feature dimension vectors into the first In the BiLSTM layer of the preset recognition model, multiple intermediate vectors output by the BiLSTM layer are obtained; the multiple intermediate vectors are input into the CRF layer of the first preset recognition model to generate multiple fine-grained entity labels.
  • the second identification subunit 30212 is specifically used for:
  • the first preset recognition model to recognize the plurality of fine-grained entity labels according to the coarse granularity, and obtain a plurality of narrow-sense entity features and a plurality of limited entity features, where the plurality of narrow-sense entity features include symptoms, diseases, parts, medical , examination and treatment, the plurality of limited entity features include time, frequency, degree, negative word, description and value; the plurality of narrow entity features and the plurality of limited entity features are combined according to preset rules to generate A plurality of generalized entity features, the plurality of generalized entity features including a generalized symptom, a generalized examination, a generalized treatment, and a generalized drug; the plurality of generalized entity features are determined as a plurality of coarse-grained entity labels.
  • the entity linking module 303 is specifically used for:
  • each coarse-grained entity label corresponds to a standard medical term, and the coarse-grained entity label is related to the standard medical term
  • the terms are synonyms; the multiple coarse-grained entity tags are fused to obtain multiple fused coarse-grained entity tags; the entity link operation is performed on the multiple fused coarse-grained entity tags and the multiple standard medical terms , which generates linked entity tags.
  • the graph query module 306 is specifically used for:
  • the preset medical knowledge graph based on the knowledge graph query statement to obtain a knowledge graph query result, where the knowledge graph query result includes the relationship of the target entity, the attributes of the target entity and multiple entities;
  • the attributes of the relationship and the target entity generate the corresponding target speech, and send the target speech to the terminal.
  • a deep learning model integrating multi-dimensional features is used to separate entity recognition and relationship extraction, and at the same time, coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction.
  • coarse-grained entity recognition is used to optimize the fine-grained entity recognition result, which reduces error transmission and redundant information in the process of entity extraction.
  • This improves the accuracy of entity recognition results, which in turn improves the accuracy of intent recognition results in the medical field.
  • this solution can be applied in the field of smart medical care, thereby promoting the construction of smart cities.
  • FIGS 3 to 4 above describe in detail the medical domain intent identification device in the embodiment of the present application from the perspective of modular functional entities, and the following describes the medical domain intent identification device in the embodiment of the present application in detail from the perspective of hardware processing.
  • FIG. 5 is a schematic structural diagram of a medical field intention identification device provided by an embodiment of the present application.
  • the medical field intention identification device 500 may vary greatly due to different configurations or performances, and may include one or more processors (central processing units, CPU) 510 (eg, one or more processors) and memory 520, one or more storage media 530 (eg, one or more mass storage devices) that store application programs 533 or data 532.
  • the memory 520 and the storage medium 530 may be short-term storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the medical field intention recognition device 500 .
  • the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the medical field intent recognition device 500 .
  • the medical domain intent identification device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input and output interfaces 560, and/or, one or more operating systems 531, such as Windows Serve, Mac OS X, Unix, Linux, FreeBSD, and more.
  • operating systems 531 such as Windows Serve, Mac OS X, Unix, Linux, FreeBSD, and more.
  • the present application also provides a device for recognizing intent in the medical field, comprising: a memory and at least one processor, wherein instructions are stored in the memory, the memory and the at least one processor are interconnected by a line; the at least one processor The instructions in the memory are invoked, so that the medical field intention recognition device performs the steps in the above medical field intention recognition method.
  • the present application also provides a computer-readable storage medium, and the computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • the computer-readable storage medium stores computer instructions, and when the computer instructions are executed on the computer, the computer performs the following steps:
  • a knowledge graph query is performed on a preset medical knowledge graph based on the knowledge graph query sentence, a knowledge graph query result is obtained, and a corresponding target phrase is generated according to the knowledge graph query result and sent to the terminal.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Human Computer Interaction (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种医疗领域意图识别方法、装置、设备及存储介质,应用于智慧医疗领域中,用于提高医疗领域意图识别结果的准确度。该方法包括:从终端获取初始问题语句;调用预置的识别模型对初始问题语句进行实体识别,得到实体识别结果;根据预置的医疗实体同义词表对多个粗粒度实体标签进行实体链接,得到链接后的实体标签;根据预置的意图识别模型、实体识别结果和链接后的实体标签对初始问题语句进行意图识别,得到候选医疗意图;根据候选医疗意图生成知识图谱查询语句;基于知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据知识图谱查询结果生成对应的目标话术并发送至终端。

Description

医疗领域意图识别方法、装置、设备及存储介质
本申请要求于2020年8月28日提交中国专利局、申请号为202010884353.8、发明名称为“医疗领域意图识别方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及医疗数据领域,尤其涉及一种医疗领域意图识别方法、装置、设备及存储介质。
背景技术
随着计算机技术在医疗领域的应用,在线问诊方式逐步打破了传统就医局限,为用户带来便利且高效的医疗体验,用户可足不出户满足自己的就医需求,免去路途遥远、挂号排队等问题的困扰,同时节约了医疗资源,提高了问诊效率。随着自然语言处理技术的发展,在线问诊系统逐步向着智能化的方向发展,如在问诊系统中引入智能问答引擎,可在问诊过程中替代医生回答用户问题,同时可为医生提供辅助的决策支持,使得问诊过程更加高效。
传统的医疗问答系统是以大量人工校对的问题答案作为知识库,基于文本相似度的方式,匹配和用户问题相似度最高的问题答案反馈给用户。发明人意识到,由于就诊过程中患病群体、疾病症状的描述方式、以及相应的治疗方式等具有多样性和特殊性,固定的问答知识无法覆盖,并且无法形成推理机制,加上维护知识库需要大量的人工成本,因此基于模板的医疗问答系统采用规则匹配或者句式匹配的方式进行意图识别,无法对多样的问题描述形式进行全覆盖,对医疗领域意图的识别准确度低。
发明内容
本申请提供了一种医疗领域意图识别方法、装置、设备及存储介质,解决了对医疗领域意图的识别准确度低的问题。
为实现上述目的,本申请第一方面提供了一种医疗领域意图识别方法,包括:从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;根据所述候选医疗意图生成知识图谱查询语句;基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
本申请第二方面提供了一种医疗领域意图识别设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;根据所述候选医疗意图生成知识图谱查询语句;基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
本申请第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:从终端获 取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;根据所述候选医疗意图生成知识图谱查询语句;基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
本申请第四方面提供了一种医疗领域意图识别装置,包括:语句获取模块,用于从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;实体识别模块,用于调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;实体链接模块,用于根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;意图识别模块,用于根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;语句生成模块,用于根据所述候选医疗意图生成知识图谱查询语句;图谱查询模块,用于基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
本申请提供的技术方案中,从终端获取初始问题语句,初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;调用预置的识别模型对初始问题语句进行实体识别,得到实体识别结果,实体识别结果包括多个粗粒度实体标签和多个实体关系;根据预置的医疗实体同义词表对多个粗粒度实体标签进行实体链接,得到链接后的实体标签;根据预置的意图识别模型、实体识别结果和链接后的实体标签对初始问题语句进行意图识别,得到候选医疗意图;根据候选医疗意图生成知识图谱查询语句;基于知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据知识图谱查询结果生成对应的目标话术并发送至终端。本申请实施例,采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,同时采用粗粒度实体识对细粒度实体识别结果进行优化,减少了实体抽取过程中的错误传递和冗余信息的干扰,提高了实体识别结果的准确度,进而提高了医疗领域意图识别结果的准确度。
附图说明
图1为本申请实施例中医疗领域意图识别方法的一个实施例示意图;
图2为本申请实施例中医疗领域意图识别方法的另一个实施例示意图;
图3为本申请实施例中医疗领域意图识别装置的一个实施例示意图;
图4为本申请实施例中医疗领域意图识别装置的另一个实施例示意图;
图5为本申请实施例中医疗领域意图识别设备的一个实施例示意图。
具体实施方式
本申请提供了一种医疗领域意图识别方法、装置、设备及存储介质,用于减少实体抽取过程中的错误传递和冗余信息的干扰,提高实体识别结果的准确度,进而提高医疗领域意图识别结果的准确度。
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例进行描述。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示 或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
请参阅图1,本申请实施例提供的医疗领域意图识别方法的流程图,具体包括:
101、从终端获取初始问题语句,初始问题语句为目标用户在医疗智能问答系统中输入的问题语句。
服务器从终端获取初始问题语句,该初始问题语句为目标用户在医疗智能问答系统中输入的问题语句。初始问题语句是用户想要了解的医学知识问题,例如,“打完头孢能喝酒吗?”、“肌肉酸痛应该去哪个科室问诊?”本实施例对初始问题语句的是咨询领域不做限定,只要和医疗相关即可。
可以理解的是,本申请的执行主体可以为医疗领域意图识别装置,还可以是服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
102、调用预置的识别模型对初始问题语句进行实体识别,得到实体识别结果,实体识别结果包括多个粗粒度实体标签和多个实体关系。
具体的,服务器调用第一预置识别模型对初始问题语句进行实体识别,得到多个粗粒度实体标签;调用第二预置识别模型对初始问题语句进行关系抽取,得到多个实体关系;根据多个粗粒度实体标签和多个实体关系生成实体识别结果。
其中,服务器可以调用第二预置识别模型的BILSTM层提取初始问题语句的上下文关系,得到多个时序向量,时序向量用于指示上下文关系;将多个时序向量输入到第二预置识别模型的注意力Attention层,生成多个句子特征向量,句子特征向量用于指示实体关系;其中,Attention层是先计算每个时序向量的权重,然后将所有时序向量的加权和作为特征向量,然后进行softmax分类。
可选的,服务器调用第一预置识别模型对初始问题语句进行实体识别,得到多个粗粒度实体标签,具体包括:
服务器调用第一预置识别模型对初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签;服务器调用第一预置识别模型对多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签。
本实施例,采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,减少错误传递和冗余信息的干扰,同时采用粗粒度实体识对细粒度实体识别结果进行优化,可进一步提高识别准确度。
103、根据预置的医疗实体同义词表对多个粗粒度实体标签进行实体链接,得到链接后的实体标签。
服务器根据预置的医疗实体同义词表对多个粗粒度实体标签进行实体链接,得到链接后的实体标签。具体的,服务器在预置的医疗实体同义词表中查找多个粗粒度实体标签对应的多个标准的医疗术语,每一个粗粒度实体标签对应一个标准的医疗术语,粗粒度实体标签与标准的医疗术语为同义词;服务器对多个粗粒度实体标签进行融合,得到多个融合的粗粒度实体标签;服务器对多个融合的粗粒度实体标签和多个标准的医疗术语进行实体链接操作,生成链接后的实体标签。
本实施例主要是对一些用户表达较口语化的医学词,进行实体链接操作,以链接到标准的医学术语上,例如,用户描述“打了胎后”,对应的医学术语为“人流后”,将“打了胎后”链接到“人流后”,又例如,用户描述“小腹部胀痛”需要链接到“下腹胀痛”的标准术语上;“怀孕34+”和“怀孕40+”都对应有标准的医疗数据“孕晚期”,因此,将“怀孕34+”和“怀孕40+”都链接到“孕晚期”上。
需要说明的是,在进行实体链接之前,还可以对粗粒度实体标签进行融合,得到融合的粗粒度实体标签,例如,“怀孕34+”和“怀孕40+”都属于孕晚期,“怀孕34+”和“怀孕40+”可以融合为“怀孕34至40周”,“怀孕34至40周”即为融合的粗粒度实体标签,然后再将融合的粗粒度实体标签于标准的医学术语进行链接。
104、根据预置的意图识别模型、实体识别结果和链接后的实体标签对初始问题语句进行意图识别,得到候选医疗意图。
服务器根据预置的意图识别模型、实体识别结果和链接后的实体标签对初始问题语句进行意图识别,得到候选医疗意图。
其中,意图识别模型为深度学习模型,由输入层、BERT词向量层、BiLSTM层、Attention层和Softmax分类层组成;由于问题意图与实体、实体标签关联较大,因此本实施例中将实体识别结果和链接后的实体标签也作为意图识别模型的输入,本实施例将初始问题语句、识别结果和链接后的实体标签联合作为输入层的句子输入。
其中,BERT词向量层将输入的句子生成词向量,BERT词向量层的输出作为BiLSTM层的输入;将BiLSTM层的全连接输出作为Attention层的输入;对Attention层的输出采用Softmax分类器进行最终的意图标签分类,得到候选医疗意图,其中,意图类型包括:原因、解释、并发症、传播方式、治疗方法、相关检查、疾病诊断、注意事项、功效、副作用/危害、操作方法、使用/服用方法、用法用量、饮食建议、是否等。
本实施例,采用深度学习模型进行用户意图识别,减少了模板数量,提高了对真实对话中的问答情况的覆盖率和准确度,并降低了维护成本。
105、根据候选医疗意图生成知识图谱查询语句。
服务器根据候选医疗意图生成知识图谱查询语句。
本实施例中,结合初始问题语句的实体识别结果和意图识别结果,进行知识图谱的查询映射,生成知识图谱查询语句,其中,查询对象可以是实体间的关系,也可以是实体的属性。
106、基于知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据知识图谱查询结果生成对应的目标话术并发送至终端。
具体的,服务器基于知识图谱查询语句在预置的医疗知识图谱进行查询,得到知识图谱查询结果,知识图谱查询结果包括目标实体的关系、目标实体的属性和多个实体;根据目标实体的关系和目标实体的属性生成对应的目标话术,并将目标话术发送至终端。。
对于不同的知识图谱查询结果,即不查询不同的实体类型,实体类型包括实体的关系、实体的属性和实体,具体过程如下:
若知识图谱查询结果为查询目标实体的关系,即知识图谱查询语句查询目标实体的关系。例如,当用户的初始问题语句是“肝硬化有哪些并发症?”时,实体识别结果为“肝硬化:疾病”,意图识别结果为“并发症”,对应的知识图谱查询语句为“match(n:Disease)-[r:Complication]-(m:Symptom)where n.name=“肝硬化”return m.name”,将连接肝硬化并发症关系的标签为症状的节点的name属性组合,生成目标话术“肝硬化并发症有肝功能受损、门脉高压、消化道出血、肝性脑病、腹膜炎等。”,将目标话术发送至终端。
若知识图谱查询结果为查询目标实体的属性,即知识图谱查询语句查询目标实体的属性。例如,当用户的初始问题语句是“贝特类降脂药有什么副作用?”,实体抽取结果为“贝特类降脂药:药品”,意图识别结果为“副作用/危害”,对应的图谱查询语句为“match(n:Drug)where n.name=“贝特类降脂药”return n.harm”,那么根据贝特类降脂药的副作用属性生成目标话术“贝特类降脂药不良反应为胃肠道不适、皮疹、脱发、头痛、性欲减退等。”,并将目标话术发送至终端;
若知识图谱查询结果为查询多个实体,例如,用户的初始问题语句是“孕期屁股痛需要注意哪些?”,实体抽取结果为“孕期:特殊时期,屁股痛:症状”,意图识别结果为“注意事项”,对应的图谱查询语句为:“match(n:SpecialPeriod{name:“孕期”})-[:MultiConditionRestriction]->(p:SpanNode),(m:Symptom{name:“屁股痛”})-[:MultiConditionRestriction]->(p:SpanNode)return p.attention”,确定与孕期和屁股痛都有关系的空白节点的注意事项属性值,并根据注意事项属性值生成目标话术“孕妇屁股痛可以用热毛巾、热水袋对疼痛处进行热敷,约半小时,疼痛感可以减轻不少”,并将目标话术发送至终端。
可以理解的是,服务器结合实体类型进行知识图谱查询语句转换,制定个性化话术将结果反馈给用户使用的终端,能够在线上问诊应用中为医生提供辅助的决策支持,使得问诊过程更高效。
本申请实施例,采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,同时采用粗粒度实体识对细粒度实体识别结果进行优化,减少了实体抽取过程中的错误传递和冗余信息的干扰,提高了实体识别结果的准确度,进而提高了医疗领域意图识别结果的准确度。并且本方案可应用于智慧医疗领域中,从而推动智慧城市的建设。
请参阅图2,本申请实施例提供的医疗领域意图识别方法的另一个流程图,具体包括:
201、构建预置的医疗知识图谱。
服务器构建预置的医疗知识图谱。具体包括:
(1)服务器获取多个数据源,多个数据源包括结构化医疗数据、半结构化医疗数据和线上医疗问诊对话数据。
其中,结构化医疗数据主要来源于业务中已有的存储与关系型数据库中疾病、药品和检查检验相关数据,半结构化医疗数据主要来源于维基百科的医疗数据、百度百科的医疗数据,数据清晰后存为半结构化数据。上述结构化半结构化数据文本内容较长且专业性较高,不易于用户理解,因此本实施例中在构建医疗知识图谱时,将医生校对后的线上问诊对话中产生的问题答案知识(即线上医疗问诊对话数据)也作为数据源之一,本申请的方案更倾向于模拟真实问诊场景对话,优化了用户问诊体验。
(2)、服务器对多个数据源进行实体抽取,得到多个实体和多个实体关系,并设置多个实体对应的实体属性和多个实体关系对应的关系属性。
本实施例采用自顶向下方式进行图谱构建,即采用基于深度学习模型的实体识别和关系抽取方法针对结构化医疗数据和半结构化医疗数据进行实体识别和关系抽取,添加到知识图谱中。
可选的,步骤(2)具体包括:
服务器采用基于深度学习模型对结构化医疗数据进行实体识别和关系抽取;服务器采用基于深度学习模型对半结构化医疗数据进行实体识别和关系的抽取;服务器生成多个实体和多个实体关系;服务器分别为每个实体设置相应的属性,得到多个实体属性,对每个实体关系设置相应的属性,得到多个实体关系属性。
其中,多个实体包括科室、疾病、症状、药品、治疗手段、食品和保健品,实体关系包括就诊科室、相关症状、适宜药品和并发症。不同类型的实体或关系,可以设置不相应的属性,例如,实体“疾病”对应有“解释”、“病因”、“发病率”等属性,实体“药品”对应有“规格”、“功效”、“禁忌”等属性,实体关系“并发症”对应有“休克”“感染”等。
(3)、服务器根据多个实体、多个实体对应的实体属性、多个实体关系和多个实体关系对应的关系属性,采取预置的深度学习模型构建初始知识图谱。
(4)、服务器对初始知识图谱进行实体对齐和关系融合,生成预置的医疗知识图谱。
其中,实体对齐和关系融合的目的是发现并合并在不同数据源中具有不同实体名称却 代表同一概念和事物的多源异构实体,将实体的属性和关系合并。实体对齐采用的是常采用的基于属性相似评分的实体对齐方法,为现有技术,具体此处不做赘述。
202、从终端获取初始问题语句,初始问题语句为目标用户在医疗智能问答系统中输入的问题语句。
服务器从终端获取初始问题语句,该初始问题语句为目标用户在医疗智能问答系统中输入的问题语句。初始问题语句是用户想要了解的医学知识问题,例如,“打完头孢能喝酒吗?”、“肌肉酸痛应该去哪个科室问诊?”本实施例对初始问题语句的是咨询领域不做限定,只要和医疗相关即可。
可以理解的是,本申请的执行主体可以为医疗领域意图识别装置,还可以是服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
203、调用第一预置识别模型对初始问题语句进行实体识别,得到多个粗粒度实体标签。
具体的,服务器调用第一预置识别模型对初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签;服务器调用第一预置识别模型对多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签。
可选的,服务器调用第一预置识别模型对初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签的过程具体包括:
服务器按照细粒度对初始问题提取多个特征维度向量,多个特征维度向量包括词向量、词标签向量、词位置向量和词性特征向量;服务器将多个特征维度向量输入到第一预置识别模型的BiLSTM层中,得到BiLSTM层输出的多个中间向量;服务器将多个中间向量输入到第一预置识别模型的CRF层中,生成多个细粒度实体标签。
其中,词标签向量为经过BIOES编码后的词标签,词位置特征向量为jieba分词工具切词后的字的位置向量,词性特征向量为jieba分词工具进行词性标注后的字的词性向量。
需要说明的是,中文词语没有明确的边界信息,并且相同的字组成不同顺序的词的语义有所差别,如“产妇肚子痛应立即到妇产科就医”中的“产妇”和“妇产科”,前者标签为“人群”,后者标签为“科室”,因此可以将词的位置信息作为一个有效特征。词性是词语的重要属性,可以表达更加抽象的词语特征,进一步发现语句的结构联系,并且实体标签如“疾病”、“症状”、“人群”等都是名词,词性与命名实体有着强关联联系,所以在模型中加入词性信息可以进一步提高实体识别的性能。实验对比发现加入词位置和词性特征后,预置识别模型的识别准确度提升5个百分点。
可选的,服务器调用第一预置识别模型对多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签过程具体包括:
服务器调用第一预置识别模型按照粗粒度对多个细粒度实体标签进行识别,得到多个狭义实体特征和多个限定实体特征,多个狭义实体特征包括症状、疾病、部位、医学、检查和治疗,多个限定实体特征包括时间、频率、程度、否定词、描述和数值;服务器将多个狭义实体特征和多个限定实体特征按照预置规则进行组合,生成多个广义实体特征,多个广义实体特征包括广义症状、广义检查、广义治疗和广义药物;服务器将多个广义实体特征确定为多个粗粒度实体标签。
例如,用户问题语句为“医生您好,我最近头从早痛到晚,请问是什么原因呢?”,按照细粒度实体识别得到“头”为身体部位,“痛”是描述性用语,“早”是时间,“晚”是时间,按照粗粒度实体识别规则将“头从早痛到晚”识别为广义症状。
204、调用第二预置识别模型对初始问题语句进行关系抽取,得到多个实体关系。
服务器调用第二预置识别模型对初始问题语句进行关系抽取,得到多个实体关系。其中,服务器可以调用第二预置识别模型的BILSTM层提取初始问题语句的上下文关系,得到多个时序向量,时序向量用于指示上下文关系;将多个时序向量输入到第二预置识别模型 的注意力Attention层,生成多个句子特征向量,句子特征向量用于指示实体关系;其中,Attention层是先计算每个时序向量的权重,然后将所有时序向量的加权和作为特征向量,然后进行softmax分类。
205、根据多个粗粒度实体标签和多个实体关系生成实体识别结果。
服务器根据多个粗粒度实体标签和多个实体关系生成实体识别结果。
本实施例,服务器采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,减少错误传递和冗余信息的干扰,同时采用粗粒度实体识对细粒度实体识别结果进行优化,可进一步提高识别准确度。
206、根据预置的医疗实体同义词表对多个粗粒度实体标签进行实体链接,得到链接后的实体标签。
服务器根据预置的医疗实体同义词表对多个粗粒度实体标签进行实体链接,得到链接后的实体标签。具体的,服务器在预置的医疗实体同义词表中查找多个粗粒度实体标签对应的多个标准的医疗术语,每一个粗粒度实体标签对应一个标准的医疗术语,粗粒度实体标签与标准的医疗术语为同义词;服务器对多个粗粒度实体标签进行融合,得到多个融合的粗粒度实体标签;服务器对多个融合的粗粒度实体标签和多个标准的医疗术语进行实体链接操作,生成链接后的实体标签。
本实施例主要是对一些用户表达较口语化的医学词,进行实体链接操作,以链接到标准的医学术语上,例如,用户描述“打了胎后”,对应的医学术语为“人流后”,将“打了胎后”链接到“人流后”,又例如,用户描述“小腹部胀痛”需要链接到“下腹胀痛”的标准术语上;“怀孕34+”和“怀孕40+”都对应有标准的医疗数据“孕晚期”,因此,将“怀孕34+”和“怀孕40+”都链接到“孕晚期”上。
需要说明的是,在进行实体链接之前,还可以对粗粒度实体标签进行融合,得到融合的粗粒度实体标签,例如,“怀孕34+”和“怀孕40+”都属于孕晚期,“怀孕34+”和“怀孕40+”可以融合为“怀孕34至40周”,“怀孕34至40周”即为融合的粗粒度实体标签,然后再将融合的粗粒度实体标签于标准的医学术语进行链接。
可以理解的是,对于不同数据源的相同实体,要进行实体归一和实体融合操作,维护医疗实体的同义词表,以构建一个去冗余去冲突的医疗知识图谱,保证问答系统有较高质量的数据支撑。
207、根据预置的意图识别模型、实体识别结果和链接后的实体标签对初始问题语句进行意图识别,得到候选医疗意图。
服务器根据预置的意图识别模型、实体识别结果和链接后的实体标签对初始问题语句进行意图识别,得到候选医疗意图。
其中,意图识别模型为深度学习模型,由输入层、BERT词向量层、BiLSTM层、Attention层和Softmax分类层组成;由于问题意图与实体、实体标签关联较大,因此本实施例中将实体识别结果和链接后的实体标签也作为意图识别模型的输入,本实施例将初始问题语句、识别结果和链接后的实体标签联合作为输入层的句子输入。
其中,BERT词向量层将输入的句子生成词向量,BERT词向量层的输出作为BiLSTM层的输入;将BiLSTM层的全连接输出作为Attention层的输入;对Attention层的输出采用Softmax分类器进行最终的意图标签分类,得到候选医疗意图,其中,意图类型包括:原因、解释、并发症、传播方式、治疗方法、相关检查、疾病诊断、注意事项、功效、副作用/危害、操作方法、使用/服用方法、用法用量、饮食建议、是否等。
本实施例,采用深度学习模型进行用户意图识别,减少了模板数量,提高了对真实对话中的问答情况的覆盖率和准确度,并降低了维护成本。
208、根据候选医疗意图生成知识图谱查询语句。
服务器根据候选医疗意图生成知识图谱查询语句。
本实施例中,结合初始问题语句的实体识别结果和意图识别结果,进行知识图谱的查询映射,生成知识图谱查询语句,其中,查询对象可以是实体间的关系,也可以是实体的属性。
209、基于知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据知识图谱查询结果生成对应的目标话术并发送至终端。
具体的,服务器基于知识图谱查询语句在预置的医疗知识图谱进行查询,得到知识图谱查询结果,知识图谱查询结果包括目标实体的关系、目标实体的属性和多个实体;根据目标实体的关系和目标实体的属性生成对应的目标话术,并将目标话术发送至终端。。
对于不同的知识图谱查询结果,即不查询不同的实体类型,实体类型包括实体的关系、实体的属性和实体,具体过程如下:
若知识图谱查询结果为查询目标实体的关系,即知识图谱查询语句查询目标实体的关系。例如,当用户的初始问题语句是“肝硬化有哪些并发症?”时,实体识别结果为“肝硬化:疾病”,意图识别结果为“并发症”,对应的知识图谱查询语句为“match(n:Disease)-[r:Complication]-(m:Symptom)where n.name=“肝硬化”return m.name”,将连接肝硬化并发症关系的标签为症状的节点的name属性组合,生成目标话术“肝硬化并发症有肝功能受损、门脉高压、消化道出血、肝性脑病、腹膜炎等。”,将目标话术发送至终端。
若知识图谱查询结果为查询目标实体的属性,即知识图谱查询语句查询目标实体的属性。例如,当用户的初始问题语句是“贝特类降脂药有什么副作用?”,实体抽取结果为“贝特类降脂药:药品”,意图识别结果为“副作用/危害”,对应的图谱查询语句为“match(n:Drug)where n.name=“贝特类降脂药”return n.harm”,那么根据贝特类降脂药的副作用属性生成目标话术“贝特类降脂药不良反应为胃肠道不适、皮疹、脱发、头痛、性欲减退等。”,并将目标话术发送至终端;
若知识图谱查询结果为查询多个实体,例如,用户的初始问题语句是“孕期屁股痛需要注意哪些?”,实体抽取结果为“孕期:特殊时期,屁股痛:症状”,意图识别结果为“注意事项”,对应的图谱查询语句为:“match(n:SpecialPeriod{name:“孕期”})-[:MultiConditionRestriction]->(p:SpanNode),(m:Symptom{name:“屁股痛”})-[:MultiConditionRestriction]->(p:SpanNode)return p.attention”,确定与孕期和屁股痛都有关系的空白节点的注意事项属性值,并根据注意事项属性值生成目标话术“孕妇屁股痛可以用热毛巾、热水袋对疼痛处进行热敷,约半小时,疼痛感可以减轻不少”,并将目标话术发送至终端。
可以理解的是,服务器结合实体类型进行知识图谱查询语句转换,制定个性化话术将结果反馈给用户使用的终端,能够在线上问诊应用中为医生提供辅助的决策支持,使得问诊过程更高效。
本申请实施例,采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,同时采用粗粒度实体识对细粒度实体识别结果进行优化,减少了实体抽取过程中的错误传递和冗余信息的干扰,提高了实体识别结果的准确度,进而提高了医疗领域意图识别结果的准确度。并且本方案可应用于智慧医疗领域中,从而推动智慧城市的建设。
上面对本申请实施例中医疗领域意图识别方法进行了描述,下面对本申请实施例中医疗领域意图识别装置进行描述,请参阅图3,本申请实施例中医疗领域意图识别装置的一个实施例包括:
语句获取模块301,用于从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
实体识别模块302,用于调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
实体链接模块303,用于根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
意图识别模块304,用于根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
语句生成模块305,用于根据所述候选医疗意图生成知识图谱查询语句;
图谱查询模块306,用于基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
本申请实施例,采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,同时采用粗粒度实体识对细粒度实体识别结果进行优化,减少了实体抽取过程中的错误传递和冗余信息的干扰,提高了实体识别结果的准确度,进而提高了医疗领域意图识别结果的准确度。并且本方案可应用于智慧医疗领域中,从而推动智慧城市的建设。
请参阅图4,本申请实施例中医疗领域意图识别装置的另一个实施例包括:
语句获取模块301,用于从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
实体识别模块302,用于调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
实体链接模块303,用于根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
意图识别模块304,用于根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
语句生成模块305,用于根据所述候选医疗意图生成知识图谱查询语句;
图谱查询模块306,用于基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
可选的,实体识别模块302包括:
实体识别单元3021,用于调用第一预置识别模型对所述初始问题语句进行实体识别,得到多个粗粒度实体标签;
关系抽取单元3022,用于调用第二预置识别模型对所述初始问题语句进行关系抽取,得到多个实体关系;
生成单元3023,用于根据所述多个粗粒度实体标签和所述多个实体关系生成实体识别结果。
可选的,实体识别单元3021包括:
第一识别子单元30211,用于调用第一预置识别模型对所述初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签;
第二识别子单元30212,用于调用第一预置识别模型对所述多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签。
可选的,第一识别子单元30211具体用于:
按照细粒度对所述初始问题提取多个特征维度向量,所述多个特征维度向量包括词向量、词标签向量、词位置向量和词性特征向量;将所述多个特征维度向量输入到第一预置识别模型的BiLSTM层中,得到BiLSTM层输出的多个中间向量;将所述多个中间向量输入到第一预置识别模型的CRF层中,生成多个细粒度实体标签。
可选的,第二识别子单元30212具体用于:
调用第一预置识别模型按照粗粒度对所述多个细粒度实体标签进行识别,得到多个狭义实体特征和多个限定实体特征,所述多个狭义实体特征包括症状、疾病、部位、医学、检查和治疗,所述多个限定实体特征包括时间、频率、程度、否定词、描述和数值;将所述多个狭义实体特征和所述多个限定实体特征按照预置规则进行组合,生成多个广义实体特征,所述多个广义实体特征包括广义症状、广义检查、广义治疗和广义药物;将多个广义实体特征确定为多个粗粒度实体标签。
可选的,实体链接模块303具体用于:
在预置的医疗实体同义词表中查找多个粗粒度实体标签对应的多个标准的医疗术语,每一个粗粒度实体标签对应一个标准的医疗术语,所述粗粒度实体标签与所述标准的医疗术语为同义词;对所述多个粗粒度实体标签进行融合,得到多个融合的粗粒度实体标签;对所述多个融合的粗粒度实体标签和所述多个标准的医疗术语进行实体链接操作,生成链接后的实体标签。
可选的,图谱查询模块306具体用于:
基于所述知识图谱查询语句在预置的医疗知识图谱进行查询,得到知识图谱查询结果,所述知识图谱查询结果包括目标实体的关系、目标实体的属性和多个实体;根据所述目标实体的关系和目标实体的属性生成对应的目标话术,并将所述目标话术发送至终端。
本申请实施例,采用融合多维度特征的深度学习模型分开进行实体识别和关系抽取,同时采用粗粒度实体识对细粒度实体识别结果进行优化,减少了实体抽取过程中的错误传递和冗余信息的干扰,提高了实体识别结果的准确度,进而提高了医疗领域意图识别结果的准确度。并且本方案可应用于智慧医疗领域中,从而推动智慧城市的建设。
上面图3至图4从模块化功能实体的角度对本申请实施例中的医疗领域意图识别装置进行详细描述,下面从硬件处理的角度对本申请实施例中医疗领域意图识别设备进行详细描述。
图5是本申请实施例提供的一种医疗领域意图识别设备的结构示意图,该医疗领域意图识别设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)510(例如,一个或一个以上处理器)和存储器520,一个或一个以上存储应用程序533或数据532的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器520和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对医疗领域意图识别设备500中的一系列指令操作。更进一步地,处理器510可以设置为与存储介质530通信,在医疗领域意图识别设备500上执行存储介质530中的一系列指令操作。
医疗领域意图识别设备500还可以包括一个或一个以上电源540,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口560,和/或,一个或一个以上操作系统531,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5示出的医疗领域意图识别设备结构并不构成对医疗领域意图识别设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
本申请还提供一种医疗领域意图识别设备,包括:存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互连;所述至少一个处理器调用所述存储器中的所述指令,以使得所述医疗领域意图识别设备执行上述医疗领域意图识别方法中的步骤。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,也可以为易失性计算机可读存储介质。计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
根据所述候选医疗意图生成知识图谱查询语句;
基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种医疗领域意图识别方法,包括:
    从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
    调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
    根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
    根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
    根据所述候选医疗意图生成知识图谱查询语句;
    基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
  2. 根据权利要求1所述的医疗领域意图识别方法,其中,所述调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系,包括:
    调用第一预置识别模型对所述初始问题语句进行实体识别,得到多个粗粒度实体标签;
    调用第二预置识别模型对所述初始问题语句进行关系抽取,得到多个实体关系;
    根据所述多个粗粒度实体标签和所述多个实体关系生成实体识别结果。
  3. 根据权利要求2所述的医疗领域意图识别方法,其中,所述调用第一预置识别模型对所述初始问题语句进行实体识别,得到多个粗粒度实体标签,包括:
    调用第一预置识别模型对所述初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签;
    调用第一预置识别模型对所述多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签。
  4. 根据权利要求3所述的医疗领域意图识别方法,其中,所述调用第一预置识别模型对所述初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签,包括:
    按照细粒度对所述初始问题提取多个特征维度向量,所述多个特征维度向量包括词向量、词标签向量、词位置向量和词性特征向量;
    将所述多个特征维度向量输入到第一预置识别模型的BiLSTM层中,得到BiLSTM层输出的多个中间向量;
    将所述多个中间向量输入到第一预置识别模型的CRF层中,生成多个细粒度实体标签。
  5. 根据权利要求3所述的医疗领域意图识别方法,其中,所述调用第一预置识别模型对所述多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签,包括:
    调用第一预置识别模型按照粗粒度对所述多个细粒度实体标签进行识别,得到多个狭义实体特征和多个限定实体特征,所述多个狭义实体特征包括症状、疾病、部位、医学、检查和治疗,所述多个限定实体特征包括时间、频率、程度、否定词、描述和数值;
    将所述多个狭义实体特征和所述多个限定实体特征按照预置规则进行组合,生成多个广义实体特征,所述多个广义实体特征包括广义症状、广义检查、广义治疗和广义药物;
    将多个广义实体特征确定为多个粗粒度实体标签。
  6. 根据权利要求1所述的医疗领域意图识别方法,其中,所述根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签,包括:
    在预置的医疗实体同义词表中查找多个粗粒度实体标签对应的多个标准的医疗术语,每一个粗粒度实体标签对应一个标准的医疗术语,所述粗粒度实体标签与所述标准的医疗 术语为同义词;
    对所述多个粗粒度实体标签进行融合,得到多个融合的粗粒度实体标签;
    对所述多个融合的粗粒度实体标签和所述多个标准的医疗术语进行实体链接操作,生成链接后的实体标签。
  7. 根据权利要求1-6中任一项所述的医疗领域意图识别方法,其中,所述基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端,包括:
    基于所述知识图谱查询语句在预置的医疗知识图谱进行查询,得到知识图谱查询结果,所述知识图谱查询结果包括目标实体的关系、目标实体的属性和多个实体;
    根据所述目标实体的关系和目标实体的属性生成对应的目标话术,并将所述目标话术发送至终端。
  8. 一种医疗领域意图识别设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
    从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
    调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
    根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
    根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
    根据所述候选医疗意图生成知识图谱查询语句;
    基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
  9. 根据权利要求8所述的医疗领域意图识别设备,所述处理器执行所述计算机程序时还实现以下步骤:
    调用第一预置识别模型对所述初始问题语句进行实体识别,得到多个粗粒度实体标签;
    调用第二预置识别模型对所述初始问题语句进行关系抽取,得到多个实体关系;
    根据所述多个粗粒度实体标签和所述多个实体关系生成实体识别结果。
  10. 根据权利要求9所述的医疗领域意图识别设备,所述处理器执行所述计算机程序时还实现以下步骤:
    调用第一预置识别模型对所述初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签;
    调用第一预置识别模型对所述多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签。
  11. 根据权利要求10所述的医疗领域意图识别设备,所述处理器执行所述计算机程序时还实现以下步骤:
    按照细粒度对所述初始问题提取多个特征维度向量,所述多个特征维度向量包括词向量、词标签向量、词位置向量和词性特征向量;
    将所述多个特征维度向量输入到第一预置识别模型的BiLSTM层中,得到BiLSTM层输出的多个中间向量;
    将所述多个中间向量输入到第一预置识别模型的CRF层中,生成多个细粒度实体标签。
  12. 根据权利要求10所述的医疗领域意图识别设备,所述处理器执行所述计算机程序时还实现以下步骤:
    调用第一预置识别模型按照粗粒度对所述多个细粒度实体标签进行识别,得到多个狭义实体特征和多个限定实体特征,所述多个狭义实体特征包括症状、疾病、部位、医学、检查和治疗,所述多个限定实体特征包括时间、频率、程度、否定词、描述和数值;
    将所述多个狭义实体特征和所述多个限定实体特征按照预置规则进行组合,生成多个广义实体特征,所述多个广义实体特征包括广义症状、广义检查、广义治疗和广义药物;
    将多个广义实体特征确定为多个粗粒度实体标签。
  13. 根据权利要求8所述的医疗领域意图识别设备,所述处理器执行所述计算机程序时还实现以下步骤:
    在预置的医疗实体同义词表中查找多个粗粒度实体标签对应的多个标准的医疗术语,每一个粗粒度实体标签对应一个标准的医疗术语,所述粗粒度实体标签与所述标准的医疗术语为同义词;
    对所述多个粗粒度实体标签进行融合,得到多个融合的粗粒度实体标签;
    对所述多个融合的粗粒度实体标签和所述多个标准的医疗术语进行实体链接操作,生成链接后的实体标签。
  14. 根据权利要求8-13中任一项所述的医疗领域意图识别设备,所述处理器执行所述计算机程序时还实现以下步骤:
    基于所述知识图谱查询语句在预置的医疗知识图谱进行查询,得到知识图谱查询结果,所述知识图谱查询结果包括目标实体的关系、目标实体的属性和多个实体;
    根据所述目标实体的关系和目标实体的属性生成对应的目标话术,并将所述目标话术发送至终端。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
    调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
    根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
    根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
    根据所述候选医疗意图生成知识图谱查询语句;
    基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
  16. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    调用第一预置识别模型对所述初始问题语句进行实体识别,得到多个粗粒度实体标签;
    调用第二预置识别模型对所述初始问题语句进行关系抽取,得到多个实体关系;
    根据所述多个粗粒度实体标签和所述多个实体关系生成实体识别结果。
  17. 根据权利要求16所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    调用第一预置识别模型对所述初始问题语句按照细粒度进行实体识别,得到多个细粒度实体标签;
    调用第一预置识别模型对所述多个细粒度实体标签按照粗粒度进行实体识别,得到多个粗粒度实体标签。
  18. 根据权利要求17所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    按照细粒度对所述初始问题提取多个特征维度向量,所述多个特征维度向量包括词向量、词标签向量、词位置向量和词性特征向量;
    将所述多个特征维度向量输入到第一预置识别模型的BiLSTM层中,得到BiLSTM层输出的多个中间向量;
    将所述多个中间向量输入到第一预置识别模型的CRF层中,生成多个细粒度实体标签。
  19. 根据权利要求17所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行以下步骤:
    调用第一预置识别模型按照粗粒度对所述多个细粒度实体标签进行识别,得到多个狭义实体特征和多个限定实体特征,所述多个狭义实体特征包括症状、疾病、部位、医学、检查和治疗,所述多个限定实体特征包括时间、频率、程度、否定词、描述和数值;
    将所述多个狭义实体特征和所述多个限定实体特征按照预置规则进行组合,生成多个广义实体特征,所述多个广义实体特征包括广义症状、广义检查、广义治疗和广义药物;
    将多个广义实体特征确定为多个粗粒度实体标签。
  20. 一种医疗领域意图识别装置,所述医疗领域意图识别装置包括:
    语句获取模块,用于从终端获取初始问题语句,所述初始问题语句为目标用户在医疗智能问答系统中输入的问题语句;
    实体识别模块,用于调用预置的识别模型对所述初始问题语句进行实体识别,得到实体识别结果,所述实体识别结果包括多个粗粒度实体标签和多个实体关系;
    实体链接模块,用于根据预置的医疗实体同义词表对所述多个粗粒度实体标签进行实体链接,得到链接后的实体标签;
    意图识别模块,用于根据预置的意图识别模型、所述实体识别结果和所述链接后的实体标签对所述初始问题语句进行意图识别,得到候选医疗意图;
    语句生成模块,用于根据所述候选医疗意图生成知识图谱查询语句;
    图谱查询模块,用于基于所述知识图谱查询语句在预置的医疗知识图谱进行知识图谱查询,得到知识图谱查询结果,根据所述知识图谱查询结果生成对应的目标话术并发送至所述终端。
PCT/CN2021/084659 2020-08-28 2021-03-31 医疗领域意图识别方法、装置、设备及存储介质 WO2022041730A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010884353.8 2020-08-28
CN202010884353.8A CN112035635A (zh) 2020-08-28 2020-08-28 医疗领域意图识别方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2022041730A1 true WO2022041730A1 (zh) 2022-03-03

Family

ID=73586135

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084659 WO2022041730A1 (zh) 2020-08-28 2021-03-31 医疗领域意图识别方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112035635A (zh)
WO (1) WO2022041730A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182235A (zh) * 2020-08-29 2021-01-05 深圳呗佬智能有限公司 一种构建知识图谱的方法、装置、计算机设备及存储介质
CN114911915A (zh) * 2022-05-27 2022-08-16 重庆长安汽车股份有限公司 一种基于知识图谱的问答搜索方法、系统、设备和介质
CN115630174A (zh) * 2022-12-21 2023-01-20 上海金仕达软件科技有限公司 一种多源公告文档处理方法、装置、存储介质及电子设备
CN116092493A (zh) * 2023-04-07 2023-05-09 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质
CN116108146A (zh) * 2023-04-13 2023-05-12 天津数域智通科技有限公司 基于知识图谱构建的信息抽取方法
CN116150406A (zh) * 2023-04-23 2023-05-23 湖南星汉数智科技有限公司 上下文稀疏实体链接方法、装置、计算机设备和存储介质
CN116186359A (zh) * 2023-05-04 2023-05-30 安徽宝信信息科技有限公司 一种高校多源异构数据的集成管理方法、系统及存储介质
CN116364296A (zh) * 2023-02-17 2023-06-30 中国人民解放军总医院 标准检查项目名称确认方法、装置、设备、介质及产品
CN116992861A (zh) * 2023-09-25 2023-11-03 四川健康久远科技有限公司 基于数据处理的医疗服务智慧处理方法及系统
CN117056493A (zh) * 2023-09-07 2023-11-14 四川大学 基于病历知识图谱的大语言模型医疗问答系统

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035635A (zh) * 2020-08-28 2020-12-04 康键信息技术(深圳)有限公司 医疗领域意图识别方法、装置、设备及存储介质
CN112466463B (zh) * 2020-12-10 2023-08-18 求臻医学科技(浙江)有限公司 基于肿瘤精准诊疗知识图谱的智能解答系统
CN112232059B (zh) * 2020-12-14 2021-03-26 北京声智科技有限公司 文本纠错方法、装置、计算机设备及存储介质
CN112925918B (zh) * 2021-02-26 2023-03-24 华南理工大学 一种基于疾病领域知识图谱的问答匹配系统
CN112966122B (zh) * 2021-03-03 2024-05-10 平安科技(深圳)有限公司 语料意图识别方法、装置、存储介质及计算机设备
CN113157893B (zh) * 2021-05-25 2023-12-15 网易(杭州)网络有限公司 多轮对话中意图识别的方法、介质、装置和计算设备
CN113282761A (zh) * 2021-05-27 2021-08-20 平安科技(深圳)有限公司 科室信息的推送方法、装置、设备以及存储介质
CN113327691B (zh) * 2021-06-01 2022-08-12 平安科技(深圳)有限公司 基于语言模型的问询方法、装置、计算机设备及存储介质
CN113345430B (zh) * 2021-06-25 2024-05-10 上海适享文化传播有限公司 基于语音固定条件下多字段的查询方法
CN113468307B (zh) * 2021-06-30 2023-06-30 网易(杭州)网络有限公司 文本处理方法、装置、电子设备及存储介质
CN113408274B (zh) * 2021-07-13 2022-06-24 北京百度网讯科技有限公司 训练语言模型的方法和标签设置方法
CN113535919B (zh) * 2021-07-16 2022-11-08 北京元年科技股份有限公司 一种数据查询的方法、装置、计算机设备以及存储介质
CN113688233A (zh) * 2021-07-30 2021-11-23 达观数据(苏州)有限公司 一种用于知识图谱语义搜索的文本理解的方法
CN113657102B (zh) * 2021-08-17 2023-05-30 北京百度网讯科技有限公司 信息抽取方法、装置、设备及存储介质
CN113707303A (zh) * 2021-08-30 2021-11-26 康键信息技术(深圳)有限公司 基于知识图谱的医疗问题解答方法、装置、设备及介质
CN113793668A (zh) * 2021-09-17 2021-12-14 平安科技(深圳)有限公司 基于人工智能的症状标准化方法、装置、电子设备及介质
CN114300128B (zh) * 2021-12-31 2022-11-22 北京欧应信息技术有限公司 用于辅助疾病智能诊断的医学概念链接系统及存储介质
CN114464312B (zh) * 2022-01-04 2022-12-02 北京欧应信息技术有限公司 用于辅助疾病推理的系统及存储介质
CN114722163B (zh) * 2022-06-10 2023-04-07 科大讯飞股份有限公司 数据查询方法、装置、电子设备和存储介质
CN115062628A (zh) * 2022-06-15 2022-09-16 北京信息科技大学 一种基于知识图谱的医患交流对话自动模拟方法
CN114996412B (zh) * 2022-08-02 2022-11-15 医智生命科技(天津)有限公司 医疗问答方法、装置、电子设备及存储介质
CN117235241A (zh) * 2023-11-15 2023-12-15 安徽省立医院(中国科学技术大学附属第一医院) 一种面向高血压问诊随访场景人机交互方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170103069A1 (en) * 2015-10-13 2017-04-13 International Business Machines Corporation Supplementing candidate answers
CN108959627A (zh) * 2018-07-23 2018-12-07 北京光年无限科技有限公司 基于智能机器人的问答交互方法及系统
CN110659366A (zh) * 2019-09-24 2020-01-07 Oppo广东移动通信有限公司 语义解析方法、装置、电子设备以及存储介质
CN111522910A (zh) * 2020-04-14 2020-08-11 浙江大学 一种基于文物知识图谱的智能语义检索方法
CN112035635A (zh) * 2020-08-28 2020-12-04 康键信息技术(深圳)有限公司 医疗领域意图识别方法、装置、设备及存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110335B (zh) * 2019-05-09 2023-01-06 南京大学 一种基于层叠模型的命名实体识别方法
CN110597970B (zh) * 2019-08-19 2023-04-07 华东理工大学 一种多粒度医疗实体联合识别的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170103069A1 (en) * 2015-10-13 2017-04-13 International Business Machines Corporation Supplementing candidate answers
CN108959627A (zh) * 2018-07-23 2018-12-07 北京光年无限科技有限公司 基于智能机器人的问答交互方法及系统
CN110659366A (zh) * 2019-09-24 2020-01-07 Oppo广东移动通信有限公司 语义解析方法、装置、电子设备以及存储介质
CN111522910A (zh) * 2020-04-14 2020-08-11 浙江大学 一种基于文物知识图谱的智能语义检索方法
CN112035635A (zh) * 2020-08-28 2020-12-04 康键信息技术(深圳)有限公司 医疗领域意图识别方法、装置、设备及存储介质

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182235A (zh) * 2020-08-29 2021-01-05 深圳呗佬智能有限公司 一种构建知识图谱的方法、装置、计算机设备及存储介质
CN114911915A (zh) * 2022-05-27 2022-08-16 重庆长安汽车股份有限公司 一种基于知识图谱的问答搜索方法、系统、设备和介质
CN115630174A (zh) * 2022-12-21 2023-01-20 上海金仕达软件科技有限公司 一种多源公告文档处理方法、装置、存储介质及电子设备
CN115630174B (zh) * 2022-12-21 2023-07-21 上海金仕达软件科技股份有限公司 一种多源公告文档处理方法、装置、存储介质及电子设备
CN116364296A (zh) * 2023-02-17 2023-06-30 中国人民解放军总医院 标准检查项目名称确认方法、装置、设备、介质及产品
CN116364296B (zh) * 2023-02-17 2023-12-26 中国人民解放军总医院 标准检查项目名称确认方法、装置、设备、介质及产品
CN116092493A (zh) * 2023-04-07 2023-05-09 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质
CN116092493B (zh) * 2023-04-07 2023-08-25 广州小鹏汽车科技有限公司 语音交互方法、服务器和计算机可读存储介质
CN116108146A (zh) * 2023-04-13 2023-05-12 天津数域智通科技有限公司 基于知识图谱构建的信息抽取方法
CN116108146B (zh) * 2023-04-13 2023-06-27 天津数域智通科技有限公司 基于知识图谱构建的信息抽取方法
CN116150406A (zh) * 2023-04-23 2023-05-23 湖南星汉数智科技有限公司 上下文稀疏实体链接方法、装置、计算机设备和存储介质
CN116186359A (zh) * 2023-05-04 2023-05-30 安徽宝信信息科技有限公司 一种高校多源异构数据的集成管理方法、系统及存储介质
CN116186359B (zh) * 2023-05-04 2023-09-01 安徽宝信信息科技有限公司 一种高校多源异构数据的集成管理方法、系统及存储介质
CN117056493A (zh) * 2023-09-07 2023-11-14 四川大学 基于病历知识图谱的大语言模型医疗问答系统
CN116992861A (zh) * 2023-09-25 2023-11-03 四川健康久远科技有限公司 基于数据处理的医疗服务智慧处理方法及系统
CN116992861B (zh) * 2023-09-25 2023-12-08 四川健康久远科技有限公司 基于数据处理的医疗服务智慧处理方法及系统

Also Published As

Publication number Publication date
CN112035635A (zh) 2020-12-04

Similar Documents

Publication Publication Date Title
WO2022041730A1 (zh) 医疗领域意图识别方法、装置、设备及存储介质
WO2022041728A1 (zh) 医学领域意图识别方法、装置、设备及存储介质
US10592610B1 (en) Semantic graph traversal for recognition of inferred clauses within natural language inputs
WO2023098288A1 (zh) 一种基于含因果性医学知识图谱的疾病辅助鉴别诊断系统
Jiang et al. FreebaseQA: A new factoid QA data set matching trivia-style question-answer pairs with Freebase
WO2021139232A1 (zh) 基于医疗知识图谱的分诊方法、装置、设备及存储介质
Alicante et al. Unsupervised entity and relation extraction from clinical records in Italian
CN109739964A (zh) 知识数据提供方法、装置、电子设备和存储介质
CN110675944A (zh) 分诊方法及装置、计算机设备及介质
Friedman et al. Natural language and text processing in biomedicine
CN113505243A (zh) 基于医疗知识图谱的智能问答方法和装置
WO2023165012A1 (zh) 问诊方法和装置、电子设备及存储介质
CN110276080B (zh) 一种语义处理方法和系统
CN114153994A (zh) 医保信息问答方法及装置
US20200242133A1 (en) Reducing a search space for a match to a query
CN117253629A (zh) 导医信息推送方法、装置、设备、介质和计算机程序产品
Montenegro et al. The HoPE model architecture: A novel approach to pregnancy information retrieval based on conversational agents
US20230253124A1 (en) Method for machine-assisted automated continuation of conversations between the user, software system, and health expert.
Casillas et al. Clinical text mining for efficient extraction of drug-allergy reactions
CN115658863A (zh) 一种基于糖尿病知识图谱的问答系统构建方法
CN113314236A (zh) 一种面向高血压的智能问答系统
CN114004237A (zh) 一种基于膀胱癌知识图谱的智能问答系统构建方法
Sun et al. Multi-strategy fusion for medical named entity recognition
Wang et al. Reasoning on Efficient Knowledge Paths: Knowledge Graph Guides Large Language Model for Domain Question Answering
Li et al. A medical specialty outpatient clinics recommendation system based on text mining

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21859579

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06.06.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21859579

Country of ref document: EP

Kind code of ref document: A1