CN114064931A - Multi-modal knowledge graph-based emergency knowledge question-answering method and system - Google Patents

Multi-modal knowledge graph-based emergency knowledge question-answering method and system Download PDF

Info

Publication number
CN114064931A
CN114064931A CN202111434019.3A CN202111434019A CN114064931A CN 114064931 A CN114064931 A CN 114064931A CN 202111434019 A CN202111434019 A CN 202111434019A CN 114064931 A CN114064931 A CN 114064931A
Authority
CN
China
Prior art keywords
knowledge
emergency
aid
entity
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111434019.3A
Other languages
Chinese (zh)
Inventor
于清
余超
吾守尔·斯拉木
恩强
业林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinjiang University
Original Assignee
Xinjiang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinjiang University filed Critical Xinjiang University
Priority to CN202111434019.3A priority Critical patent/CN114064931A/en
Publication of CN114064931A publication Critical patent/CN114064931A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a multi-modal knowledge map-based emergency knowledge question-answering method and a system, wherein the method comprises the following steps: acquiring first-aid related knowledge based on the Internet, and constructing a multi-mode first-aid knowledge map according to the first-aid related knowledge; acquiring a question input by a user, and extracting an entity and a relation in the question by using an entity-relation joint extraction model; positioning an entity in the multi-mode first-aid knowledge graph according to the entity in the question to determine a matching entity; calculating the similarity of all relations between the entities in the question sentence and the matching entities by using a deep learning model; determining answers of the question sentences according to the similarity; and inputting the answer into a machine translation model according to the target language selected by the user, and outputting the translated answer. By adopting the multi-modal knowledge map-based emergency knowledge question-answering method and the multi-modal knowledge map-based emergency knowledge question-answering system, the emergency knowledge can be learned on line, the convenience of social citizen learning and the emergency effect are improved, multi-language translation can be performed, and the emergency efficiency is improved.

Description

Multi-modal knowledge graph-based emergency knowledge question-answering method and system
Technical Field
The invention relates to the field of emergency knowledge question answering, in particular to an emergency knowledge question answering method and system based on a multi-mode knowledge map.
Background
The out-of-hospital emergency treatment is a basic link in the three-loop theory (out-of-hospital emergency medical loop, in-hospital emergency loop, critical illness ICU loop) of the trauma treatment system, and is also a very important loop. Currently, however, the out-of-hospital emergency is a weak link in the medical care of China. Taking out-of-hospital cardiac arrest as an example, data show that 50 thousands of people in China have lost lives due to cardiac arrest every year, and the survival rate is less than 3%. When the patient has sudden cardiac arrest and effective cardiopulmonary resuscitation is achieved within 1 minute, 90 percent of the patients can be rescued; effective cardiopulmonary resuscitation is achieved within 4 minutes, and 50% of patients can survive; effective cardiopulmonary resuscitation is achieved within 6 minutes, and only 10% of patients can survive; cardiopulmonary resuscitation is initiated within 10 minutes, and patients with cardiac arrest are almost irreparable, with about 70% to 80% of cardiac arrest occurring in homes, streets, and public places, and with 120 emergency vehicles being difficult to reach within 4 minutes. For sudden death, drowning, foreign body throat blockage, car accidents, poisoning and other accidents such as various disasters, before the professional first aid strength is reached, social first aid, public self-rescue, mutual rescue and the like can be timely and effectively rescued, and casualties can be reduced to the greatest extent. Therefore, the popularization of public first aid knowledge and skills is directly related to the life safety and social development of the masses. Once an emergency situation occurs, the first witness at the emergency site must be able to undertake the burden of quickly performing an emergency with simple emergency equipment. However, because emergency events may occur at any time and on any occasion, the first witnesses in the emergency scene are often mostly ordinary citizens, not emergency professionals.
However, the Chinese citizens have weak first aid consciousness and insufficient first aid knowledge and capability. According to survey data, the popularization rate of first aid knowledge in developed countries in the world is more than 10%. The population proportion of the America which is trained by the on-site rescue is more than 33 percent, the popularity rate of Germany is 80 percent, the popularity rate of France and Australia is 40 percent, and the popularity rate of the Chinese first aid knowledge and skills is less than 1 percent on average, thus the emergency rescue skill training needs to be increased and the popularity of the first aid knowledge is emphasized. After the mastery level of first-aid knowledge of college students in Zhejiang province is investigated, the conclusion is that the first-aid knowledge level of college students in Zhejiang province is still to be improved, and the first-aid knowledge level of college students is really improved by strengthening popularization training and propaganda education of the first-aid knowledge in college schools. When the mastery conditions and existing problems of the medical college students on the emergency knowledge and skills are investigated, the conclusion is that the emergency knowledge and skill levels of the medical college students are generally low, and the requirement degree on the emergency knowledge and skill is high. Certain improvement is needed in the aspect of emergency safety education, and the first-aid knowledge and skill theory are realized to be in the same practice and develop repeatedly. The college students are used as the mainstream pillars of the future society, while only 29.90% of the college students who actually take part in the emergency rescue training know the correct position of chest compression, only 16.60% of the college students are extremely lack of on-site emergency knowledge, and 95.40% of the college students show that the emergency skill training is needed. Therefore, the popularization rate of first-aid knowledge of people in China is seriously low, including college students receiving high education and the like.
The level of emergency treatment outside the hospital becomes one of the specific social indicators reflecting the modern civilization degree, economic development level and comprehensive strength of a country or a region, and is more concerned with the life safety of each citizen. However, the pre-hospital emergency development in China is weak, the emergency willingness of citizens is not strong, the emergency consciousness is weak, and the emergency knowledge and the emergency ability are seriously insufficient at present. Aiming at the current situation in the pre-hospital emergency field, the popularization of emergency knowledge must be enhanced, the emergency capability of the social citizens is improved, the emergency personnel can reasonably apply emergency measures to save themselves before arriving, and more rescue time is strived for the subsequent pre-hospital emergency work.
The problem that how to popularize pre-hospital emergency knowledge and improve the safety awareness of the first aid of citizens is researched by all countries. Beyond China, the popularization of first-aid knowledge is supported by a relatively perfect system and related laws. The first emergency witnesses who are authenticated are first witnesses who have the authentication examination standard of first emergency witnesses in the United states, the first witnesses who are authenticated are first-aid trainings in the 40-60 school hours and can deal with the first witnesses of pre-hospital medical emergency rescue conditions, and except for special trainings for cardio-pulmonary resuscitation, professional trainees of various schools, colleges, middle schools and community service institutions give out the out-of-hospital emergency training. Meanwhile, in the united states, laws stipulate that special post personnel, such as firefighters and drivers, must have certain first aid capability level certification. The series of measures make the American public have strong emergency consciousness, the popularization rate of the basic emergency technology reaches 89.95 percent, and the lifesaving rate reaches 99.87 percent. The French national medical research institute provides a plurality of measures aiming at the state of short public emergency knowledge, including providing more emergency training opportunities to cultivate public emergency personnel and fund the institutions to ensure the daily emergency training of schools, armies and workplaces in the training of driving schools, the examinations of primary and middle schools, the fulfillment of group obligations and the practice of high-risk work. The units in China that carry out pre-hospital first aid knowledge popularization work are mainly China emergency hospitals, China Red Cross and American Heart Association (AHA). The traditional mode of popularization of the existing first-aid knowledge is mainly through offline training, and professionals are invited to carry out skill training on site. Although the effect is good, the impression of people who take part in training is impressive, the defects are also highlighted with the forward development of society, and the current defects comprise that the learning time and place are not free, which is very inconvenient for people and students who work everyday. Such as: many enterprise's staff is difficult to take out the time to participate in first aid knowledge study during the course of the work, causes the people who know first aid knowledge on one's side few and few, and training time is discontinuous, even if participate in first aid knowledge training, can remember totally also not much. On the other hand, more manpower and material resources are needed to be equipped for one training, and the number of people for training is limited. In addition, the training is difficult to understand by professional theoretical knowledge, and is difficult to master particularly for the minority nationality with unsmooth language understanding. The popularization of emergency knowledge for college students is to intermittently cultivate the first aid quality of the college students by developing an emergency knowledge lecture way. China has long been paying attention to training on cardio-pulmonary resuscitation of social public, such as red cross and emergency centers for carrying out cardio-pulmonary resuscitation training on policemen, security guards, firemen, teachers and students in schools and the like; performing cardio-pulmonary resuscitation training on medical workers in medical colleges and hospitals; and other institutions publicizing first aid. However, in reality, only a few first responders can still perform effective cardiopulmonary resuscitation after the training of cardiopulmonary resuscitation is finished. In order to solve the problems in the field of emergency education, the invention aims to solve the problems by adopting a method of combining emergency knowledge with knowledge maps, natural language processing and a mobile internet technology.
The knowledge graph is a foundation and a bridge for realizing intelligent semantic retrieval, and lays a solid foundation for knowledge interconnection on the world wide web. Compared with the traditional Web page network, the nodes in the knowledge graph are changed into different types of entities from the form of the Web pages, and the edges in the graph are also changed into various rich semantic relationships among the entities from hyperlinks (hyperlinks) connecting the Web pages. The knowledge graph can be divided into a general domain knowledge graph and a vertical domain knowledge graph according to the coverage range. The medicine is one of the vertical fields with the widest application of knowledge maps, is also a hotspot of research in the artificial intelligence field of various countries at present, and has great application value in the intelligent medical fields such as disease risk assessment, intelligent auxiliary diagnosis and treatment, medical quality control and medical treatment, question and answer, and the like. The application of the knowledge-maps in the existing medical field, including the medical knowledge-maps of IBM's Watton Health, the medical intellectual library of Ali's "medical deer", and the AI medical knowledge-map of dog searching APGC, has also started to come into the eyes of people in recent 2 years. In the medical field, typical medical knowledge maps include SNOMED-CT, IBM's Watton Health, and Chinese medicine knowledge maps such as Shanghai eosin Hospital. These traditional knowledge maps contain only textual knowledge and do not relate to knowledge of other modalities, such as medical images, audio-visual resources for emergency promotions. The traditional knowledge graph mainly focuses on researching entities and relations of texts and databases, and the multi-modal knowledge graph constructs entities under multiple modalities (such as visual modalities) and multi-modal semantic relations among the entities of the multiple modalities on the basis of the traditional knowledge graph. For example, in a latest one of the multimodal encyclopedias Richpedia, a multimodal semantic relationship (rpo: imageof) between images of the image modality London and text modality knowledge graph entities (DBpedia entities: London eye) is first constructed, and then a multimodal semantic relationship (rpo: nextTo) between images of the image modality London and text modality entities is also constructed. The latest research of the multi-modal knowledge graph in the medical field is based on medical question and answer of the multi-modal knowledge graph, the method for constructing the multi-modal knowledge graph adds visual information into an original Chinese symptom library, the specific method is to collect a plurality of images of an entity in the Chinese symptom library from Google pictures, obtain visual representation of the entity on the basis of different image noise values, and finally well integrate the visual information of the entity into the Chinese symptom library. The emergency education resources are rich, and the propaganda material of one emergency skill comprises texts, languages, videos and pictures.
The knowledge graph aiming at the field of emergency treatment is fresh and smelled at present, but the traditional knowledge graph in the field of medicine is more researched. The construction process of the traditional knowledge graph in the medical field can be summarized into three steps: medical knowledge extraction, medical knowledge fusion and medical knowledge calculation. The medical knowledge extraction is characterized in that the constituent elements of the knowledge map such as entities, relations, attributes and the like are extracted from a large amount of structured, semi-structured or unstructured medical data, the elements are stored in the knowledge base in a reasonable and efficient mode, the medical knowledge fusion integrates, disambiguates and processes the content of the medical knowledge base, the logic and expression capacity in the knowledge base are enhanced, and old knowledge is updated or new knowledge is supplemented for the medical knowledge map. The medical knowledge calculation deduces the missing fact by knowledge reasoning, and automatically completes the diagnosis and treatment of the disease.
Compared with the traditional named entity recognition task, the identification of the emergency information entity is greatly different from the identification of the traditional named entity, and is particularly represented in the following aspects: 1) the method mainly aims at identifying information such as sudden diseases in emergency treatment, sudden symptom description of patients, emergency equipment, emergency drugs, emergency skills, emergency operation steps, symptom information of the patients in the emergency treatment process and the like; 2) the sudden diseases of the patients, the sudden symptom description information and other aspects are different from the traditional named entity identification, for example, a large amount of professional terms and abbreviation words are used, so that the medical entity identification task is more dependent on the prior knowledge; 4) nesting is common to medical named entities. However, most named entity recognition systems can only process a single entity at present, ignore entities nested inside, and cannot capture fine-grained semantic information in a text. To solve the above problems, meichi Ju proposes a novel dynamic stacking (fltner layers) neural network model, which uses internal entities to facilitate external entity detection; JuntaoY proposes a multi-head labeling strategy, a Biaffine mechanism is constructed on a coding layer through two feedforward networks, and SoftMax coding is adopted, entity identification is reconstructed into a structured prediction task, so that the accuracy of a model on a data set is improved by 2.2%. However, multi-headed annotations have a representation matrix sparsity problem. Therefore, the invention introduces a pre-training language model based on BERT (bidirectional Encoder reproduction from transformations), searches all spans in the sentence through segment arrangement and marking, flexibly processes the complex extraction problem in the recognition task, decouples the sequence length, and realizes the entity relationship joint extraction.
The relation extraction is not only a main task of information extraction, but also a key link for constructing and supplementing the knowledge graph. In recent years, joint learning (joint learning) models have made great progress in the study of entity-relationship extraction. Compared with the traditional pipeline (pipeline) method, the joint learning can extract the relation of two subtasks by fully utilizing the entities and the relations, and the error propagation problem is avoided. Existing joint models can be divided into two categories: structured prediction and multitask learning. The structured prediction method integrates two tasks into a unified framework, and a decoding module is used for outputting extraction information. Meihanzhang, JueWang, adopted the form filling method proposed in Makoto Miwa; arzo Katiyar and Suncong Zheng use a sequence-based labeling approach; changzhi Sun and Tsu-Jui Fu propose graph-based methods to jointly predict entities and relationship types; xiaoya Li converts the task into multiple rounds of question-and-answer questions. All of the above methods need to solve the global optimization problem and perform joint decoding using beam search or reinforcement learning at the time of inference. The multi-task learning approach basically builds two separate entity recognition and relationship extraction models and optimizes them together through parameter sharing. Makoto Miwa suggests using a sequence tagging model for entity prediction and a tree-based LSTM model for relationship extraction. Both models share an LSTM layer for contextualized word representation, and they find that sharing parameters improves the performance of both models. The method of Giannis Bekoulis is similar, except that they model the relational classification as a multi-label header selection problem. But these methods still perform pipelined decoding: first, entities are extracted and a relational model is applied to the predicted entities.
Knowledge representation learning is based on a distributed representation idea, semantic information of an entity (or a relation) is mapped into a low-dimensional dense real value vector space, so that the distance between two objects with similar semantics is similar, effective representation and calculation of a knowledge graph can be realized, and the knowledge graph is effectively applied to the aspects of relation reasoning, link prediction, entity clustering and the like. Compared with symbolic representation, distributed representation has the advantages of improving calculation efficiency, relieving data sparseness and achieving heterogeneous information fusion, so that a reasonable knowledge representation learning model is designed, and a foundation is laid for knowledge calculation and further achieving natural language understanding. The expression learning technology is derived from a word2vec language learning model proposed by Mikolov et al in 2013, and the Mikolov et al finds that word vector expressions trained by the language model have translation invariance. Based on the above teaching, Bordes et al design a TransE model, adopt the h + r ═ t modeling assumption, and use summation as a semantic synthesis calculation scheme to realize the relationship inference between entities. However, the relation types in the knowledge graph are usually complex, and the relationship types such as "one-to-many" and "many-to-many" in the pair of TransE cannot be accurately represented. To solve this problem, a series of research results were produced: wang et al propose a TransH model to map entities to hyperplanes of a corresponding relationship r, and then implement semantic synthesis mapping by an addition mode, thereby effectively solving the problems of one-to-many and many-to-many, but the premise requires that the entities and the relationships are in the same space, so that the multi-semantic characteristics of the relationships in medical clinical data cannot be better represented. Lin et al, based on TransE and TransH design ideas, propose a TransR model by changing vector expressions of entities and relations on the premise of not changing semantic synthesis operation modes, the model maps the entities and relations to different vector spaces respectively, and then perform semantic synthesis, more suitable for multi-semantic representation of relations in medical clinical data.
Knowledge reasoning is to further mine implicit knowledge on the basis of the existing medical knowledge base, so that the knowledge base is enriched and expanded. Conventional knowledge inference methods include description logic-based inference, rule-based inference, case-based inference, and the like. The conventional knowledge inference method has the natural advantages of mining useful information from massive medical data and can improve the efficiency and the accuracy of knowledge inference, and common models comprise an artificial neural network model, a genetic algorithm, a back propagation network model and the like.
Entity links map entity designations in natural language text to corresponding entities in the knowledge-graph, such as: the high fever is one of the manifestations of heatstroke, and two entities of the high fever and the heatstroke are respectively mapped to corresponding entities in the knowledge graph. With the development of the internet, data grows exponentially, and the challenge of quickly acquiring effective information in texts exists, while entity links help users to accurately acquire the effective information in a short time. In the medical field, entity linking correctly links entity references in electronic medical records to corresponding entities in medical knowledge maps, and can also solve the problems of diversity and ambiguity of medical entities. Current knowledge-graph research is focused primarily on static knowledge-graphs that are virtually invariant over time, and in addition, entity links rely on the sophistication of knowledge-graphs.
The entity linking task has two key steps: and (4) performing named identification and entity disambiguation. Entity disambiguation, in turn, includes candidate entity generation and candidate entity ordering. The identification method is referred to herein as consistent with the medical entity relationship identification method.
The candidate entity generation takes the specifically identified entity designation as a target, queries in the knowledge graph, finds a corresponding entity as a candidate entity, and the process is the candidate entity generation. The candidate entities tend to be more than one and therefore the candidate entities need to be ranked. The generation of candidate entities requires high recall rate, thereby improving the entity link accuracy. Generating candidate entities by building a name dictionary is the most common method, but this method has a low candidate entity recall rate. A method for constructing a name dictionary is adopted, an experience probability entity graph is constructed through a knowledge graph, namely, the entity popularity corresponding to the entity designation is calculated through the knowledge graph, and candidate entity generation is assisted.
Candidate entity ordering the candidate entity ordering mainly includes a traditional feature method, a binary method, a graph-based method and a deep learning-based method. A disambiguation method fusing multiple characteristics is carried out by Linezeife and the like, the characteristics of entity popularity, question similarity and similar entity designation are combined and weighted, and the results are sorted to obtain entities corresponding to entity designations, but the traditional characteristic method is difficult to capture fine-grained structure information and semantic information; pilz and the like construct an entity designation and entity vector representation method of subject information, and the subject distance is input into an SVM classifier for classification by calculating the similarity subject distance between an entity designation context and a candidate entity context; zhoujin and the like propose a joint feature entity linking method based on a graph, multiple features are fused to initial edge weight calculation by using a restarting random walk algorithm, the graph-based method has good interpretability on global disambiguation, but is difficult to be combined with a local method to optimize the disambiguation, and entities with similar semantics are difficult to distinguish in texts with insufficient context information; the deep learning-based method does not need manual feature labeling, and Wu dao Chong and the like adopt a C-DSSM model to realize short text entity linkage. The invention provides a method for realizing the sequencing of candidate entities by inputting the similarity and the empirical probability logarithm of the entity designation-entity into a feed-forward neural network (FFNN) and calculating local scores and adding global voting scores.
At present, two methods are used for constructing the multi-modal knowledge graph, one is that different modes are respectively extracted and the final multi-modal graph is formed by graph fusion in the traditional method. Leimantling proposed the first comprehensive open-source multimedia knowledge extraction system that takes as input a large amount of unstructured, heterogeneous multimedia data from a variety of sources and languages and follows a rich, fine-grained ontology to create a coherent, structured knowledge base, indexing entities, relationships and events. However, the traditional method has the problem that the dependence and the corresponding relation among different modal characteristics are not considered at the source, so that the final fusion result cannot well depict various associations contained in the multi-modal data. Therefore, further, the map has the characteristics of multiple modes at the beginning, and the constructed multi-mode map can help to understand data of multiple modes and complete tasks such as visual relation recognition and cross-modal entity linking. Therefore, the second construction method is to add additional modal information through network link or picture search and the like on the basis of the traditional knowledge graph to construct the multi-modal knowledge graph. Based on the Chinese symptom library, Zhanying et al construct a multi-modal knowledge map in the medical field by integrating the image information of the entities in the knowledge base. The multimodal encyclopedia Richpedia firstly constructs a multimodal semantic relationship (rpo: imageof) between an image modality London eye image and a text modality knowledge graph entity (DBpedia entity: London eye), and then also constructs a multimodal semantic relationship (rpo: nextTo) between an image modality entity London eye and an image modality entity. How to extract information of other modes and how to structure unstructured visual information are the key points for constructing a multi-mode knowledge graph.
The common method for carrying out structural processing on the voice information is that a global identifier is defined for the voice information, the global identifier is converted into a text form through a voice processing tool such as a science news flyer, an entity in the text is identified and linked with an entity in a knowledge graph, and finally the voice information is connected with a corresponding entity in the knowledge graph in a relational mode. In the construction of the multi-mode course knowledge graph, the dawn firstly identifies the lecture voice of a teacher into a text through a voice identification technology, secondly realizes the matching link of the voice and a knowledge point entity through text matching, and defines the relation between the voice and the knowledge point entity as the association, thereby completing the multi-mode entity link work.
The conventional knowledge-graph is integrated into image information, and firstly, the relationship between the object attribute in an image and different objects needs to be acquired. The existing main research methods comprise manual annotation of image information, image identification and image description. The manual labeling method is that each attribute of an object in an image and the relationship between the objects are manually subjected to generation of structured text data, so that entities in a knowledge graph and the relationship between the entities are expanded; the image recognition method can extract an object in an image and the attribute of the object by using an image recognition technology. But the image recognition method cannot recognize the relationship between different objects in the picture. The image description is based on image recognition, and not only can identify objects in the image, but also can identify the relationship between the objects. In the research work of image description, chechomen and others propose a control signal with finer granularity, which is called as Abstract Scene Graph (ASG), so that objects, attributes, relationships and the like that users want to express can be conveniently controlled through the ASG. However, both image recognition and image description require a large number of manually labeled images. Most of images in the existing first-aid propaganda materials are not marked.
The traditional knowledge graph is integrated into video information to mine more connections between entities in the knowledge base. There are many first aid training videos such as first aid equipment teaching video, first aid skill teaching video, the video of common wound treatment in current first aid propaganda resource. The objects contained in the videos are almost all in the emergency training resources in the text form, so the value of adding new entities into the knowledge graph is not great, but the relationships between the objects in the videos are ubiquitous, and the relationships in the knowledge graph can be further enriched by mining the relationships between the objects in the videos. The latest multimodality knowledge-graph Richpedia contains the relationship of image entities to image entities, which is found by including the description information of the images of two image entities, which is derived from the file names of the images in wikipedia. For example, the file name of a picture of an airplane in wikipedia, "an airplane stopped on a runway to fly," includes two entities, "the runway," "the airplane," and the relationship "stayon" between them.
The knowledge map provides a more effective mode for expression, organization, management and utilization of massive, heterogeneous and dynamic data, so that the intelligent level of the system is higher and is closer to the cognitive thinking of human beings.
The most common application of the knowledge graph as the intelligent question-answering is also a hotspot of the current research, and the intelligent question-answering based on the knowledge graph mainly comprises the following methods:
the earliest knowledge graph-based intelligent question answering is a template-based question answering method, and a query expression is formed by constructing a group of template parameters to match a question text. The whole process does not involve question analysis, and relevant entity relation mapping is replaced by the preset query template. Problem templates need field experts to compile by hand, and are time-consuming, labor-consuming and difficult to maintain. To solve this problem, research on automatic generation of problem templates has been focused. Cui et al propose an optimization scheme for simple fact question-answering in terms of large-scale template automated generation. Absjabal et al propose a QUINT model, automatically learn templates through linguistic data, and convert natural language question sentences into knowledge base queries with the aid of the generated templates. Cocco et al propose an object-oriented question-answering system, which learns the SPARQL template through a machine learning method on the existing training set (question-answer pairs paired with each other) by means of an RDF-form LinkedSpeding data set.
The method has the advantages that: and a relatively accurate answer can be obtained, and the response speed is high. The disadvantages are as follows: a large amount of manpower is required for template proofreading and template library maintenance. However, aiming at the multi-hop complex problem in the field of question answering, the latest template method can also provide a solution, and the research focus of the current method is more focused on automatic template generation, so that the problems of time consumption and labor consumption are solved.
The key idea of the question-answering method based on semantic analysis is that natural language question components are analyzed, query is converted into a logic expression, the logic expression is converted into knowledge map query by using semantic information of the knowledge map, and finally a corresponding result is obtained. The logical expression is used for structured query facing knowledge graph, and searching entity in the knowledge base and knowledge related to the entity. The implementation of semantic analysis question-answering system based on knowledge graph requires two key steps: 1) converting the problem into a semantic representation which can be understood and operated by a machine by using a semantic parser; 2) the semantics are used to generate a structured query language, query the knowledge graph, and find answers from the returned set of entities. There are three types of semantic analysis methods: semantic analysis based on dictionary-grammar, semantic analysis based on semantic graph construction, and semantic analysis based on neural network. Generally, semantic representation based on symbolic logic lacks flexibility, is easily influenced by a semantic gap between symbols in a question semantic analysis process, and meanwhile, a structured semantic representation obtained from a natural language question needs many steps of operation, and error transmission among the steps affects question and answer accuracy. And a large amount of linguistic data is needed for training the neural network, so the question-answering method based on semantic analysis has a poor effect.
The answer ordering method based on deep learning is that rich semantic information (characters, words, context relations, entities, relations and attributes in a knowledge graph) contained in a question and the knowledge graph is projected to a high-dimensional vector space to obtain character vectors or word vectors, similarity calculation is carried out on the vectors through a deep learning model, candidate ordering is obtained through a corresponding scoring mechanism, and a final question and answer result is obtained. A recent study was that Zhou et al combined rules with neural networks and took the first name in the 2019 CCKS evaluation. In the method of answer ranking based on deep learning, it is a core task to calculate the correlation between an input question and a candidate answer entity. Questions and answers are currently used to better effect a directly trained question-answer model.
Machine translation, in the translation task, it is desirable to obtain a translation from a source language to a target language. Machine translation the main current research methods are statistical machine translation and neural machine translation. Statistical machine translation systems mathematically model machine translation. Training may be performed on a big data basis. Its cost is very low because this method is language independent. Once this model is built, it is applicable to all languages. Common statistical machine translation models include word-based machine translation modeling, torsion and proliferation rate-based models, phrase-based models, and syntax-based models. Since statistical machine translation is a corpus-based method, if the amount of data is small, a problem of data sparseness is encountered. Meanwhile, another problem is faced, the translation knowledge comes from the automatic training of big data, and how to add the expert knowledge into the research is not mature. Neural network translation has grown rapidly in recent years. Compared with the statistical machine translation, the neural network translation is relatively simple in model, and mainly comprises two parts, namely an encoder and a decoder. The encoder represents the source language as a high-dimensional vector after a series of neural network transformations. The decoder is responsible for re-decoding (translating) this high-dimensional vector into the target language. With the development of deep learning techniques, neural network translation systems have surpassed statistical-based approaches in most languages. Neural machine translation differs from statistical machine translation in the way that linguistic strings are represented. Statistical machine translation is based on a representation model of discrete space, with all word strings essentially being composed of smaller word strings (phrases, rules). The neural machine translation is based on a continuous space representation model, the continuous space representation model can capture more hidden information, and all word strings respectively correspond to a point on a continuous space (for example, correspond to a point in a multi-dimensional real space). Thus, the model can be optimized better, and the generalization capability of the unseen sample is better. Existing neural network translation models mainly include a model based on a recurrent neural network, a model based on a convolutional neural network, and a model based on self-attention.
In summary, the existing emergency education is generally offline education, which is inconvenient for social citizens to learn and poor in learning effect, so that the popularization of emergency knowledge is low, and the problem of low emergency capability of the social citizens exists.
Disclosure of Invention
The invention aims to provide a multi-mode knowledge graph-based emergency knowledge question-answering method and a multi-mode knowledge graph-based emergency knowledge question-answering system, which are used for solving the problems of inconvenience and poor effect of offline emergency education training in the prior art.
In order to achieve the purpose, the invention provides the following scheme:
an emergency knowledge question-answering method based on a multi-modal knowledge map comprises the following steps:
acquiring first-aid related knowledge based on the Internet, and constructing a multi-mode first-aid knowledge map according to the first-aid related knowledge; the first aid related knowledge comprises emergency diseases, first aid medicines, first aid methods, first aid equipment and use methods of the first aid equipment; the form of the first aid related knowledge comprises text, voice, pictures and video;
acquiring a question input by a user, and extracting an entity and a relation in the question by using an entity-relation joint extraction model;
positioning the entity in the multi-mode first-aid knowledge graph according to the entity in the question, and determining a matching entity; the matching entity is an entity in the multi-modal emergency knowledge base which is matched with the entity in the question;
calculating the similarity of the relationship in the question and the relationship of all the matching entities by using a deep learning model;
determining answers of the question sentences according to the similarity;
and inputting the answer into a machine translation model according to the target language selected by the user, and outputting the translated answer.
Optionally, the obtaining of the emergency-related knowledge based on the internet, and constructing a multi-modal emergency knowledge graph according to the emergency-related knowledge specifically include:
acquiring first-aid knowledge in a text form based on the Internet, and constructing a traditional first-aid knowledge map;
acquiring voice form first-aid knowledge, image form first-aid knowledge and video form first-aid knowledge based on the Internet;
and merging the voice form emergency knowledge, the image form emergency knowledge and the video form emergency knowledge into the traditional emergency knowledge map to obtain a multi-mode emergency knowledge map.
Optionally, the merging the voice-form emergency knowledge, the image-form emergency knowledge, and the video-form emergency knowledge into the conventional emergency knowledge map to obtain a multi-modal emergency knowledge map specifically includes:
converting the voice information in the voice form emergency knowledge into text information;
performing combined extraction on the text information to obtain an entity and a text of the voice form emergency knowledge;
the entity and the text of the voice form emergency knowledge are merged into the traditional emergency knowledge map to obtain an emergency knowledge map containing voice emergency knowledge;
labeling the image information in the image form emergency knowledge to obtain an image labeling result; the image annotation result comprises object attributes in the image and the relationship between the objects;
the image labeling result is fused into the first-aid knowledge map containing the voice first-aid knowledge to obtain a first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge;
labeling image information in the first-aid knowledge in the video form to obtain a video labeling result; the video annotation result comprises an emergency skill name;
and merging the video annotation result into the emergency knowledge map containing the voice emergency knowledge and the image emergency knowledge to obtain the multi-mode emergency knowledge map.
Optionally, the determining the answer to the question sentence according to the similarity specifically includes:
selecting the relation with the highest similarity in the similarities;
and taking the matching entity corresponding to the relation with the highest similarity as the answer of the question.
An emergency knowledge question-answering system based on a multi-modal knowledge map, comprising:
the multi-mode first-aid knowledge map building module is used for obtaining first-aid related knowledge based on the Internet and building a multi-mode first-aid knowledge map according to the first-aid related knowledge; the first aid related knowledge comprises emergency diseases, first aid medicines, first aid methods, first aid equipment and use methods of the first aid equipment; the form of the first aid related knowledge comprises text, voice, pictures and video;
the entity relationship extraction module is used for acquiring a question input by a user and extracting an entity and a relationship in the question by using an entity relationship joint extraction model;
the matching module is used for positioning the entity in the multi-mode first-aid knowledge graph according to the entity in the question and determining a matching entity; the matching entity is an entity in the multi-modal emergency knowledge base which is matched with the entity in the question;
the similarity calculation module is used for calculating the similarity between the relation in the question and the relations of all the matched entities by utilizing a deep learning model;
the answer determining module is used for determining the answer of the question according to the similarity;
and the translation module is used for inputting the answer into a machine translation model according to the target language selected by the user and outputting the translated answer.
Optionally, the multi-modal emergency knowledge base building module specifically includes:
the traditional first-aid knowledge map construction unit is used for acquiring first-aid knowledge in a text form based on the Internet and constructing a traditional first-aid knowledge map;
the multi-mode first-aid knowledge acquisition unit is used for acquiring voice form first-aid knowledge, image form first-aid knowledge and video form first-aid knowledge based on the Internet;
and the merging unit is used for merging the voice form emergency treatment knowledge, the image form emergency treatment knowledge and the video form emergency treatment knowledge into the traditional emergency treatment knowledge map to obtain a multi-mode emergency treatment knowledge map.
Optionally, the merging unit specifically includes:
the conversion unit is used for converting the voice information in the voice form emergency knowledge into text information;
the combined extraction subunit is used for performing combined extraction on the text information to obtain an entity and a text of the voice form emergency knowledge;
the voice integration subunit is used for integrating the entity and the text of the voice form emergency knowledge into the traditional emergency knowledge map to obtain an emergency knowledge map containing the voice emergency knowledge;
the image labeling subunit is used for labeling the image information in the image form emergency knowledge to obtain an image labeling result; the image annotation result comprises object attributes in the image and the relationship between the objects;
the image merging subunit is used for merging the image labeling result into the first-aid knowledge map containing the voice first-aid knowledge to obtain a first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge;
the video labeling subunit is used for labeling the image information in the first aid knowledge in the video form to obtain a video labeling result; the video annotation result comprises an emergency skill name;
and the video blending subunit is used for blending the video labeling result into the first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge to obtain the multi-mode first-aid knowledge map.
Optionally, the answer determining module specifically includes:
the selecting unit is used for selecting the relation with the highest similarity in the similarities;
and the answer determining unit is used for taking the matching entity corresponding to the relation with the highest similarity as the answer of the question.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the first-aid knowledge question-answering method and system based on the multi-mode knowledge map, the multi-mode first-aid knowledge map is constructed, the entities and the relations in the question are extracted, the entities in the multi-mode first-aid knowledge map are positioned according to the entities in the question, and the similarity of all the relations between the entities in the question and the entities in the multi-mode first-aid knowledge map is calculated by using a deep learning model; determining answers of the question sentences according to the similarity; and translating the answer according to the target language selected by the user, and outputting the translated answer. The invention combines the multi-modal knowledge in the field of emergency treatment and constructs the multi-modal emergency treatment knowledge map on the basis of the traditional emergency treatment knowledge map. And translating answer texts obtained by inquiring in the multi-mode first-aid knowledge graph according to the target language selected by the user by using a neural network machine translation model. By adopting the multi-modal knowledge map-based emergency knowledge question-answering method and the multi-modal knowledge map-based emergency knowledge question-answering system, the emergency knowledge can be learned on line, the convenience of social citizen learning and the emergency effect are improved, multi-language translation can be performed, and the emergency efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a first aid knowledge question-answering method based on a multi-modal knowledge map according to the present invention;
FIG. 2 is a block diagram of an emergency knowledge question-answering system based on a multi-modal knowledge map according to the present invention;
FIG. 3 is a schematic diagram of an entity labeling method according to the present invention;
FIG. 4 is a schematic diagram of a structure of a joint extraction model of entities and relationships provided by the present invention;
FIG. 5 is a block diagram of an intelligent question answering system according to an embodiment of the present invention;
fig. 6 is a flow chart of an intelligent question answering method according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a multi-mode knowledge graph-based emergency knowledge question-answering method and a multi-mode knowledge graph-based emergency knowledge question-answering system, which are used for solving the problems of inconvenience and poor effect of offline emergency education training in the prior art.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a flowchart of an emergency treatment knowledge question-answering method based on a multi-modal knowledge graph, as shown in fig. 1, the emergency treatment knowledge question-answering method based on the multi-modal knowledge graph includes:
step 101: the method comprises the steps of obtaining first-aid related knowledge based on the Internet, and constructing a multi-mode first-aid knowledge map according to the first-aid related knowledge. The first aid related knowledge comprises emergency diseases, first aid medicines, first aid methods, first aid equipment and use methods of the first aid equipment; the form of the first aid related knowledge includes text, voice, pictures, and video.
In a specific embodiment, the step 101 specifically includes:
based on the Internet, the first-aid knowledge in the text form is acquired, and a traditional first-aid knowledge map is constructed.
Voice form first aid knowledge, image form first aid knowledge and video form first aid knowledge are acquired based on the internet.
And merging the voice form emergency knowledge, the image form emergency knowledge and the video form emergency knowledge into the traditional emergency knowledge map to obtain a multi-mode emergency knowledge map.
The merging the voice-form emergency knowledge, the image-form emergency knowledge and the video-form emergency knowledge into the conventional emergency knowledge map to obtain a multi-modal emergency knowledge map specifically includes:
and converting the voice information in the voice form emergency knowledge into text information.
And performing combined extraction on the text information to obtain the entity and the text of the voice form emergency knowledge.
And integrating the entity and the text of the voice form emergency knowledge into the traditional emergency knowledge map to obtain the emergency knowledge map containing the voice emergency knowledge.
Labeling the image information in the image form emergency knowledge to obtain an image labeling result; the image annotation result comprises object attributes in the image and the relationship between the objects.
And merging the image labeling result into the first-aid knowledge map containing the voice first-aid knowledge to obtain the first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge.
Labeling image information in the first-aid knowledge in the video form to obtain a video labeling result; the video annotation result comprises an emergency skill name.
And merging the video annotation result into the emergency knowledge map containing the voice emergency knowledge and the image emergency knowledge to obtain the multi-mode emergency knowledge map.
Step 102: and acquiring a question input by a user, and extracting an entity and a relation in the question by using an entity-relation joint extraction model.
Step 103: and positioning the entities in the multi-mode emergency treatment knowledge graph according to the entities in the question, and determining matched entities. The matching entity is an entity in the multi-modal emergency knowledge map which is matched with the entity in the question.
Step 104: and calculating the similarity of the relationship in the question and the relationship of all the matching entities by using a deep learning model.
Step 105: and determining answers of the question sentences according to the similarity.
In a specific embodiment, the step 105 specifically includes:
and selecting the relation with the highest similarity in the similarities.
And taking the matching entity corresponding to the relation with the highest similarity as the answer of the question.
In practical application, after the answers of the question are determined, the corresponding voice first-aid knowledge, image first-aid knowledge and video first-aid knowledge can be viewed in addition to the translation of the text.
Step 106: and inputting the answer into a machine translation model according to the target language selected by the user, and outputting the translated answer.
Fig. 2 is a structural diagram of an emergency treatment knowledge question-answering system based on a multi-modal knowledge graph, as shown in fig. 2, the emergency treatment knowledge question-answering system based on the multi-modal knowledge graph includes:
the multi-modal emergency knowledge map construction module 201 is configured to obtain emergency related knowledge based on the internet, and construct a multi-modal emergency knowledge map according to the emergency related knowledge. The first aid related knowledge comprises emergency diseases, first aid medicines, first aid methods, first aid equipment and use methods of the first aid equipment; the form of the first aid related knowledge includes text, voice, pictures, and video.
And the entity relationship extraction module 202 is configured to obtain a question input by a user, and extract an entity and a relationship in the question by using an entity relationship joint extraction model.
The matching module 203 is used for positioning the entities in the multi-modal emergency treatment knowledge graph according to the entities in the question and determining the matching entities; the matching entity is an entity in the multi-modal emergency knowledge map which is matched with the entity in the question.
And the similarity calculation module 204 is configured to calculate similarities between the relationships in the question and the relationships of all the matching entities by using a deep learning model.
And the answer determining module 205 is configured to determine an answer to the question according to the similarity.
And the translation module 206 is configured to input the answer into a machine translation model according to the target language selected by the user, and output a translated answer.
In a specific embodiment, the multi-modal emergency knowledge base construction module 201 specifically includes:
and the traditional first-aid knowledge map construction unit is used for acquiring first-aid knowledge in a text form based on the Internet and constructing a traditional first-aid knowledge map.
The multi-mode first-aid knowledge acquisition unit is used for acquiring voice form first-aid knowledge, image form first-aid knowledge and video form first-aid knowledge on the basis of the Internet.
And the merging unit is used for merging the voice form emergency treatment knowledge, the image form emergency treatment knowledge and the video form emergency treatment knowledge into the traditional emergency treatment knowledge map to obtain a multi-mode emergency treatment knowledge map.
In a specific embodiment, the merging unit specifically includes:
and the conversion module is used for converting the voice information in the voice form emergency knowledge into text information.
And the combined extraction subunit is used for performing combined extraction on the text information to obtain the entity and the text of the voice form emergency knowledge.
And the voice integration subunit is used for integrating the entity and the text of the voice form emergency knowledge into the traditional emergency knowledge map to obtain the emergency knowledge map containing the voice emergency knowledge.
And the image labeling subunit is used for labeling the image information in the image form emergency knowledge to obtain an image labeling result. The image annotation result comprises object attributes in the image and the relationship between the objects.
And the image merging subunit is used for merging the image labeling result into the emergency knowledge map containing the voice emergency knowledge to obtain an emergency knowledge map containing the voice emergency knowledge and the image emergency knowledge.
The video labeling subunit is used for labeling the image information in the first aid knowledge in the video form to obtain a video labeling result; the video annotation result comprises an emergency skill name.
And the video blending subunit is used for blending the video labeling result into the first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge to obtain the multi-mode first-aid knowledge map.
In a specific embodiment, the answer determining module specifically includes:
and the selecting unit is used for selecting the relation with the highest similarity in the similarities.
And the answer determining unit is used for taking the matching entity corresponding to the relation with the highest similarity as the answer of the question.
In practical applications, since most knowledge in the field of pre-hospital care is unstructured data, the invention mainly studies an information extraction method for unstructured information. The method comprises the steps of constructing a multi-mode first-aid knowledge graph and extracting entities and relations in the question by utilizing an entity-relation combined extraction model, wherein the information extraction comprises two tasks of entity identification and relation extraction, and the main method comprises the steps of flow line type information extraction and combined extraction. The pipelined information extraction method is characterized in that entity extraction is firstly carried out, then relationship extraction is carried out, two models need to be trained, the error of entity extraction can influence the effect of relationship extraction, entity redundancy exists, and the inherent relation and dependency between two tasks are ignored. The joint extraction can relieve the problem that the error of the entity extraction is propagated to the relation extraction. Therefore, the invention defines a uniform entity relation label space (a union of entity types and relation types), the input of the entity and relation combined extraction model is a two-dimensional n x n table (n is the length of the input text), and the entity and relation combined extraction model assigns labels to each unit from the uniform label space. As shown in FIG. 3, the entities are squares on the diagonal and the relationships are rectangles on either side of the diagonal.
The joint extraction method keeps the complete expression (overlapping relation, directed relation and undirected relation) of the entity relation in the real extraction scene. Based on the tabular form, the joint extraction model of entities and relationships performs two operations: and (4) filling and decoding. First, filling the table predicts the label of each word pair, learns the interaction between word pairs using a double-attention mechanism, and adds a regularization structural constraint to the table. Next, an approximate joint decoding algorithm is designed, and finally extracted entities and relations are output. Experiments prove that the method can effectively identify the nested entities and the overlapping relations. Aiming at the problem of the nested entities, after the span is identified, whether the nested entities exist is identified in an iterative mode. Taking fig. 3 as an example, after David Perkins is identified, the next step is to determine whether David and Perkins are entities. The structure of the entity and relationship joint extraction model is shown in fig. 4, and is mainly divided into three parts, namely, an encoding layer, a constraint layer and a decoding layer.
Coding
First, a vector representation (h) of the text is obtained using a pre-trained language model acquisition1,h2,...,hn) For the long-distance sentence dependence problem, the context of the sentence is spliced, i.e. the sentence is extended to a fixed length (set to 200 by default). And simultaneously, better coding the direction information of the words in the table by adopting a deep double affine attention mechanism (deep double affine attention mechanism): 1) obtaining information of sentences in different directions by adopting 2 dimensionality reduction multilayer perceptrons; 2) calculating a score vector for each word pair using a dual affine model; 3) the prediction tag is output by the Softmax function.
② adding constraints
In fact, the prediction labels obtained in the last step are independent of each other, and the result obtained by the combined extraction model of the entities and the relations is relatively poor. Intuitively, the entities and the relations correspond to squares and rectangles in the two-dimensional matrix respectively, but the corresponding constraints are not explicitly defined in the previous step. To this, two independent constraints are added: 1) the physical and undirected relationships are symmetric about a diagonal; 2) if a relationship exists, the corresponding entity pair must exist, i.e., the probability of the relationship label is not higher than the probability of two entities.
Decoding Decoding
The decoding algorithm is mainly divided into three steps: span decoding (entity and span between entities), entity type decoding, and entity-to-relationship decoding. And predicting the two-dimensional matrix in a uniform label space, marking the two-dimensional matrix as span when the two-dimensional matrix exceeds a threshold value, and finally predicting the entity and the relationship with the highest score.
The main role of entity linking is to disambiguate the entity references obtained from the text with the entities in the knowledge base, identifying each entity reference as its corresponding mapping entity in the medical knowledge base. Entity refers herein to a textual representation of an entity. An entity may have many different expressions, such as full name, alias, abbreviation, etc., such as artificial respiration also known in different texts as cardiopulmonary resuscitation, and defibrillation also known in other articles as AEDs. In first aid knowledge, the same object of these different names contains the same attributes. Therefore, the invention uses the entity linking method based on the entity attribute to judge whether the entities are the same by calculating the similarity of the character strings in the name attribute of the entities. The similarity of the entity name and the attribute is mainly calculated by the following modes of a cosine distance, a jaccard correlation coefficient and the like:
Figure BDA0003381230190000201
Figure BDA0003381230190000202
wherein, is the same as e1And e2Given a medical entity, a (e) represents an attribute string of medical entity e.
In practical application, multi-mode first-aid knowledge is merged into a traditional first-aid knowledge map, and the specific process is as follows:
in the emergency training resource, the voice information is mainly a voice explanation of a certain emergency skill. Such voice information is mostly contributed by emergency specialists and thus the voice quality is high. The method comprises the steps of firstly, manually identifying voice information, identifying the emergency skill of the current voice explanation, and defining a resource identifier for a voice file by using a corresponding emergency skill name. Firstly, voice information is converted into text information by using a voice recognition tool, then the text information is subjected to joint extraction, then entities and relations of the voice information are added into a knowledge graph, finally, corresponding entities in the knowledge graph are positioned according to names of voice files, and connection is carried out through the relation of audioaf.
In the emergency training resource, the contents of image information mainly comprise characters and emergency equipment, the contained objects are relatively simple, and the relationship among the objects is relatively clear. The invention adopts a manual marking method aiming at image information, which comprises marking the object attributes in the image and the relationship between the objects. The invention takes the result of the manual marking as the resource identifier of the image, extracts the information in the result of the manual marking and completes the knowledge map. And for the image only containing a single object, positioning the entities in the knowledge graph according to the labeling result, and connecting through the relation of 'imageof'.
In emergency training resources, the content of the vast majority of video information is emergency skills. The invention adopts a manual marking method aiming at image information, and the method comprises the steps of marking the first-aid skill name displayed by a video, positioning the entity in a knowledge map according to a marking result, and connecting through a vedioaf relation.
In one embodiment, the invention adopts an answer sorting method based on deep learning to construct a question-answering system based on a multi-modal emergency knowledge map, and a specific system framework is shown in fig. 5.
The answer ranking method based on deep learning needs to project questions and rich semantic information (characters, words, context relations, entities, relations and attributes in the knowledge graph) contained in the knowledge graph to a high-dimensional vector space to obtain character vectors or word vectors, perform similarity calculation on the vectors through a deep learning model, and obtain candidate ranking through a corresponding scoring mechanism to obtain a final question-answer result. The system uses the similarity of the BERT model comparison relationship to find the best answer in the knowledge graph, as shown in FIG. 6. After the entity and the relation of the question of the user are extracted, the information of the rest of the first-order is searched in the multi-mode first-aid knowledge graph as an answer by using the triple structure of the multi-mode first-aid knowledge graph and knowing the two elements of the triple.
In recent years, with artificial intelligence, machine learning, deep learning and the like representing significant progress of learning techniques, semantic information in medical entities can be represented as vectors of dense low-dimensional real values, so that complex semantic associations in entities and relations are calculated in a low-dimensional space. A common method of knowledge representation is a distance translation model, in which the distance translation model judges the rationality of a fact using a distance-based scoring function, and the representatives of the distance translation model include a translation model (TransE) and a complex relationship model (TransH, TransR, TransD, TransG, KG2E, etc.) extended therefrom. The relationship vectors in a triplet can be viewed as a translation of the head entity vector to the tail entity vector, and satisfy the relationship: the translation model has fewer parameters, low calculation complexity, suitability for a large-scale sparse medical knowledge base and better performance and expansibility, so that the invention uses the TransE model for knowledge representation.
The first-aid health knowledge question-answering method and system based on the multi-mode knowledge graph have the following advantages that:
the invention has good innovation and practicability. With the development of information technology, the first-aid knowledge transmission media are gradually changed from traditional printed matters such as first-aid manuals, books and publicity columns to information technology products such as first-aid websites and medical software. Existing first aid knowledge dissemination methods include, in addition to traditional offline volunteer activities, the use of the internet to disseminate knowledge around first aid health subjects. The online spreading first-aid knowledge has wider popularization range and low popularization cost, and the offline volunteering activity can better ensure the popularization quality. However, in real life, with regard to the detailed problem of emergency skill knowledge, the user would like to know the detailed problem directly, rather than a complete explanation of emergency skill. And in the emergency first-aid scene, the invention can immediately give answers to related first-aid knowledge according to symptom information provided by the user. Compared with the existing first-aid knowledge popularization mode, the method has the advantages that the use scene is wider, and the user problems can be more accurately understood to give accurate answers. Meanwhile, considering the situation that multiple languages exist in part of China, the invention provides a machine translation model in the field of emergency treatment, and emergency treatment knowledge is presented in different languages. The invention can solve the problems of single emergency training mode, limited offline training effect, weak social public emergency safety consciousness and the like in the prior emergency education field.
The data comes from Wuluqiqi first aid center, and the first aid information is real and reliable.
Emphasizes the learning effect of the user. The invention combines the multi-modal knowledge in the field of emergency treatment and constructs the multi-modal emergency treatment knowledge map on the basis of the traditional knowledge map. The invention integrates the information of the voice, the image and the video which are related to the first aid into the traditional knowledge map, and the answer returned to the user also comprises the information of other modes, so that the answer is more comprehensive and understandable.
The invention has multi-language characteristics. The invention uses a neural machine translation method, uses Chinese-Uygur parallel prediction in the field of emergency to construct a neural network machine translation model, and translates the text of the answer obtained by inquiring in the multi-mode emergency knowledge map according to the target language selected by the user. Therefore, the text in the first-aid knowledge map can be kept to be the Chinese character, and the knowledge map can be updated conveniently in the future. Meanwhile, the requirement of understanding answers by the user is met.
In practical application, the emergency health knowledge question-answering system based on the multi-modal knowledge map comprises:
multi-modal first aid knowledge-graph: the method comprises the steps of crawling emergency relevant knowledge from the Internet, wherein the knowledge comprises emergency diseases, emergency medicines, emergency methods, emergency equipment and using modes, the knowledge forms comprise texts, voices, pictures and videos, and a multi-mode emergency knowledge map is constructed by using the emergency relevant knowledge.
Joint extraction model of entities and relationships: and constructing a combined extraction model of entities and relations in the natural language question in the field of emergency treatment, and positioning the entities in the multi-mode emergency treatment knowledge map according to the extracted entities.
A relation similarity calculation model based on deep learning: and linking to the entity in the multi-mode first-aid knowledge graph according to the jointly extracted entity to achieve the purpose of entity positioning. And calculating the similarity of the extracted relationship in the question and all the relationships obtained by the related entities in the knowledge graph spectrum by using a deep learning model. And obtaining answers of the question sentences according to the similarity scores.
Machine translation model based on neural network: the neural machine translation model at least realizes the Chinese-character mutual translation.
When a question input by a user is received, extracting entities and relations in the question by using a model extracted by entity-relation combination, positioning the entities in a knowledge map according to the entities in the question, then calculating the similarity of the relations in the question and all relations of matched entities in a multi-mode first-aid knowledge map by using a deep learning model such as Bert, selecting the relation with the highest similarity, obtaining the corresponding entities as answers of the problems, then inputting the answers as translation models according to target languages selected by the user, and finally feeding back the results of the translation models as final answers to the user.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (8)

1. A multi-modal knowledge map-based emergency knowledge question-answering method is characterized by comprising the following steps:
acquiring first-aid related knowledge based on the Internet, and constructing a multi-mode first-aid knowledge map according to the first-aid related knowledge; the first aid related knowledge comprises emergency diseases, first aid medicines, first aid methods, first aid equipment and use methods of the first aid equipment; the form of the first aid related knowledge comprises text, voice, pictures and video;
acquiring a question input by a user, and extracting an entity and a relation in the question by using an entity-relation joint extraction model;
positioning the entity in the multi-mode first-aid knowledge graph according to the entity in the question, and determining a matching entity; the matching entity is an entity in the multi-modal emergency knowledge base which is matched with the entity in the question;
calculating the similarity of the relationship in the question and the relationship of all the matching entities by using a deep learning model;
determining answers of the question sentences according to the similarity;
and inputting the answer into a machine translation model according to the target language selected by the user, and outputting the translated answer.
2. The multi-modal knowledge-graph-based emergency knowledge question answering method according to claim 1, wherein the internet-based acquisition of emergency-related knowledge and the construction of a multi-modal emergency knowledge-graph based on the emergency-related knowledge comprise:
acquiring first-aid knowledge in a text form based on the Internet, and constructing a traditional first-aid knowledge map;
acquiring voice form first-aid knowledge, image form first-aid knowledge and video form first-aid knowledge based on the Internet;
and merging the voice form emergency knowledge, the image form emergency knowledge and the video form emergency knowledge into the traditional emergency knowledge map to obtain a multi-mode emergency knowledge map.
3. The multi-modal knowledge-graph-based emergency knowledge question-answering method according to claim 2, wherein the merging the voice-form emergency knowledge, the image-form emergency knowledge and the video-form emergency knowledge into the traditional emergency knowledge-graph to obtain a multi-modal emergency knowledge-graph, specifically comprises:
converting the voice information in the voice form emergency knowledge into text information;
performing combined extraction on the text information to obtain an entity and a text of the voice form emergency knowledge;
the entity and the text of the voice form emergency knowledge are merged into the traditional emergency knowledge map to obtain an emergency knowledge map containing voice emergency knowledge;
labeling the image information in the image form emergency knowledge to obtain an image labeling result; the image annotation result comprises object attributes in the image and the relationship between the objects;
the image labeling result is fused into the first-aid knowledge map containing the voice first-aid knowledge to obtain a first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge;
labeling image information in the first-aid knowledge in the video form to obtain a video labeling result; the video annotation result comprises an emergency skill name;
and merging the video annotation result into the emergency knowledge map containing the voice emergency knowledge and the image emergency knowledge to obtain the multi-mode emergency knowledge map.
4. The multi-modal knowledge-graph-based emergency knowledge question-answering method according to claim 1, wherein the determining answers to the question sentences according to the similarity specifically comprises:
selecting the relation with the highest similarity in the similarities;
and taking the matching entity corresponding to the relation with the highest similarity as the answer of the question.
5. An emergency knowledge question-answering system based on a multi-modal knowledge map, comprising:
the multi-mode first-aid knowledge map building module is used for obtaining first-aid related knowledge based on the Internet and building a multi-mode first-aid knowledge map according to the first-aid related knowledge; the first aid related knowledge comprises emergency diseases, first aid medicines, first aid methods, first aid equipment and use methods of the first aid equipment; the form of the first aid related knowledge comprises text, voice, pictures and video;
the entity relationship extraction module is used for acquiring a question input by a user and extracting an entity and a relationship in the question by using an entity relationship joint extraction model;
the matching module is used for positioning the entity in the multi-mode first-aid knowledge graph according to the entity in the question and determining a matching entity; the matching entity is an entity in the multi-modal emergency knowledge base which is matched with the entity in the question;
the similarity calculation module is used for calculating the similarity between the relation in the question and the relations of all the matched entities by utilizing a deep learning model;
the answer determining module is used for determining the answer of the question according to the similarity;
and the translation module is used for inputting the answer into a machine translation model according to the target language selected by the user and outputting the translated answer.
6. The multi-modal knowledge-graph-based emergency knowledge question-answering system according to claim 5, wherein the multi-modal emergency knowledge-graph construction module specifically comprises:
the traditional first-aid knowledge map construction unit is used for acquiring first-aid knowledge in a text form based on the Internet and constructing a traditional first-aid knowledge map;
the multi-mode first-aid knowledge acquisition unit is used for acquiring voice form first-aid knowledge, image form first-aid knowledge and video form first-aid knowledge based on the Internet;
and the merging unit is used for merging the voice form emergency treatment knowledge, the image form emergency treatment knowledge and the video form emergency treatment knowledge into the traditional emergency treatment knowledge map to obtain a multi-mode emergency treatment knowledge map.
7. The multi-modal knowledge-graph-based emergency knowledge question-answering system according to claim 6, wherein the merging unit specifically comprises:
the conversion unit is used for converting the voice information in the voice form emergency knowledge into text information;
the combined extraction subunit is used for performing combined extraction on the text information to obtain an entity and a text of the voice form emergency knowledge;
the voice integration subunit is used for integrating the entity and the text of the voice form emergency knowledge into the traditional emergency knowledge map to obtain an emergency knowledge map containing the voice emergency knowledge;
the image labeling subunit is used for labeling the image information in the image form emergency knowledge to obtain an image labeling result; the image annotation result comprises object attributes in the image and the relationship between the objects;
the image merging subunit is used for merging the image labeling result into the first-aid knowledge map containing the voice first-aid knowledge to obtain a first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge;
the video labeling subunit is used for labeling the image information in the first aid knowledge in the video form to obtain a video labeling result; the video annotation result comprises an emergency skill name;
and the video blending subunit is used for blending the video labeling result into the first-aid knowledge map containing the voice first-aid knowledge and the image first-aid knowledge to obtain the multi-mode first-aid knowledge map.
8. The multi-modal knowledge-graph-based emergency knowledge question and answer system of claim 5, wherein the answer determination module specifically comprises:
the selecting unit is used for selecting the relation with the highest similarity in the similarities;
and the answer determining unit is used for taking the matching entity corresponding to the relation with the highest similarity as the answer of the question.
CN202111434019.3A 2021-11-29 2021-11-29 Multi-modal knowledge graph-based emergency knowledge question-answering method and system Pending CN114064931A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111434019.3A CN114064931A (en) 2021-11-29 2021-11-29 Multi-modal knowledge graph-based emergency knowledge question-answering method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111434019.3A CN114064931A (en) 2021-11-29 2021-11-29 Multi-modal knowledge graph-based emergency knowledge question-answering method and system

Publications (1)

Publication Number Publication Date
CN114064931A true CN114064931A (en) 2022-02-18

Family

ID=80277317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111434019.3A Pending CN114064931A (en) 2021-11-29 2021-11-29 Multi-modal knowledge graph-based emergency knowledge question-answering method and system

Country Status (1)

Country Link
CN (1) CN114064931A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821245A (en) * 2022-05-30 2022-07-29 大连大学 Medical visual question-answering method based on global visual information intervention
CN114936296A (en) * 2022-07-25 2022-08-23 达而观数据(成都)有限公司 Indexing method, system and computer equipment for super-large-scale knowledge map storage
CN114970537A (en) * 2022-06-27 2022-08-30 昆明理工大学 Cross-border ethnic culture entity relationship extraction method and device based on multilayer labeling strategy
CN115309870A (en) * 2022-10-11 2022-11-08 启元世界(北京)信息技术服务有限公司 Knowledge acquisition method and device
CN115422362A (en) * 2022-10-09 2022-12-02 重庆邮电大学 Text matching method based on artificial intelligence
CN115599902A (en) * 2022-12-15 2023-01-13 西南石油大学(Cn) Oil-gas encyclopedia question-answering method and system based on knowledge graph
CN117114695A (en) * 2023-10-19 2023-11-24 本溪钢铁(集团)信息自动化有限责任公司 Interaction method and device based on intelligent customer service in steel industry
CN117591663A (en) * 2024-01-19 2024-02-23 北京华源技术有限公司 Knowledge graph-based large model promt generation method
US12013884B2 (en) 2022-06-30 2024-06-18 International Business Machines Corporation Knowledge graph question answering with neural machine translation

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114821245B (en) * 2022-05-30 2024-03-26 大连大学 Medical visual question-answering method based on global visual information intervention
CN114821245A (en) * 2022-05-30 2022-07-29 大连大学 Medical visual question-answering method based on global visual information intervention
CN114970537A (en) * 2022-06-27 2022-08-30 昆明理工大学 Cross-border ethnic culture entity relationship extraction method and device based on multilayer labeling strategy
CN114970537B (en) * 2022-06-27 2024-04-23 昆明理工大学 Cross-border ethnic cultural entity relation extraction method and device based on multi-layer labeling strategy
US12013884B2 (en) 2022-06-30 2024-06-18 International Business Machines Corporation Knowledge graph question answering with neural machine translation
CN114936296A (en) * 2022-07-25 2022-08-23 达而观数据(成都)有限公司 Indexing method, system and computer equipment for super-large-scale knowledge map storage
CN114936296B (en) * 2022-07-25 2022-11-08 达而观数据(成都)有限公司 Indexing method, system and computer equipment for super-large-scale knowledge map storage
CN115422362A (en) * 2022-10-09 2022-12-02 重庆邮电大学 Text matching method based on artificial intelligence
CN115422362B (en) * 2022-10-09 2023-10-31 郑州数智技术研究院有限公司 Text matching method based on artificial intelligence
CN115309870A (en) * 2022-10-11 2022-11-08 启元世界(北京)信息技术服务有限公司 Knowledge acquisition method and device
CN115599902A (en) * 2022-12-15 2023-01-13 西南石油大学(Cn) Oil-gas encyclopedia question-answering method and system based on knowledge graph
CN117114695B (en) * 2023-10-19 2024-01-26 本溪钢铁(集团)信息自动化有限责任公司 Interaction method and device based on intelligent customer service in steel industry
CN117114695A (en) * 2023-10-19 2023-11-24 本溪钢铁(集团)信息自动化有限责任公司 Interaction method and device based on intelligent customer service in steel industry
CN117591663A (en) * 2024-01-19 2024-02-23 北京华源技术有限公司 Knowledge graph-based large model promt generation method
CN117591663B (en) * 2024-01-19 2024-05-17 北京华源技术有限公司 Knowledge graph-based large model promt generation method

Similar Documents

Publication Publication Date Title
CN114064931A (en) Multi-modal knowledge graph-based emergency knowledge question-answering method and system
González García et al. A review of artificial intelligence in the internet of things
US8332394B2 (en) System and method for providing question and answers with deferred type evaluation
Khan et al. Extracting Spatial Information From Place Descriptions
Sharma et al. A survey of methods, datasets and evaluation metrics for visual question answering
Tyagi et al. Demystifying the role of natural language processing (NLP) in smart city applications: background, motivation, recent advances, and future research directions
WO2023029506A1 (en) Illness state analysis method and apparatus, electronic device, and storage medium
WO2023225858A1 (en) Reading type examination question generation system and method based on commonsense reasoning
CN113779220A (en) Mongolian multi-hop question-answering method based on three-channel cognitive map and graph attention network
Zhang et al. Hierarchical scene parsing by weakly supervised learning with image descriptions
CN115062162A (en) Metabolic disease-based incident knowledge graph construction method and system
CN106897274B (en) Cross-language comment replying method
CN115525751A (en) Intelligent question-answering system and method based on knowledge graph
CN112883172B (en) Biomedical question-answering method based on dual knowledge selection
CN117591655A (en) Intelligent question-answering system based on traditional Chinese medicine knowledge graph
Li et al. Approach of intelligence question-answering system based on physical fitness knowledge graph
CN115186072A (en) Knowledge graph visual question-answering method based on double-process cognitive theory
CN117235261A (en) Multi-modal aspect-level emotion analysis method, device, equipment and storage medium
CN116628207A (en) Training method and device for text classification model, electronic equipment and storage medium
Cui et al. Beyond language: Learning commonsense from images for reasoning
Saint-Dizier et al. Knowledge and reasoning for question answering: Research perspectives
CN116956869A (en) Text normalization method, device, electronic equipment and storage medium
Zhu et al. PlanGPT: Enhancing urban planning with tailored language model and efficient retrieval
Baghaee Automatic neural question generation using community-based question answering systems
Zhang Exploration of Cross‐Modal Text Generation Methods in Smart Justice

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination