WO2021012519A1 - Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium - Google Patents

Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2021012519A1
WO2021012519A1 PCT/CN2019/117954 CN2019117954W WO2021012519A1 WO 2021012519 A1 WO2021012519 A1 WO 2021012519A1 CN 2019117954 W CN2019117954 W CN 2019117954W WO 2021012519 A1 WO2021012519 A1 WO 2021012519A1
Authority
WO
WIPO (PCT)
Prior art keywords
training
question
model
ner
language model
Prior art date
Application number
PCT/CN2019/117954
Other languages
French (fr)
Chinese (zh)
Inventor
朱威
李恬静
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021012519A1 publication Critical patent/WO2021012519A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Definitions

  • This application relates to the technical field of knowledge graphs, and in particular to a question and answer method, device, computer equipment and storage medium based on artificial intelligence.
  • the knowledge map is also called the scientific knowledge map. It is called the knowledge domain visualization or the knowledge domain mapping map in the library and information industry. It is a series of different graphs showing the relationship between the development process of knowledge and the structure. Because it can provide high-quality structured data, more and more fields will use knowledge graphs and question answering systems based on knowledge graphs, such as automatic question answering, search engines, and information extraction.
  • a typical knowledge graph is usually expressed in the form of head entity, relationship, and tail entity (for example, Yao Ming, nationality, China) of the triad. The expression of this instance reflects the fact that Yao Ming’s nationality is Chinese.
  • the purpose of this application is to provide a question and answer method, device, computer equipment and storage medium based on artificial intelligence to solve the problems existing in the prior art.
  • this application provides a question and answer method based on artificial intelligence, including:
  • the second training data is manually labeled NER data.
  • Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
  • the answer corresponding to the sentence to be processed is determined and output.
  • this application also provides an artificial intelligence-based question and answer device, including:
  • the language model training module is configured to perform language model training based on first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
  • the NER model training module is used to perform NER model training based on the second training data and the trained language model.
  • the second training data is manually labeled NER data.
  • Each of the NER data includes a question and the Manually labeled NER mark corresponding to the question;
  • the relation matching model training module is configured to perform relation matching model training based on the second training data and the trained language model
  • the entity recognition module is used to recognize the entities in the sentence to be processed based on the trained NER model
  • a relationship obtaining module configured to obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model
  • an answer output module configured to determine and output the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
  • this application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • a computer program stored in the memory and running on the processor.
  • the second training data is manually labeled NER data.
  • Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
  • the answer corresponding to the sentence to be processed is determined and output.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps of the artificial intelligence-based question and answer method are realized:
  • the second training data is manually labeled NER data.
  • Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
  • the answer corresponding to the sentence to be processed is determined and output.
  • the artificial intelligence-based question answering method, device, computer equipment and storage medium provided in this application are based on language model transfer learning and graph transfer learning techniques, which improve the commonly used training methods of language models and make them more suitable for the knowledge graph question answering system. It can achieve higher accuracy with a smaller amount of manual labeling data.
  • the map structure is transferred to the relationship matching model through alternate training of the relationship matching model and the knowledge representation model, which can effectively improve the accuracy of the relationship matching model, that is, the accuracy of the corresponding relationship recognition, and achieve a stronger relationship
  • the extraction ability reduces the cost of manual participation and improves the efficiency of constructing knowledge graphs.
  • FIG. 1 is a flowchart of an embodiment of a question answering method based on artificial intelligence in this application
  • FIG. 2 is a schematic diagram of program modules of an embodiment of an artificial intelligence-based question and answer device according to this application;
  • FIG. 3 is a schematic diagram of the hardware structure of an embodiment of an artificial intelligence-based question and answer device of this application.
  • This application discloses a question and answer method based on artificial intelligence and a question answer method based on artificial intelligence, including the following steps:
  • S1 performs language model training based on the first training data, in which a large number of unlabeled question corpus in a designated field is collected, and the location of the entity in each question corpus is automatically marked to obtain the first training data.
  • step S1 the crawler can actively collect a large number of unlabeled question corpora in designated vertical fields (such as the medical field, marine field and other industries that emphasize the depth of knowledge and require high professionalism).
  • the question corpus is mainly question and answer interactive data , And the larger the corpus, the better (generally no less than 500,000 data); the task of the above language model training is to predict a word in a sentence when the word is occluded.
  • dictionary matching can be used to automatically mark which part of the sentence is an entity (although it will be somewhat inaccurate).
  • the language model is to predict what the next word is based on the context, that is, to realize the function of predicting the word when a word in the sentence is occluded.
  • the selection of the language model is not restricted, such as the commonly used google transformer language model or LSMT ( Long Sort Term Memory) language models can be used, and the language model training program can also adopt the conventional training program in this field.
  • the language model training may include the following steps: input the first training data into the google transformer language
  • the embedding layer of the model is vectorized, and then input to the coding layer to obtain the self-attention calculation matrix.
  • the calculation matrix is input to the loss function and based on the Adam optimization algorithm to minimize the loss function of the google transformer language model, and finally save the google transformer language model parameter settings.
  • step S2 performs NER model training based on the second training data and the language model after training.
  • the parameters of the language model are also regarded as the parameters of the upper-layer NER model and are trained together.
  • the network of the NER model is added to the trained language model, and the NER model is trained based on the artificially labeled NER data.
  • the parameters of the language model are also regarded as the parameters of the upper NER model and proceed together training. Through gradient descent, update parameters, and repeat the training process, the loss function can be continuously reduced, and better and better predictions can be obtained.
  • the second training data is manually labeled NER data, and each of the NER data includes a question sentence and the manually labeled NER mark corresponding to the question sentence; wherein the requirement for the quantity of the second training data is a thousand level OK.
  • step S2 NER model training includes:
  • S21 input the question sentence in the second training data to the NER model through the language model after training to represent the vector sequence or matrix, so as to output the predicted NER mark;
  • S22 compares the artificially annotated NER mark with the corresponding predicted NER mark to calculate a loss function; in this embodiment, the loss function is selected as Categorical cross entropy.
  • S23 updates and optimizes the language model and NER model parameters based on the gradient optimization algorithm.
  • NER models such as LSTM+CRF models or other entity recognition models.
  • S3 performs relationship matching model training based on the second training data and the trained language model.
  • the question vector and the relationship vector are trained on the relationship matching model for each epoch, and the relationship vector continues to perform a separate epoch knowledge Represents model training, the above-mentioned relation matching model training and knowledge representation model training alternate until all epoch training is completed.
  • the question sentence in the second training data is outputted by the language model after the training of the question sentence vector
  • the relationship in the question sentence in the second training data is randomly initialized by the embedding layer and expressed as a relationship vector
  • the above one epoch refers to all
  • the data is sent to the network to complete a forward calculation and back propagation process.
  • Entities have multiple relationships and attributes in the knowledge graph, but which one matches the question best needs to be confirmed based on the relationship matching model. Generally, it is necessary to select the only correct relationship among 100-200 relationships. Exemplarily, how should atorvastatin be taken to prevent coronary heart disease? The relationship is: how to take ⁇ drugs> to prevent and treat ⁇ disease>? Both ⁇ drug> and ⁇ disease> represent medical entities.
  • the semantics and the knowledge graph structure are combined, and the relationship matching model is trained for one epoch.
  • the language model is also included in the relationship matching model, namely All parameters of the language model are regarded as the parameters of the relational matching model and updated during training; the relational vector is also updated in the relational matching model, and at the end of an epoch, the relational vector is also put into the knowledge representation model such as HolE model (holographic Embeddings of knowledge graphs, knowledge graph holographic embedding representation model) is trained for an epoch.
  • HolE model holographic Embeddings of knowledge graphs, knowledge graph holographic embedding representation model
  • Relational matching model training includes the following steps:
  • S31 encode the question in the second training data through the trained language model, and output the question vector; in this step, first copy a trained language model, and then express the question vector Output from the language model, where the requirement on the quantity of the second training data is also a thousand level;
  • the relationship in the question sentence in the second training data is randomly initialized as a relationship vector through the embedding layer; in step S32, the relationship vector is randomly initialized in the second training data;
  • step S33 through the pre-training of the language model and the pre-training of the knowledge representation, the semantics and the knowledge graph structure are combined, and the relationship matching model is trained for an epoch.
  • the language model is also included in the relationship matching model, that is, language All parameters of the model are regarded as the parameters of the relational matching model, which are updated during training, that is, the language model is fine-tuned; the relational vector is also updated in the relational matching model, and at the end of an epoch, the relational vector is also put into the knowledge representation Models such as Holographic embeddings of knowledge graphs (holographic embeddings of knowledge graphs, knowledge graph holographic embedding representation model) are trained for an epoch.
  • the above-mentioned relationship matching model training and knowledge representation model training are performed alternately, which can better perform relationship matching.
  • S34 saves the google transformer language model and the parameters of the relationship matching model.
  • the above-mentioned relationship matching model is not limited, and general semantic matching models are available, for example, based on ESIM (enhanced semantic inference model), or other semantic matching models.
  • S4 identifies entities in the sentence to be processed based on the trained NER model, and obtains the relationship corresponding to the sentence to be processed based on the trained relationship matching model.
  • S5 determines and outputs the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
  • each entity will have a corresponding node content on the knowledge graph.
  • it is specifically used to search for the entity in the sentence to be processed in the corresponding node on the knowledge graph.
  • the content corresponding to the relationship of the sentence to be processed; the found content is determined as the answer corresponding to the sentence to be processed, and the answer is output.
  • the artificial intelligence-based question answering method shown in this application improves the commonly used training methods of language models, making it more suitable for graph question answering
  • the system can achieve higher accuracy with less manual labeling data.
  • the extraction ability reduces the cost of manual participation and improves the efficiency of constructing knowledge graphs.
  • This application shows a question and answer device 10 based on artificial intelligence.
  • the question and answer device 10 based on artificial intelligence may include or be divided into one or more program modules, one or more Each program module is stored in a storage medium and executed by one or more processors to complete the application and realize the above-mentioned question and answer method based on artificial intelligence.
  • the program module referred to in this application refers to a series of computer program instruction segments capable of completing specific functions, and is more suitable than the program itself to describe the execution process of the artificial intelligence-based question and answer device 10 in the storage medium.
  • the artificial intelligence-based question answering device 10 shown in this application includes:
  • the language model training module 11 is configured to perform language model training based on first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
  • the NER model training module 12 is configured to perform NER model training based on the second training data and the trained language model.
  • the second training data is manually labeled NER data, and each of the NER data includes a question and Manually mark the NER mark corresponding to the question;
  • the relation matching model training module 13 is configured to perform relation matching model training based on the second training data and the trained language model
  • the entity recognition module 14 is used to recognize entities in the sentence to be processed based on the trained NER model
  • the relationship obtaining module 15 is configured to obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model
  • the answer output module 16 is used to determine and output the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
  • the language model training module 11 includes a first training data acquisition sub-module, which is used to collect a large number of unlabeled question corpus in a specified field, and automatically mark the location of the entity in each question corpus , To obtain the first training data.
  • the location of the entity in each question corpus is automatically marked through dictionary matching, and if the question corpus does not match the entity, it is randomly selected.
  • the language model is a google transformer language model
  • the language model training module further includes:
  • the vectorization sub-module is used to input the first training data into the embedding layer of the google transformer language model for vectorization;
  • the matrix acquisition sub-module is used to input the vector to the coding layer to obtain the self-attention calculation matrix
  • the first optimization sub-module is configured to input the calculation matrix into the loss function and update and optimize the parameters of the google transformer language model based on a gradient optimization algorithm;
  • the first saving sub-module is used to save the google transformer language model parameter settings.
  • the NER model training module 12 includes:
  • a predictive NER tag acquisition sub-module configured to express a vector sequence or matrix of the question sentence in the second training data through the language model after training, and input it into the NER model to output a predictive NER tag;
  • the comparison sub-module is used to compare the artificially annotated NER mark with the corresponding predicted NER mark, and calculate a loss function
  • the second optimization sub-module is used to update and optimize the language model and the NER model parameters based on the gradient optimization algorithm.
  • the relationship matching model training module 13 includes:
  • the question vector obtaining submodule is used to obtain the question vector of the second training data
  • the relation vector obtaining sub-module is used to obtain the relation vector of the second training data
  • the training sub-module is used to perform the interaction and training of the attention mechanism with the output of the question vector and the relation vector.
  • the question vector and the relation vector participate in the training of an epoch relation matching model, and Based on the gradient optimization algorithm, the parameters of the language model and the relationship matching model are updated and optimized; at the same time, at the end of each epoch, the relationship vector is put into the knowledge representation model for one epoch training, alternating the above training Process until all epochs are processed;
  • the third saving submodule is used to save the language model and the parameters of the relationship matching model.
  • the question sentence in the second training data outputs the question sentence vector through the trained language model.
  • the relationship in the question sentence in the second training data is randomly initialized by an embedding layer and expressed as a relationship vector.
  • the artificial intelligence-based question answering device 10 shown in this application improves the commonly used training methods of language models based on language model transfer learning and graph transfer learning techniques, making it more suitable for performing graph question answering systems. Achieve higher accuracy with a smaller amount of manual labeling data.
  • the map structure is transferred to the relationship matching model through alternate training of the relationship matching model and the knowledge representation model, which can effectively improve the accuracy of the relationship matching model, that is, the accuracy of the corresponding relationship recognition, and achieve a stronger relationship
  • the extraction capability reduces the cost of manual participation and improves the efficiency of constructing knowledge graphs.
  • This application also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server or a cabinet server (including independent servers, or more A server cluster composed of two servers), etc.
  • the computer device 20 in this embodiment at least includes but is not limited to: a memory 21 and a processor 22 that can be communicatively connected to each other through a system bus, as shown in FIG. 3. It should be pointed out that FIG. 3 only shows the computer device 20 with components 21-22, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
  • the memory 21 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc.
  • the memory 21 may be an internal storage unit of the computer device 20, such as a hard disk or memory of the computer device 20.
  • the memory 21 may also be an external storage device of the computer device 20, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc.
  • the memory 21 may also include both the internal storage unit of the computer device 20 and its external storage device.
  • the memory 21 is generally used to store an operating system and various application software installed in the computer device 20, such as the program code of the artificial intelligence-based question answering device 10 in the first embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 22 is generally used to control the overall operation of the computer device 20.
  • the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the question and answer device 10 based on artificial intelligence, so as to implement the question and answer method based on artificial intelligence in the first embodiment.
  • This application also provides a computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read-Only Memory (PROM), Magnetic Memory, Disk, CD, Server, App Store, etc., on which computer programs are stored
  • ROM read only Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • PROM Programmable Read-Only Memory
  • Magnetic Memory Disk, CD, Server, App Store, etc.
  • the computer-readable storage medium of this embodiment is used to store the question and answer device 10 based on artificial intelligence, and when executed by a processor, it implements the question and answer method based on artificial intelligence in the first embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

An artificial intelligence-based question and answer method and apparatus, a computer device, and a storage medium: on the basis of first training data, performing language model training and, on the basis of second training data and the trained language model, performing NER model training and relationship matching model training; on the basis of the trained NER model, identifying an entity in a sentence to be processed and, on the basis of the trained relationship matching model, acquiring the relationship corresponding to the sentence to be processed; on the basis of the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, determining an answer corresponding to the sentence to be processed, and outputting same. Common language model training methods are thus improved on the basis of language model transfer learning and graph transfer learning technology, achieving a higher level of accuracy by means of a smaller amount of manually labelled data, and being more suitable for constructing a knowledge graph question and answer system.

Description

基于人工智能的问答方法、装置、计算机设备及存储介质Question answering method, device, computer equipment and storage medium based on artificial intelligence
本申请申明享有2019年07月19日递交的申请号为CN201910655550X、名称为“基于人工智能的问答方法、装置、计算机设备及存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application affirms the priority of the Chinese patent application filed on July 19, 2019 with the application number CN201910655550X and titled "Artificial intelligence-based question and answer methods, devices, computer equipment and storage media", the overall content of the Chinese patent application Incorporated in this application by reference.
技术领域Technical field
本申请涉及知识图谱技术领域,尤其涉及一种基于人工智能的问答方法、装置、计算机设备及存储介质。This application relates to the technical field of knowledge graphs, and in particular to a question and answer method, device, computer equipment and storage medium based on artificial intelligence.
背景技术Background technique
知识图谱又称为科学知识图谱,在图书情报界称为知识域可视化或知识领域映射地图,是显示知识发展进程与结构关系的一系列各种不同的图形。由于其能提供高质量的结构化数据,所以越来越多的领域会使用到知识图谱以及以知识图谱为基础的问答系统受到使用,例如自动问答、搜索引擎以及信息抽取。典型的知识图谱通常以三元组的头实体、关系、尾实体(例如姚明,国籍,中国)表达形式,通过该实例的表达反映了姚明的国籍是中国这件事实。The knowledge map is also called the scientific knowledge map. It is called the knowledge domain visualization or the knowledge domain mapping map in the library and information industry. It is a series of different graphs showing the relationship between the development process of knowledge and the structure. Because it can provide high-quality structured data, more and more fields will use knowledge graphs and question answering systems based on knowledge graphs, such as automatic question answering, search engines, and information extraction. A typical knowledge graph is usually expressed in the form of head entity, relationship, and tail entity (for example, Yao Ming, nationality, China) of the triad. The expression of this instance reflects the fact that Yao Ming’s nationality is Chinese.
但是目前知识图谱问答技术尚处于探索与研发阶段,大部分成果与进展还是以学界的论文为主,具体方案为:根据用户提出的问句,在数据库中通过关键字检索获得对应的论文或者网站文献,用户在点击具体的论文内容中去寻找其需要的内容,这样会导致用户提出问题的处理效率较差,不能满足用户的使用要求。However, the current knowledge graph question and answer technology is still in the stage of exploration and research and development. Most of the results and progress are mainly based on academic papers. The specific plan is: according to the question raised by the user, the corresponding paper or website can be obtained through keyword search in the database In the literature, the user clicks on the specific content of the paper to find the content he needs, which will result in poor processing efficiency for the user to raise the problem, and it cannot meet the user's requirements.
知识图谱问答系统,无论开放域还是垂直域,精确度是限制其广泛应用的主要因素,而精确度不足的主要因素在于标注数据量太少。由于知识图谱问答系统的标注包括实体识别标注与关系标注,标注数据所需要的成本巨大,为了快速构建知识图谱问答系统,减少所需要的标注数据量是很重要的。In the knowledge graph question and answer system, whether in an open domain or a vertical domain, accuracy is the main factor that limits its wide application, and the main factor of insufficient accuracy is that the amount of labeling data is too small. Since the annotation of the knowledge graph question answering system includes entity recognition annotation and relationship annotation, the cost of annotating data is huge. In order to quickly build the knowledge graph question and answer system, it is important to reduce the amount of annotation data required.
发明内容Summary of the invention
本申请的目的是提供一种基于人工智能的问答方法、装置、计算机设备及存储介质,用于解决现有技术存在的问题。The purpose of this application is to provide a question and answer method, device, computer equipment and storage medium based on artificial intelligence to solve the problems existing in the prior art.
为实现上述目的,本申请提供一种基于人工智能的问答方法,包括:In order to achieve the above objectives, this application provides a question and answer method based on artificial intelligence, including:
基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;Performing language model training based on the first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;Perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;Performing relationship matching model training based on the second training data and the trained language model;
基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;Identify entities in the sentence to be processed based on the trained NER model, and obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。According to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, the answer corresponding to the sentence to be processed is determined and output.
为实现上述目的,本申请还提供一种基于人工智能的问答装置,包括:In order to achieve the above objective, this application also provides an artificial intelligence-based question and answer device, including:
语言模型训练模块,用于基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;The language model training module is configured to perform language model training based on first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
NER模型训练模块,用于基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;The NER model training module is used to perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the Manually labeled NER mark corresponding to the question;
关系匹配模型训练模块,用于基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;The relation matching model training module is configured to perform relation matching model training based on the second training data and the trained language model;
实体识别模块,用于基于所述训练后的NER模型识别待处理语句中的实体;The entity recognition module is used to recognize the entities in the sentence to be processed based on the trained NER model;
关系获取模块,用于基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;A relationship obtaining module, configured to obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
以及答案输出模块,用于根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。And an answer output module, configured to determine and output the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
为实现上述目的,本申请还提供一种计算机设备,包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现基于人工智能的问答方法的以下步骤:In order to achieve the above objective, this application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, an artificial intelligence-based The following steps of the question and answer method:
基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;Performing language model training based on the first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;Perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;Performing relationship matching model training based on the second training data and the trained language model;
基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;Identify entities in the sentence to be processed based on the trained NER model, and obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。According to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, the answer corresponding to the sentence to be processed is determined and output.
为实现上述目的,本申请还提供计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现基于人工智能的问答方法的以下步骤:In order to achieve the above-mentioned purpose, the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps of the artificial intelligence-based question and answer method are realized:
基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;Performing language model training based on the first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;Perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;Performing relationship matching model training based on the second training data and the trained language model;
基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;Identify entities in the sentence to be processed based on the trained NER model, and obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。According to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, the answer corresponding to the sentence to be processed is determined and output.
本申请提供的基于人工智能的问答方法、装置、计算机设备及存储介质,基于语言模型迁移学习和图谱迁移学习技术,改进了语言模型常用的训练方法,使其更适合于进行知识图谱问答系统,可实现以较少量的人工标记数据达到更高的精确度。具体而言,通过对语言模型预训练,并基于训练后的语言模型以及少量人工标记的数据分别对NER模型和关系匹配模型进行训练,以进行待处理语句中实体与对应的关系识别,其中,在关系匹配模型的训练的过程通过关系匹配模型与知识表示模型交替地训练将图谱结构迁移到关系匹配模型,可有效提升关系匹配模型的精度,即对应的关系识别精度,实现了更强的关系抽取能力,减少了人工参与的成本,提高了构建知识图谱的效率。The artificial intelligence-based question answering method, device, computer equipment and storage medium provided in this application are based on language model transfer learning and graph transfer learning techniques, which improve the commonly used training methods of language models and make them more suitable for the knowledge graph question answering system. It can achieve higher accuracy with a smaller amount of manual labeling data. Specifically, by pre-training the language model, and training the NER model and the relationship matching model based on the trained language model and a small amount of manually labeled data, respectively, to recognize entities and corresponding relationships in the sentence to be processed, where, During the training of the relationship matching model, the map structure is transferred to the relationship matching model through alternate training of the relationship matching model and the knowledge representation model, which can effectively improve the accuracy of the relationship matching model, that is, the accuracy of the corresponding relationship recognition, and achieve a stronger relationship The extraction ability reduces the cost of manual participation and improves the efficiency of constructing knowledge graphs.
附图说明Description of the drawings
图1为本申请基于人工智能的问答方法一实施例的流程图;FIG. 1 is a flowchart of an embodiment of a question answering method based on artificial intelligence in this application;
图2为本申请基于人工智能的问答装置一实施例的程序模块示意图;2 is a schematic diagram of program modules of an embodiment of an artificial intelligence-based question and answer device according to this application;
图3为本申请基于人工智能的问答装置一实施例的硬件结构示意图。FIG. 3 is a schematic diagram of the hardware structure of an embodiment of an artificial intelligence-based question and answer device of this application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions, and advantages of this application clearer, the following further describes this application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and not used to limit the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
实施例一Example one
请参阅图1,本申请公开了一种基于人工智能的问答方法一种基于人工智能的问答方法,包括以下步骤:Please refer to Figure 1. This application discloses a question and answer method based on artificial intelligence and a question answer method based on artificial intelligence, including the following steps:
S1基于第一训练数据进行语言模型训练,其中采集指定领域大量的未标注的问句语料,自动标注每一所述问句语料中实体所在的位置,获取第一训练数据。S1 performs language model training based on the first training data, in which a large number of unlabeled question corpus in a designated field is collected, and the location of the entity in each question corpus is automatically marked to obtain the first training data.
步骤S1中,可以通过爬虫积极搜集获得大量指定垂直领域(如医疗领域、海洋领域等强调知识深度、专业性要求高的行业)的未标注的问句语料,问句语料以问答交互数据为主,且语料数量越大越好(一般不小于50万个数据);上述语言模型训练的任务是在句中某个字遮挡的情况下预测该字。作为一优选方案,步骤S1中,可通过字典匹配,可以自动标注句中哪些部分是一个实体(虽然会有些不准确)。如果句中存在实体,则将语言模型的训练集中在该部分,即通过语言模型预测实体的某个字,这样训练的语言模型可以很好的捕捉到实体信息。 如果通过字典没有匹配到实体,则随机选取。In step S1, the crawler can actively collect a large number of unlabeled question corpora in designated vertical fields (such as the medical field, marine field and other industries that emphasize the depth of knowledge and require high professionalism). The question corpus is mainly question and answer interactive data , And the larger the corpus, the better (generally no less than 500,000 data); the task of the above language model training is to predict a word in a sentence when the word is occluded. As a preferred solution, in step S1, dictionary matching can be used to automatically mark which part of the sentence is an entity (although it will be somewhat inaccurate). If there is an entity in the sentence, focus the training of the language model on this part, that is, predict a certain word of the entity through the language model, so that the trained language model can well capture the entity information. If the entity is not matched through the dictionary, it is randomly selected.
语言模型就是根据上下文去预测下一个词是什么,即实现在句中某个字遮挡的情况下预测该字的功能,语言模型的选取不做具体限制,如常用的google transformer语言模型或LSMT(Long Sort Term Memory)语言模型均可,语言模型的训练方案也可采取本领域常规训练方案。The language model is to predict what the next word is based on the context, that is, to realize the function of predicting the word when a word in the sentence is occluded. The selection of the language model is not restricted, such as the commonly used google transformer language model or LSMT ( Long Sort Term Memory) language models can be used, and the language model training program can also adopt the conventional training program in this field.
以下以语言模型为google transformer语言模型对本步骤所述训练步骤的进行进一步说明,若语言模型选取google transformer,则语言模型训练可包括如下步骤:将所述第一训练数据输入至所述google transformer语言模型的中嵌入层进行向量化,然后输入至编码层获取自注意力计算矩阵,将所述计算矩阵输入至损失函数中并基于adam优化算法对google transformer语言模型损失函数最小化,最后保存所述google transformer语言模型参数设置。The following uses the language model as the google transformer language model to further explain the training steps in this step. If the language model selects google transformer, the language model training may include the following steps: input the first training data into the google transformer language The embedding layer of the model is vectorized, and then input to the coding layer to obtain the self-attention calculation matrix. The calculation matrix is input to the loss function and based on the Adam optimization algorithm to minimize the loss function of the google transformer language model, and finally save the google transformer language model parameter settings.
S2基于第二训练数据以及训练后的所述语言模型进行NER模型训练,其中,在NER模型的训练过程中,语言模型的参数也被视为上层NER模型的参数,一起进行训练。步骤S2中在训练好的语言模型之上加入NER模型的网络,并基于人工标注好的NER数据上训练NER模型,训练过程中,语言模型的参数也被视为上层NER模型的参数,一起进行训练。通过梯度下降,更新参数,重复训练过程,则可以不断降低损失函数,得到越来越好的预测。其中,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;其中关于第二训练数据的数量的要求,为千级别即可。S2 performs NER model training based on the second training data and the language model after training. In the training process of the NER model, the parameters of the language model are also regarded as the parameters of the upper-layer NER model and are trained together. In step S2, the network of the NER model is added to the trained language model, and the NER model is trained based on the artificially labeled NER data. During the training process, the parameters of the language model are also regarded as the parameters of the upper NER model and proceed together training. Through gradient descent, update parameters, and repeat the training process, the loss function can be continuously reduced, and better and better predictions can be obtained. Wherein, the second training data is manually labeled NER data, and each of the NER data includes a question sentence and the manually labeled NER mark corresponding to the question sentence; wherein the requirement for the quantity of the second training data is a thousand level OK.
本实施例中,步骤S2中,NER模型训练包括:In this embodiment, in step S2, NER model training includes:
S21将所述第二训练数据中的问句经由训练后的所述语言模型表示向量序列或者矩阵,并输入至所述NER模型,以输出预测NER标记;S21 input the question sentence in the second training data to the NER model through the language model after training to represent the vector sequence or matrix, so as to output the predicted NER mark;
S22比较所述人工标注NER标记和对应的所述预测NER标记,计算损失函数;本实施例中,损失函数选取Categorical cross entropy。S22 compares the artificially annotated NER mark with the corresponding predicted NER mark to calculate a loss function; in this embodiment, the loss function is selected as Categorical cross entropy.
S23基于梯度优化算法,更新优化所述语言模型与NER模型参数。S23 updates and optimizes the language model and NER model parameters based on the gradient optimization algorithm.
上述训练步骤对于NER模型通用,比如LSTM+CRF模型,或者其他实体识别模型均可。The above training steps are common to NER models, such as LSTM+CRF models or other entity recognition models.
S3基于第二训练数据以及训练后的所述语言模型进行关系匹配模型训练,训练过程中,问句向量与关系向量每进行一个epoch的关系匹配模型训练后,关系向量继续单独进行一个epoch的知识表示模型训练,上述关系匹配模型训练与知识表示模型训练交替进行,直至完成所有的epoch训练。其中,第二训练数据中的问句经由训练后的语言模型输出所述问句向量、第二训练数据中的问句中的关系经由嵌入层随机初始化表示为关系向量,上述一个epoch指代所有的数据送入网络中完成一次前向计算及反向传播的过程。S3 performs relationship matching model training based on the second training data and the trained language model. During the training process, the question vector and the relationship vector are trained on the relationship matching model for each epoch, and the relationship vector continues to perform a separate epoch knowledge Represents model training, the above-mentioned relation matching model training and knowledge representation model training alternate until all epoch training is completed. Wherein, the question sentence in the second training data is outputted by the language model after the training of the question sentence vector, the relationship in the question sentence in the second training data is randomly initialized by the embedding layer and expressed as a relationship vector, and the above one epoch refers to all The data is sent to the network to complete a forward calculation and back propagation process.
实体在知识图谱中有多个关系和属性,但是哪一个与问句最匹配,需要基于关系匹配模型确认,一般需要在100个-200个关系中选择唯一一个正确的关系。示例性的,待处理语句阿托伐他丁防治冠心病应该怎么服用?的关系为:<药品>防治<疾病>应该怎么服用?<药品>和<疾病>均表示医学实体。本实施例中,通 过语言模型的预训练和知识表示的预训练,将语义和知识图谱结构进行结合,关系匹配模型训练一个epoch,训练过程中,语言模型也被包含在关系匹配模型中,即语言模型所有参数全部视为关系匹配模型的参数,在训练时进行更新;关系向量在关系匹配模型中也做更新,同时一个epoch结束时,关系向量也再放入知识表示模型如HolE模型(holographic embeddings of knowledge graphs,知识图谱全息嵌入表示模型)中进行一个epoch的训练,上述关系匹配模型训练与知识表示模型训练交替进行,可以更好地进行关系匹配。Entities have multiple relationships and attributes in the knowledge graph, but which one matches the question best needs to be confirmed based on the relationship matching model. Generally, it is necessary to select the only correct relationship among 100-200 relationships. Exemplarily, how should atorvastatin be taken to prevent coronary heart disease? The relationship is: how to take <drugs> to prevent and treat <disease>? Both <drug> and <disease> represent medical entities. In this embodiment, through the pre-training of the language model and the pre-training of the knowledge representation, the semantics and the knowledge graph structure are combined, and the relationship matching model is trained for one epoch. During the training process, the language model is also included in the relationship matching model, namely All parameters of the language model are regarded as the parameters of the relational matching model and updated during training; the relational vector is also updated in the relational matching model, and at the end of an epoch, the relational vector is also put into the knowledge representation model such as HolE model (holographic Embeddings of knowledge graphs, knowledge graph holographic embedding representation model) is trained for an epoch. The above relationship matching model training and knowledge representation model training are performed alternately, which can better perform relationship matching.
关系匹配模型训练包括以下步骤:Relational matching model training includes the following steps:
S31将所述第二训练数据中的问句经由训练后的所述语言模型进行编码,并输出的问句向量;本步骤中,首先拷贝一份训练好的语言模型过来,然后问句向量表示由语言模型输出,其中关于第二训练数据的数量的要求,也为千级别即可;S31 encode the question in the second training data through the trained language model, and output the question vector; in this step, first copy a trained language model, and then express the question vector Output from the language model, where the requirement on the quantity of the second training data is also a thousand level;
S32对所述第二训练数据中的问句中的关系经由嵌入层随机初始化表示为关系向量;步骤S32中,将第二训练数据进行关系向量随机初始化;In S32, the relationship in the question sentence in the second training data is randomly initialized as a relationship vector through the embedding layer; in step S32, the relationship vector is randomly initialized in the second training data;
S33关系匹配模型训练,训练过程中,所述问句向量与关系向量参与一个epoch的关系匹配模型的训练,并基于梯度优化算法,更新优化所述语言模型以及所述关系匹配模型的参数;同时,每个所述epoch结束时,所述关系向量再放入知识表示模型中进行一个epoch的训练,交替上述训练过程直至所有的epoch处理完毕。S33 relation matching model training, during the training process, the question vector and relation vector participate in the training of an epoch relation matching model, and based on the gradient optimization algorithm, update and optimize the language model and the parameters of the relation matching model; At the end of each epoch, the relationship vector is put into the knowledge representation model to perform one epoch training, and the above training process is alternated until all epochs are processed.
步骤S33中,通过语言模型的预训练和知识表示的预训练,将语义和知识图谱结构进行结合,关系匹配模型训练一个epoch,训练过程中,语言模型也被包含在关系匹配模型中,即语言模型所有参数全部视为关系匹配模型的参数,在训练时进行更新,即进行语言模型的微调;关系向量在关系匹配模型中也做更新,同时一个epoch结束时,关系向量也再放入知识表示模型如HolE模型(holographic embeddings of knowledge graphs,知识图谱全息嵌入表示模型)中进行一个epoch的训练,上述关系匹配模型训练与知识表示模型训练交替进行,可以更好地进行关系匹配。In step S33, through the pre-training of the language model and the pre-training of the knowledge representation, the semantics and the knowledge graph structure are combined, and the relationship matching model is trained for an epoch. During the training process, the language model is also included in the relationship matching model, that is, language All parameters of the model are regarded as the parameters of the relational matching model, which are updated during training, that is, the language model is fine-tuned; the relational vector is also updated in the relational matching model, and at the end of an epoch, the relational vector is also put into the knowledge representation Models such as Holographic embeddings of knowledge graphs (holographic embeddings of knowledge graphs, knowledge graph holographic embedding representation model) are trained for an epoch. The above-mentioned relationship matching model training and knowledge representation model training are performed alternately, which can better perform relationship matching.
S34保存所述google transformer语言模型以及所述关系匹配模型的参数。S34 saves the google transformer language model and the parameters of the relationship matching model.
上述关系匹配的模型不用限定,一般的语义匹配模型都是可以的,比如基于ESIM(enhanced semantic inference model),或者其他语义匹配模型都可以。The above-mentioned relationship matching model is not limited, and general semantic matching models are available, for example, based on ESIM (enhanced semantic inference model), or other semantic matching models.
S4基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系。S4 identifies entities in the sentence to be processed based on the trained NER model, and obtains the relationship corresponding to the sentence to be processed based on the trained relationship matching model.
S5根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。S5 determines and outputs the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
可以理解的是,每一个实体在知识图谱上会有一个对应的节点内容,一种具体的实现方式中,具体用于在所述待处理语句中的实体在知识图谱上所对应的节点中查找所述待处理语句的关系所对应的内容;将所查找到的内容确定为所述待处理语句所对应的答案,并输出所述答案。It is understandable that each entity will have a corresponding node content on the knowledge graph. In a specific implementation, it is specifically used to search for the entity in the sentence to be processed in the corresponding node on the knowledge graph. The content corresponding to the relationship of the sentence to be processed; the found content is determined as the answer corresponding to the sentence to be processed, and the answer is output.
因此,综上所述,本申请所示的一种基于人工智能的问答方法,基于语言 模型迁移学习和图谱迁移学习技术,改进了语言模型常用的训练方法,使其更适合于进行只是图谱问答系统,可实现以较少量的人工标记数据达到更高的精确度。具体而言,通过对语言模型预训练,并基于训练后的语言模型以及少量人工标记的数据分别对NER模型和关系匹配模型进行训练,以进行待处理语句中实体与对应的关系识别,其中,在关系匹配模型的训练的过程通过关系匹配模型与知识表示模型交替地训练将图谱结构迁移到关系匹配模型,可有效提升关系匹配模型的精度,即对应的关系识别精度,实现了更强的关系抽取能力,减少了人工参与的成本,提高了构建知识图谱的效率。Therefore, in summary, the artificial intelligence-based question answering method shown in this application, based on language model transfer learning and graph transfer learning technology, improves the commonly used training methods of language models, making it more suitable for graph question answering The system can achieve higher accuracy with less manual labeling data. Specifically, by pre-training the language model, and training the NER model and the relationship matching model based on the trained language model and a small amount of manually labeled data, respectively, to recognize entities and corresponding relationships in the sentence to be processed, where, During the training of the relationship matching model, the map structure is transferred to the relationship matching model through alternate training of the relationship matching model and the knowledge representation model, which can effectively improve the accuracy of the relationship matching model, that is, the accuracy of the corresponding relationship recognition, and achieve a stronger relationship The extraction ability reduces the cost of manual participation and improves the efficiency of constructing knowledge graphs.
实施例二Example two
请继续参阅图2,本申请示出了一种基于人工智能的问答装置10,在本实施例中,基于人工智能的问答装置10可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述基于人工智能的问答方法。本申请所称的程序模块是指能够完成特定功能的一系列计算机程序指令段,比程序本身更适合于描述基于人工智能的问答装置10在存储介质中的执行过程。Please continue to refer to FIG. 2. This application shows a question and answer device 10 based on artificial intelligence. In this embodiment, the question and answer device 10 based on artificial intelligence may include or be divided into one or more program modules, one or more Each program module is stored in a storage medium and executed by one or more processors to complete the application and realize the above-mentioned question and answer method based on artificial intelligence. The program module referred to in this application refers to a series of computer program instruction segments capable of completing specific functions, and is more suitable than the program itself to describe the execution process of the artificial intelligence-based question and answer device 10 in the storage medium.
以下描述将具体介绍本实施例各程序模块的功能:The following description will specifically introduce the functions of each program module in this embodiment:
本申请所示的基于人工智能的问答装置10,包括:The artificial intelligence-based question answering device 10 shown in this application includes:
语言模型训练模块11,用于基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;The language model training module 11 is configured to perform language model training based on first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
NER模型训练模块12,用于基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;The NER model training module 12 is configured to perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data, and each of the NER data includes a question and Manually mark the NER mark corresponding to the question;
关系匹配模型训练模块13,用于基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;The relation matching model training module 13 is configured to perform relation matching model training based on the second training data and the trained language model;
实体识别模块14,用于基于所述训练后的NER模型识别待处理语句中的实体;The entity recognition module 14 is used to recognize entities in the sentence to be processed based on the trained NER model;
关系获取模块15,用于基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;The relationship obtaining module 15 is configured to obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
以及答案输出模块16,用于根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。And the answer output module 16 is used to determine and output the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
作为一优选方案,所述语言模型训练模块11包括第一训练数据获取子模块,用于采集指定领域大量的未标注的问句语料,并自动标注每一所述问句语料中实体所在的位置,以获所述取第一训练数据。As a preferred solution, the language model training module 11 includes a first training data acquisition sub-module, which is used to collect a large number of unlabeled question corpus in a specified field, and automatically mark the location of the entity in each question corpus , To obtain the first training data.
进一步的,所述第一训练数据获取子模块中,经由字典匹配自动标注每一所述问句语料中实体所在的位置,若所述问句语料没有匹配到实体,则随机选取。Further, in the first training data acquisition submodule, the location of the entity in each question corpus is automatically marked through dictionary matching, and if the question corpus does not match the entity, it is randomly selected.
作为一优选方案,所述语言模型为google transformer语言模型,所述所述语言模型训练模块还包括:As a preferred solution, the language model is a google transformer language model, and the language model training module further includes:
向量化子模块,用于将所述第一训练数据输入至所述google transformer语言模型的中嵌入层进行向量化;The vectorization sub-module is used to input the first training data into the embedding layer of the google transformer language model for vectorization;
矩阵获取子模块,用于将所述向量输入至编码层获取自注意力计算矩阵;The matrix acquisition sub-module is used to input the vector to the coding layer to obtain the self-attention calculation matrix;
第一优化子模块,用于将所述计算矩阵输入至损失函数中并基于梯度优化算法,更新优化所述google transformer语言模型参数;The first optimization sub-module is configured to input the calculation matrix into the loss function and update and optimize the parameters of the google transformer language model based on a gradient optimization algorithm;
第一保存子模块,用于保存所述google transformer语言模型参数设置。The first saving sub-module is used to save the google transformer language model parameter settings.
作为一优选方案,所述NER模型训练模块12包括:As a preferred solution, the NER model training module 12 includes:
预测NER标记获取子模块,用于将所述第二训练数据中的问句经由训练后的所述语言模型表示向量序列或者矩阵,并输入至所述NER模型,以输出预测NER标记;A predictive NER tag acquisition sub-module, configured to express a vector sequence or matrix of the question sentence in the second training data through the language model after training, and input it into the NER model to output a predictive NER tag;
比较子模块,用于比较所述人工标注NER标记和对应的所述预测NER标记,计算损失函数;The comparison sub-module is used to compare the artificially annotated NER mark with the corresponding predicted NER mark, and calculate a loss function;
第二优化子模块,用于基于梯度优化算法,更新优化所述语言模型与所述NER模型参数。The second optimization sub-module is used to update and optimize the language model and the NER model parameters based on the gradient optimization algorithm.
作为一优选方案,所述关系匹配模型训练模块13包括:As a preferred solution, the relationship matching model training module 13 includes:
问句向量获取子模块,用于获取所述第二训练数据的问句向量;The question vector obtaining submodule is used to obtain the question vector of the second training data;
关系向量获取子模块,用于获取所述第二训练数据的关系向量;The relation vector obtaining sub-module is used to obtain the relation vector of the second training data;
训练子模块,用于将所述问句向量与关系向量输出进行注意力机制的交互及训练,训练过程中,所述问句向量与所述关系向量参与一个epoch的关系匹配模型的训练,并基于梯度优化算法,更新优化所述语言模型以及所述关系匹配模型的参数;同时,每个所述epoch结束时,所述关系向量再放入知识表示模型中进行一个epoch的训练,交替上述训练过程直至所有的epoch处理完毕;The training sub-module is used to perform the interaction and training of the attention mechanism with the output of the question vector and the relation vector. During the training process, the question vector and the relation vector participate in the training of an epoch relation matching model, and Based on the gradient optimization algorithm, the parameters of the language model and the relationship matching model are updated and optimized; at the same time, at the end of each epoch, the relationship vector is put into the knowledge representation model for one epoch training, alternating the above training Process until all epochs are processed;
第三保存子模块,用于保存所述语言模型以及所述关系匹配模型的参数。The third saving submodule is used to save the language model and the parameters of the relationship matching model.
进一步的,所述问句向量获取子模块中,所述第二训练数据中的问句经由训练后的语言模型输出所述问句向量。Further, in the question vector obtaining sub-module, the question sentence in the second training data outputs the question sentence vector through the trained language model.
进一步的,所述关系向量获取子模块中,所述第二训练数据中的问句中的关系经由嵌入层随机初始化表示为关系向量。Further, in the relationship vector obtaining submodule, the relationship in the question sentence in the second training data is randomly initialized by an embedding layer and expressed as a relationship vector.
综上,本申请所示的一种基于人工智能的问答装置10,基于语言模型迁移学习和图谱迁移学习技术,改进了语言模型常用的训练方法,使其更适合于进行只是图谱问答系统,可实现以较少量的人工标记数据达到更高的精确度。具体而言,通过对语言模型预训练,并基于训练后的语言模型以及少量人工标记的数据分别对NER模型和关系匹配模型进行训练,以进行待处理语句中实体与对应的关系识别,其中,在关系匹配模型的训练的过程通过关系匹配模型与知识表示模型交替地训练将图谱结构迁移到关系匹配模型,可有效提升关系匹配模型的精度,即对应的关系识别精度,实现了更强的关系抽取能力,减少了人工参与的成本,提高了构建知识图谱的效率。In summary, the artificial intelligence-based question answering device 10 shown in this application improves the commonly used training methods of language models based on language model transfer learning and graph transfer learning techniques, making it more suitable for performing graph question answering systems. Achieve higher accuracy with a smaller amount of manual labeling data. Specifically, by pre-training the language model, and training the NER model and the relationship matching model based on the trained language model and a small amount of manually labeled data, respectively, to recognize entities and corresponding relationships in the sentence to be processed, where, During the training of the relationship matching model, the map structure is transferred to the relationship matching model through alternate training of the relationship matching model and the knowledge representation model, which can effectively improve the accuracy of the relationship matching model, that is, the accuracy of the corresponding relationship recognition, and achieve a stronger relationship The extraction capability reduces the cost of manual participation and improves the efficiency of constructing knowledge graphs.
实施例三Example three
本申请还提供一种计算机设备,如可以执行程序的智能手机、平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。本实施例的计算机设备20至少包括但不限于:可通过系统总线相互通信连接的存储器21、处理器22,如图3所示。需要指出的是,图3仅示出了具有组件21-22的计算机设备20,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。This application also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server or a cabinet server (including independent servers, or more A server cluster composed of two servers), etc. The computer device 20 in this embodiment at least includes but is not limited to: a memory 21 and a processor 22 that can be communicatively connected to each other through a system bus, as shown in FIG. 3. It should be pointed out that FIG. 3 only shows the computer device 20 with components 21-22, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
本实施例中,存储器21(即可读存储介质)包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备20的内部存储单元,例如该计算机设备20的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备20的外部存储设备,例如该计算机设备20上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备20的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备20的操作系统和各类应用软件,例如实施例一的基于人工智能的问答装置10的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。In this embodiment, the memory 21 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 21 may be an internal storage unit of the computer device 20, such as a hard disk or memory of the computer device 20. In other embodiments, the memory 21 may also be an external storage device of the computer device 20, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 20 and its external storage device. In this embodiment, the memory 21 is generally used to store an operating system and various application software installed in the computer device 20, such as the program code of the artificial intelligence-based question answering device 10 in the first embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备20的总体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行基于人工智能的问答装置10,以实现实施例一的基于人工智能的问答方法。The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 22 is generally used to control the overall operation of the computer device 20. In this embodiment, the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the question and answer device 10 based on artificial intelligence, so as to implement the question and answer method based on artificial intelligence in the first embodiment.
实施例四Example four
本申请还提供一种计算机可读存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器执行时实现相应功能。本实施例的计算机可读存储介质用于存储基于人工智能的问答装置10,被处理器执行时实现实施例一的基于人工智能的问答方法。This application also provides a computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read-Only Memory (PROM), Magnetic Memory, Disk, CD, Server, App Store, etc., on which computer programs are stored The corresponding function is realized when executed by the processor. The computer-readable storage medium of this embodiment is used to store the question and answer device 10 based on artificial intelligence, and when executed by a processor, it implements the question and answer method based on artificial intelligence in the first embodiment.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用 本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种基于人工智能的问答方法,其特征在于,包括以下步骤:A question and answer method based on artificial intelligence, characterized in that it comprises the following steps:
    基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;Performing language model training based on the first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
    基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;Perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
    基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;Performing relationship matching model training based on the second training data and the trained language model;
    基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;Identify entities in the sentence to be processed based on the trained NER model, and obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
    根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。According to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, the answer corresponding to the sentence to be processed is determined and output.
  2. 根据权利要求1所述的基于人工智能的问答方法,其特征在于,经由爬虫工具搜集获得大量指定垂直领域的未标注的问句语料。The artificial intelligence-based question answering method according to claim 1, wherein a large number of unlabeled question corpus in a designated vertical field is collected through a crawler tool.
  3. 根据权利要求1或2所述的基于人工智能的问答方法,其特征在于:经由字典匹配自动标注每一所述问句语料中实体所在的位置,若所述问句语料没有匹配到实体,则随机选取。The artificial intelligence-based question answering method according to claim 1 or 2, characterized in that the location of the entity in each question corpus is automatically marked through dictionary matching, and if the question corpus does not match the entity, then Randomly selected.
  4. 根据权利要求1所述的基于人工智能的问答方法,其特征在于:所述语言模型为google transformer语言模型,所述google transformer语言模型训练包括:The artificial intelligence-based question answering method according to claim 1, wherein the language model is a google transformer language model, and the google transformer language model training includes:
    将所述第一训练数据输入至所述google transformer语言模型的中嵌入层进行向量化;Inputting the first training data to the middle embedding layer of the google transformer language model for vectorization;
    将向量化的第一训练数据输入至编码层获取自注意力计算矩阵;Input the vectorized first training data to the coding layer to obtain the self-attention calculation matrix;
    将所述自注意力计算矩阵输入至google transformer语言模型的损失函数中并基于梯度优化算法,更新优化所述google transformer语言模型的参数;Input the self-attention calculation matrix into the loss function of the google transformer language model, and update and optimize the parameters of the google transformer language model based on a gradient optimization algorithm;
    保存所述google transformer语言模型的参数设置。Save the parameter settings of the google transformer language model.
  5. 根据权利要求1所述的基于人工智能的问答方法,其特征在于:所述NER模型训练包括:The question answering method based on artificial intelligence according to claim 1, wherein the NER model training comprises:
    将所述第二训练数据中的问句经由训练后的所述语言模型处理后得到向量序列或者矩阵,并输入至所述NER模型,以输出预测NER标记;The question sentence in the second training data is processed by the language model after training to obtain a vector sequence or matrix, and input to the NER model to output a predicted NER label;
    比较所述人工标注NER标记和对应的所述预测NER标记,计算所述NER模型的损失函数;Comparing the artificially annotated NER mark with the corresponding predicted NER mark, and calculating the loss function of the NER model;
    基于梯度优化算法,更新优化所述语言模型与所述NER模型参数。Based on the gradient optimization algorithm, update and optimize the language model and the NER model parameters.
  6. 根据权利要求1所述的基于人工智能的问答方法,其特征在于:所述关系匹配模型训练包括:The question answering method based on artificial intelligence according to claim 1, wherein the relationship matching model training comprises:
    获取所述第二训练数据的问句向量与关系向量;Acquiring the question vector and the relation vector of the second training data;
    将所述问句向量与关系向量输出进行注意力机制的交互及训练,所述问句向 量与所述关系向量参与一个epoch的关系匹配模型的训练,并基于梯度优化算法,更新优化所述语言模型以及所述关系匹配模型的参数;同时,每个所述epoch结束时,所述关系向量再放入知识表示模型中进行一个epoch的训练,交替上述训练过程直至所有的epoch处理完毕;The question vector and the relation vector are output for interaction and training of the attention mechanism. The question vector and the relation vector participate in the training of an epoch relation matching model, and the language is updated and optimized based on the gradient optimization algorithm Model and the parameters of the relationship matching model; at the same time, at the end of each epoch, the relationship vector is put into the knowledge representation model for one epoch training, and the above training process is alternated until all epochs are processed;
    保存所述语言模型以及所述关系匹配模型的参数。Save the language model and the parameters of the relationship matching model.
  7. 根据权利要求6所述的基于人工智能的问答方法,其特征在于:所述第二训练数据中的问句经由训练后的语言模型输出所述问句向量;The question answering method based on artificial intelligence according to claim 6, characterized in that: the question sentence in the second training data outputs the question sentence vector through the trained language model;
    所述第二训练数据中的问句中的关系经由嵌入层随机初始化表示为关系向量。The relationship in the question sentence in the second training data is randomly initialized by the embedding layer and expressed as a relationship vector.
  8. 一种基于人工智能的问答装置,其特征在于,包括:A question and answer device based on artificial intelligence, characterized in that it comprises:
    语言模型训练模块,用于基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;The language model training module is configured to perform language model training based on first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
    NER模型训练模块,用于基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;The NER model training module is used to perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the Manually labeled NER mark corresponding to the question;
    关系匹配模型训练模块,用于基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;The relation matching model training module is configured to perform relation matching model training based on the second training data and the trained language model;
    实体识别模块,用于基于所述训练后的NER模型识别待处理语句中的实体;The entity recognition module is used to recognize the entities in the sentence to be processed based on the trained NER model;
    关系获取模块,用于基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;A relationship obtaining module, configured to obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
    以及答案输出模块,用于根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。And an answer output module, configured to determine and output the answer corresponding to the sentence to be processed according to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed.
  9. 根据权利要求8所述的基于人工智能的问答装置,其特征在于,所述语言模型训练模块包括第一训练数据获取子模块,用于经由爬虫工具采集指定领域大量的未标注的问句语料,并自动标注每一所述语料中实体所在的位置,以获所述取第一训练数据。The artificial intelligence-based question answering device according to claim 8, wherein the language model training module comprises a first training data acquisition sub-module for collecting a large amount of unlabeled question corpus in a specified field through a crawler tool, And automatically mark the location of the entity in each corpus to obtain the first training data.
  10. 根据权利要求8所述的基于人工智能的问答装置,其特征在于,所述第一训练数据获取子模块中,经由字典匹配自动标注每一所述问句语料中实体所在的位置,若所述问句语料没有匹配到实体,则随机选取。The artificial intelligence-based question answering device according to claim 8, wherein in the first training data acquisition sub-module, the location of the entity in each question corpus is automatically marked through dictionary matching, if the If the question corpus does not match the entity, it is randomly selected.
  11. 根据权利要求8所述的基于人工智能的问答装置,其特征在于,所述语言模型为google transformer语言模型,所述语言模型训练模块包括:The artificial intelligence-based question answering device according to claim 8, wherein the language model is a google transformer language model, and the language model training module comprises:
    向量化子模块,用于将所述第一训练数据输入至所述google transformer语言模型的中嵌入层(embedding层)进行向量化;The vectorization sub-module is used to input the first training data into the embedding layer (embedding layer) of the google transformer language model for vectorization;
    矩阵获取子模块,用于将向量化的第一训练数据输入至编码层获取自注意力计算矩阵;The matrix acquisition sub-module is used to input the vectorized first training data to the coding layer to obtain the self-attention calculation matrix;
    第一优化子模块,用于将所述计算矩阵输入至损失函数中并基于梯度优化算法,更新优化所述google transformer语言模型参数;The first optimization sub-module is configured to input the calculation matrix into the loss function and update and optimize the parameters of the google transformer language model based on a gradient optimization algorithm;
    第一保存子模块,用于保存所述google transformer语言模型参数设置。The first saving sub-module is used to save the google transformer language model parameter settings.
  12. 根据权利要求8所述的基于人工智能的问答装置,其特征在于,所述NER模型训练模块包括:The artificial intelligence-based question answering device according to claim 8, wherein the NER model training module comprises:
    预测NER标记获取子模块,用于将所述第二训练数据中的问句经由训练后的所述语言模型表示向量序列或者矩阵,并输入至所述NER模型,以输出预测NER标记;A predictive NER tag acquisition sub-module, configured to express a vector sequence or matrix of the question sentence in the second training data through the language model after training, and input it into the NER model to output a predictive NER tag;
    比较子模块,用于比较所述人工标注NER标记和对应的所述预测NER标记,计算损失函数;The comparison sub-module is used to compare the artificially annotated NER mark with the corresponding predicted NER mark, and calculate a loss function;
    第二优化子模块,用于基于梯度优化算法,更新优化所述语言模型与所述NER模型参数。The second optimization sub-module is used to update and optimize the language model and the NER model parameters based on the gradient optimization algorithm.
  13. 根据权利要求8所述的基于人工智能的问答装置,其特征在于,所述关系匹配模型训练模块包括:The artificial intelligence-based question answering device according to claim 8, wherein the relationship matching model training module comprises:
    问句向量获取子模块,用于获取所述第二训练数据的问句向量;The question vector obtaining submodule is used to obtain the question vector of the second training data;
    关系向量获取子模块,用于获取所述第二训练数据的关系向量;The relation vector obtaining sub-module is used to obtain the relation vector of the second training data;
    训练子模块,用于将所述问句向量与关系向量输出进行注意力机制的交互及训练,训练过程中,所述问句向量与所述关系向量参与一个epoch的关系匹配模型的训练,并基于梯度优化算法,更新优化所述语言模型以及所述关系匹配模型的参数;同时,每个所述epoch结束时,所述关系向量再放入知识表示模型中进行一个epoch的训练,交替上述训练过程直至所有的epoch处理完毕;The training sub-module is used to perform the interaction and training of the attention mechanism with the output of the question vector and the relation vector. During the training process, the question vector and the relation vector participate in the training of an epoch relation matching model, and Based on the gradient optimization algorithm, the parameters of the language model and the relationship matching model are updated and optimized; at the same time, at the end of each epoch, the relationship vector is put into the knowledge representation model for one epoch training, alternating the above training Process until all epochs are processed;
    第三保存子模块,用于保存所述语言模型以及所述关系匹配模型的参数。The third saving submodule is used to save the language model and the parameters of the relationship matching model.
  14. 根据权利要求13所述的基于人工智能的问答装置,其特征在于,所述问句向量获取子模块中,所述第二训练数据中的问句经由训练后的语言模型输出所述问句向量。The artificial intelligence-based question answering device according to claim 13, wherein in the question vector obtaining submodule, the question sentence in the second training data outputs the question sentence vector through the trained language model .
  15. 根据权利要求13所述的基于人工智能的问答装置,其特征在于,所述关系向量获取子模块中,所述第二训练数据中的问句中的关系经由嵌入层随机初始化表示为关系向量。The artificial intelligence-based question answering device according to claim 13, wherein in the relation vector obtaining sub-module, the relation in the question sentence in the second training data is randomly initialized and expressed as a relation vector by an embedding layer.
  16. 一种计算机设备,其特征在于:包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现基于人工智能的问答方法的以下步骤:A computer device, which is characterized by: comprising a memory, a processor, and a computer program stored in the memory and capable of running on the processor. When the processor executes the computer program, the following steps of an artificial intelligence-based question and answer method are implemented :
    基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;Performing language model training based on the first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
    基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;Perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
    基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;Performing relationship matching model training based on the second training data and the trained language model;
    基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;Identify entities in the sentence to be processed based on the trained NER model, and obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
    根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。According to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, the answer corresponding to the sentence to be processed is determined and output.
  17. 根据权利要求16所述的基于人工智能的问答方法,其特征在于:所述语言模型为google transformer语言模型,所述google transformer语言模型训练包括:The question and answer method based on artificial intelligence according to claim 16, wherein the language model is a google transformer language model, and the training of the google transformer language model includes:
    将所述第一训练数据输入至所述google transformer语言模型的中嵌入层进行向量化;Inputting the first training data to the middle embedding layer of the google transformer language model for vectorization;
    将向量化的第一训练数据输入至编码层获取自注意力计算矩阵;Input the vectorized first training data to the coding layer to obtain the self-attention calculation matrix;
    将所述自注意力计算矩阵输入至google transformer语言模型的损失函数中并基于梯度优化算法,更新优化所述google transformer语言模型的参数;Input the self-attention calculation matrix into the loss function of the google transformer language model, and update and optimize the parameters of the google transformer language model based on a gradient optimization algorithm;
    保存所述google transformer语言模型的参数设置。Save the parameter settings of the google transformer language model.
  18. 根据权利要求16所述的基于人工智能的问答方法,其特征在于:所述NER模型训练包括:The question answering method based on artificial intelligence according to claim 16, wherein the NER model training comprises:
    将所述第二训练数据中的问句经由训练后的所述语言模型处理后得到向量序列或者矩阵,并输入至所述NER模型,以输出预测NER标记;The question sentence in the second training data is processed by the language model after training to obtain a vector sequence or matrix, and input to the NER model to output a predicted NER label;
    比较所述人工标注NER标记和对应的所述预测NER标记,计算所述NER模型的损失函数;Comparing the artificially annotated NER mark with the corresponding predicted NER mark, and calculating the loss function of the NER model;
    基于梯度优化算法,更新优化所述语言模型与所述NER模型参数。Based on the gradient optimization algorithm, update and optimize the language model and the NER model parameters.
  19. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于:所述计算机程序被处理器执行时实现基于人工智能的问答方法的以下步骤:A computer-readable storage medium with a computer program stored thereon, characterized in that: when the computer program is executed by a processor, the following steps of an artificial intelligence-based question and answer method are implemented:
    基于第一训练数据进行语言模型训练,所述第一训练数据为指定领域大量自动标注的问句语料;Performing language model training based on the first training data, where the first training data is a large number of automatically labeled question corpus in a specified field;
    基于第二训练数据以及训练后的所述语言模型进行NER模型训练,所述第二训练数据为人工标注的NER数据,每个所述NER数据包括一问句以及该问句对应的人工标注NER标记;Perform NER model training based on the second training data and the trained language model. The second training data is manually labeled NER data. Each of the NER data includes a question and the manually labeled NER corresponding to the question. mark;
    基于所述第二训练数据以及训练后的所述语言模型进行关系匹配模型训练;Performing relationship matching model training based on the second training data and the trained language model;
    基于所述训练后的NER模型识别待处理语句中的实体,基于所述训练后的关系匹配模型获得所述待处理语句对应的关系;Identify entities in the sentence to be processed based on the trained NER model, and obtain the relationship corresponding to the sentence to be processed based on the trained relationship matching model;
    根据所述待处理语句对应的关系、所述待处理语句中的实体,确定所述待处理语句所对应的答案并输出。According to the relationship corresponding to the sentence to be processed and the entity in the sentence to be processed, the answer corresponding to the sentence to be processed is determined and output.
  20. 根据权利要求19所述的基于人工智能的问答方法,其特征在于:所述语言模型为google transformer语言模型,所述google transformer语言模型训练包括:The question and answer method based on artificial intelligence according to claim 19, wherein the language model is a google transformer language model, and the training of the google transformer language model includes:
    将所述第一训练数据输入至所述google transformer语言模型的中嵌入层进行向量化;Inputting the first training data to the middle embedding layer of the google transformer language model for vectorization;
    将向量化的第一训练数据输入至编码层获取自注意力计算矩阵;Input the vectorized first training data to the coding layer to obtain the self-attention calculation matrix;
    将所述自注意力计算矩阵输入至google transformer语言模型的损失函数中并基于梯度优化算法,更新优化所述google transformer语言模型的参数;Input the self-attention calculation matrix into the loss function of the google transformer language model, and update and optimize the parameters of the google transformer language model based on a gradient optimization algorithm;
    保存所述google transformer语言模型的参数设置。Save the parameter settings of the google transformer language model.
PCT/CN2019/117954 2019-07-19 2019-11-13 Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium WO2021012519A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910655550.XA CN110532397B (en) 2019-07-19 2019-07-19 Question-answering method and device based on artificial intelligence, computer equipment and storage medium
CN201910655550.X 2019-07-19

Publications (1)

Publication Number Publication Date
WO2021012519A1 true WO2021012519A1 (en) 2021-01-28

Family

ID=68661863

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117954 WO2021012519A1 (en) 2019-07-19 2019-11-13 Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN110532397B (en)
WO (1) WO2021012519A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641830A (en) * 2021-07-19 2021-11-12 北京百度网讯科技有限公司 Model pre-training method and device, electronic equipment and storage medium
CN113743095A (en) * 2021-07-19 2021-12-03 西安理工大学 Chinese problem generation unified pre-training method based on word lattice and relative position embedding
CN116483982A (en) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN117708306A (en) * 2024-02-06 2024-03-15 神州医疗科技股份有限公司 Medical question-answering architecture generation method and system based on layered question-answering structure
CN118052291A (en) * 2024-04-16 2024-05-17 北京海纳数聚科技有限公司 Vertical domain large language model training method based on expansion causal graph embedding
CN118093841A (en) * 2024-04-25 2024-05-28 浙江大学 Model training method and question-answering method for question-answering system

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209384B (en) * 2020-01-08 2023-08-15 腾讯科技(深圳)有限公司 Question-answer data processing method and device based on artificial intelligence and electronic equipment
CN111259127B (en) * 2020-01-15 2022-05-31 浙江大学 Long text answer selection method based on transfer learning sentence vector
CN111368058B (en) * 2020-03-09 2023-05-02 昆明理工大学 Question-answer matching method based on transfer learning
CN111538843B (en) * 2020-03-18 2023-06-16 广州多益网络股份有限公司 Knowledge-graph relationship matching method and model building method and device in game field
CN111950297A (en) * 2020-08-26 2020-11-17 桂林电子科技大学 Abnormal event oriented relation extraction method
CN113254612A (en) * 2021-05-24 2021-08-13 中国平安人寿保险股份有限公司 Knowledge question-answering processing method, device, equipment and storage medium
CN113779360A (en) * 2021-08-18 2021-12-10 深圳技术大学 Multi-head question-answering model-based question solving method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3179384A1 (en) * 2014-09-29 2017-06-14 Huawei Technologies Co., Ltd. Method and device for parsing interrogative sentence in knowledge base
CN106934012A (en) * 2017-03-10 2017-07-07 上海数眼科技发展有限公司 A kind of question answering in natural language method and system of knowledge based collection of illustrative plates
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN109684354A (en) * 2017-10-18 2019-04-26 北京国双科技有限公司 Data query method and apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446286B (en) * 2017-02-16 2023-04-25 阿里巴巴集团控股有限公司 Method, device and server for generating natural language question answers
CN108182262B (en) * 2018-01-04 2022-03-04 华侨大学 Intelligent question-answering system construction method and system based on deep learning and knowledge graph
CN108256065B (en) * 2018-01-16 2021-11-09 智言科技(深圳)有限公司 Knowledge graph reasoning method based on relation detection and reinforcement learning
CN108280062A (en) * 2018-01-19 2018-07-13 北京邮电大学 Entity based on deep learning and entity-relationship recognition method and device
CN109902171B (en) * 2019-01-30 2020-12-25 中国地质大学(武汉) Text relation extraction method and system based on hierarchical knowledge graph attention model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3179384A1 (en) * 2014-09-29 2017-06-14 Huawei Technologies Co., Ltd. Method and device for parsing interrogative sentence in knowledge base
CN106934012A (en) * 2017-03-10 2017-07-07 上海数眼科技发展有限公司 A kind of question answering in natural language method and system of knowledge based collection of illustrative plates
CN107748757A (en) * 2017-09-21 2018-03-02 北京航空航天大学 A kind of answering method of knowledge based collection of illustrative plates
CN109684354A (en) * 2017-10-18 2019-04-26 北京国双科技有限公司 Data query method and apparatus

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641830A (en) * 2021-07-19 2021-11-12 北京百度网讯科技有限公司 Model pre-training method and device, electronic equipment and storage medium
CN113743095A (en) * 2021-07-19 2021-12-03 西安理工大学 Chinese problem generation unified pre-training method based on word lattice and relative position embedding
CN113641830B (en) * 2021-07-19 2024-03-29 北京百度网讯科技有限公司 Model pre-training method, device, electronic equipment and storage medium
CN116483982A (en) * 2023-06-25 2023-07-25 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN116483982B (en) * 2023-06-25 2023-10-13 北京中关村科金技术有限公司 Knowledge question-answering method, knowledge question-answering device, electronic equipment and readable storage medium
CN117708306A (en) * 2024-02-06 2024-03-15 神州医疗科技股份有限公司 Medical question-answering architecture generation method and system based on layered question-answering structure
CN117708306B (en) * 2024-02-06 2024-05-03 神州医疗科技股份有限公司 Medical question-answering architecture generation method and system based on layered question-answering structure
CN118052291A (en) * 2024-04-16 2024-05-17 北京海纳数聚科技有限公司 Vertical domain large language model training method based on expansion causal graph embedding
CN118093841A (en) * 2024-04-25 2024-05-28 浙江大学 Model training method and question-answering method for question-answering system

Also Published As

Publication number Publication date
CN110532397A (en) 2019-12-03
CN110532397B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
WO2021012519A1 (en) Artificial intelligence-based question and answer method and apparatus, computer device, and storage medium
US11574122B2 (en) Method and system for joint named entity recognition and relation extraction using convolutional neural network
US11922121B2 (en) Method and apparatus for information extraction, electronic device, and storage medium
WO2020140386A1 (en) Textcnn-based knowledge extraction method and apparatus, and computer device and storage medium
US10755048B2 (en) Artificial intelligence based method and apparatus for segmenting sentence
WO2020258487A1 (en) Method and apparatus for sorting question-answer relationships, and computer device and storage medium
WO2021121198A1 (en) Semantic similarity-based entity relation extraction method and apparatus, device and medium
WO2021114810A1 (en) Graph structure-based official document recommendation method, apparatus, computer device, and medium
US9058317B1 (en) System and method for machine learning management
CN109726298B (en) Knowledge graph construction method, system, terminal and medium suitable for scientific and technical literature
CN112084789B (en) Text processing method, device, equipment and storage medium
CN109804371B (en) Method and device for semantic knowledge migration
WO2022222300A1 (en) Open relationship extraction method and apparatus, electronic device, and storage medium
CN111552766B (en) Using machine learning to characterize reference relationships applied on reference graphs
CN111143571A (en) Entity labeling model training method, entity labeling method and device
WO2021151322A1 (en) Method and apparatus for entity identification based on deep learning model, device, and medium
CN114153994A (en) Medical insurance information question-answering method and device
CN116821373A (en) Map-based prompt recommendation method, device, equipment and medium
CN109189907A (en) A kind of search method and device based on semantic matches
KR102642488B1 (en) Data providing device, method and computer program generating answer using artificial intelligence technology
US11514321B1 (en) Artificial intelligence system using unsupervised transfer learning for intra-cluster analysis
US12014276B2 (en) Deterministic training of machine learning models
CN112948561B (en) Method and device for automatically expanding question-answer knowledge base
EP3964980A1 (en) Automatically recommending an existing machine learning project as adaptable for use in a new machine learning project
CN111625579B (en) Information processing method, device and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19938934

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19938934

Country of ref document: EP

Kind code of ref document: A1