CN115525776A

CN115525776A - Event extraction model training method, event extraction method and related equipment

Info

Publication number: CN115525776A
Application number: CN202211351675.1A
Authority: CN
Inventors: 刘珮; 钱兵; 谢汉垒; 薛艳茹; 马冲
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2022-10-31
Filing date: 2022-10-31
Publication date: 2022-12-27

Abstract

The invention provides an event extraction model training method, an event extraction method and related equipment, wherein the event extraction model training method comprises the following steps: generating a knowledge graph based on the guidance information, the expert cases and the communication dictionary; encoding the knowledge graph to obtain knowledge graph encoding; encoding the event case to obtain a text code; fusing the knowledge graph codes and the text codes to obtain fused codes; inputting the fusion code into a first extraction model to obtain a pseudo data label of the event case; a second extraction model is trained based on the event cases and the pseudo data labels. The method and the device improve the generalization capability and the extraction accuracy of the event extraction model under the condition of small sample data.

Description

Event extraction model training method, event extraction method and related equipment

Technical Field

The invention relates to the field of natural language processing, in particular to an event extraction model training method, an event extraction method and related equipment.

Background

In a specific field such as communications, various decisions need to be made by a domain expert or a strategic planning expert in order to face various problems. Prior to decision making, research is required-that is, intelligence is obtained about the problem. Event information can be automatically obtained through event extraction, and the information screening time is greatly reduced in the information explosion era through an automatic extraction mode.

However, the conventional event extraction method generally includes a plurality of subtasks, in which a trigger word is detected, an event category is identified, an argument is detected, and an argument role is identified. The first step requires a labeled trigger word, and if not, the subsequent steps cannot be carried out. The supervised event extraction method depends on manual labeling data, meanwhile, models need to be trained to finish subtasks one by one, model series connection leads to error transmission, and the effect of the latter model greatly depends on a preorder model. In addition, the model lacks prior information, the dependence on data is obvious, the learning difficulty is increased under the condition of less sample data, and the effect is poor.

Therefore, it is an urgent technical problem to be solved by those skilled in the art how to improve the generalization ability and extraction accuracy of the event extraction model when the sample data is small.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the invention and therefore may include information that does not constitute prior art that is already known to a person of ordinary skill in the art.

Disclosure of Invention

Aiming at the problems in the prior art, the invention aims to provide an event extraction model training method, device, equipment and storage medium, which overcome the difficulties in the prior art and improve the generalization capability and extraction accuracy of an event extraction model under the condition of small sample data.

The embodiment of the invention provides an event extraction model training method, which comprises the following steps:

generating a knowledge graph based on the guidance information, the expert cases and the communication dictionary;

coding the knowledge graph to obtain knowledge graph codes;

encoding the event case to obtain a text code;

fusing the knowledge graph codes and the text codes to obtain fused codes;

inputting the fusion code into a first extraction model to obtain a pseudo data label of the event case;

a second extraction model is trained based on the event cases and the pseudo data labels.

In some embodiments of the present application, the generating a knowledge-graph based on the guidance information, the expert case, and the communication dictionary comprises:

carrying out format conversion on the guide information to obtain a knowledge block;

and generating a knowledge block tree structure based on the text data, wherein a root node of the knowledge block tree structure is a file name, leaf nodes of the knowledge block tree structure are knowledge blocks, and nodes of the knowledge block tree structure except the root node and the leaf nodes are multi-level titles.

In some embodiments of the present application, the encoding the knowledge-graph to obtain knowledge-graph encoding includes:

extracting a first entity of the expert case;

searching the knowledge graph for associated entities and entity relationships based on the first entity;

and encoding the entity and the entity relation.

In some embodiments of the present application, the encoding the entities and entity relationships comprises:

and encoding the entities and entity relations by using a graph neural network or a TransE algorithm.

In some embodiments of the present application, said fusing said knowledge-graph coding and said text coding, obtaining a fused coding comprises:

and splicing, multiplying, adding or weighting and summing the knowledge graph codes and the text codes to obtain fusion codes.

In some embodiments of the present application, the first extraction model is an event extraction model based on DMCNN, or a composite event extraction model of ALBERT, biLSTM, CRF.

In some embodiments of the present application, the second extraction model is an event extraction model based on a full sub-graph search.

According to another aspect of the present application, there is also provided an event extraction method, including:

inputting the event to be extracted into a second extraction model, wherein the second extraction model is trained by the event extraction model training method;

and obtaining an entity output by the second extraction model.

According to another aspect of the present application, there is also provided an event extraction model training apparatus, including:

a knowledge graph generation module configured to generate a knowledge graph based on the guidance information, the expert cases, and the communication dictionary;

the knowledge graph coding module is configured to code the knowledge graph to obtain knowledge graph codes;

the event coding module is configured to code the event case to obtain a text code;

a fusion module configured to fuse the knowledge-graph code and the text code to obtain a fusion code;

a pseudo data label obtaining module configured to input the fusion code into a first extraction model to obtain a pseudo data label of the event case;

a training module configured to train a second extraction model based on the event cases and the pseudo data labels.

According to another aspect of the present application, there is also provided an event extraction device, including:

an input module configured to input an event to be extracted to a second extraction model, the second extraction model being trained via an event extraction model training method as described above;

an extraction module configured to obtain an entity output by the second extraction model.

According to still another aspect of the present invention, there is also provided a processing apparatus including:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform the steps of the event extraction model training method as described above via execution of the executable instructions.

Embodiments of the present invention also provide a computer-readable storage medium for storing a program, which when executed, implements the steps of the above-described event extraction model training method.

Compared with the prior art, the invention aims to:

background knowledge is added in training of a small-scale data set, a knowledge graph constructed by priori knowledge is used as a background knowledge adding model, model reasoning is assisted, generalization capability is improved, overfitting is reduced, and the knowledge graph contains multi-hop knowledge which is not possessed by a text, so that reasoning capability of the model can be greatly improved. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, the model is used for distillation, the training result of the good-effect model is distilled to the efficient model, the accuracy and the efficiency of the final model are both considered, and the effect and the efficiency of event extraction are greatly improved.

Drawings

Other features, objects and advantages of the present invention will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, with reference to the accompanying drawings.

FIG. 1 is a flow diagram of one embodiment of an event extraction model training method of the present invention.

FIG. 2 is a flow chart of an embodiment of the event extraction model training method of the present invention.

FIG. 3 is a schematic diagram of an embodiment of an event extraction model training method of the present invention.

FIG. 4 is a flow diagram of one embodiment of an event extraction method of the present invention.

FIG. 5 is a block diagram of an embodiment of the event extraction model training apparatus of the present invention.

FIG. 6 is a block diagram of an embodiment of the event extraction model training apparatus according to the present invention.

FIG. 7 is a block diagram of one embodiment of an event extraction device of the present invention.

FIG. 8 is a schematic diagram of the structure of the event extraction model training device of the present invention.

Fig. 9 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar structures, and thus their repetitive description will be omitted.

Referring now to FIG. 1, FIG. 1 is a flow diagram of one embodiment of an event extraction model training method of the present invention. The embodiment of the invention provides an event extraction model training method, which comprises the following steps:

step S110: generating a knowledge graph based on the guidance information, the expert cases and the communication dictionary;

step S120: encoding the knowledge graph to obtain knowledge graph encoding;

step S130: encoding the event case to obtain a text code;

step S140: fusing the knowledge graph codes and the text codes to obtain fused codes;

step S150: inputting the fusion code into a first extraction model to obtain a pseudo data label of the event case;

step S160: a second extraction model is trained based on the event cases and the pseudo data labels.

Therefore, background knowledge is added in the training of the small-scale data set, the knowledge graph constructed by the priori knowledge is used as a background knowledge adding model, the model reasoning is assisted, the generalization capability is improved, the overfitting is reduced, and the reasoning capability of the model can be greatly improved as the knowledge graph contains multi-hop knowledge which is not possessed by the text. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, model distillation is utilized, the training result of the model with good effect is distilled to the model with high efficiency, the accuracy and the efficiency of the model are considered finally, and the effect and the efficiency of event extraction are greatly improved.

Referring now to FIG. 2, FIG. 2 is a flow chart of an embodiment of the event extraction model training method of the present invention.

Step S201: and converting the format of the guide information to obtain a knowledge block.

Step S202: and generating a knowledge block tree structure based on the text data, wherein a root node of the knowledge block tree structure is a file name, leaf nodes of the knowledge block tree structure are knowledge blocks, and nodes of the knowledge block tree structure except the root node and the leaf nodes are multi-level titles.

Step S203: and extracting the expert cases, and constructing a knowledge graph by combining the knowledge block tree structure.

Step S204: the communication dictionary is stored in the knowledge graph.

Step S205: a first entity of the expert case is extracted.

Step S206: and searching related entities and entity relations from the knowledge graph based on the first entity.

Step S207: and encoding the entity and the entity relation. And encoding the entities and entity relations by using a graph neural network or a TransE algorithm.

Step S208: and splicing, multiplying, adding or weighting and summing the knowledge graph codes and the text codes to obtain fusion codes.

Step S209: inputting the fusion code into a first extraction model to obtain a pseudo data label of the event case;

the first extraction model can be an event extraction model based on DMCNN, or a composite event extraction model of ALBERT, bilSTM and CRF.

Step S210: a second extraction model is trained based on the event cases and the pseudo data labels.

The second extraction model may be an event extraction model based on a full sub-graph search.

Referring now to FIG. 3, FIG. 3 is a schematic diagram of an embodiment of an event extraction model training method of the present invention.

First, knowledge graph is constructed by heterogeneous background knowledge.

And multi-source heterogeneous data such as a guidance manual, an index system and the like are uniformly converted into text data. The conversion operation may include, but is not limited to, PDF conversion, picture extraction replacement by path (e.g., replacing a picture by a picture address stored by a local server or a unique code of a picture), picture table conversion to text table, etc. And acquiring a file name, a first level title, a second level title, a third level title and a knowledge block (text block, picture block and text table block) based on the guidance information such as a guidance manual, an index system and the like to construct a knowledge block tree structure. Then data is imported into the knowledge graph, the root node of the knowledge graph is a file name, a first-level, second-level and third-level title is downwards respectively, leaf nodes are contents of each knowledge block, and the path of the current node in the tree structure diagram is stored in the attribute path of each node, for example, the path form of a certain knowledge block is as follows: file name/primary title/secondary title/tertiary title.

And (4) extracting fault description, reasons, solutions and effects from expert cases, and combining the fault description, the reasons, the solutions and the effects with the multi-source heterogeneous data to construct a knowledge graph.

The communication dictionary is stored in the knowledge graph. Specifically, a triplet may be constructed for the communication dictionary to obtain (communication term, communication term interpretation, communication term detailed interpretation). And storing the obtained triples into the knowledge graph.

The knowledge-graph is then encoded using a graph neural network or transE.

Searching relevant primary entities and primary relations (the primary entity is an entity to be extracted, the primary relation is a related relation with the entity, and the other entity connected with the related relation is a secondary entity) in the knowledge graph by using the entity extracted from the input expert case document, and coding the entities and the relations by using a graph neural network or a TransE algorithm to obtain descriptive knowledge representation and relevance knowledge representation of the entity.

Then, encoding fusion is performed.

The method can encode wireless network case data (sample cases) by using BERT to obtain text codes, and fuses with knowledge graph codes of the background knowledge as model input, wherein the fusion mode can be splicing, multiplication, addition or weighted summation based on an attention mechanism.

Then, an event extraction model based on DMCNN or ALBERT + BilSTM + CRF is constructed.

And respectively training a trigger word recognition model and an argument role recognition model by using the fusion code as input, wherein the trigger word recognition model and the argument role recognition model jointly form an event extraction model. The model selection can be an event extraction model Dynamic Multi-porous conditional Neural Networks or a coincidence event extraction model of ALBERT + BilSTM + CRF, and the first event extraction model has the characteristics of high accuracy, good model effect, poor efficiency and long training time.

Then, model distillation was performed.

Predicting the label of the unlabeled sample case by utilizing the trained first event extraction model I, acquiring pseudo label data, and then training a second event extraction model by utilizing the pseudo label data and the sample case. The second event extraction model may be a relatively complete joint extraction model of events in conjunction with a full sub-graph search, such as the second event extraction model may be a document-level event extraction model based on heterogeneous graph interactions and event trackers. And the trained second model serves as a final event extraction model to predict new event extraction data.

This increases background knowledge in small-scale sample data input. The small-scale sample data set has less data volume, less labeled data volume and less available information, and can not meet the requirements of model reasoning. The knowledge graph of the relevant field constructed by the prior knowledge is used as a background knowledge adding model. The background knowledge is encoded by using a graph neural network or other neural networks, and is added into an inference model to assist the inference process of the model and improve the generalization capability.

Model distillation is utilized to give consideration to both accuracy and efficiency. The problem that the existing model is high in accuracy but low in efficiency is considered, the existing model is not complete enough when being used for extracting actual events, so that a batch of pseudo label data can be trained, then the pseudo label data is distilled to a model with high efficiency, the trained model with high efficiency is used for extracting in actual extraction, and finally the effect and the efficiency of the model are considered.

And the method is suitable for small sample data. And various improvements aiming at event extraction of small sample data are realized, the information provided by the data is improved by establishing priori knowledge, the problem of insufficient training data of the small sample data is solved by model distillation, and the pseudo label data is verified by utilizing global rules to obtain the expanded corpus.

Referring now to fig. 4, fig. 4 is a flow chart of one embodiment of an event extraction method of the present invention. The event extraction method comprises the following steps:

step S410: inputting the event to be extracted into a second extraction model, wherein the second extraction model is trained by the event extraction model training method;

step S420: and obtaining an entity output by the second extraction model.

The background knowledge is added in the training of the small-scale data set, the knowledge graph constructed by the priori knowledge is used as the background knowledge and added into the model, the model is assisted to carry out reasoning, the generalization capability is improved, overfitting is reduced, and the knowledge graph contains multi-hop knowledge which is not possessed by the text, so that the reasoning capability of the model can be greatly improved. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, model distillation is utilized, the training result of the model with good effect is distilled to the model with high efficiency, the accuracy and the efficiency of the model are considered finally, and the effect and the efficiency of event extraction are greatly improved.

The above description is only illustrative of specific implementations of the present invention, and the present invention is not limited thereto, and the steps of splitting, merging, changing the execution sequence, splitting, merging, and information transmission are all within the protection scope of the present invention.

FIG. 5 is a block diagram of an embodiment of the event extraction model training apparatus of the present invention. The event extraction model training apparatus 500 of the present invention, as shown in fig. 5, includes, but is not limited to, a knowledge graph generation module 510, a knowledge graph coding module 520, an event coding module 530, a fusion module 540, a pseudo data label acquisition module 550, and a training module 560.

The knowledge-graph generation module 510 is configured to generate a knowledge graph based on the guidance information, the expert cases, and the communication dictionary;

the knowledge-graph encoding module 520 is configured to encode the knowledge graph to obtain knowledge-graph encoding;

the event encoding module 530 is configured to encode the event case, obtaining a text code;

the fusion module 540 is configured to fuse the knowledge-graph encoding and the text encoding to obtain a fused encoding;

the pseudo data label acquiring module 550 is configured to input the fusion code into the first extraction model, and acquire a pseudo data label of the event case;

the training module 560 is configured to train a second extraction model based on the event cases and the pseudo data labels.

FIG. 6 is a block diagram of an embodiment of an event extraction model training apparatus according to the present invention. The event extraction model training apparatus 600 includes:

a format conversion module 601, configured to perform format conversion on the guidance information to obtain a knowledge block.

A tree structure building module 602, configured to generate a knowledge block tree structure based on the text data, where a root node of the knowledge block tree structure is a filename, leaf nodes of the knowledge block tree structure are knowledge blocks, and nodes of the knowledge block tree structure other than the root node and the leaf nodes are multilevel titles.

And the knowledge graph constructing module 603 is used for extracting the expert cases and constructing the knowledge graph by combining the knowledge block tree structure.

A communication dictionary module 604 for storing a communication dictionary in the knowledge graph.

A first extraction module 605 for extracting the first entity of the expert case.

A searching module 606 for searching the knowledge-graph for associated entities and entity relationships based on the first entity.

The entity and relationship encoding module 607 is used for encoding the entity and the entity relationship. And encoding the entities and entity relations by using a graph neural network or a TransE algorithm.

And a fusion coding module 608, configured to splice, multiply, add, or sum the knowledge-graph code and the text code to obtain a fusion code.

A pseudo data label obtaining module 609, configured to input the fusion code into a first extraction model, so as to obtain a pseudo data label of the event case;

a training module 610, configured to train a second extraction model based on the event cases and the pseudo data labels.

The implementation principle of the above modules is described in the event extraction model training method, and is not described herein again.

According to the event extraction model training device, background knowledge is added in the training of a small-scale data set, the knowledge graph constructed by the priori knowledge is used as a background knowledge adding model to assist in model reasoning, the generalization capability is improved, overfitting is reduced, and the reasoning capability of the model can be greatly improved as the knowledge graph contains multi-hop knowledge which is not possessed by a text. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, model distillation is utilized, the training result of the model with good effect is distilled to the model with high efficiency, the accuracy and the efficiency of the model are considered finally, and the effect and the efficiency of event extraction are greatly improved.

Fig. 5 and fig. 6 are only schematic diagrams respectively illustrating the event extraction model training apparatuses 500 and 600 provided by the present invention, and the splitting, combining, and adding of modules are within the scope of the present invention without departing from the concept of the present invention. The event extraction model training apparatuses 500 and 600 provided by the present invention can be implemented by software, hardware, firmware, plug-in and any combination thereof, which is not limited by the present invention.

Referring now to fig. 7, fig. 7 is a block diagram of one embodiment of an event extraction mechanism of the present invention. The event extraction device 700 includes:

an input module 7110 configured to input events to be extracted to a second extraction model, said second extraction model being trained via the event extraction model training method as described above;

a decimation module 720 configured to obtain an entity output by the second decimation model.

Fig. 7 is a schematic diagram illustrating an event extraction apparatus 700 provided by the present invention, respectively, and the splitting, merging and adding of modules are within the protection scope of the present invention without departing from the concept of the present invention. The event extraction device 700 provided by the present invention can be implemented by software, hardware, firmware, plug-in and any combination thereof, and the present invention is not limited thereto.

The embodiment of the invention also provides a processing device which comprises a processor. A memory having stored therein executable instructions of the processor. Wherein the processor is configured to perform the steps of the event extraction model training method via execution of the executable instructions.

As shown above, the processing device of the embodiment of the present invention adds background knowledge in the training of the small-scale data set, uses the knowledge graph constructed by the priori knowledge as a background knowledge adding model to assist in model inference, improves generalization capability, reduces overfitting, and can greatly improve inference capability of the model because the knowledge graph contains multi-hop knowledge that the text does not have. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, the model is used for distillation, the training result of the good-effect model is distilled to the efficient model, the accuracy and the efficiency of the final model are both considered, and the effect and the efficiency of event extraction are greatly improved.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Accordingly, various aspects of the present invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" platform.

FIG. 8 is a schematic view of the structure of the processing apparatus of the present invention. An electronic device 800 according to this embodiment of the invention is described below with reference to fig. 8. The electronic device 800 shown in fig. 8 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present invention.

As shown in fig. 8, electronic device 800 is in the form of a general purpose computing device. The components of the electronic device 800 may include, but are not limited to: at least one processing unit 810, at least one memory unit 820, a bus 830 connecting different platform components (including memory unit 820 and processing unit 810), a display unit 840, etc.

Wherein the storage unit stores program code, which can be executed by the processing unit 810, to cause the processing unit 810 to execute the steps according to various exemplary embodiments of the present invention described in the event extraction model training method and/or the event extraction method section described above in this specification. For example, processing unit 810 may perform the steps as shown in fig. 2.

The storage unit 820 may include readable media in the form of volatile memory units such as a random access memory unit (RAM) 8201 and/or a cache memory unit 8202, and may further include a read only memory unit (ROM) 8203.

The storage unit 820 may also include a program/utility 8204 having a set (at least one) of program modules 8205, such program modules 8205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 830 may be any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 800 may also communicate with one or more external devices 8001 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 800, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 800 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 850. Also, the electronic device 800 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 860. A network adapter 860 may communicate with the other modules of the electronic device 800 via the bus 830. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 800, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms, to name a few.

Embodiments of the present invention further provide a computer-readable storage medium for storing a program, and the steps of the event extraction model training method and/or the event extraction method implemented when the program is executed. In some possible embodiments, aspects of the present invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the present invention described in the event extraction model training method and/or the event extraction method section above in this specification when the program product is run on the terminal device.

As described above, in the computer-readable storage medium for executing event extraction model training according to this embodiment, the background knowledge is added in the training of the small-scale data set, the knowledge graph constructed using the prior knowledge is used as a background knowledge addition model to assist model inference, so that the generalization capability is improved, overfitting is reduced, and the inference capability of the model can be greatly improved because the knowledge graph contains multi-hop knowledge that the text does not have. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, the model is used for distillation, the training result of the good-effect model is distilled to the efficient model, the accuracy and the efficiency of the final model are both considered, and the effect and the efficiency of event extraction are greatly improved.

Fig. 9 is a schematic structural diagram of a computer-readable storage medium of the present invention. Referring to fig. 9, a program product 900 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

In conclusion, background knowledge is added in the training of the small-scale data set, the knowledge graph constructed by the priori knowledge is used as a background knowledge adding model to assist in model reasoning, generalization capability is improved, overfitting is reduced, and the reasoning capability of the model can be greatly improved because the knowledge graph contains multi-hop knowledge which is not possessed by a text. In addition, aiming at the problem that the current event extraction model is good in effect but poor in efficiency, model distillation is utilized, the training result of the model with good effect is distilled to the model with high efficiency, the accuracy and the efficiency of the model are considered finally, and the effect and the efficiency of event extraction are greatly improved.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. An event extraction model training method is characterized by comprising the following steps:

encoding the knowledge graph to obtain knowledge graph encoding;

encoding the event case to obtain a text code;

fusing the knowledge graph codes and the text codes to obtain fused codes;

2. The event extraction model training method of claim 1, wherein the generating a knowledge graph based on the guidance information, the expert cases, and the communication dictionary comprises:

3. The event extraction model training method according to claim 1, wherein the encoding the knowledge-graph to obtain knowledge-graph encoding comprises:

extracting a first entity of the expert case;

and encoding the entity and the entity relation.

4. The event extraction model training method according to claim 3, wherein the encoding the entities and entity relationships comprises:

5. The event extraction model training method according to claim 1, wherein the fusing the knowledge-graph code and the text code to obtain a fused code comprises:

6. The method of claim 1, wherein the first extraction model is an event extraction model based on DMCNN, or a composite event extraction model of ALBERT, bilSTM, CRF.

7. The event extraction model training method according to claim 1, wherein the second extraction model is an event extraction model based on a complete sub-graph search.

8. An event extraction method, comprising:

inputting events to be extracted into a second extraction model, the second extraction model being trained via an event extraction model training method according to any one of claims 1 to 7;

and obtaining an entity output by the second extraction model.

9. An event extraction model training device, comprising:

10. An event extraction device, comprising:

an input module configured to input an event to be extracted to a second extraction model, the second extraction model being trained via the event extraction model training method of any one of claims 1 to 7;

11. A processing device, comprising:

a processor;

a memory having stored therein executable instructions of the processor;

wherein the processor is configured to perform, via execution of the executable instructions:

the event extraction model training method of any one of claims 1 to 7; and/or

The event extraction model training method of claim 8.

12. A computer-readable storage medium storing a program that when executed implements:

the event extraction model training method of any one of claims 1 to 7; and/or

The event extraction model training method of claim 8.