CN112487109A - Entity relationship extraction method, terminal and computer readable storage medium - Google Patents
Entity relationship extraction method, terminal and computer readable storage medium Download PDFInfo
- Publication number
- CN112487109A CN112487109A CN202011386678.XA CN202011386678A CN112487109A CN 112487109 A CN112487109 A CN 112487109A CN 202011386678 A CN202011386678 A CN 202011386678A CN 112487109 A CN112487109 A CN 112487109A
- Authority
- CN
- China
- Prior art keywords
- entity
- data
- term memory
- memory network
- quantum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 68
- 238000003860 storage Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 74
- 230000002457 bidirectional effect Effects 0.000 claims abstract description 70
- 230000015654 memory Effects 0.000 claims abstract description 62
- 230000006403 short-term memory Effects 0.000 claims abstract description 52
- 230000007246 mechanism Effects 0.000 claims abstract description 43
- 230000008447 perception Effects 0.000 claims abstract description 38
- 238000000034 method Methods 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 10
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000007781 pre-processing Methods 0.000 claims description 5
- 230000007787 long-term memory Effects 0.000 abstract description 31
- 238000011161 development Methods 0.000 abstract description 9
- 238000012545 processing Methods 0.000 description 12
- 238000010276 construction Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 11
- 238000011160 research Methods 0.000 description 11
- 230000018109 developmental process Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 7
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008034 disappearance Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 206010044565 Tremor Diseases 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000005040 ion trap Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The application discloses an entity relationship extraction method, a terminal and a computer readable storage medium, wherein the entity relationship extraction method comprises the following steps: acquiring text data related to the quantum, and constructing a self-attention mechanism; constructing a bidirectional long and short term memory network based on an attention mechanism, and calculating entity perception attention based on the bidirectional long and short term memory network; performing joint training on first data in the bidirectional long-short term memory network and second data in entity perception attention to obtain a training model; and inputting the text data into a training model for training to obtain an extraction result. The technical problems that due to rapid development of the industry, names of entities related to a plurality of quanta cannot be identified in time and the relations among the related entities of the quanta need to be combed are solved, complicated feature engineering is avoided, and the entity relation combined extraction efficiency is improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to an entity relationship extraction method, a terminal, and a computer-readable storage medium.
Background
With the development of scientific research, industrialization and marketization in the quantum field in international advanced countries and China, especially the commercial application of quantum computers and quantum encryption technology is gradually started, and the influence of quantum technology on various industries in China is increasing day by day. The quantum related entities are identified from publicly published network news texts, and the relationship among the quantum related entities is extracted, so that the relationship graph among quantum enterprises is constructed, and the important research content is provided. At present, the relation between quantum related entities is mainly screened in a manual mode, and methods for information extraction and entity relation extraction based on the quantum related entities are not enough. In summary, the existing quantum-related network news and credit worthiness text entity relationship joint extraction mainly has the following problems: as the industry has developed dramatically, the names of numerous quantum-related entities have not been identified in a timely manner, and the relationships among the quantum-related entities have yet to be combed.
Disclosure of Invention
The embodiment of the application aims to solve the problems that due to rapid development of the industry, names of entities related to a plurality of quanta cannot be identified in time, and the relationship among the related entities of the quanta needs to be combed.
In order to achieve the above object, an aspect of the present application provides an entity relationship extraction method, where the entity relationship extraction method includes the following steps:
acquiring text data related to the quantum, and constructing a self-attention mechanism;
constructing a bidirectional long-short term memory network based on the self-attention mechanism, and calculating entity perception attention based on the bidirectional long-short term memory network;
performing joint training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model;
and inputting the text data into the training model for training to obtain an extraction result.
Optionally, the step of constructing a self-attention mechanism comprises:
performing point multiplication calculation on the query and the key value to obtain a weight factor;
and scaling the weight factor to obtain scaling data, normalizing the scaling data to obtain a weight coefficient, and performing weighted summation on the weight coefficient and a set value.
Optionally, the step of constructing a bidirectional long-short term memory network based on the self-attention mechanism includes:
acquiring data of the data layer of the self-attention mechanism;
and constructing a forward long short-term memory network and a backward long short-term memory network based on the data of the data layer, and forming the bidirectional long short-term memory network based on the forward long short-term memory network and the backward long short-term memory network.
Optionally, the step of computing entity awareness based on the bidirectional long-short term memory network comprises:
calculating relative location features and potential types of entity features based on the bidirectional long-short term memory network;
and obtaining the entity perception attention according to the relative position characteristics and the potential type entity characteristics.
Optionally, the step of jointly training the first data in the bidirectional long-short term memory network and the second data in the entity perception attention comprises:
acquiring first data of a parameter sharing layer in the bidirectional long-short term memory network and second data of a joint decoding layer in the entity perception attention;
and performing joint training on the first data of the parameter sharing layer and the second data of the joint decoding layer.
Optionally, the step of jointly training the first data of the parameter sharing layer and the second data of the joint decoding layer includes:
constructing a gradient descent optimizer of the loss function;
a gradient descent optimizer based on the loss function jointly trains the first data of the parameter sharing layer and the second data of the joint decoding layer.
Optionally, after the step of obtaining the text data related to the quantum, the method includes:
carrying out structuralized preprocessing on the text data, and carrying out word segmentation and quantum entity recognition operation to obtain a sentence containing a plurality of quantum entities;
a corpus is composed based on the sentences.
Optionally, after the step of obtaining the text data related to the quantum, the method further includes:
quantum entity naming is performed based on the text data, and a previous relation of the quantum entity is determined.
In addition, in order to achieve the above object, another aspect of the present application further provides a terminal, where the terminal includes a memory, a processor, and an entity relationship extraction program stored in the memory and running on the processor, and the processor implements the steps of the entity relationship extraction method when executing the entity relationship extraction program.
In addition, to achieve the above object, another aspect of the present application further provides a computer-readable storage medium, on which an entity relationship extraction program is stored, and the entity relationship extraction program, when executed by a processor, implements the steps of the entity relationship extraction method as described above.
The embodiment constructs a self-attention mechanism by acquiring text data related to quanta; constructing a bidirectional long and short term memory network based on an attention mechanism, and calculating entity perception attention based on the bidirectional long and short term memory network; performing joint training on first data in the bidirectional long-short term memory network and second data in entity perception attention to obtain a training model; and inputting the text data into a training model for training to obtain an extraction result. The problem that due to rapid development of the industry, names of entities related to a plurality of quanta cannot be identified in time and the relationship among the related entities of the quanta needs to be combed is solved, complicated feature engineering is avoided, and the entity relationship combined extraction efficiency is improved.
Drawings
Fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a first embodiment of an entity relationship extraction method according to the present application;
FIG. 3 is a flowchart illustrating a second embodiment of an entity relationship extraction method according to the present application;
FIG. 4 is a schematic flow chart illustrating a method for constructing a self-attention mechanism in the entity relationship extraction method according to the present application;
FIG. 5 is a schematic flow chart illustrating the construction of a bidirectional long-term and short-term memory network based on the self-attention mechanism in the entity relationship extraction method of the present application;
FIG. 6 is a schematic diagram illustrating a process of calculating entity awareness based on the two-way long-short term memory network in the entity relationship extraction method of the present application;
fig. 7 is a schematic flow chart illustrating the joint training of the first data in the bidirectional long-short term memory network and the second data in the entity perception attention in the entity relationship extraction method according to the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The main solution of the embodiment of the application is as follows: acquiring text data related to the quantum, and constructing a self-attention mechanism; constructing a bidirectional long-short term memory network based on the self-attention mechanism, and calculating entity perception attention based on the bidirectional long-short term memory network; performing joint training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model; and inputting the text data into the training model for training to obtain an extraction result.
The existing quantum-related network news and credit worthiness text entity relationship joint extraction has the following problems: as the industry has developed dramatically, the names of numerous quantum-related entities have not been identified in a timely manner, and the relationships among the quantum-related entities have yet to be combed. The method includes the steps that a self-attention mechanism is constructed by obtaining text data related to quanta; constructing a bidirectional long and short term memory network based on an attention mechanism, and calculating entity perception attention based on the bidirectional long and short term memory network; performing joint training on first data in the bidirectional long-short term memory network and second data in entity perception attention to obtain a training model; and inputting the text data into a training model for training to obtain an extraction result. The problem that due to rapid development of the industry, names of entities related to a plurality of quanta cannot be identified in time and the relationship among the related entities of the quanta needs to be combed is solved, complicated feature engineering is avoided, and the entity relationship combined extraction efficiency is improved.
As shown in fig. 1, fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present application.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the terminal may further include a camera, a Radio Frequency (RF) circuit, a sensor, a remote controller, an audio circuit, a WiFi module, a detector, and the like. Of course, the terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer and a temperature sensor, which are not described herein again.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 does not constitute a limitation of the terminal device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and an entity relationship extraction program.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to invoke the entity relationship extraction routine in the memory 1005 and perform the following operations:
acquiring text data related to the quantum, and constructing a self-attention mechanism;
constructing a bidirectional long-short term memory network based on the self-attention mechanism, and calculating entity perception attention based on the bidirectional long-short term memory network;
performing joint training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model;
and inputting the text data into the training model for training to obtain an extraction result.
Referring to fig. 2, fig. 2 is a schematic flowchart of a first embodiment of the entity relationship extraction method according to the present application.
Embodiments of the present application provide embodiments of an entity relationship extraction method, and it should be noted that, although a logical order is shown in a flowchart, in some cases, the steps shown or described may be performed in an order different from that shown or described here.
The entity relationship extraction method comprises the following steps:
step S10, acquiring text data related to the quantum and constructing a self-attention mechanism;
with the development of quantum technology, a large amount of text data related to quantum exists in the internet, the data usually contains a large amount of knowledge, and the construction of a relation graph among quantum enterprises based on the knowledge becomes important research content. In order to automatically construct a relationship graph from massive text data, entity relationship extraction is becoming a popular research task, wherein the entity relationship extraction task aims to identify (entities, relationship types, entities) triples existing in the text, such as (school a, cooperation, company B), wherein school a and company B are quantum-related entities, and the entity relationship between school a and company B is a cooperative relationship. The triples existing in the text can be classified into three categories, namely normal triples, single entity overlapping triples, and entity-to-overlapping triples. The single entity overlapping triple refers to that two relation triples share the same entity; an entity-to-overlapping triplet refers to the existence of multiple relationships between two entities.
The terminal comprises a public data acquisition module and a private data acquisition module, wherein the public data acquisition module is used for collecting the original network public data of enterprise information and commercial activities related to quantum technology on a public network and a news medium. The private data acquisition module is used for acquiring private data related to the quantum in the quantum industry and in scientific research scope. Wherein, the field of data acquisition includes: the quantum computer type comprises superconductivity, ion traps, optical quanta, ultra-cold atoms, semiconductor quanta, optical quanta, ultra-cold atoms, topological quantum computer and other types of quantum computers and general quantum computers and special quantum computers; research and development of quantum computer hardware machine accessories, quantum coding, quantum software, quantum communication, quantum informatics, quantum encryption, quantum key distribution, quantum simulators, quantum materials, quantum sensing and metering, quantum invisible state transfer, quantum measurement, quantum networks and quantum internet, quantum decoherence and expandability research, fidelity and the like.
Further, the collected data content includes: the trading behavior of quantum products and services, the economic effect of quantum products after falling to the ground, the branch entity for quantum related entity development, the holding or holding of other entities by quantum related entities (whole capital, subordinate, stock control, stock holding and purchasing), the creation of new quantum related entities, the marketing channel company of quantum related scientific and technological applications and products, the company with the potential of quantum related scientific and technological applications, the method comprises the following steps of issuing laws and regulations of policies related to the quantum by various governments and local governments and regions, applying and approving patent applications and transfer related to the quantum by entities and individuals, publishing and breaking-through research progress of scientific research papers related to the quantum, carrying out quantum-related science popularization work (such as Bilibili, Anhui, brief books, tremble platforms and the like) by popular science people on well-known media, public information issued in well-known WeChat public accounts and other influential public platforms and the like.
Further, network public data and private data sources include, but are not limited to, the following: business activities, transaction contract amount, equity investment proportion and the like in the business activities; quantum topic seminar, seminar; a news report; patent approval; core journal papers are published; the growth of quantum-related researchers and practitioners.
After acquiring text data related to a quantum, the terminal performs construction of a self-attention mechanism, wherein the self-attention mechanism is to make each input interact with each other (self), and then find an input (attention) which should be focused more, and referring to fig. 4, the step of constructing the self-attention mechanism includes:
step S11, performing dot product calculation on the query and the key value to obtain a weight factor;
step S12, performing scaling processing on the weight factor to obtain scaling data, normalizing the scaling data to obtain a weight coefficient, and performing weighted summation on the weight coefficient and a set value.
In the process of constructing the self-attention mechanism, the terminal firstly performs scaling point-by-attention calculation: performing dot multiplication on Q (query) and K (key value) to obtain a primary weight factor, and dividing the result of the dot multiplication of Q and K byVariables are scaled (to prevent gradient disappearance); normalizing the obtained weight values, namely calculating by softmax to enable the sum of all weight factors to be 1; and carrying out weighted summation on the value according to the normalized weight coefficient. The formula of the step is
Wherein d iskIs the size of the word embedding vector.
Further calculating Multi-head Attention (Multi-head Attention), wherein the Multi-head Attention is mainly repeatedly calculated h times on h Scaled Dot-Product attribute basic units, so that related information can be learned from different dimensions and expression subspaces, that is, the Multi-head Attention is actually a plurality of self-Attention connected together, and the corresponding formula is as follows:
Multihead(Q,K,V)=Concat(head1,.....,headh)Wo
headi=Attention(QWi Q,KWi K,VWi V)
step S20, constructing a bidirectional long and short term memory network based on the self-attention mechanism, and calculating entity perception attention based on the bidirectional long and short term memory network;
the bidirectional Long-Short Term Memory network is formed by evolution of a Long Short-Term Memory network (LSTM), wherein the Long-Short Term Memory network is a time recurrent neural network, is suitable for processing and predicting important events with relatively Long intervals and relatively Long delays in a time sequence, and is provided for solving the problem of gradient disappearance existing in the RNN structure of the recurrent neural network. For example, when "the children are predicted in the.)" in which case the separation between the relevant information and the predicted word position is small, the RNN would predict that the word is "sky" using the previous information. However, if we want to predict "I trend up in France.. I spot fluent.)", the language model guesses that the next word may be the name of one language, but specifically what language requires the use of the long interval of the preceding France, in which case the RNN cannot utilize the long interval information because of the "gradient vanishing" problem. At this time, a bidirectional long-short term memory network is needed, and the bidirectional RNN is composed of two general RNNs, a forward RNN, which uses past information, a reverse RNN, and future information, so that at time t, information at time t-1 and information at time t +1 can be used, that is, when the forward RNN is used, the word "French" can be predicted.
Further, referring to fig. 5, the step of constructing a bidirectional long-short term memory network based on the self-attention mechanism includes:
step S21, acquiring data of the data layer of the self-attention mechanism;
and step S22, constructing a forward long short-term memory network and a backward long short-term memory network based on the data of the data layer, and forming the bidirectional long short-term memory network based on the forward long short-term memory network and the backward long short-term memory network.
In the process of constructing the bidirectional long and short term memory network, a data layer of a self-attention mechanism is firstly obtained, a forward long and short term memory network and a backward long and short term memory network are constructed on the basis of the data layer, and the bidirectional long and short term memory network is formed on the basis of the forward long and short term memory network and the backward long and short term memory network. Specifically, a forward long-short term memory network is constructed:
constructing a backward long-short term memory network:
the bidirectional long and short term memory network consists of a forward long and short term memory network and a backward long and short term memory network:
further, referring to fig. 6, the step of computing entity awareness based on the bidirectional long-short term memory network comprises:
step S23, calculating relative position characteristics and potential type entity characteristics based on the bidirectional long-short term memory network;
and step S24, obtaining the entity perception attention according to the relative position characteristics and the potential type entity characteristics.
After the terminal constructs the bidirectional long and short term memory network, the relative position characteristics and the entity characteristics of the potential type are calculated based on the bidirectional long and short term memory network, and the entity perception attention is obtained according to the relative position characteristics and the entity characteristics of the potential type. Specifically, the entity-aware attention mechanism considers the relative position feature and the potential type of entity feature, wherein the relative position feature is expressed as:
the entities of the potential types are characterized by:
obtaining an entity perception attention mechanism based on the relative position characteristics and the potential types of entity characteristics, wherein the formula is as follows:
step S30, performing combined training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model;
after the terminal completes the construction of the bidirectional long and short term memory network and the calculation of the entity perception attention, the first data in the bidirectional long and short term memory network and the second data in the entity perception attention are subjected to combined training to obtain a training model. Referring to fig. 7, the step of jointly training the first data in the bidirectional long-short term memory network and the second data in the entity perception attention includes:
step S31, acquiring first data of a parameter sharing layer in the bidirectional long and short term memory network and second data of a joint decoding layer in the entity awareness attention;
step S32, performing joint training on the first data of the parameter sharing layer and the second data of the joint decoding layer.
The terminal trains the entity-relationship combined extraction model, including the establishment of a loss function, wherein the loss function comprises two parts, namely entity extraction loss and relationship extraction loss, and when the loss function is smaller, the accuracy of the model is higher, and the model can better extract triples in sentences. In an embodiment, classification calculation is performed by injecting a fully connected softmax layer according to an entity awareness attention mechanism, and a conditional probability p (y | s, θ) of a relationship type is obtained, where the calculation formula corresponding to p (y | s, θ) is:
p(y|s,θ)=softmax(Woz+bo)
where y is the target relationship class and S is the input sentence, which is a parameter available for learning in the entire network.
Constructing a loss function AdaDelta gradient descent optimizer L:
where | D | is the size of the training data set. To prevent overfitting of the parameters, a parameter λ is introduced. L obtains the minimum value by adopting an AdaDelta gradient descent optimizer to obtain the parameter theta.
And further carrying out joint training on the first data of the parameter sharing layer and the second data of the joint decoding layer based on the loss function AdaDelta gradient descent optimizer L to obtain a training model.
And step S40, inputting the text data into the training model for training to obtain an extraction result.
And after the terminal obtains the training model for entity relationship extraction, outputting a corresponding entity relationship extraction result by the text data to be subjected to entity relationship extraction based on the training model. For example, the text data to be subjected to entity relationship extraction at present is that "quantum company a is a subsidiary of quantum company B, and the two companies are closely connected", the result after the model training is (quantum company a, subsidiary, quantum company B), that is, quantum company a and quantum company B are quantum entities, and the relationship between them is a subsidiary relationship.
The embodiment constructs a self-attention mechanism by acquiring text data related to quanta; constructing a bidirectional long and short term memory network based on an attention mechanism, and calculating entity perception attention based on the bidirectional long and short term memory network; performing combined training on the first data in the bidirectional long and short term memory network and the second data in the entity sensing attention to obtain a training model, wherein the parameter sharing layer is obtained based on the bidirectional long and short term memory network, and the combined decoding layer is obtained based on the entity sensing attention; and outputting a corresponding entity relation extraction result by the text data to be subjected to entity relation extraction based on the training model. The technical problems that due to rapid development of the industry, names of entities related to a plurality of quanta cannot be identified in time and the relations among the related entities of the quanta need to be combed are solved, complicated feature engineering is avoided, and the entity relation combined extraction efficiency is improved.
Further, referring to fig. 3, a second embodiment of the entity relationship extraction method of the present application is provided.
The second embodiment of the entity relationship extraction method differs from the first embodiment of the entity relationship extraction method in that the step of acquiring text data related to quanta is followed by:
step S13, carrying out structuralized preprocessing on the text data, and carrying out word segmentation and quantum entity recognition operation to obtain sentences containing a plurality of quantum entities;
step S14, a corpus is composed based on the sentences.
After the terminal acquires the data related to the quantum entity, each word in the network news and credit standing text sentences is mapped into a low-dimensional vector, and the whole sentence is formed by vector splicing and expressed as
X=(x1,x2,x3,.....xn)
Wherein n is the number of words, X is the vectorization expression of the network news and credit worthiness text sentences, and X isiAs a word wiThe vector of (a) represents, wherein,wias a word vector, ciIs a character-based vector representation of a word.
Further carrying out structured preprocessing on the data, including data cleaning, text word segmentation, entity identification, relation marking, dependency analysis and the like; and performing word segmentation and quantum entity recognition to obtain an experimental corpus composed of sentences of sentence level and containing two or more quantum entities. For example: the given text data "Qinghua university and Huazhong science and technology university have both been opened up with quantum-related specialties. After word segmentation processing, a word sequence X { "Qinghua university", "and", "Huazhong science and technology university", "average", "offer", "have", "profession related to quantum", and "can be obtained. "}, the identified quantum entities are" Qinghua university "and" Huazhong university of science and technology ". The corpus is used for predefining entity relationship extraction target types, labeling original texts related to quanta, and constructing an entity relationship extraction model training data set and a test data set. The entity relation data set sentences are subjected to word embedding and vector representation
S=(w1,w2,w3,....wn)
Further, after the step of obtaining the text data related to the quantum, the method further includes:
quantum entity naming is performed based on the text data, and a previous relation of the quantum entity is determined.
After the terminal acquires the text data, the terminal needs to name the quantum entity based on the text data and determine the previous relation of the quantum entity. The name of the quantum entity can be used for identifying enterprises, associations among companies, illegal people operating entities, non-profit organizations and social groups, universities and scientific research organizations and the like related to the quantum. The entities related to quantum are determined based on the recognition operation, and include all companies related to quantum technology or applying quantum technology (unlimited companies, limited companies, joint companies, stock limited companies, stock joint companies, and the like), associations between companies, illegal people operating entities, non-profit organizations and social groups, universities, scientific research institutions and other scientific research institutions, and the like. Wherein, the relationship between the general entities includes but is not limited to the following relationship:
1. cooperative Cooperate (funding, cooperation, joint funding), etc.;
2. the affiliate affiate company;
3. affiliate assignees;
4. subsidiary, Subsidiary;
5. stock rights (entity A is the holding stockholder, holding company of entity B);
6. buy (trade Buy between companies);
7. merge (combination between two entities);
8. establishing Establish (an entity A establishes a subordinate and control relation of an entity B);
9. risk association (Risk correlation);
10. long term cooperative agreement (long term cooperative acquisition).
In the embodiment, a corpus is constructed by processing data based on the collected text data, entity naming is performed, the relationship between entities is determined, and a data basis is provided for the entity relationship extraction model.
The present application further provides an entity relationship extracting apparatus, in an embodiment, the entity relationship extracting apparatus includes a memory, a processor, and an entity relationship extracting program stored on the memory and capable of running on the processor, and when executed by the processor, the entity relationship extracting program implements the following steps:
acquiring text data related to the quantum, and constructing a self-attention mechanism;
constructing a bidirectional long-short term memory network based on the self-attention mechanism, and calculating entity perception attention based on the bidirectional long-short term memory network;
performing joint training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model;
and inputting the text data into the training model for training to obtain an extraction result.
In one embodiment, the entity relationship extraction device comprises an acquisition module, a construction module, a training module and an extraction module.
The acquisition module is used for acquiring text data related to the quantum and constructing a self-attention mechanism;
the construction module is used for constructing a bidirectional long and short term memory network based on the self-attention mechanism and calculating entity perception attention based on the bidirectional long and short term memory network;
the training module is used for carrying out combined training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model;
and the training module is used for inputting the text data into the training model for training to obtain an extraction result.
Further, the acquisition module comprises a calculation unit;
the computing unit is used for performing point multiplication computation on the query and the key value to obtain a weight factor;
the calculating unit is further configured to scale the weight factor to obtain scaled data, normalize the scaled data to obtain a weight coefficient, and perform weighted summation on the weight coefficient and a set value.
Further, the construction module comprises an acquisition unit and a construction unit;
the acquisition unit is used for acquiring data of the data layer of the self-attention mechanism;
the construction unit is used for constructing a forward long short-term memory network and a backward long short-term memory network based on the data of the data layer, and forming the bidirectional long short-term memory network based on the forward long short-term memory network and the backward long short-term memory network.
Further, the building module further comprises a computing unit;
the computing unit is used for computing relative position characteristics and potential types of entity characteristics based on the bidirectional long-short term memory network;
the computing unit is further configured to obtain the entity perception attention according to the relative position feature and the entity feature of the potential type.
Further, the training module comprises a construction unit and a training unit;
the construction unit is used for acquiring first data of a parameter sharing layer in the bidirectional long-short term memory network and second data of a joint decoding layer in the entity perception attention;
the training unit is configured to perform joint training on the first data of the parameter sharing layer and the second data of the joint decoding layer.
Further, the training unit comprises a calculation subunit;
the calculating subunit is used for constructing a gradient descent optimizer of the loss function;
the calculation subunit is further configured to perform joint training on the first data of the parameter sharing layer and the second data of the joint decoding layer based on a gradient descent optimizer of the loss function.
Further, the acquisition module further comprises a processing unit;
the processing unit is used for carrying out structured preprocessing on the text data, and carrying out word segmentation and quantum entity recognition operation to obtain a sentence containing a plurality of quantum entities;
the processing unit is further configured to compose a corpus based on the sentences.
Further, the processing unit is further configured to perform quantum entity naming based on the text data, and determine a previous relationship of the quantum entity.
The implementation of the functions of each module of the entity relationship extraction device is similar to the process in the embodiment of the method, and is not repeated here.
In addition, the application also provides a terminal, the terminal comprises a memory, a processor and an entity relation extraction program which is stored in the memory and operated on the processor, and the terminal constructs a self-attention mechanism by acquiring text data related to quanta; constructing a bidirectional long and short term memory network based on an attention mechanism, and calculating entity perception attention based on the bidirectional long and short term memory network; performing joint training on first data in the bidirectional long-short term memory network and second data in entity perception attention to obtain a training model; and inputting the text data into a training model for training to obtain an extraction result. The problem that due to rapid development of the industry, names of entities related to a plurality of quanta cannot be identified in time and the relationship among the related entities of the quanta needs to be combed is solved, complicated feature engineering is avoided, and the entity relationship combined extraction efficiency is improved.
In addition, the present application also provides a computer readable storage medium, on which an entity relationship extraction program is stored, and when being executed by a processor, the entity relationship extraction program implements the steps of the entity relationship extraction method as described above.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While alternative embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following appended claims be interpreted as including alternative embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (10)
1. An entity relationship extraction method, the method comprising:
acquiring text data related to the quantum, and constructing a self-attention mechanism;
constructing a bidirectional long-short term memory network based on the self-attention mechanism, and calculating entity perception attention based on the bidirectional long-short term memory network;
performing joint training on the first data in the bidirectional long-short term memory network and the second data in the entity perception attention to obtain a training model;
and inputting the text data into the training model for training to obtain an extraction result.
2. The entity relationship extraction method according to claim 1, wherein the step of constructing a self-attention mechanism comprises:
performing point multiplication calculation on the query and the key value to obtain a weight factor;
and scaling the weight factor to obtain scaling data, normalizing the scaling data to obtain a weight coefficient, and performing weighted summation on the weight coefficient and a set value.
3. The entity relationship extraction method according to claim 1, wherein the step of constructing a bidirectional long-short term memory network based on the self-attention mechanism comprises:
acquiring data of the data layer of the self-attention mechanism;
and constructing a forward long short-term memory network and a backward long short-term memory network based on the data of the data layer, and forming the bidirectional long short-term memory network based on the forward long short-term memory network and the backward long short-term memory network.
4. The entity relationship extraction method according to any one of claims 1 to 3, wherein the step of calculating entity awareness based on the bidirectional long-short term memory network comprises:
calculating relative location features and potential types of entity features based on the bidirectional long-short term memory network;
and obtaining the entity perception attention according to the relative position characteristics and the potential type entity characteristics.
5. The entity relationship extraction method according to any one of claims 1 to 3, wherein the step of jointly training the first data in the bidirectional long-short term memory network and the second data in the entity perception attention comprises:
acquiring first data of a parameter sharing layer in the bidirectional long-short term memory network and second data of a joint decoding layer in the entity perception attention;
and performing joint training on the first data of the parameter sharing layer and the second data of the joint decoding layer.
6. The entity relationship extraction method as claimed in claim 5, wherein the step of jointly training the first data of the parameter sharing layer and the second data of the joint decoding layer comprises:
constructing a gradient descent optimizer of the loss function;
a gradient descent optimizer based on the loss function jointly trains the first data of the parameter sharing layer and the second data of the joint decoding layer.
7. The entity relationship extraction method according to claim 1, wherein the step of obtaining the text data related to the quantum comprises:
carrying out structuralized preprocessing on the text data, and carrying out word segmentation and quantum entity recognition operation to obtain a sentence containing a plurality of quantum entities;
a corpus is composed based on the sentences.
8. The entity relationship extraction method according to claim 1, wherein the step of obtaining the text data related to the quantum further comprises:
quantum entity naming is performed based on the text data, and a previous relation of the quantum entity is determined.
9. A terminal, characterized in that the terminal comprises a memory, a processor and an entity relation extraction program stored on the memory and running on the processor, and the processor implements the steps of the method according to any one of claims 1 to 8 when executing the entity relation extraction program.
10. A computer-readable storage medium, having stored thereon an entity relationship extraction program which, when executed by a processor, implements the steps of the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011386678.XA CN112487109A (en) | 2020-12-01 | 2020-12-01 | Entity relationship extraction method, terminal and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011386678.XA CN112487109A (en) | 2020-12-01 | 2020-12-01 | Entity relationship extraction method, terminal and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112487109A true CN112487109A (en) | 2021-03-12 |
Family
ID=74938649
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011386678.XA Pending CN112487109A (en) | 2020-12-01 | 2020-12-01 | Entity relationship extraction method, terminal and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112487109A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158667A (en) * | 2021-04-09 | 2021-07-23 | 杭州电子科技大学 | Event detection method based on entity relationship level attention mechanism |
CN113609244A (en) * | 2021-06-08 | 2021-11-05 | 中国科学院软件研究所 | Structured record extraction method and device based on controllable generation |
CN113792881A (en) * | 2021-09-17 | 2021-12-14 | 北京百度网讯科技有限公司 | Model training method and device, electronic device and medium |
WO2023056808A1 (en) * | 2021-10-08 | 2023-04-13 | 中兴通讯股份有限公司 | Encrypted malicious traffic detection method and apparatus, storage medium and electronic apparatus |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334339A (en) * | 2019-04-30 | 2019-10-15 | 华中科技大学 | It is a kind of based on location aware from the sequence labelling model and mask method of attention mechanism |
CN111368528A (en) * | 2020-03-09 | 2020-07-03 | 西南交通大学 | Entity relation joint extraction method for medical texts |
CN111859912A (en) * | 2020-07-28 | 2020-10-30 | 广西师范大学 | PCNN model-based remote supervision relationship extraction method with entity perception |
CN111950297A (en) * | 2020-08-26 | 2020-11-17 | 桂林电子科技大学 | Abnormal event oriented relation extraction method |
-
2020
- 2020-12-01 CN CN202011386678.XA patent/CN112487109A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110334339A (en) * | 2019-04-30 | 2019-10-15 | 华中科技大学 | It is a kind of based on location aware from the sequence labelling model and mask method of attention mechanism |
CN111368528A (en) * | 2020-03-09 | 2020-07-03 | 西南交通大学 | Entity relation joint extraction method for medical texts |
CN111859912A (en) * | 2020-07-28 | 2020-10-30 | 广西师范大学 | PCNN model-based remote supervision relationship extraction method with entity perception |
CN111950297A (en) * | 2020-08-26 | 2020-11-17 | 桂林电子科技大学 | Abnormal event oriented relation extraction method |
Non-Patent Citations (2)
Title |
---|
NEW Y: "《Position-aware Attention and Supervised Data Improve Slot Filling》阅读笔记", pages 1 - 5, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/30828466> * |
张计龙: "慧源共享 数据悦读 首届上海高校开放数据创新研究大赛 数据论文集", 31 October 2020, 复旦大学出版社, pages: 64 - 67 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113158667A (en) * | 2021-04-09 | 2021-07-23 | 杭州电子科技大学 | Event detection method based on entity relationship level attention mechanism |
CN113609244A (en) * | 2021-06-08 | 2021-11-05 | 中国科学院软件研究所 | Structured record extraction method and device based on controllable generation |
CN113609244B (en) * | 2021-06-08 | 2023-09-05 | 中国科学院软件研究所 | Structured record extraction method and device based on controllable generation |
CN113792881A (en) * | 2021-09-17 | 2021-12-14 | 北京百度网讯科技有限公司 | Model training method and device, electronic device and medium |
WO2023056808A1 (en) * | 2021-10-08 | 2023-04-13 | 中兴通讯股份有限公司 | Encrypted malicious traffic detection method and apparatus, storage medium and electronic apparatus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bharadiya | A comparative study of business intelligence and artificial intelligence with big data analytics | |
Pan et al. | Study on convolutional neural network and its application in data mining and sales forecasting for E-commerce | |
CN110751286B (en) | Training method and training system for neural network model | |
CN112487109A (en) | Entity relationship extraction method, terminal and computer readable storage medium | |
Chakraborty et al. | Comparative sentiment analysis on a set of movie reviews using deep learning approach | |
Tyagi et al. | A Step-To-Step Guide to Write a Quality Research Article | |
CN112231569A (en) | News recommendation method and device, computer equipment and storage medium | |
Kumar et al. | Image sentiment analysis using convolutional neural network | |
Zhao et al. | What is market talking about? Market-oriented prospect analysis for entrepreneur fundraising | |
Feng | Data Analysis and Prediction Modeling Based on Deep Learning in E‐Commerce | |
Khan et al. | A customized deep learning-based framework for classification and analysis of social media posts to enhance the Hajj and Umrah services | |
Singhal et al. | An E‐commerce prediction system for product allocation to bridge the gap between cultural analytics and data science | |
CN117235257A (en) | Emotion prediction method, device, equipment and storage medium based on artificial intelligence | |
Modrušan et al. | Intelligent Public Procurement Monitoring System Powered by Text Mining and Balanced Indicators | |
Zhang et al. | SLIND: identifying stable links in online social networks | |
Foote et al. | A computational analysis of social media scholarship | |
Sun et al. | Navigating the Digital Transformation of Commercial Banks: Embracing Innovation in Customer Emotion Analysis | |
Cai et al. | Joint attention LSTM network for aspect-level sentiment analysis | |
Chen et al. | Towards accurate search for e-commerce in steel industry: a knowledge-graph-based approach | |
US11941076B1 (en) | Intelligent product sequencing for category trees | |
Rastogi et al. | Computational Analysis of Online Pooja Portal for Pandit Booking System: An AI and ML Based Approach for Smart Cities | |
Jantan et al. | Convolutional Neural Networks (CNN) Model for Mobile Brand Sentiment Analysis | |
Ramya et al. | Real time emotion support system in text mining [rtestm] | |
Le et al. | Geographic Information Systems (GIS) Based Visual Analytics Framework for Highway Project Performance Evaluation | |
He et al. | User Profiles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |