CN111368542A - Text language association extraction method and system based on recurrent neural network - Google Patents

Text language association extraction method and system based on recurrent neural network Download PDF

Info

Publication number
CN111368542A
CN111368542A CN201811600745.6A CN201811600745A CN111368542A CN 111368542 A CN111368542 A CN 111368542A CN 201811600745 A CN201811600745 A CN 201811600745A CN 111368542 A CN111368542 A CN 111368542A
Authority
CN
China
Prior art keywords
entity
expression
sequence
vector
context
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811600745.6A
Other languages
Chinese (zh)
Inventor
韩英
陈薇
王腾蛟
李强
刘迪
黄晓光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Zhejiang Electric Power Co Ltd
Original Assignee
Peking University
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, State Grid Zhejiang Electric Power Co Ltd filed Critical Peking University
Priority to CN201811600745.6A priority Critical patent/CN111368542A/en
Publication of CN111368542A publication Critical patent/CN111368542A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a text language association extraction method and system based on a recurrent neural network. The method automatically extracts complex context characteristics based on a recurrent neural network (bidirectional long and short term memory network) and encodes semantic information of the context; discovering a definition pattern in the document through a rule-based entity expression pair extractor, identifying a definition about a non-standard expression in the document, and extracting the defined standard expression and the defined non-standard expression which belong to the same entity concept; encoding the extracted characteristics of the entity expression pair, and embedding information about entity normalization into a low-dimensional entity expression vector; the entity expression vector and the context feature coding vector are connected and subjected to dimension conversion to obtain a final code; and the decoder based on the conditional random field decodes a globally optimal state sequence as a final output sequence by combining the transition probabilities between the features and the states learned by the encoder. The invention can effectively improve the performance of entity identification.

Description

Text language association extraction method and system based on recurrent neural network
Technical Field
The invention belongs to the field of artificial intelligence, and relates to a method for extracting information from massive unstructured data by using a natural language processing technology, in particular to a method for identifying entities from texts and extracting entity association relation, which is a key technology for information extraction.
Background
Text entity extraction is to identify entities with specific meanings, such as name of person, name of place, name of organization, etc., from text. The method is a key technology for extracting information from massive unstructured data, and is a fundamental stone of numerous complex natural language processing applications, such as intelligent question answering, knowledge maps, automatic abstractions, machine translation and the like.
Due to the rich expression form of natural language, the same entity may have many different expressions, such as full name, abbreviation and alternative name of the entity. The phenomenon of "ambiguous word" is widely existed in chinese and english, such as "chinese industrial and commercial banks" and "workers" in chinese, and "United States" and "u.s. The variable expression form of the entity brings huge challenges to the entity recognition. The results of Khalid M A0 et al [ Khalid M A, Jijkon V, De Rijke M. the impact of nomenclature neutralization on Information transformation for query analysis [ C ]// European Conference on Information recovery.Springer, Berlin, Heidelberg,2008: 705-.
In the field of natural language processing, entity identification and entity association normalization are traditionally regarded as independent tasks which are processed separately. The entity identification is firstly carried out, and then the result of the entity identification is used as the input of the entity association normalization, so that the result of the entity normalization cannot be fed back to the entity identification in a pipeline mode, and therefore the entity identification cannot utilize useful information of the entity normalization. Existing research on this part of the combined process of entity identification and entity normalization is also very limited. LiuX et al [ Liu X, Zhou M, Wei F, et al. Joint information of nominal importance and knowledge for tweeets [ C ]// Proceedings of the 50th annular Meeting and knowledge for computer Linear algorithms Long Papers-volume1.Association for computer Linear algorithms 2012: 526: 535 ] studied the combined treatment of entity identification and entity normalization for tweets and proposed a probability map-based model. The model describes whether two entity expressions between similar-content tweets refer to the same entity concept by introducing a binary random variable. Similarly, Luo G et al [ Luo G, Huang X, Lin C Y, et al. Joint incidence and normalization [ C ]// Proceedings of the 2015 consensus on Empirical Methods in Natural Language processing.2015:879-888 ] also propose a probabilistic graph-based model to combine entity identification and entity normalization. These methods all focus on the joint processing of normalization of entity expressions and entity identification between short texts tweets, rely on a large number of artificially constructed features based on a probabilistic graph model of statistical machine learning. These feature projects are costly, difficult to expand on large-scale datasets, do not function well with massive amounts of data, and are not data-driven. And high-order interactions of many hidden contextual features cannot be covered by manually building the features. And the entity normalization modules in the methods all depend on the existing dictionary, and an unreasonable assumption that the standard entity expression exists in the dictionary exists. The existing dictionary is limited in coverage, and a plurality of linguistic data are lack of dictionaries in corresponding fields. Especially today, where information technology is well developed, new entities, such as reports on newly established institutions, newly issued bonds, newly occurring events, etc., often appear in the text of news media, which do not exist in existing dictionaries and knowledge bases, and dictionary-dependent methods cannot normalize the names of such new entities.
The technical solution is needed to solve the above problems, and the complex features of the text context can be automatically learned without relying on manual feature engineering, and simultaneously, the entity normalization information can be effectively obtained by using the definition of the non-standard entity expression in the document, and the learning of the text context features and the information of the entity expression pair defined in the document are integrated to realize better entity identification.
Disclosure of Invention
Aiming at the problems, the invention aims to design and realize a model combining rules and deep learning for extracting text entities and entity association relation, can realize automatic extraction of context characteristics by utilizing the deep learning, avoids complex characteristic engineering, can also find definition about entity expression in a document by utilizing the rules to be integrated with human knowledge and experience, and can realize better entity identification by assisting entity identification through entity association normalization in the text.
In order to achieve the purpose, the invention adopts the following technical scheme:
a text entity and entity incidence relation extraction method based on a recurrent neural network comprises the following steps:
(1) automatically extracting complex context characteristics through a time recursive neural network (a bidirectional long and short term memory network), and coding information of the context characteristics;
(2) discovering a definition mode in the document through a rule, identifying the definition related to the non-standard expression in the document, and extracting the defined standard expression and the defined non-standard expression which belong to the same entity concept as an entity expression characteristic;
(3) coding the extracted entity expression characteristics, and embedding information related to entity normalization into a low-dimensional entity expression vector;
(4) connecting the context characteristics and the codes of the entity expression characteristics in a vector space to obtain a final code fusing entity identification and entity expression normalization information;
(5) and sending the final code into a conditional random field model, calculating a global optimal state sequence by combining transition probabilities among states, decoding and outputting a final result sequence of the text entity and the entity incidence relation.
A system for extracting text entities and entity incidence relation based on a recurrent neural network comprises:
the word/word embedding module is used for mapping each word/word of the original text sequence into a certain-dimension vector;
the context feature encoder is used for representing the text sequence after embedding the words and the phrases in a vector form, automatically extracting complex context features and encoding semantic information of the context;
the word segmentation module is used for segmenting the original text sequence;
the entity expression pair extractor is used for discovering definitions about non-standard expressions in the document based on word segmentation results of the word segmentation module, and extracting standard expressions and non-standard expressions which belong to the same entity concept and are defined as entity expression characteristics;
the entity normalization information encoder is used for encoding the entity expression characteristics extracted by the extractor by the entity expression and embedding the information about entity normalization into a low-dimensional entity expression vector;
the entity identification and normalization coding combination module is used for connecting the context characteristics obtained by the context characteristic encoder with the entity expression characteristic codes obtained by the entity normalization information encoder in a vector space to obtain final codes fusing entity identification and entity expression normalization information;
and the decoder based on the conditional random field is used for calculating to obtain a globally optimal state sequence as a final output sequence of the text entity and the entity incidence relation by combining the transition probability between the output and the state of the entity identification and normalization coding combination module.
Compared with the prior art, the invention has the following positive effects:
the invention provides a text entity and entity incidence relation extraction method based on a recurrent neural network by adopting a mode of combining rules and deep learning, which utilizes a bidirectional long-short term memory network to automatically extract text context semantic features, simultaneously combines human experience and knowledge into rules for extracting entity nonstandard expressions defined in documents, and improves the performance of an entity recognition system through entity incidence normalization. The method utilizes the advantage of deep learning for automatically extracting the features, avoids manual feature engineering which has high time cost and high labor cost and is difficult to expand to a large data set, and realizes real data driving; meanwhile, the method gives full play to the knowledge and experience of people, quickly discovers the definition of the entity nonstandard expression in the document based on the rule, and extracts the entity expression pair by fully utilizing the information transmitted by the document content; the relevance of the entity identification and the entity normalization task is fully utilized, compared with the traditional separate processing mode, the simultaneous processing of the entity identification and the entity normalization can be supported, the information sharing of the entity identification and the entity normalization is realized, and the performance of the entity identification is improved by utilizing the information of the entity normalization. The invention has the advantages of low overhead, high expression and multiple applications.
Drawings
Fig. 1 is a schematic diagram illustrating a module composition of a recurrent neural network-based text entity and entity association extraction system according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of data flow and network structure of a recurrent neural network-based text entity and entity association extraction system according to an embodiment of the present invention. Wherein B-ORG represents the beginning of the organization class entity, I-ORG represents the middle of the organization class entity, E-ORG represents the end of the organization class entity, and O represents a non-organization class entity.
Fig. 3 is a flowchart illustrating steps of a recurrent neural network-based text entity and entity association extraction system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments and accompanying drawings.
Fig. 1 is a schematic diagram of constituent modules of a recurrent neural network-based text entity and entity association relation extraction system according to an embodiment of the present invention, and fig. 2 is a schematic diagram of data flows and a network structure of a recurrent neural network-based text entity and entity association relation extraction system according to an embodiment of the present invention. With reference to fig. 1 and fig. 2, the functions and implementations of the modules shown in fig. 1 are described as follows:
(1) a context feature encoder based on a time recursive neural network (bidirectional long and short term memory network) consists of a forward long and short term memory network (LSTM) and a backward long and short term memory network, and is responsible for automatically extracting complex context features and encoding semantic information of contexts.
When the LSTM receives information from the previous time at time t, the cell (the neuron of the LSTM) first determines that a part of the information is forgotten, and the forgetting gate controls a forgotten parameter. The input to the gate is the input x at the current timetAnd the output h of the previous momentt-1The formula for a forget gate is as follows:
ft=σ(Wf·[ht-1,xt]+bf)
wherein f istIs the cyclic weight of the forgetting gate, σ is the activation function (sigmoid function), WfIs the input weight of the forgetting gate, bfIs the bias of the forgetting gate.
After discarding useless information, the cell needs to decide which newly entered information to absorb, and the formula of the input gate is as follows:
it=σ(Wi·[ht-1,xt]+bi)
wherein itIs the cyclic weight of the input gate, σ is the activation function (sigmoid function), WiIs the input weight of the input gate, bfIs the offset of the input gate.
Cell candidate at present:
Figure BDA0001922431830000041
wherein
Figure BDA0001922431830000042
Is a candidate for a cell, WcIs the input weight, x, of the cell candidatetIs the input x at the current timet,ht-1Is the output of the previous time, bcIs a bias of cellular candidates.
Updating the cell state to obtain a new cell state, and calculating from the old cell state selective forgetting and the candidate cell state:
Figure BDA0001922431830000043
wherein C istIs a new value of the cell state, ftIs the cyclic weight of the forgetting gate, Ct-1Is the value of the cell state at the previous moment, itIs the round-robin weight of the input gate,
Figure BDA0001922431830000051
is a cell candidate at the current time.
Finally, the output gate plays a role to determine the output vector h of the hidden layer at the current momenttDefinition of the output gate:
ot=σ(Wo·[ht-1,xt]+bo)
wherein o istIs the weight of the input gate, σ is the activation function (sigmoid function), WoIs the connection weight of the output gate, boIs the offset of the output gate, xtIs the input x at the current timet,ht-1Is the output of the previous time instant.
The output of the hidden layer at the current moment is that the state of the activated cells is output outwards through an output gate:
ht=ot*tanhCt
wherein o istIs the weight of the input gate, CtIs the updated cell state value at the current time, htIs the output of the current time.
For a given text sequence with a string of n characters/words (English is a word and Chinese is a character), the notation is S ═ w1,w2,w3,….wn]Wherein w isiAnd (3) a vector representing the ith character/word of the sequence after the character/word is embedded. At time n, the hidden layer of the forward LSTM network is output as
Figure BDA0001922431830000052
Hidden layer output to the LSTM network is noted
Figure BDA0001922431830000053
The hidden layer output of the forward LSTM network and the hidden layer output of the backward LSTM network are combined together through a combining layer to obtain
Figure BDA0001922431830000054
The context feature encoder output is noted as HR
(2) The entity expression pair extractor based on the rules has the functions of fully utilizing the knowledge and experience of people, discovering the definition related to the entity nonstandard expression in the document through the rules based on the syntactic structure and the lexical structure, and extracting the expression pair which is given by the definition and refers to the same entity concept, such as the name pair of < full name, short name >, < full name, alternative name >.
Table 1 gives the rules used for the decimator. Wherein F represents a standard expression, and A represents a non-standard expression such as abbreviation, alternative name and the like. The string length of F is specified to be longer than the string length of a. The expression pair extractor extracts entity expression pairs from the data which are in accordance with the syntactic and lexical conditions.
TABLE 1 formulation rules used for the decimator
Figure BDA0001922431830000055
(3) And the entity normalization information encoder is responsible for encoding the characteristics of the extracted entity expression pairs and embedding the information about entity normalization into a low-dimensional entity expression vector. For the expression pair extracted by the entity expression pair extractor, the expression pair is firstly converted into a vector with a certain length and then is further learned through a linear layer.
The corresponding meanings of each element of the expression vector respectively correspond to the beginning, the middle, the end and the independent single character of the non-standard name from left to right, and the beginning, the middle and the end of the standard name. Since the standard name is the longest one of the multiple names of the entity, there is no case of independent words. For a given string of text containing n characters/words (english is a word and chinese is a character), the notation S ═ w1,w2,w3,….wn]Assuming that the set of expression pairs extracted by the expression pair extractor is a<F1,A1>,<F2,A2>,……<Fk,Ak>For each word wiThe method comprises the following steps of (1) preparing,
Figure BDA0001922431830000061
wherein, g (w)i) The value of the formulation function representing the w-th word/word.
For satisfying g (w)i) W not equal to 0iMeaning M corresponding to each element of the expression vector normalized by its corresponding named entityiIs defined as:
Figure BDA0001922431830000062
wherein Pos is wiThe positions in the name pair are B (beginning), I (middle), E (end), S (name consists of only one word), respectively.
When the initial expression vector is recorded as V, for each character/word (Chinese is character, English is word) wiComprises the following steps:
Figure BDA0001922431830000063
wherein i is more than or equal to 1 and less than or equal to N, j is more than or equal to 1 and less than or equal to 7, and NjA label representing the jth element representation of the expression vector.
And the initialized expression vector is marked as V, and after the linear layer processing, the result output by the entity normalization information encoder is a final expression vector:
Figure BDA0001922431830000064
wherein HNThe final expression vector is represented by a vector of representations,
Figure BDA0001922431830000065
a function representing the expression vector acting on the initialisation, wlInput weights representing linear layers, blIndicating the bias of the linear layer.
(4) And the context characteristic and entity expression characteristic combination module is responsible for realizing the connection of the codes of the context characteristic and the entity expression characteristic in a vector space to obtain the final code fusing entity identification and entity expression normalization information.
Hidden layer vector H obtained by context feature encoderRAnd the expression vector H obtained by the entity normalization information encoderNSplicing into a vector H containing high-order feature interaction and low-order feature interactionA
HA=[HR,HN]
HAAnd converting the full connection layer to form an output vector H of the final encoder:
H=wf·HA+bf
h is a tensor with one dimension (n, L), n is the length of each sample sequence, and L is the number of classes of output labels.
(5) And the decoder based on the conditional random field is responsible for decoding a globally optimal state sequence as a final output sequence by combining the transition probability between the features and the states learned by the encoder.
Figure BDA0001922431830000071
The predicted tag representing the ith word of the sequence is yiThe score of the state feature of the time,
Figure BDA0001922431830000072
representing slave label yiTransfer to yi+1Score of the state transition feature of (1), y0Representing the beginning of the marker sequence, ynRepresenting the end of the marker sequence. The total score of the marker sequence is the sum of the score of the status feature and the score of the metastasis feature, and is defined as follows:
Figure BDA0001922431830000073
performing Softmax processing on scores S (X, y) corresponding to all possible marker sequences y to obtain the probability of the sequences y:
Figure BDA0001922431830000074
wherein Y isXRepresenting all possible marker sequences for the input sequence X, the output marker sequence is the sequence that achieves the maximum score during the prediction phase of the decoder.
Figure BDA0001922431830000075
Fig. 3 is a flowchart illustrating steps of a method for extracting a text entity and an entity association relationship based on a recurrent neural network according to an embodiment of the present invention. The steps are specifically described as follows:
step 1.1 prepare data and segment the dataset.
Preparing marked data, segmenting the marked data into a training data set, developing the data set and a testing data set, wherein the training data set and the developing data set are used in a training stage, and the testing data set is used in a testing stage. The data set is a text data set, and each sample is an article.
Step 1.2 establishes a character index table.
And establishing character indexes for all the obtained corpora, and adding the number of the unknown character for each character number from 1. Word embedding module for the back (Chinese is character, English is word)
Step 1.3 batch sample input
Training of the training data set is input into the system in batches according to a small batch principle and a set batch size.
Step 2.1 word segmentation
Segmenting each sentence by sentence unit
Step 2.2 define pattern matching
Each sentence is searched whether a condition on defining a pattern in the syntax structure is satisfied, and whether a definition such as a full name (abbreviation) exists or not is searched. If present, an entity expression pair may be defined. If not, then there are no pairs of entity expressions in this sentence.
Step 2.3 extracting entity expression pair by forward and backward search
And searching the words before and after the separator according to the definition mark as the separator, such as '('), checking whether a word combination before and after the word combination which meets the lexical condition of the entity expression pair exists, and extracting the entity expression pair if the word combination exists.
Step 2.4 expression information embedding
And embedding the extracted entity expression information, converting the information into low-dimensional entity expression vectors, and judging whether each dimension corresponds to a short term or a full term and the position of a corresponding character in the entity. If no entity expression pair exists, all zeros initialize.
Step 3 word embedding
For each character of each input sample, character/word embedding (English is a word, Chinese is a character level) is carried out, and the character/word embedding is converted into a 300-dimensional vector according to a character index table and a linear layer.
Step 4 bidirectional LSTM network
The input sample sequence represented by the word vector is sent to a bidirectional LSTM network to extract context feature information.
Step 5 connecting and converting
And the hidden layer vector output by the bidirectional LSTM network is spliced with the entity expression vector to realize the connection of the vector space. And then the dimensionality of the tensor is converted through the full connection layer. The emission probability (the probability of the state sequence generating the observation sequence) of each character is obtained, i.e. the state features of the CRF model.
Step 6CRF modeling state transition probability
CRF modeling, taking into account the dependency between states (tags), and the probability of emission of an observed sequence to a sequence of states.
Step 7 decoding the globally optimal sequence
And calculating the score of each sequence, and calculating the sequence with the highest global score after combining the label transition probability through a dynamic programming algorithm to be used as a final output sequence. If it is the prediction phase, then it ends at step 7. If it is a training phase, there are also steps 8 and 9.
Step 8 calculating a cost function
In the training process, the objective function is to maximize the log-likelihood of the correct label sequence of the training set.
Figure BDA0001922431830000081
The cost function is the negative of the objective function.
Step 9 adaptive gradient descent algorithm
And training the model by using an Adam algorithm, and adaptively adjusting the learning rate according to the training speed. If the effect of the model on the test set is reduced, the overfitting is indicated, the training is stopped immediately, and otherwise, the training is continued.
While the foregoing disclosure shows illustrative embodiments of the invention, it should be noted that various changes and modifications could be made herein without departing from the scope of the invention as defined by the appended claims. In accordance with the structures of the embodiments of the invention described herein, the constituent elements of the claims can be replaced with any functionally equivalent elements. Therefore, the scope of the present invention should be determined by the contents of the appended claims.

Claims (10)

1.A text language association extraction method based on a recurrent neural network is characterized by comprising the following steps:
(1) automatically extracting complex context characteristics through a time recursive neural network, and coding information of the context characteristics;
(2) discovering a definition mode in the document through a rule, identifying the definition related to the non-standard expression in the document, and extracting the defined standard expression and the defined non-standard expression which belong to the same entity concept as an entity expression characteristic;
(3) coding the extracted entity expression characteristics, and embedding information related to entity normalization into a low-dimensional entity expression vector;
(4) connecting the context characteristics and the codes of the entity expression characteristics in a vector space to obtain a final code fusing entity identification and entity expression normalization information;
(5) and sending the final code into a conditional random field model, calculating a global optimal state sequence by combining transition probabilities among states, decoding and outputting a final result sequence of the text entity and the entity incidence relation.
2. The method of claim 1, wherein the temporal recurrent neural network of step (1) is a bidirectional long-short term memory network.
3. The method of claim 1, wherein the step (2) extracts the expression pair referring to the same entity concept by a rule based on syntactic and lexical structures, wherein the non-standard expression includes abbreviation and alternative name, and the string length of the standard expression is specified to be longer than that of the non-standard expression.
4. The method according to claim 1, wherein the step (3) converts the expression pairs extracted by the entity expression pair extractor into vectors with a certain length, and then further learns the vectors through a linear layer to obtain the final entity expression vector.
5. The method according to claim 4, wherein each element of the entity expression vector has corresponding meaning from left to right, which corresponds to a beginning, middle, end, independent word representing a non-standard name, and a beginning, middle, and end representing a standard name.
6. The method of claim 1, wherein step (4) is performed by using a hidden layer vector H obtained by a context feature encoderRAnd the expression vector H obtained by the entity normalization information encoderNSplicing into a vector H containing high-order feature interaction and low-order feature interactionA,HAAnd converting the output vector H of the final encoder through a full connection layer, wherein the H is a tensor with one dimension of (n, L), wherein n is the length of each sample sequence, and L is the number of types of output labels.
7. The method of claim 1, wherein step (5) comprises:
(5.1) calculating a total score for the marker sequence, which is the sum of the score for the status features and the score for the metastasis features, defined as follows:
Figure FDA0001922431820000011
wherein Hi,yiThe predicted tag representing the ith word of the sequence is yiScore of the State feature of time, Ayi,yi+1Representing slave label yiTransfer to yi+1Score of the state transition feature of (1), y0Representing the beginning of the marker sequence, ynRepresents the end of the marker sequence;
(5.2) performing Softmax processing on scores S (X, y) corresponding to all possible marker sequences y to obtain the probability of the sequences y:
Figure FDA0001922431820000021
wherein, YXRepresents all possible marker sequences for input sequence X;
(5.3) in the prediction stage of the decoder, the output marker sequence is the sequence that achieves the maximum score:
Figure FDA0001922431820000022
8. the method of claim 1 or 7 wherein the conditional random field based decoder is such that the objective function in the training process is to maximize the log-likelihood of the correct token sequence of the training set and the cost function is the negative of the objective function.
9. The method of claim 8 wherein the conditional random field based decoder is trained using an adaptive gradient descent algorithm and the learning rate is adaptively adjusted based on the training speed, and wherein the training is terminated if the effect of the model on the test set is decreasing, and wherein the training is continued otherwise.
10. A system for extracting a text language association based on a recurrent neural network, comprising:
the word/word embedding module is used for mapping each word/word of the original text sequence into a certain-dimension vector;
the context feature encoder is used for representing the text sequence after embedding the words and the phrases in a vector form, automatically extracting complex context features and encoding semantic information of the context;
the word segmentation module is used for segmenting the original text sequence;
the entity expression pair extractor is used for discovering definitions about non-standard expressions in the document based on word segmentation results of the word segmentation module, and extracting standard expressions and non-standard expressions which belong to the same entity concept and are defined as entity expression characteristics;
the entity normalization information encoder is used for encoding the entity expression characteristics extracted by the extractor by the entity expression and embedding the information about entity normalization into a low-dimensional entity expression vector;
the entity identification and normalization coding combination module is used for connecting the context characteristics obtained by the context characteristic encoder with the entity expression characteristic codes obtained by the entity normalization information encoder in a vector space to obtain final codes fusing entity identification and entity expression normalization information;
and the decoder based on the conditional random field is used for calculating to obtain a globally optimal state sequence as a final output sequence of the text entity and the entity incidence relation by combining the transition probability between the output and the state of the entity identification and normalization coding combination module.
CN201811600745.6A 2018-12-26 2018-12-26 Text language association extraction method and system based on recurrent neural network Pending CN111368542A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811600745.6A CN111368542A (en) 2018-12-26 2018-12-26 Text language association extraction method and system based on recurrent neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811600745.6A CN111368542A (en) 2018-12-26 2018-12-26 Text language association extraction method and system based on recurrent neural network

Publications (1)

Publication Number Publication Date
CN111368542A true CN111368542A (en) 2020-07-03

Family

ID=71206031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811600745.6A Pending CN111368542A (en) 2018-12-26 2018-12-26 Text language association extraction method and system based on recurrent neural network

Country Status (1)

Country Link
CN (1) CN111368542A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112184178A (en) * 2020-10-14 2021-01-05 深圳壹账通智能科技有限公司 Mail content extraction method and device, electronic equipment and storage medium
CN113065346A (en) * 2021-04-02 2021-07-02 国网浙江省电力有限公司信息通信分公司 Text entity identification method and related device
CN113268595A (en) * 2021-05-24 2021-08-17 中国电子科技集团公司第二十八研究所 Structured airport alarm processing method based on entity relationship extraction
CN114625340A (en) * 2022-05-11 2022-06-14 深圳市商用管理软件有限公司 Commercial software research and development method, device, equipment and medium based on demand analysis
CN114663896A (en) * 2022-05-17 2022-06-24 深圳前海环融联易信息科技服务有限公司 Document information extraction method, device, equipment and medium based on image processing
CN116090458A (en) * 2022-12-20 2023-05-09 北京邮电大学 Medical information extraction method, device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853292A (en) * 2010-05-18 2010-10-06 深圳市北科瑞讯信息技术有限公司 Method and system for constructing business social network
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN107526798A (en) * 2017-08-18 2017-12-29 武汉红茶数据技术有限公司 A kind of Entity recognition based on neutral net and standardization integrated processes and model
CN108446355A (en) * 2018-03-12 2018-08-24 深圳证券信息有限公司 Investment and financing event argument abstracting method, device and equipment
CN108460013A (en) * 2018-01-30 2018-08-28 大连理工大学 A kind of sequence labelling model based on fine granularity vocabulary representation model

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853292A (en) * 2010-05-18 2010-10-06 深圳市北科瑞讯信息技术有限公司 Method and system for constructing business social network
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN107122416A (en) * 2017-03-31 2017-09-01 北京大学 A kind of Chinese event abstracting method
CN107526798A (en) * 2017-08-18 2017-12-29 武汉红茶数据技术有限公司 A kind of Entity recognition based on neutral net and standardization integrated processes and model
CN108460013A (en) * 2018-01-30 2018-08-28 大连理工大学 A kind of sequence labelling model based on fine granularity vocabulary representation model
CN108446355A (en) * 2018-03-12 2018-08-24 深圳证券信息有限公司 Investment and financing event argument abstracting method, device and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱佳晖: "基于双向LSTM和CRF的军事命名实体识别和链接", 《第六届中国指挥控制大会论文集(上册)》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112184178A (en) * 2020-10-14 2021-01-05 深圳壹账通智能科技有限公司 Mail content extraction method and device, electronic equipment and storage medium
CN113065346A (en) * 2021-04-02 2021-07-02 国网浙江省电力有限公司信息通信分公司 Text entity identification method and related device
CN113268595A (en) * 2021-05-24 2021-08-17 中国电子科技集团公司第二十八研究所 Structured airport alarm processing method based on entity relationship extraction
CN113268595B (en) * 2021-05-24 2022-09-06 中国电子科技集团公司第二十八研究所 Structured airport alarm processing method based on entity relationship extraction
CN114625340A (en) * 2022-05-11 2022-06-14 深圳市商用管理软件有限公司 Commercial software research and development method, device, equipment and medium based on demand analysis
CN114663896A (en) * 2022-05-17 2022-06-24 深圳前海环融联易信息科技服务有限公司 Document information extraction method, device, equipment and medium based on image processing
CN116090458A (en) * 2022-12-20 2023-05-09 北京邮电大学 Medical information extraction method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111783462B (en) Chinese named entity recognition model and method based on double neural network fusion
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN109657239B (en) Chinese named entity recognition method based on attention mechanism and language model learning
CN110162749B (en) Information extraction method, information extraction device, computer equipment and computer readable storage medium
CN108416058B (en) Bi-LSTM input information enhancement-based relation extraction method
CN111368542A (en) Text language association extraction method and system based on recurrent neural network
CN111737496A (en) Power equipment fault knowledge map construction method
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
CN110851604B (en) Text classification method and device, electronic equipment and storage medium
CN111666758B (en) Chinese word segmentation method, training device and computer readable storage medium
CN110263325B (en) Chinese word segmentation system
CN112380863A (en) Sequence labeling method based on multi-head self-attention mechanism
CN113704416B (en) Word sense disambiguation method and device, electronic equipment and computer-readable storage medium
CN112163089B (en) High-technology text classification method and system integrating named entity recognition
CN111309918A (en) Multi-label text classification method based on label relevance
Ren et al. Detecting the scope of negation and speculation in biomedical texts by using recursive neural network
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN113190656A (en) Chinese named entity extraction method based on multi-label framework and fusion features
CN111737497B (en) Weak supervision relation extraction method based on multi-source semantic representation fusion
CN114428850A (en) Text retrieval matching method and system
CN111428518B (en) Low-frequency word translation method and device
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN116680575B (en) Model processing method, device, equipment and storage medium
CN115062123A (en) Knowledge base question-answer pair generation method of conversation generation system
Liu et al. Exploring segment representations for neural semi-Markov conditional random fields

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200703