CN110826334B - Chinese named entity recognition model based on reinforcement learning and training method thereof - Google Patents
Chinese named entity recognition model based on reinforcement learning and training method thereof Download PDFInfo
- Publication number
- CN110826334B CN110826334B CN201911089295.3A CN201911089295A CN110826334B CN 110826334 B CN110826334 B CN 110826334B CN 201911089295 A CN201911089295 A CN 201911089295A CN 110826334 B CN110826334 B CN 110826334B
- Authority
- CN
- China
- Prior art keywords
- word
- sentence
- network
- named entity
- representing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012549 training Methods 0.000 title claims abstract description 31
- 230000002787 reinforcement Effects 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000009471 action Effects 0.000 claims abstract description 43
- 230000011218 segmentation Effects 0.000 claims abstract description 22
- 238000005215 recombination Methods 0.000 claims abstract description 20
- 230000006798 recombination Effects 0.000 claims abstract description 20
- 239000013598 vector Substances 0.000 claims description 46
- 230000006870 function Effects 0.000 claims description 32
- 238000002372 labelling Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention relates to a Chinese named entity recognition model based on reinforcement learning and a training method thereof, wherein the model comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module; firstly, a strategy network designates an action sequence, then a word segmentation and recombination network executes actions in the action sequence one by one, a phrase is obtained through 'terminating' actions, the phrase is used as auxiliary input information, a lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into a named entity recognition network, a label sequence of sentences is obtained, and a recognition result is used as a delay rewarding guiding strategy network module to update. The invention effectively divides sentences by reinforcement learning, avoids modeling redundant interference words matched in sentences, effectively avoids the influence on dependence and long text of an external dictionary, can better utilize the correct word information, and better helps a Chinese named entity recognition model to improve recognition effect.
Description
Technical Field
The invention relates to the field of machine learning, in particular to a Chinese named entity recognition model based on reinforcement learning and a training method thereof.
Background
Named entity recognition (named entity recognition, NER) is a basic task in the field of natural language processing, and is to recognize named terms from texts, lay down the tasks of relation extraction, question-answering system, syntactic analysis, machine translation and the like, and play an important role in the process of the natural language processing technology going to practical use. In general, the task of identifying named entities is to identify named entities of three major classes (entity class, time class and digit class) and seven minor classes (person name, organization name, place name, time, date, currency and percentage) in the text to be processed.
The existing Chinese named entity recognition model is a lattice-LSTM, and besides each word in a sentence, the model takes cell vectors of all potential words taking the word as a word tail as input, the selection of the potential words depends on an external dictionary, a supplementary gate is additionally added to control the selection of word granularity information and word granularity information, and the input vectors are changed into word information, a last hidden state vector and all word information taking the word as the word tail by word information, a last hidden state vector and a last cell state vector. The model has the advantage that explicit word information can be utilized in a model based on word sequence labeling without encountering word segmentation errors.
However, just because the lattice-LSTM model uses information of all words in a sentence, words composed of adjacent words in the sentence, if present in an external dictionary, are input into the model as registered word granularity information, but the words are not necessarily correctly divided in the sentence, such as: according to the thought of the model, "Changjiang bridge in Nanjing city" uses the login word composed of characters as input according to the sequence, the login word means that the word is a noun recorded in an external dictionary, then "Nanjing", "Nanjing city", "city long", "Changjiang bridge" and "Changjiang bridge" are used as input words in the model, but obviously, the word "city long" is an interfering word in the sentence, and the word information of the word is used to have negative influence on entity identification. In addition, the model typically requires autonomous construction of an external dictionary from the dataset used for the experiment, with severe dependencies on the external dictionary. Meanwhile, when the text length is increased, the number of potential words in the sentence is increased, and the complexity of the model is greatly improved.
Disclosure of Invention
The invention provides a Chinese named entity recognition model based on reinforcement learning and a training method thereof, which aims to solve the problems that in the prior art, redundant interference words matched in sentences are modeled and depend on an external dictionary and influenced by long text, and the internal relation of the sentences is learned by constructing a reinforcement learning model, so that a sentence dividing method related to a named entity recognition task is effectively learned, thereby cutting the sentences and realizing effective sentence division. Therefore, the method can effectively avoid the input of disturbing words and the use of an external dictionary, reduces the number of words in sentences when the text length is increased, and can better help the Chinese named entity recognition model to improve the recognition accuracy by utilizing the correct word information.
In order to solve the technical problems, the invention adopts the following technical scheme: the Chinese named entity recognition model based on reinforcement learning comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so as to obtain an action sequence for the whole sentence, and obtaining delay rewards according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM representation of the sentences;
and the named entity recognition network module is used for inputting the hidden state of the lattice-LSTM representation of the sentence into a CRF (conditional random field ) layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward.
Preferably, the action includes internal or termination.
Preferably, the random strategy is:
π(a t |s t ;θ)=σ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
Preferably, the word segmentation and recombination network cuts sentences according to the action sequence output by the strategy network module to obtain phrases, and encodes each phrase to be respectively used as the input of the cell state at the last word of the corresponding phrase to obtain the characteristics of the labalice-LSTM of the sentences.
Preferably, the named entity recognition network module inputs the output of a lattice-LSTM obtained by a word segmentation and recombination network into a CRF layer, scores each labeling sequence of the sentence by utilizing a characteristic function set of the CRF layer, indexes and normalizes the score, calculates all possible labeling sequences by using a first-order Viterbi algorithm, takes the sequence with the highest score as the final output, carries out parameter training on the value of a loss function in a back propagation manner, and takes the loss value as a delay rewarding updating strategy network module; the penalty function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y represent a sentence and a labeling sequence corresponding to the sentence, respectively.
The training method is used for training the Chinese named entity recognition model and comprises the following steps:
step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word;
step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a CRF layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Input CRF layer; let y=l 1 ,l 2 ,…,l n Representing the output tag of the CRF layer, the output tag sequence probability is calculated by:
wherein s represents a sentence;is directed to l i Model parameters of (2); />Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets.
The calculation formula of the loss value function is as follows:
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
Preferably, in the first step, the action includes internal or termination, and the formula of the random policy is as follows:
π(a t |s t ;θ)=ρ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
Preferably, in the second step, the character layer is characterized by the character through the LSTM, and the update formula is as follows:
wherein ,a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence; /> and />The cell state and the hidden state at time t are shown, respectively.
After the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
wherein ,a code vector representing a j-th word in the sentence; />The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter; />Representing input, forget and output gates, respectively; />Representing a new candidate state; />Cell states representing the j-1 th word of a sentence; />Representing the updated cell state; />The hidden state of the j-th word moment of the sentence is represented by the output gate +.>And the cell state at the present moment->Determining; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
Phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
wherein ,a code vector representing a phrase in the sentence starting from the b-th word and ending with the e-th word; />The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter; />Representing input and forget gates, respectively; />Representing a new candidate state; />A cell state representing the phrase first word; />Representing the updated cell state; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
Additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
wherein ,a code vector representing an e-th word in the sentence; />Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter; />Representing an additional door; σ () represents a sigmoid function.
The updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows:
wherein ,an input gate vector for the j-th word; />An input gate vector that is a phrase ending with j starting with b;is the phrase cell state; />New candidate cell states for the word; />Is a phrase information vector;is a word information vector.
Preferably, before the first step, the named entity recognition network and network parameters thereof are pre-trained, and words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Compared with the prior art, the invention has the beneficial effects that: the Chinese named entity recognition model and the method thereof based on reinforcement learning effectively divide sentences by utilizing reinforcement learning, avoid modeling redundant interference words matched in sentences, effectively avoid dependence on an external dictionary and influence of long texts, and better utilize the correct word information to better help the Chinese named entity recognition model to improve recognition effect.
Drawings
FIG. 1 is a schematic diagram of a Chinese named entity recognition model based on reinforcement learning;
FIG. 2 is a schematic diagram of a strategy network module of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 3 is a schematic diagram of a named entity recognition network module based on a reinforcement learning Chinese named entity recognition model according to the present invention;
FIG. 4 is a flow chart of a training method of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 5 is a diagram of an example sentence segmentation for a training method of a Chinese named entity recognition model based on reinforcement learning according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent; for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationship depicted in the drawings is for illustrative purposes only and is not to be construed as limiting the present patent.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are orientations or positional relationships indicated by terms "upper", "lower", "left", "right", "long", "short", etc., based on the orientations or positional relationships shown in the drawings, this is merely for convenience in describing the present invention and simplifying the description, and is not an indication or suggestion that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, so that the terms describing the positional relationships in the drawings are merely for exemplary illustration and are not to be construed as limitations of the present patent, and that it is possible for those of ordinary skill in the art to understand the specific meaning of the terms described above according to specific circumstances.
The technical scheme of the invention is further specifically described by the following specific embodiments with reference to the accompanying drawings:
example 1
1-3 are embodiments of a Chinese named entity recognition model based on reinforcement learning, which comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action (the action comprises the internal part or the termination) for each word in the sentence under each state space by adopting a random strategy, so that an action sequence is obtained for the whole sentence, and delay rewards are obtained according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update; the random strategy is:
π(a t |s t ;θ)=σ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
The word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM expression of the sentences;
specifically, the word segmentation and recombination network cuts sentences according to the action sequences output by the strategy network module to obtain phrases, and encodes each phrase to be respectively used as the input of the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentences.
And the named entity recognition network module is used for inputting the hidden state of the lattice-LSTM expression of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train the named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward. Wherein, the calculation formula of the loss value is as follows,
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
The working principle of the embodiment is as follows: firstly, a strategy network designates an action sequence, then a word segmentation and recombination network executes actions in the action sequence one by one, a phrase is obtained through 'terminating' actions, the phrase is used as the input information of the last word of the phrase, a lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into a named entity recognition network, a label sequence of sentences is obtained, and a recognition result is used as a delay rewarding guide strategy network module to update.
The beneficial effects of this embodiment are: the embodiment is an enhancement of an LSTM-CRF model based on a neural network, combines a framework of reinforcement learning to learn the internal relation of sentences, efficiently divides the sentence structure, integrates the obtained phrase information into a lattice-LSTM model based on word granularity, and fully learns the word granularity information and the word granularity information related to the word granularity information so as to achieve a better recognition effect.
Example 2
Fig. 4 shows an embodiment of a training method for training the model of embodiment 1 based on reinforcement learning of a chinese named entity recognition model, which includes the following steps:
pretreatment: pre-training a named entity recognition network and network parameters thereof, wherein words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
in step one, the states, actions, policies are defined as follows:
1. status: the encoding vector of the currently input word and the context vector preceding the word;
2. the actions are as follows: defining two distinct operations, including internal and termination;
3. strategy: the random strategy is defined as follows:
π(a t |s t ;θ)=σ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
Step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word;
as shown in fig. 5, "washington in the united states" is divided into "washington" in the united states ". The character level characterization of the word by LSTM is performed with the updated formula as follows:
wherein ,a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence; /> and />The cell state and the hidden state at time t are shown, respectively.
After the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
wherein ,a code vector representing a j-th word in the sentence; />The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter; />Representing input, forget and output gates, respectively; />Representing a new candidate state; />Cell states representing the j-1 th word of a sentence; />Representing the updated cell state; />The hidden state of the j-th word moment of the sentence is represented; by the output door->And the cell state at the present moment->Determining; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
Phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
a code vector representing a phrase in the sentence starting from the b-th word and ending with the e-th word; />The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter; />Respectively represent transfusionDoor entry and forget; />Representing a new candidate state; />A cell state representing the phrase first word; />Representing the updated cell state; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
Additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
wherein ,a code vector representing an e-th word in the sentence; />Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter; />Representing an additional door; σ () represents a sigmoid function.
The updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows: />
wherein ,an input gate vector for the j-th word; />An input gate vector that is a phrase ending with j starting with b;is the phrase cell state; />New candidate cell states for the word; />Is a phrase information vector;is a word information vector.
Step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a CRF layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Input CRF layer; let y=l 1 ,l 2 ,…,l n Representing the output tag of the CRF layer, the output tag sequence probability is calculated by:
wherein s represents a sentence;is directed to l i Model parameters of (2); />Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets.
The calculation formula of the loss value function is as follows:
wherein lambda is L 2 The regular term coefficients, θ, represent the parameter set, and s and y represent the sentence and the correct labeling sequence corresponding to the sentence, respectively.
The definition of rewards is: after the action sequence is sampled through the strategy network, sentence division can be obtained, phrases obtained after sentence division are added into an LSTM model based on word granularity as word granularity information, a token based on the lattice-LSTM model is obtained, the token is input into a named entity recognition network module, entity labels of each word are obtained through a CRF layer, entity labels are decoded, and a reward value is calculated according to a recognition result. This is a delayed reward with which the policy network module can be directed to update, since the final recognition result is to be obtained to calculate the reward value.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.
Claims (8)
1. A training method of a Chinese named entity recognition model based on reinforcement learning is characterized by comprising the following steps:
step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word; the character is characterized in character level through LSTM, and each phrase is obtained according to termination, and the updated formula is as follows:
wherein ,a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence; /> and />Respectively representing a cell state and a hidden state at a time t;
after the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
wherein ,a code vector representing a j-th word in the sentence; />The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter; />Representing input, forget and output gates, respectively; />Representing a new candidate state; />Cell states representing the j-1 th word of a sentence; />Representing the updated cell state; />The hidden state of the j-th word moment of the sentence is represented; by the output door->And the cell state at the present moment->Determining; sigma () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function;
phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
wherein ,a code vector representing a phrase in the sentence starting from the b-th word and ending with the w-th word; />The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter; />Representing input and forget gates, respectively; />Representing a new candidate state; />A cell state representing the phrase first word; />Representing the updated cell state; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function;
additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
wherein ,a code vector representing an e-th word in the sentence; />Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter; />Representing an additional door; sigma () represents a sigmoid function;
the updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows:
wherein ,an input gate vector for the j-th word; />An input gate vector that is a phrase ending with j starting with b; />Is the phrase cell state; />New candidate cell states for the word; />Is a phrase information vector;is a word information vector;
step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a conditional random field layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the update of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Inputting a conditional random field layer; let y=l 1 ,l 2 ,…,l n The output label representing the conditional random field layer, the output label sequence probability is calculated by:
wherein s represents a sentence;is directed to l i Model parameters of (2); />Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets;
the calculation formula of the loss value function is as follows:
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
2. The training method of a reinforcement learning-based chinese named entity recognition model of claim 1, wherein in said step one, said actions include internal or termination, and the formula of the random strategy is as follows:
π(a t | t ;)=(W*s t +)
wherein ,π(at | t The method comprises the steps of carrying out a first treatment on the surface of the ) Representing selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t The state of the strategy network at the moment t; sigma () represents a sigmoid function; w, b denotes network parameters.
3. The training method of a Chinese named entity recognition model based on reinforcement learning according to claim 1, wherein before the first step, the named entity recognition network and network parameters thereof are pre-trained, and words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
4. The Chinese named entity recognition model based on reinforcement learning is characterized by comprising a strategy network module, a word segmentation recombination network and a named entity recognition network module; training with the training method of the preceding claims 1-3;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so as to obtain an action sequence for the whole sentence;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM expression of the sentences;
and the named entity recognition network module is used for inputting the hidden state of the language-LSTM expression of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train the named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward.
5. The reinforcement-learning-based chinese named entity recognition model of claim 4, wherein said actions comprise internal or termination.
6. The reinforcement-learning-based chinese named entity recognition model of claim 4, wherein said random strategy is:
π(a t | t ;)=(W*s t +)
wherein ,π(at | t The method comprises the steps of carrying out a first treatment on the surface of the ) Representing selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t The state of the strategy network at the moment t; sigma () represents a sigmoid function; w, b denotes network parameters.
7. The reinforcement learning-based Chinese named entity recognition model of claim 6, wherein the word segmentation and recombination network cuts sentences according to the action sequences output by the strategy network module to obtain phrases, and encodes each phrase as the cell state input at the last word of the corresponding phrase to obtain the language-LSTM representation of the sentences.
8. The reinforcement learning-based Chinese named entity recognition model of claim 7, wherein the named entity recognition network module inputs the output of lattice-LSTM obtained by the word segmentation and recombination network into a conditional random field layer, scores each labeling sequence of the sentence by using a feature function set of the conditional random field layer, indexes and normalizes the score, calculates all possible labeling sequences by using a first-order Viterbi algorithm, and the labeling sequence with the highest score is used as a final output. Meanwhile, defining a loss function, carrying out parameter training on the back propagation of the loss value, and taking the loss value as a delay rewarding updating strategy network module; the penalty function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911089295.3A CN110826334B (en) | 2019-11-08 | 2019-11-08 | Chinese named entity recognition model based on reinforcement learning and training method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911089295.3A CN110826334B (en) | 2019-11-08 | 2019-11-08 | Chinese named entity recognition model based on reinforcement learning and training method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110826334A CN110826334A (en) | 2020-02-21 |
CN110826334B true CN110826334B (en) | 2023-04-21 |
Family
ID=69553722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911089295.3A Active CN110826334B (en) | 2019-11-08 | 2019-11-08 | Chinese named entity recognition model based on reinforcement learning and training method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110826334B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476031A (en) * | 2020-03-11 | 2020-07-31 | 重庆邮电大学 | Improved Chinese named entity recognition method based on L attice-L STM |
CN111666734B (en) * | 2020-04-24 | 2021-08-10 | 北京大学 | Sequence labeling method and device |
CN111951959A (en) * | 2020-08-23 | 2020-11-17 | 云知声智能科技股份有限公司 | Dialogue type diagnosis guiding method and device based on reinforcement learning and storage medium |
CN112163089B (en) * | 2020-09-24 | 2023-06-23 | 中国电子科技集团公司第十五研究所 | High-technology text classification method and system integrating named entity recognition |
CN112699682B (en) * | 2020-12-11 | 2022-05-17 | 山东大学 | Named entity identification method and device based on combinable weak authenticator |
CN113051921B (en) * | 2021-03-17 | 2024-02-20 | 北京智慧星光信息技术有限公司 | Internet text entity identification method, system, electronic equipment and storage medium |
CN112966517B (en) * | 2021-04-30 | 2022-02-18 | 平安科技(深圳)有限公司 | Training method, device, equipment and medium for named entity recognition model |
CN114004233B (en) * | 2021-12-30 | 2022-05-06 | 之江实验室 | Remote supervision named entity recognition method based on semi-training and sentence selection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117472A (en) * | 2018-11-12 | 2019-01-01 | 新疆大学 | A kind of Uighur name entity recognition method based on deep learning |
CN109255119A (en) * | 2018-07-18 | 2019-01-22 | 五邑大学 | A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition |
CN109597876A (en) * | 2018-11-07 | 2019-04-09 | 中山大学 | A kind of more wheels dialogue answer preference pattern and its method based on intensified learning |
CN109657239A (en) * | 2018-12-12 | 2019-04-19 | 电子科技大学 | The Chinese name entity recognition method learnt based on attention mechanism and language model |
-
2019
- 2019-11-08 CN CN201911089295.3A patent/CN110826334B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109255119A (en) * | 2018-07-18 | 2019-01-22 | 五邑大学 | A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition |
CN109597876A (en) * | 2018-11-07 | 2019-04-09 | 中山大学 | A kind of more wheels dialogue answer preference pattern and its method based on intensified learning |
CN109117472A (en) * | 2018-11-12 | 2019-01-01 | 新疆大学 | A kind of Uighur name entity recognition method based on deep learning |
CN109657239A (en) * | 2018-12-12 | 2019-04-19 | 电子科技大学 | The Chinese name entity recognition method learnt based on attention mechanism and language model |
Also Published As
Publication number | Publication date |
---|---|
CN110826334A (en) | 2020-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110826334B (en) | Chinese named entity recognition model based on reinforcement learning and training method thereof | |
CN108628823B (en) | Named entity recognition method combining attention mechanism and multi-task collaborative training | |
Yao et al. | An improved LSTM structure for natural language processing | |
CN111767718B (en) | Chinese grammar error correction method based on weakened grammar error feature representation | |
CN111581970B (en) | Text recognition method, device and storage medium for network context | |
CN110162789A (en) | A kind of vocabulary sign method and device based on the Chinese phonetic alphabet | |
CN110991185A (en) | Method and device for extracting attributes of entities in article | |
Wu et al. | An effective approach of named entity recognition for cyber threat intelligence | |
Solyman et al. | Proposed model for arabic grammar error correction based on convolutional neural network | |
CN115658890A (en) | Chinese comment classification method based on topic-enhanced emotion-shared attention BERT model | |
CN114841167A (en) | Clinical named entity identification method based on multi-embedding combination of graph neural network | |
Anbukkarasi et al. | Neural network-based error handler in natural language processing | |
Göker et al. | Neural text normalization for turkish social media | |
CN116522165B (en) | Public opinion text matching system and method based on twin structure | |
CN113705207A (en) | Grammar error recognition method and device | |
CN112183062A (en) | Spoken language understanding method based on alternate decoding, electronic equipment and storage medium | |
Alkhatlan et al. | Attention-based sequence learning model for Arabic diacritic restoration | |
Smith et al. | Bootstrapping feature-rich dependency parsers with entropic priors | |
Heymann et al. | Improving ctc using stimulated learning for sequence modeling | |
CN116362242A (en) | Small sample slot value extraction method, device, equipment and storage medium | |
Swaileh | Language Modelling for Handwriting Recognition | |
CN111597831B (en) | Machine translation method for generating statistical guidance by hybrid deep learning network and words | |
CN113012685B (en) | Audio recognition method and device, electronic equipment and storage medium | |
Fan et al. | Sub-word based mongolian offline handwriting recognition | |
He et al. | A comparison and improvement of online learning algorithms for sequence labeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |