CN110826334B - Chinese named entity recognition model based on reinforcement learning and training method thereof - Google Patents

Chinese named entity recognition model based on reinforcement learning and training method thereof Download PDF

Info

Publication number
CN110826334B
CN110826334B CN201911089295.3A CN201911089295A CN110826334B CN 110826334 B CN110826334 B CN 110826334B CN 201911089295 A CN201911089295 A CN 201911089295A CN 110826334 B CN110826334 B CN 110826334B
Authority
CN
China
Prior art keywords
word
sentence
network
named entity
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911089295.3A
Other languages
Chinese (zh)
Other versions
CN110826334A (en
Inventor
叶梅
卓汉逵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911089295.3A priority Critical patent/CN110826334B/en
Publication of CN110826334A publication Critical patent/CN110826334A/en
Application granted granted Critical
Publication of CN110826334B publication Critical patent/CN110826334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a Chinese named entity recognition model based on reinforcement learning and a training method thereof, wherein the model comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module; firstly, a strategy network designates an action sequence, then a word segmentation and recombination network executes actions in the action sequence one by one, a phrase is obtained through 'terminating' actions, the phrase is used as auxiliary input information, a lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into a named entity recognition network, a label sequence of sentences is obtained, and a recognition result is used as a delay rewarding guiding strategy network module to update. The invention effectively divides sentences by reinforcement learning, avoids modeling redundant interference words matched in sentences, effectively avoids the influence on dependence and long text of an external dictionary, can better utilize the correct word information, and better helps a Chinese named entity recognition model to improve recognition effect.

Description

Chinese named entity recognition model based on reinforcement learning and training method thereof
Technical Field
The invention relates to the field of machine learning, in particular to a Chinese named entity recognition model based on reinforcement learning and a training method thereof.
Background
Named entity recognition (named entity recognition, NER) is a basic task in the field of natural language processing, and is to recognize named terms from texts, lay down the tasks of relation extraction, question-answering system, syntactic analysis, machine translation and the like, and play an important role in the process of the natural language processing technology going to practical use. In general, the task of identifying named entities is to identify named entities of three major classes (entity class, time class and digit class) and seven minor classes (person name, organization name, place name, time, date, currency and percentage) in the text to be processed.
The existing Chinese named entity recognition model is a lattice-LSTM, and besides each word in a sentence, the model takes cell vectors of all potential words taking the word as a word tail as input, the selection of the potential words depends on an external dictionary, a supplementary gate is additionally added to control the selection of word granularity information and word granularity information, and the input vectors are changed into word information, a last hidden state vector and all word information taking the word as the word tail by word information, a last hidden state vector and a last cell state vector. The model has the advantage that explicit word information can be utilized in a model based on word sequence labeling without encountering word segmentation errors.
However, just because the lattice-LSTM model uses information of all words in a sentence, words composed of adjacent words in the sentence, if present in an external dictionary, are input into the model as registered word granularity information, but the words are not necessarily correctly divided in the sentence, such as: according to the thought of the model, "Changjiang bridge in Nanjing city" uses the login word composed of characters as input according to the sequence, the login word means that the word is a noun recorded in an external dictionary, then "Nanjing", "Nanjing city", "city long", "Changjiang bridge" and "Changjiang bridge" are used as input words in the model, but obviously, the word "city long" is an interfering word in the sentence, and the word information of the word is used to have negative influence on entity identification. In addition, the model typically requires autonomous construction of an external dictionary from the dataset used for the experiment, with severe dependencies on the external dictionary. Meanwhile, when the text length is increased, the number of potential words in the sentence is increased, and the complexity of the model is greatly improved.
Disclosure of Invention
The invention provides a Chinese named entity recognition model based on reinforcement learning and a training method thereof, which aims to solve the problems that in the prior art, redundant interference words matched in sentences are modeled and depend on an external dictionary and influenced by long text, and the internal relation of the sentences is learned by constructing a reinforcement learning model, so that a sentence dividing method related to a named entity recognition task is effectively learned, thereby cutting the sentences and realizing effective sentence division. Therefore, the method can effectively avoid the input of disturbing words and the use of an external dictionary, reduces the number of words in sentences when the text length is increased, and can better help the Chinese named entity recognition model to improve the recognition accuracy by utilizing the correct word information.
In order to solve the technical problems, the invention adopts the following technical scheme: the Chinese named entity recognition model based on reinforcement learning comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so as to obtain an action sequence for the whole sentence, and obtaining delay rewards according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM representation of the sentences;
and the named entity recognition network module is used for inputting the hidden state of the lattice-LSTM representation of the sentence into a CRF (conditional random field ) layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward.
Preferably, the action includes internal or termination.
Preferably, the random strategy is:
π(a t |s t ;θ)=σ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
Preferably, the word segmentation and recombination network cuts sentences according to the action sequence output by the strategy network module to obtain phrases, and encodes each phrase to be respectively used as the input of the cell state at the last word of the corresponding phrase to obtain the characteristics of the labalice-LSTM of the sentences.
Preferably, the named entity recognition network module inputs the output of a lattice-LSTM obtained by a word segmentation and recombination network into a CRF layer, scores each labeling sequence of the sentence by utilizing a characteristic function set of the CRF layer, indexes and normalizes the score, calculates all possible labeling sequences by using a first-order Viterbi algorithm, takes the sequence with the highest score as the final output, carries out parameter training on the value of a loss function in a back propagation manner, and takes the loss value as a delay rewarding updating strategy network module; the penalty function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
Figure BDA0002266384690000031
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y represent a sentence and a labeling sequence corresponding to the sentence, respectively.
The training method is used for training the Chinese named entity recognition model and comprises the following steps:
step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word;
step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a CRF layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Input CRF layer; let y=l 1 ,l 2 ,…,l n Representing the output tag of the CRF layer, the output tag sequence probability is calculated by:
Figure BDA0002266384690000041
wherein s represents a sentence;
Figure BDA0002266384690000042
is directed to l i Model parameters of (2); />
Figure BDA0002266384690000043
Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets.
The calculation formula of the loss value function is as follows:
Figure BDA0002266384690000044
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
Preferably, in the first step, the action includes internal or termination, and the formula of the random policy is as follows:
π(a t |s t ;θ)=ρ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
Preferably, in the second step, the character layer is characterized by the character through the LSTM, and the update formula is as follows:
Figure BDA0002266384690000045
wherein ,
Figure BDA00022663846900000415
a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence; />
Figure BDA0002266384690000046
and />
Figure BDA0002266384690000047
The cell state and the hidden state at time t are shown, respectively.
After the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
Figure BDA0002266384690000048
Figure BDA0002266384690000049
Figure BDA00022663846900000410
wherein ,
Figure BDA00022663846900000411
a code vector representing a j-th word in the sentence; />
Figure BDA00022663846900000412
The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter; />
Figure BDA00022663846900000413
Representing input, forget and output gates, respectively; />
Figure BDA00022663846900000414
Representing a new candidate state; />
Figure BDA0002266384690000051
Cell states representing the j-1 th word of a sentence; />
Figure BDA0002266384690000052
Representing the updated cell state; />
Figure BDA0002266384690000053
The hidden state of the j-th word moment of the sentence is represented by the output gate +.>
Figure BDA0002266384690000054
And the cell state at the present moment->
Figure BDA0002266384690000055
Determining; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
Phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
Figure BDA0002266384690000056
Figure BDA0002266384690000057
wherein ,
Figure BDA0002266384690000058
a code vector representing a phrase in the sentence starting from the b-th word and ending with the e-th word; />
Figure BDA0002266384690000059
The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter; />
Figure BDA00022663846900000510
Representing input and forget gates, respectively; />
Figure BDA00022663846900000511
Representing a new candidate state; />
Figure BDA00022663846900000512
A cell state representing the phrase first word; />
Figure BDA00022663846900000513
Representing the updated cell state; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
Additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
Figure BDA00022663846900000514
wherein ,
Figure BDA00022663846900000515
a code vector representing an e-th word in the sentence; />
Figure BDA00022663846900000516
Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter; />
Figure BDA00022663846900000517
Representing an additional door; σ () represents a sigmoid function.
Figure BDA00022663846900000518
The updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows:
Figure BDA00022663846900000519
wherein ,
Figure BDA00022663846900000520
an input gate vector for the j-th word; />
Figure BDA00022663846900000521
An input gate vector that is a phrase ending with j starting with b;
Figure BDA00022663846900000522
is the phrase cell state; />
Figure BDA00022663846900000523
New candidate cell states for the word; />
Figure BDA00022663846900000524
Is a phrase information vector;
Figure BDA00022663846900000525
is a word information vector.
Preferably, before the first step, the named entity recognition network and network parameters thereof are pre-trained, and words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Compared with the prior art, the invention has the beneficial effects that: the Chinese named entity recognition model and the method thereof based on reinforcement learning effectively divide sentences by utilizing reinforcement learning, avoid modeling redundant interference words matched in sentences, effectively avoid dependence on an external dictionary and influence of long texts, and better utilize the correct word information to better help the Chinese named entity recognition model to improve recognition effect.
Drawings
FIG. 1 is a schematic diagram of a Chinese named entity recognition model based on reinforcement learning;
FIG. 2 is a schematic diagram of a strategy network module of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 3 is a schematic diagram of a named entity recognition network module based on a reinforcement learning Chinese named entity recognition model according to the present invention;
FIG. 4 is a flow chart of a training method of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 5 is a diagram of an example sentence segmentation for a training method of a Chinese named entity recognition model based on reinforcement learning according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent; for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationship depicted in the drawings is for illustrative purposes only and is not to be construed as limiting the present patent.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are orientations or positional relationships indicated by terms "upper", "lower", "left", "right", "long", "short", etc., based on the orientations or positional relationships shown in the drawings, this is merely for convenience in describing the present invention and simplifying the description, and is not an indication or suggestion that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, so that the terms describing the positional relationships in the drawings are merely for exemplary illustration and are not to be construed as limitations of the present patent, and that it is possible for those of ordinary skill in the art to understand the specific meaning of the terms described above according to specific circumstances.
The technical scheme of the invention is further specifically described by the following specific embodiments with reference to the accompanying drawings:
example 1
1-3 are embodiments of a Chinese named entity recognition model based on reinforcement learning, which comprises a strategy network module, a word segmentation recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action (the action comprises the internal part or the termination) for each word in the sentence under each state space by adopting a random strategy, so that an action sequence is obtained for the whole sentence, and delay rewards are obtained according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update; the random strategy is:
π(a t |s t ;θ)=σ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
The word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM expression of the sentences;
specifically, the word segmentation and recombination network cuts sentences according to the action sequences output by the strategy network module to obtain phrases, and encodes each phrase to be respectively used as the input of the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentences.
And the named entity recognition network module is used for inputting the hidden state of the lattice-LSTM expression of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train the named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward. Wherein, the calculation formula of the loss value is as follows,
Figure BDA0002266384690000071
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
The working principle of the embodiment is as follows: firstly, a strategy network designates an action sequence, then a word segmentation and recombination network executes actions in the action sequence one by one, a phrase is obtained through 'terminating' actions, the phrase is used as the input information of the last word of the phrase, a lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into a named entity recognition network, a label sequence of sentences is obtained, and a recognition result is used as a delay rewarding guide strategy network module to update.
The beneficial effects of this embodiment are: the embodiment is an enhancement of an LSTM-CRF model based on a neural network, combines a framework of reinforcement learning to learn the internal relation of sentences, efficiently divides the sentence structure, integrates the obtained phrase information into a lattice-LSTM model based on word granularity, and fully learns the word granularity information and the word granularity information related to the word granularity information so as to achieve a better recognition effect.
Example 2
Fig. 4 shows an embodiment of a training method for training the model of embodiment 1 based on reinforcement learning of a chinese named entity recognition model, which includes the following steps:
pretreatment: pre-training a named entity recognition network and network parameters thereof, wherein words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
in step one, the states, actions, policies are defined as follows:
1. status: the encoding vector of the currently input word and the context vector preceding the word;
2. the actions are as follows: defining two distinct operations, including internal and termination;
3. strategy: the random strategy is defined as follows:
π(a t |s t ;θ)=σ(W*s t +b)
wherein ,π(at |s t The method comprises the steps of carrying out a first treatment on the surface of the θ) represents the selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t And the state of the policy network at the time t.
Step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word;
as shown in fig. 5, "washington in the united states" is divided into "washington" in the united states ". The character level characterization of the word by LSTM is performed with the updated formula as follows:
Figure BDA0002266384690000091
/>
wherein ,
Figure BDA0002266384690000092
a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence; />
Figure BDA0002266384690000093
and />
Figure BDA0002266384690000094
The cell state and the hidden state at time t are shown, respectively.
After the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
Figure BDA0002266384690000095
Figure BDA0002266384690000096
Figure BDA0002266384690000097
wherein ,
Figure BDA0002266384690000098
a code vector representing a j-th word in the sentence; />
Figure BDA0002266384690000099
The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter; />
Figure BDA00022663846900000910
Representing input, forget and output gates, respectively; />
Figure BDA00022663846900000911
Representing a new candidate state; />
Figure BDA00022663846900000912
Cell states representing the j-1 th word of a sentence; />
Figure BDA00022663846900000913
Representing the updated cell state; />
Figure BDA00022663846900000914
The hidden state of the j-th word moment of the sentence is represented; by the output door->
Figure BDA00022663846900000915
And the cell state at the present moment->
Figure BDA00022663846900000916
Determining; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
Phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
Figure BDA00022663846900000917
Figure BDA00022663846900000918
Figure BDA00022663846900000919
a code vector representing a phrase in the sentence starting from the b-th word and ending with the e-th word; />
Figure BDA00022663846900000920
The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter; />
Figure BDA00022663846900000921
Respectively represent transfusionDoor entry and forget; />
Figure BDA00022663846900000922
Representing a new candidate state; />
Figure BDA00022663846900000923
A cell state representing the phrase first word; />
Figure BDA00022663846900000924
Representing the updated cell state; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
Additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
Figure BDA0002266384690000101
wherein ,
Figure BDA0002266384690000102
a code vector representing an e-th word in the sentence; />
Figure BDA0002266384690000103
Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter; />
Figure BDA0002266384690000104
Representing an additional door; σ () represents a sigmoid function.
Figure BDA0002266384690000105
The updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows: />
Figure BDA0002266384690000106
wherein ,
Figure BDA0002266384690000107
an input gate vector for the j-th word; />
Figure BDA0002266384690000108
An input gate vector that is a phrase ending with j starting with b;
Figure BDA0002266384690000109
is the phrase cell state; />
Figure BDA00022663846900001010
New candidate cell states for the word; />
Figure BDA00022663846900001011
Is a phrase information vector;
Figure BDA00022663846900001012
is a word information vector.
Step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a CRF layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Input CRF layer; let y=l 1 ,l 2 ,…,l n Representing the output tag of the CRF layer, the output tag sequence probability is calculated by:
Figure BDA00022663846900001013
wherein s represents a sentence;
Figure BDA00022663846900001014
is directed to l i Model parameters of (2); />
Figure BDA00022663846900001015
Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets.
The calculation formula of the loss value function is as follows:
Figure BDA00022663846900001016
wherein lambda is L 2 The regular term coefficients, θ, represent the parameter set, and s and y represent the sentence and the correct labeling sequence corresponding to the sentence, respectively.
The definition of rewards is: after the action sequence is sampled through the strategy network, sentence division can be obtained, phrases obtained after sentence division are added into an LSTM model based on word granularity as word granularity information, a token based on the lattice-LSTM model is obtained, the token is input into a named entity recognition network module, entity labels of each word are obtained through a CRF layer, entity labels are decoded, and a reward value is calculated according to a recognition result. This is a delayed reward with which the policy network module can be directed to update, since the final recognition result is to be obtained to calculate the reward value.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (8)

1. A training method of a Chinese named entity recognition model based on reinforcement learning is characterized by comprising the following steps:
step one: inputting sentence data for training into a strategy network module, wherein the strategy network module samples each word in a sentence with one action under each state space, and outputs an action sequence of the whole sentence;
step two: dividing sentences by the word segmentation and recombination network according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the law-LSTM representation of the word; the character is characterized in character level through LSTM, and each phrase is obtained according to termination, and the updated formula is as follows:
Figure FDA0004102474230000011
wherein ,
Figure FDA0004102474230000012
a transfer function representing LSTM; x is x t A code vector representing a word input at time t of the sentence; />
Figure FDA0004102474230000013
and />
Figure FDA0004102474230000014
Respectively representing a cell state and a hidden state at a time t;
after the division of sentences is completed, phrase information is integrated into a word-granularity-based LSTM model, which is a basic cyclic LSTM function, as follows:
Figure FDA0004102474230000015
Figure FDA0004102474230000016
Figure FDA0004102474230000017
wherein ,
Figure FDA0004102474230000018
a code vector representing a j-th word in the sentence; />
Figure FDA0004102474230000019
The hidden state of the j-1 th word moment of the sentence is represented; w (W) cT and bc Is a model parameter; />
Figure FDA00041024742300000110
Representing input, forget and output gates, respectively; />
Figure FDA00041024742300000111
Representing a new candidate state; />
Figure FDA00041024742300000112
Cell states representing the j-1 th word of a sentence; />
Figure FDA00041024742300000113
Representing the updated cell state; />
Figure FDA00041024742300000114
The hidden state of the j-th word moment of the sentence is represented; by the output door->
Figure FDA00041024742300000115
And the cell state at the present moment->
Figure FDA00041024742300000116
Determining; sigma () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function;
phrase information is characterized by an LSTM model without an output gate, and a specific formula is as follows:
Figure FDA00041024742300000117
Figure FDA00041024742300000118
wherein ,
Figure FDA00041024742300000119
a code vector representing a phrase in the sentence starting from the b-th word and ending with the w-th word; />
Figure FDA00041024742300000120
The hidden state of the b word moment of the sentence is represented, namely the hidden state of the first word of the phrase; w (W) wT and bw Is a model parameter; />
Figure FDA0004102474230000021
Representing input and forget gates, respectively; />
Figure FDA0004102474230000022
Representing a new candidate state; />
Figure FDA0004102474230000023
A cell state representing the phrase first word; />
Figure FDA0004102474230000024
Representing the updated cell state; sigma () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function;
additionally, an additional gate is added to select the granularity of the word and the granularity information of the word, and the cell states of the code vector which is input as the word and the phrase ending with the word are input, and the formula is defined as follows:
Figure FDA0004102474230000025
wherein ,
Figure FDA0004102474230000026
a code vector representing an e-th word in the sentence; />
Figure FDA0004102474230000027
Representing the cell state of a phrase starting from the b-th word and ending with the e-th word, i.e. the cell state of a phrase ending with the e-th word in a sentence; w (W) lT and bl Is a model parameter; />
Figure FDA0004102474230000028
Representing an additional door; sigma () represents a sigmoid function;
Figure FDA0004102474230000029
the updating mode of the hidden state is changed, the updating of the hidden state is unchanged, and the final representation formula based on the lattice-LSTM model is as follows:
Figure FDA00041024742300000210
wherein ,
Figure FDA00041024742300000211
an input gate vector for the j-th word; />
Figure FDA00041024742300000212
An input gate vector that is a phrase ending with j starting with b; />
Figure FDA00041024742300000213
Is the phrase cell state; />
Figure FDA00041024742300000214
New candidate cell states for the word; />
Figure FDA00041024742300000215
Is a phrase information vector;
Figure FDA00041024742300000216
is a word information vector;
step three: the hidden state obtained by the named entity recognition network from the word segmentation and recombination network is input into a conditional random field layer, a named entity recognition result is finally obtained, a loss value is obtained through calculation according to the recognition result and used for training a named entity recognition model, and meanwhile the loss value is used as a delay reward to guide the update of the strategy network module;
the sentence is characterized by a lattice-LSTM model, so that the hidden state vector h of each word in the sentence can be obtained i The state vector sequence h= { H is then 1 ,h 2 ,…,h n Inputting a conditional random field layer; let y=l 1 ,l 2 ,…,l n The output label representing the conditional random field layer, the output label sequence probability is calculated by:
Figure FDA00041024742300000217
wherein s represents a sentence;
Figure FDA00041024742300000218
is directed to l i Model parameters of (2); />
Figure FDA00041024742300000219
Is directed to l i-1 and li Is set to be a bias parameter of (a); y' represents all possible output tag sets;
the calculation formula of the loss value function is as follows:
Figure FDA0004102474230000031
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
2. The training method of a reinforcement learning-based chinese named entity recognition model of claim 1, wherein in said step one, said actions include internal or termination, and the formula of the random strategy is as follows:
π(a t | t ;)=(W*s t +)
wherein ,π(at | t The method comprises the steps of carrying out a first treatment on the surface of the ) Representing selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t The state of the strategy network at the moment t; sigma () represents a sigmoid function; w, b denotes network parameters.
3. The training method of a Chinese named entity recognition model based on reinforcement learning according to claim 1, wherein before the first step, the named entity recognition network and network parameters thereof are pre-trained, and words used by the named entity recognition network are words obtained by dividing an original sentence through a simple heuristic algorithm;
and (3) temporarily fixing the pre-trained partial network parameters of the entity identification network as the network parameters of the named entity identification network, then pre-training the strategy network, and finally jointly training the whole network parameters.
4. The Chinese named entity recognition model based on reinforcement learning is characterized by comprising a strategy network module, a word segmentation recombination network and a named entity recognition network module; training with the training method of the preceding claims 1-3;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so as to obtain an action sequence for the whole sentence;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the codes of the phrases with the code vector of the last word of the phrases so as to obtain the lattice-LSTM expression of the sentences;
and the named entity recognition network module is used for inputting the hidden state of the language-LSTM expression of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train the named entity recognition model, and simultaneously guiding the updating of the strategy network module by taking the loss value as a delay reward.
5. The reinforcement-learning-based chinese named entity recognition model of claim 4, wherein said actions comprise internal or termination.
6. The reinforcement-learning-based chinese named entity recognition model of claim 4, wherein said random strategy is:
π(a t | t ;)=(W*s t +)
wherein ,π(at | t The method comprises the steps of carrying out a first treatment on the surface of the ) Representing selection action a t Probability of (2); θ= { W, b }, representing parameters of the policy network; s is(s) t The state of the strategy network at the moment t; sigma () represents a sigmoid function; w, b denotes network parameters.
7. The reinforcement learning-based Chinese named entity recognition model of claim 6, wherein the word segmentation and recombination network cuts sentences according to the action sequences output by the strategy network module to obtain phrases, and encodes each phrase as the cell state input at the last word of the corresponding phrase to obtain the language-LSTM representation of the sentences.
8. The reinforcement learning-based Chinese named entity recognition model of claim 7, wherein the named entity recognition network module inputs the output of lattice-LSTM obtained by the word segmentation and recombination network into a conditional random field layer, scores each labeling sequence of the sentence by using a feature function set of the conditional random field layer, indexes and normalizes the score, calculates all possible labeling sequences by using a first-order Viterbi algorithm, and the labeling sequence with the highest score is used as a final output. Meanwhile, defining a loss function, carrying out parameter training on the back propagation of the loss value, and taking the loss value as a delay rewarding updating strategy network module; the penalty function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
Figure FDA0004102474230000041
wherein lambda is L 2 A regularization term coefficient; θ represents a parameter set; s and y respectively represent sentences and correct labeling sequences corresponding to the sentences; p denotes the probability that the sentence s is labeled as sequence y, i.e. the probability that the label is correct.
CN201911089295.3A 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof Active CN110826334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911089295.3A CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911089295.3A CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Publications (2)

Publication Number Publication Date
CN110826334A CN110826334A (en) 2020-02-21
CN110826334B true CN110826334B (en) 2023-04-21

Family

ID=69553722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911089295.3A Active CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Country Status (1)

Country Link
CN (1) CN110826334B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476031A (en) * 2020-03-11 2020-07-31 重庆邮电大学 Improved Chinese named entity recognition method based on L attice-L STM
CN111666734B (en) * 2020-04-24 2021-08-10 北京大学 Sequence labeling method and device
CN111951959A (en) * 2020-08-23 2020-11-17 云知声智能科技股份有限公司 Dialogue type diagnosis guiding method and device based on reinforcement learning and storage medium
CN112163089B (en) * 2020-09-24 2023-06-23 中国电子科技集团公司第十五研究所 High-technology text classification method and system integrating named entity recognition
CN112699682B (en) * 2020-12-11 2022-05-17 山东大学 Named entity identification method and device based on combinable weak authenticator
CN113051921B (en) * 2021-03-17 2024-02-20 北京智慧星光信息技术有限公司 Internet text entity identification method, system, electronic equipment and storage medium
CN112966517B (en) * 2021-04-30 2022-02-18 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN114004233B (en) * 2021-12-30 2022-05-06 之江实验室 Remote supervision named entity recognition method based on semi-training and sentence selection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model

Also Published As

Publication number Publication date
CN110826334A (en) 2020-02-21

Similar Documents

Publication Publication Date Title
CN110826334B (en) Chinese named entity recognition model based on reinforcement learning and training method thereof
CN108628823B (en) Named entity recognition method combining attention mechanism and multi-task collaborative training
Yao et al. An improved LSTM structure for natural language processing
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN111581970B (en) Text recognition method, device and storage medium for network context
CN110162789A (en) A kind of vocabulary sign method and device based on the Chinese phonetic alphabet
CN110991185A (en) Method and device for extracting attributes of entities in article
Wu et al. An effective approach of named entity recognition for cyber threat intelligence
Solyman et al. Proposed model for arabic grammar error correction based on convolutional neural network
CN115658890A (en) Chinese comment classification method based on topic-enhanced emotion-shared attention BERT model
CN114841167A (en) Clinical named entity identification method based on multi-embedding combination of graph neural network
Anbukkarasi et al. Neural network-based error handler in natural language processing
Göker et al. Neural text normalization for turkish social media
CN116522165B (en) Public opinion text matching system and method based on twin structure
CN113705207A (en) Grammar error recognition method and device
CN112183062A (en) Spoken language understanding method based on alternate decoding, electronic equipment and storage medium
Alkhatlan et al. Attention-based sequence learning model for Arabic diacritic restoration
Smith et al. Bootstrapping feature-rich dependency parsers with entropic priors
Heymann et al. Improving ctc using stimulated learning for sequence modeling
CN116362242A (en) Small sample slot value extraction method, device, equipment and storage medium
Swaileh Language Modelling for Handwriting Recognition
CN111597831B (en) Machine translation method for generating statistical guidance by hybrid deep learning network and words
CN113012685B (en) Audio recognition method and device, electronic equipment and storage medium
Fan et al. Sub-word based mongolian offline handwriting recognition
He et al. A comparison and improvement of online learning algorithms for sequence labeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant