CN110826334A - Chinese named entity recognition model based on reinforcement learning and training method thereof - Google Patents
Chinese named entity recognition model based on reinforcement learning and training method thereof Download PDFInfo
- Publication number
- CN110826334A CN110826334A CN201911089295.3A CN201911089295A CN110826334A CN 110826334 A CN110826334 A CN 110826334A CN 201911089295 A CN201911089295 A CN 201911089295A CN 110826334 A CN110826334 A CN 110826334A
- Authority
- CN
- China
- Prior art keywords
- word
- sentence
- network
- named entity
- entity recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012549 training Methods 0.000 title claims abstract description 28
- 230000002787 reinforcement Effects 0.000 title claims abstract description 26
- 238000000034 method Methods 0.000 title claims abstract description 20
- 230000009471 action Effects 0.000 claims abstract description 43
- 230000011218 segmentation Effects 0.000 claims abstract description 21
- 238000005215 recombination Methods 0.000 claims abstract description 18
- 230000006798 recombination Effects 0.000 claims abstract description 18
- 239000013598 vector Substances 0.000 claims description 46
- 230000006870 function Effects 0.000 claims description 32
- 238000002372 labelling Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 4
- 230000001413 cellular effect Effects 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 125000004122 cyclic group Chemical group 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000008521 reorganization Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a Chinese named entity recognition model based on reinforcement learning and a training method thereof, wherein the model comprises a strategy network module, a word segmentation and recombination network and a named entity recognition network module; firstly, the strategy network appoints an action sequence, then the word segmentation and recombination network executes the actions in the action sequence one by one, a phrase is obtained through the action of 'stopping', the phrase is used as auxiliary input information, lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into the named entity identification network to obtain a label sequence of sentences, and the identification result is used as delay reward to guide the updating of the strategy network module. The method effectively divides the sentences by using reinforcement learning, avoids modeling redundant interference words matched in the sentences, effectively avoids dependence on an external dictionary and influence of long texts, can better utilize correct word information, and better helps the Chinese named entity recognition model to improve the recognition effect.
Description
Technical Field
The invention relates to the field of machine learning, in particular to a Chinese named entity recognition model based on reinforcement learning and a training method thereof.
Background
Named Entity Recognition (NER) is a basic task in the field of natural language processing, and refers to recognizing named referents from texts, laying the foundation for tasks such as relation extraction, question-answering system, syntactic analysis, machine translation and the like, and plays an important role in the process of putting natural language processing technology into practical use. In general, the named entity recognition task is to recognize named entities of three major classes (entity class, time class and numeric class), seven minor classes (person name, organization name, place name, time, date, currency and percentage) in the text to be processed.
An existing chinese named entity recognition model is lattice-LSTM, which, in addition to inputting each word in a sentence, also takes as input cell vectors of all potential words whose end is the word, the selection of the potential words depends on an external dictionary, and additionally, a supplementary gate is added to control the selection of word granularity information and word granularity information, so as to change the input vector from word information, a previous hidden state vector and a previous cell state vector to word information, a previous hidden state vector and all word information whose end is the word. The advantage of this model is that explicit word information can be utilized in a model based on word sequence tagging, and no word segmentation errors are encountered.
However, just because the lattice-LSTM model uses information of all words in a sentence, a word composed of adjacent words in the sentence, if present in an external dictionary, is input into the model as registered word granularity information, but the word is not necessarily a correct division in the sentence, such as: according to the thought of the model, the 'Nanjing Yangtze river bridge' can take the entry words formed by characters as input in sequence, the entry words mean that the words are nouns recorded in an external dictionary, and then the 'Nanjing', 'Nanjing City', 'Shangjiang', 'Changjiang bridge' and 'Changjiang river bridge' can be taken as the entry words in the model, but obviously, the 'Shangjiang' is an interfering word in the sentence, and the word information of the 'Shangjiang' has negative influence on entity identification. In addition, the model usually requires an external dictionary to be constructed autonomously from the data set used in the experiment, and has a serious dependency on the external dictionary. Meanwhile, when the text length is increased, the number of potential words in the sentence is increased, and the complexity of the model is greatly improved.
Disclosure of Invention
The invention aims to solve the problems that the modeling is carried out on redundant interference words matched in a sentence, and the modeling is dependent on an external dictionary and is influenced by a long text in the prior art, and provides a Chinese named entity recognition model based on reinforcement learning and a training method thereof. Therefore, the input of interference words and the use of an external dictionary can be effectively avoided, the number of words in a sentence is reduced when the text length is increased, and the correct word information can be utilized to better help the Chinese named entity recognition model to improve the recognition accuracy.
In order to solve the technical problems, the invention adopts the technical scheme that: the Chinese named entity recognition model based on reinforcement learning is provided and comprises a strategy network module, a word segmentation and recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy so as to obtain an action sequence for the whole sentence, and obtaining a delay reward according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the encoding of the phrases with the encoding vector of the last character of the phrases so as to obtain the lattice-LSTM representation of the sentences;
and the named entity recognition network module is used for inputting the hidden state represented by the lattice-LSTM of the sentence into a CRF (conditional random field) layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module.
Preferably, the action comprises an internal or termination.
Preferably, the random policy is:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
Preferably, the word segmentation and recombination network cuts the sentences to obtain phrases according to the action sequence output by the strategy network module, codes each phrase and respectively inputs the phrase as the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentence.
Preferably, the named entity recognition network module inputs the output of lattice-LSTM obtained by the word segmentation and recombination network into a CRF layer, scores each labeled sequence of the sentence by using a feature function set of the CRF layer, indexes and standardizes the score, calculates all possible labeled sequences by using a first-order Viterbi algorithm, takes the sequence with the highest score as final output, performs parameter training by reversely propagating the value of the loss function, and simultaneously takes the loss value as a delay reward updating strategy network module; the loss function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y represent the sentence and the corresponding annotation sequence of the sentence, respectively.
The training method is used for training the Chinese named entity recognition model and comprises the following steps:
the method comprises the following steps: inputting sentence data for training into a strategy network module, wherein the strategy network module samples one action for each word in a sentence under each state space and outputs an action sequence of the whole sentence;
step two: the word segmentation and recombination network divides sentences according to the action sequence output by the strategy network module, cuts the sentences into phrases, and combines the encoding of the phrases with the encoding vector of the last word of the phrase to obtain the lattice-LSTM representation of the word;
step three: inputting the hidden state obtained by the named entity recognition network from the word segmentation and recombination network into a CRF layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, and a hidden state vector h of each word in the sentence is obtainediThen, the state vector sequence H is set to { H ═ H1,h2,…,hnInputting into CRF layer; let y equal to l1,l2,…,lnRepresenting the output label of the CRF layer, and calculating the probability of the output label sequence by the following formula:
wherein s represents a sentence;is directed toiThe model parameters of (1);is directed toi-1 and liThe bias parameter of (2); y' represents all possible output tag sets.
The formula for the function of the loss value is:
wherein λ is L2A regularization term coefficient; theta meterShowing parameter sets; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
Preferably, in the step one, the action includes internal or termination, and the formula of the random strategy is as follows:
π(at|st;θ)=ρ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
Preferably, in the second step, the word is characterized in a character level by LSTM, and the update formula is as follows:
wherein ,a transfer function representing LSTM; x is the number oftA code vector representing a word input at time t of the sentence;andrespectively, the cell state and the hidden state at time t.
After the division of the sentence is completed, the phrase information is integrated into a word granularity-based LSTM model, which is the basic cyclic LSTM function, as follows:
wherein ,a coded vector representing the jth word in the sentence;representing the hidden state of the j-1 character moment of the sentence; wcT and bcIs a model parameter;respectively representing input, forgetting and output gates;representing a new candidate state;representing the cell state of the j-1 word time of the sentence;indicating the updated cell state;the hidden state of the jth word time of the sentence is represented by an output gateAnd the cell state at the current timeDetermining; σ () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
The phrase information is characterized by an LSTM model without an output gate, and the specific formula is as follows:
wherein ,a coded vector representing a phrase starting from the b-th word and ending at the e-th word in the sentence;representing the hidden state of the b-th word of the sentence, namely the hidden state of the first word of the phrase; wwT and bwIs a model parameter;respectively representing input and forgetting to remember a gate;representing a new candidate state;cell state representing the first word of the phrase;indicating the updated cell state; σ () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
In addition, an additional gate is added to select word granularity and word granularity information, the input is a coding vector of a word and the cell state of a phrase ending with the word, and the formula is defined as follows:
wherein ,encoding for representing the e-th character in a sentenceA code vector;the cell state of the phrases starting from the b-th word to ending from the e-th word is represented, namely the cell state of the phrases taking the e-th word as the tail in the sentence; wlT and blIs a model parameter;an additional door is shown; σ () represents a sigmoid function.
The updating mode of the hidden state is changed, the hidden state is updated without change, and the final characterization formula based on the lattice-LSTM model is as follows:
wherein ,an input gate vector for the jth word;an input gate vector for phrases starting at b and ending with j;is the phrase cellular state;a new candidate cell state for the word;is a phrase information vector;is a word information vector.
Preferably, before the step one is carried out, the named entity recognition network and the network parameters thereof are pre-trained, and the words used by the named entity recognition network are words obtained by dividing the original sentences through a simple heuristic algorithm;
and temporarily setting part of the network parameters of the entity recognition network which are pre-trained as the network parameters of the named entity recognition network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Compared with the prior art, the invention has the beneficial effects that: the Chinese named entity recognition model based on reinforcement learning and the method thereof effectively divide sentences by utilizing reinforcement learning, avoid modeling redundant interference words matched in the sentences and effectively avoid dependence on an external dictionary and influence of long texts.
Drawings
FIG. 1 is a schematic diagram of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 2 is a schematic diagram of a strategy network module of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 3 is a schematic diagram of a named entity recognition network module based on a reinforcement learning Chinese named entity recognition model according to the present invention;
FIG. 4 is a flow chart of a training method of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 5 is an exemplary diagram of sentence segmentation in the training method of the Chinese named entity recognition model based on reinforcement learning according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms such as "upper", "lower", "left", "right", "long", "short", etc., indicating orientations or positional relationships based on the orientations or positional relationships shown in the drawings, it is only for convenience of description and simplicity of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationships in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:
example 1
1-3, an embodiment of a Chinese named entity recognition model based on reinforcement learning is shown, which comprises a strategy network module, a segmentation and recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action (action comprises internal or termination) for each word in the sentence under each state space by adopting a random strategy so as to obtain an action sequence for the whole sentence, and obtaining a delay reward according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update; the random strategy is:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
The word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the encoding of the phrases with the encoding vector of the last character of the phrases so as to obtain lattice-LSTM expression of the sentences;
specifically, the word segmentation and recombination network cuts the sentences to obtain phrases according to the action sequence output by the strategy network module, codes each phrase and respectively inputs the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentence.
And the named entity recognition network module is used for inputting the hidden state expressed by the lattice-LSTM of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module. Wherein, the calculation formula of the loss value is as follows,
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
The working principle of the embodiment is as follows: firstly, the strategy network appoints an action sequence, then the word segmentation and recombination network executes the actions in the action sequence one by one, a phrase is obtained through the action of 'stopping', the phrase is used as the input information of the last character of the phrase, lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input to a named entity identification network to obtain a label sequence of a sentence, and the identification result is used as delay reward to guide the updating of a strategy network module.
The beneficial effects of this embodiment: the embodiment is reinforcement of an LSTM-CRF model based on a neural network, an reinforcement learning framework is combined to learn the internal relation of sentences, the sentence structure is efficiently divided, the obtained phrase information is integrated into a lattice-LSTM model based on word granularity, and the word granularity information related to the word granularity information are fully learned, so that a better recognition effect is achieved.
Example 2
Fig. 4 shows an embodiment of a training method of a chinese named entity recognition model based on reinforcement learning, which is used for training the model described in embodiment 1, and includes the following steps:
pretreatment: pre-training a named entity recognition network and network parameters thereof, wherein the words used by the named entity recognition network are words obtained by dividing original sentences through a simple heuristic algorithm;
and temporarily setting part of the network parameters of the entity recognition network which are pre-trained as the network parameters of the named entity recognition network, then pre-training the strategy network, and finally jointly training the whole network parameters.
The method comprises the following steps: inputting sentence data for training into a strategy network module, wherein the strategy network module samples one action for each word in a sentence under each state space and outputs an action sequence of the whole sentence;
in step one, the states, actions, and policies are defined as follows:
1. the state is as follows: an encoding vector of a currently input word and a context vector preceding the word;
2. the actions are as follows: defining two different operations, including internal and termination;
3. strategy: the random strategy is defined as follows:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
Step two: the word segmentation and recombination network divides sentences according to the action sequence output by the strategy network module, cuts the sentences into phrases, and combines the encoding of the phrases with the encoding vector of the last word of the phrase to obtain the lattice-LSTM representation of the word;
as shown in FIG. 5, "Washington in the United states" is classified as "United states", "Washington". The character level of the character is characterized by LSTM, and the updating formula is as follows:
wherein ,a transfer function representing LSTM; x is the number oftA code vector representing a word input at time t of the sentence;andrespectively, the cell state and the hidden state at time t.
After the division of the sentence is completed, the phrase information is integrated into a word granularity-based LSTM model, which is the basic cyclic LSTM function, as follows:
wherein ,a coded vector representing the jth word in the sentence;representing the hidden state of the j-1 character moment of the sentence; wcT and bcIs a model parameter;respectively representing input, forgetting and output gates;representing a new candidate state;representing the cell state of the j-1 word time of the sentence;indicating the updated cell state;representing the hidden state of the jth word moment of the sentence; from the output gateAnd the cell state at the current timeDetermining; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
The phrase information is characterized by an LSTM model without an output gate, and the specific formula is as follows:
a coded vector representing a phrase starting from the b-th word and ending at the e-th word in the sentence;representing the b-th word time of a sentenceThe hidden state of (a), i.e., the hidden state of the first word of the phrase; wwT and bwIs a model parameter;respectively representing input and forgetting to remember a gate;representing a new candidate state;cell state representing the first word of the phrase;indicating the updated cell state; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
In addition, an additional gate is added to select word granularity and word granularity information, the input is a coding vector of a word and the cell state of a phrase ending with the word, and the formula is defined as follows:
wherein ,a coded vector representing the e-th word in the sentence;the cell state of the phrases starting from the b-th word to ending from the e-th word is represented, namely the cell state of the phrases taking the e-th word as the tail in the sentence; wlT and blIs a model parameter;an additional door is shown; σ () represents a sigmoid function.
The updating mode of the hidden state is changed, the hidden state is updated without change, and the final characterization formula based on the lattice-LSTM model is as follows:
wherein ,an input gate vector for the jth word;an input gate vector for phrases starting at b and ending with j;is the phrase cellular state;a new candidate cell state for the word;is a phrase information vector;is a word information vector.
Step three: inputting the hidden state obtained by the named entity recognition network from the word segmentation and recombination network into a CRF layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, and a hidden state vector h of each word in the sentence is obtainediThen, the state vector sequence H is set to { H ═ H1,h2,…,hnInputting into CRF layer; let y equal to l1,l2,…,lnRepresenting the output label of the CRF layer, and calculating the probability of the output label sequence by the following formula:
wherein s represents a sentence;is directed toiThe model parameters of (1);is directed toi-1 and liThe bias parameter of (2); y' represents all possible output tag sets.
The formula for the function of the loss value is:
wherein λ is L2Regular term coefficients, θ represents a parameter set, and s and y represent a sentence and the correct sequence of labels corresponding to the sentence, respectively.
The reward is defined as: the method comprises the steps of obtaining division of sentences after action sequences are sampled through a policy network, adding phrases obtained after the sentence division as word granularity information into an LSTM model based on word granularity to obtain a representation based on a lattice-LSTM model, inputting the representation into a named entity identification network module, obtaining entity labels of all words through a CRF layer, decoding entity labels, and calculating reward values according to identification results. This is a delayed reward with which policy network modules can be directed to update, since the reward value can only be calculated until the final recognition result is obtained.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (9)
1. A Chinese named entity recognition model based on reinforcement learning is characterized by comprising a strategy network module, a word segmentation and recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so that an action sequence is obtained for the whole sentence;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the encoding of the phrases with the encoding vector of the last character of the phrases so as to obtain lattice-LSTM expression of the sentences;
and the named entity recognition network module is used for inputting the hidden state expressed by the lattice-LSTM of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module.
2. The reinforcement learning-based Chinese named entity recognition model of claim 1, wherein the action comprises an internal or termination.
3. The reinforcement learning-based Chinese named entity recognition model of claim 1, wherein the stochastic strategy is:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stThe state of the network is optimized at the moment t; σ () represents a sigmoid function; w, b denotes network parameters.
4. The model of claim 3, wherein the word segmentation and reassembly network segments the sentence according to the action sequence outputted by the strategy network module to obtain phrases, encodes each phrase, and inputs the phrase as the cell state at the last word of the corresponding phrase to obtain lattice-LSTM representation of the sentence.
5. The reinforced learning-based named entity recognition model of Chinese according to claim 4, wherein the named entity recognition network module inputs the output of lattice-LSTM obtained from the participle reorganization network into the conditional random field layer, scores each annotation sequence of the sentence by using the feature function set of the conditional random field layer and indexes and standardizes the score, calculates all possible annotation sequences by using a first-order Viterbi algorithm, and takes the highest scoring annotation sequence as the final output. Defining a loss function, carrying out parameter training on the loss value by back propagation, and taking the loss value as a delay reward updating strategy network module; the loss function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
6. A training method of Chinese named entity recognition model based on reinforcement learning, which is used for training the Chinese named entity recognition model based on reinforcement learning of any one of claims 1 to 5, and comprises the following steps:
the method comprises the following steps: inputting sentence data for training into a strategy network module, wherein the strategy network module samples one action for each word in a sentence under each state space and outputs an action sequence of the whole sentence;
step two: the word segmentation and recombination network divides sentences according to the action sequence output by the strategy network module, breaks the sentences into phrases, codes the phrases and combines the code vectors of the last character of the phrases, thereby obtaining the lattice-LSTM representation of the characters;
step three: the named entity recognition network inputs the hidden state obtained from the word segmentation and recombination network into a conditional random field layer, finally obtains a named entity recognition result, calculates a loss value according to the recognition result to train a named entity recognition model, and simultaneously takes the loss value as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, and a hidden state vector h of each word in the sentence is obtainediThen, the state vector sequence H is set to { H ═ H1,h2,…,hnInputting a conditional random field layer; let y equal to l1,l2,…,lnAn output tag representing the conditional random field layer, the output tag sequence probability being calculated by:
wherein s represents a sentence;is directed toiThe model parameters of (1);is directed toi-1 and liThe bias parameter of (2); y' represents all possible output tag sets.
The formula for the function of the loss value is:
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
7. The method for training the Chinese named entity recognition model based on reinforcement learning of claim 6, wherein in the step one, the action comprises internal or termination, and the formula of the random strategy is as follows:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stThe state of the network is optimized at the moment t; σ () represents a sigmoid function; w, b denotes network parameters.
8. The method for training the Chinese named entity recognition model based on reinforcement learning of claim 6, wherein in the second step, the word is characterized by the character level through LSTM, and the phrases are obtained according to the termination, and the formula is updated as follows:
wherein ,a transfer function representing LSTM; x is the number oftA code vector representing a word input at time t of the sentence;andrespectively, the cell state and the hidden state at time t.
After the division of the sentence is completed, the phrase information is integrated into a word granularity-based LSTM model, which is the basic cyclic LSTM function, as follows:
wherein ,a coded vector representing the jth word in the sentence;representing the hidden state of the j-1 character moment of the sentence; wcT and bcIs a model parameter;respectively representing input, forgetting and output gates;representing a new candidate state;representing the cell state of the j-1 word time of the sentence;indicating the updated cell state;representing the hidden state of the jth word moment of the sentence; from the output gateAnd the cell state at the current timeDetermining; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
The phrase information is characterized by an LSTM model without an output gate, and the specific formula is as follows:
wherein ,a coded vector representing a phrase starting from the b-th word and ending at the e-th word in the sentence;representing the hidden state of the b-th word of the sentence, namely the hidden state of the first word of the phrase; wwT and bwIs a model parameter;respectively representing input and forgetting to remember a gate;representing a new candidate state;express the phrase firstWord cell states;indicating the updated cell state; σ () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
In addition, an additional gate is added to select word granularity and word granularity information, the input is a coding vector of a word and the cell state of a phrase ending with the word, and the formula is defined as follows:
wherein ,a coded vector representing the e-th word in the sentence;the cell state of the phrases starting from the b-th word to ending from the e-th word is represented, namely the cell state of the phrases taking the e-th word as the tail in the sentence; wlT and blIs a model parameter;an additional door is shown; σ () represents a sigmoid function.
The updating mode of the hidden state is changed, the hidden state is updated without change, and the final characterization formula based on the lattice-LSTM model is as follows:
9. The training method of the Chinese named entity recognition model based on reinforcement learning of claim 6, wherein before the step one, the named entity recognition network and the network parameters thereof are pre-trained, and at this time, the words used by the named entity recognition network are words obtained by dividing the original sentence through a simple heuristic algorithm;
and temporarily setting part of the network parameters of the entity recognition network which are pre-trained as the network parameters of the named entity recognition network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911089295.3A CN110826334B (en) | 2019-11-08 | 2019-11-08 | Chinese named entity recognition model based on reinforcement learning and training method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911089295.3A CN110826334B (en) | 2019-11-08 | 2019-11-08 | Chinese named entity recognition model based on reinforcement learning and training method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110826334A true CN110826334A (en) | 2020-02-21 |
CN110826334B CN110826334B (en) | 2023-04-21 |
Family
ID=69553722
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911089295.3A Active CN110826334B (en) | 2019-11-08 | 2019-11-08 | Chinese named entity recognition model based on reinforcement learning and training method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110826334B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476031A (en) * | 2020-03-11 | 2020-07-31 | 重庆邮电大学 | Improved Chinese named entity recognition method based on L attice-L STM |
CN111666734A (en) * | 2020-04-24 | 2020-09-15 | 北京大学 | Sequence labeling method and device |
CN111951959A (en) * | 2020-08-23 | 2020-11-17 | 云知声智能科技股份有限公司 | Dialogue type diagnosis guiding method and device based on reinforcement learning and storage medium |
CN112163089A (en) * | 2020-09-24 | 2021-01-01 | 中国电子科技集团公司第十五研究所 | Military high-technology text classification method and system fusing named entity recognition |
CN112699682A (en) * | 2020-12-11 | 2021-04-23 | 山东大学 | Named entity identification method and device based on combinable weak authenticator |
CN112966517A (en) * | 2021-04-30 | 2021-06-15 | 平安科技(深圳)有限公司 | Training method, device, equipment and medium for named entity recognition model |
CN113051921A (en) * | 2021-03-17 | 2021-06-29 | 北京智慧星光信息技术有限公司 | Internet text entity identification method, system, electronic equipment and storage medium |
CN114004233A (en) * | 2021-12-30 | 2022-02-01 | 之江实验室 | Remote supervision named entity recognition method based on semi-training and sentence selection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117472A (en) * | 2018-11-12 | 2019-01-01 | 新疆大学 | A kind of Uighur name entity recognition method based on deep learning |
CN109255119A (en) * | 2018-07-18 | 2019-01-22 | 五邑大学 | A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition |
CN109597876A (en) * | 2018-11-07 | 2019-04-09 | 中山大学 | A kind of more wheels dialogue answer preference pattern and its method based on intensified learning |
CN109657239A (en) * | 2018-12-12 | 2019-04-19 | 电子科技大学 | The Chinese name entity recognition method learnt based on attention mechanism and language model |
-
2019
- 2019-11-08 CN CN201911089295.3A patent/CN110826334B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109255119A (en) * | 2018-07-18 | 2019-01-22 | 五邑大学 | A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition |
CN109597876A (en) * | 2018-11-07 | 2019-04-09 | 中山大学 | A kind of more wheels dialogue answer preference pattern and its method based on intensified learning |
CN109117472A (en) * | 2018-11-12 | 2019-01-01 | 新疆大学 | A kind of Uighur name entity recognition method based on deep learning |
CN109657239A (en) * | 2018-12-12 | 2019-04-19 | 电子科技大学 | The Chinese name entity recognition method learnt based on attention mechanism and language model |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111476031A (en) * | 2020-03-11 | 2020-07-31 | 重庆邮电大学 | Improved Chinese named entity recognition method based on L attice-L STM |
CN111666734A (en) * | 2020-04-24 | 2020-09-15 | 北京大学 | Sequence labeling method and device |
CN111951959A (en) * | 2020-08-23 | 2020-11-17 | 云知声智能科技股份有限公司 | Dialogue type diagnosis guiding method and device based on reinforcement learning and storage medium |
CN112163089A (en) * | 2020-09-24 | 2021-01-01 | 中国电子科技集团公司第十五研究所 | Military high-technology text classification method and system fusing named entity recognition |
CN112163089B (en) * | 2020-09-24 | 2023-06-23 | 中国电子科技集团公司第十五研究所 | High-technology text classification method and system integrating named entity recognition |
CN112699682A (en) * | 2020-12-11 | 2021-04-23 | 山东大学 | Named entity identification method and device based on combinable weak authenticator |
CN112699682B (en) * | 2020-12-11 | 2022-05-17 | 山东大学 | Named entity identification method and device based on combinable weak authenticator |
CN113051921A (en) * | 2021-03-17 | 2021-06-29 | 北京智慧星光信息技术有限公司 | Internet text entity identification method, system, electronic equipment and storage medium |
CN113051921B (en) * | 2021-03-17 | 2024-02-20 | 北京智慧星光信息技术有限公司 | Internet text entity identification method, system, electronic equipment and storage medium |
CN112966517A (en) * | 2021-04-30 | 2021-06-15 | 平安科技(深圳)有限公司 | Training method, device, equipment and medium for named entity recognition model |
CN112966517B (en) * | 2021-04-30 | 2022-02-18 | 平安科技(深圳)有限公司 | Training method, device, equipment and medium for named entity recognition model |
CN114004233A (en) * | 2021-12-30 | 2022-02-01 | 之江实验室 | Remote supervision named entity recognition method based on semi-training and sentence selection |
Also Published As
Publication number | Publication date |
---|---|
CN110826334B (en) | 2023-04-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110826334A (en) | Chinese named entity recognition model based on reinforcement learning and training method thereof | |
CN108628823B (en) | Named entity recognition method combining attention mechanism and multi-task collaborative training | |
CN110135457B (en) | Event trigger word extraction method and system based on self-encoder fusion document information | |
Yao et al. | An improved LSTM structure for natural language processing | |
CN108416058B (en) | Bi-LSTM input information enhancement-based relation extraction method | |
CN110083831A (en) | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF | |
CN110866401A (en) | Chinese electronic medical record named entity identification method and system based on attention mechanism | |
CN111767718B (en) | Chinese grammar error correction method based on weakened grammar error feature representation | |
CN112541356B (en) | Method and system for recognizing biomedical named entities | |
CN113177412A (en) | Named entity identification method and system based on bert, electronic equipment and storage medium | |
Wu et al. | An effective approach of named entity recognition for cyber threat intelligence | |
CN116432645A (en) | Traffic accident named entity recognition method based on pre-training model | |
CN109766523A (en) | Part-of-speech tagging method and labeling system | |
Han et al. | MAF‐CNER: A Chinese Named Entity Recognition Model Based on Multifeature Adaptive Fusion | |
CN113360667A (en) | Biomedical trigger word detection and named entity identification method based on multitask learning | |
CN112349294A (en) | Voice processing method and device, computer readable medium and electronic equipment | |
CN115017890A (en) | Text error correction method and device based on character pronunciation and character font similarity | |
CN111291550B (en) | Chinese entity extraction method and device | |
Alkhatlan et al. | Attention-based sequence learning model for Arabic diacritic restoration | |
CN116187304A (en) | Automatic text error correction algorithm and system based on improved BERT | |
CN109960782A (en) | A kind of Tibetan language segmenting method and device based on deep neural network | |
Shahid et al. | Next word prediction for Urdu language using deep learning models | |
CN115240712A (en) | Multi-mode-based emotion classification method, device, equipment and storage medium | |
Brill | Pattern-based disambiguation for natural language processing | |
CN112634878B (en) | Speech recognition post-processing method and system and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |