CN110826334A - Chinese named entity recognition model based on reinforcement learning and training method thereof - Google Patents

Chinese named entity recognition model based on reinforcement learning and training method thereof Download PDF

Info

Publication number
CN110826334A
CN110826334A CN201911089295.3A CN201911089295A CN110826334A CN 110826334 A CN110826334 A CN 110826334A CN 201911089295 A CN201911089295 A CN 201911089295A CN 110826334 A CN110826334 A CN 110826334A
Authority
CN
China
Prior art keywords
word
sentence
network
named entity
entity recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911089295.3A
Other languages
Chinese (zh)
Other versions
CN110826334B (en
Inventor
叶梅
卓汉逵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN201911089295.3A priority Critical patent/CN110826334B/en
Publication of CN110826334A publication Critical patent/CN110826334A/en
Application granted granted Critical
Publication of CN110826334B publication Critical patent/CN110826334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a Chinese named entity recognition model based on reinforcement learning and a training method thereof, wherein the model comprises a strategy network module, a word segmentation and recombination network and a named entity recognition network module; firstly, the strategy network appoints an action sequence, then the word segmentation and recombination network executes the actions in the action sequence one by one, a phrase is obtained through the action of 'stopping', the phrase is used as auxiliary input information, lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input into the named entity identification network to obtain a label sequence of sentences, and the identification result is used as delay reward to guide the updating of the strategy network module. The method effectively divides the sentences by using reinforcement learning, avoids modeling redundant interference words matched in the sentences, effectively avoids dependence on an external dictionary and influence of long texts, can better utilize correct word information, and better helps the Chinese named entity recognition model to improve the recognition effect.

Description

Chinese named entity recognition model based on reinforcement learning and training method thereof
Technical Field
The invention relates to the field of machine learning, in particular to a Chinese named entity recognition model based on reinforcement learning and a training method thereof.
Background
Named Entity Recognition (NER) is a basic task in the field of natural language processing, and refers to recognizing named referents from texts, laying the foundation for tasks such as relation extraction, question-answering system, syntactic analysis, machine translation and the like, and plays an important role in the process of putting natural language processing technology into practical use. In general, the named entity recognition task is to recognize named entities of three major classes (entity class, time class and numeric class), seven minor classes (person name, organization name, place name, time, date, currency and percentage) in the text to be processed.
An existing chinese named entity recognition model is lattice-LSTM, which, in addition to inputting each word in a sentence, also takes as input cell vectors of all potential words whose end is the word, the selection of the potential words depends on an external dictionary, and additionally, a supplementary gate is added to control the selection of word granularity information and word granularity information, so as to change the input vector from word information, a previous hidden state vector and a previous cell state vector to word information, a previous hidden state vector and all word information whose end is the word. The advantage of this model is that explicit word information can be utilized in a model based on word sequence tagging, and no word segmentation errors are encountered.
However, just because the lattice-LSTM model uses information of all words in a sentence, a word composed of adjacent words in the sentence, if present in an external dictionary, is input into the model as registered word granularity information, but the word is not necessarily a correct division in the sentence, such as: according to the thought of the model, the 'Nanjing Yangtze river bridge' can take the entry words formed by characters as input in sequence, the entry words mean that the words are nouns recorded in an external dictionary, and then the 'Nanjing', 'Nanjing City', 'Shangjiang', 'Changjiang bridge' and 'Changjiang river bridge' can be taken as the entry words in the model, but obviously, the 'Shangjiang' is an interfering word in the sentence, and the word information of the 'Shangjiang' has negative influence on entity identification. In addition, the model usually requires an external dictionary to be constructed autonomously from the data set used in the experiment, and has a serious dependency on the external dictionary. Meanwhile, when the text length is increased, the number of potential words in the sentence is increased, and the complexity of the model is greatly improved.
Disclosure of Invention
The invention aims to solve the problems that the modeling is carried out on redundant interference words matched in a sentence, and the modeling is dependent on an external dictionary and is influenced by a long text in the prior art, and provides a Chinese named entity recognition model based on reinforcement learning and a training method thereof. Therefore, the input of interference words and the use of an external dictionary can be effectively avoided, the number of words in a sentence is reduced when the text length is increased, and the correct word information can be utilized to better help the Chinese named entity recognition model to improve the recognition accuracy.
In order to solve the technical problems, the invention adopts the technical scheme that: the Chinese named entity recognition model based on reinforcement learning is provided and comprises a strategy network module, a word segmentation and recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy so as to obtain an action sequence for the whole sentence, and obtaining a delay reward according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the encoding of the phrases with the encoding vector of the last character of the phrases so as to obtain the lattice-LSTM representation of the sentences;
and the named entity recognition network module is used for inputting the hidden state represented by the lattice-LSTM of the sentence into a CRF (conditional random field) layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module.
Preferably, the action comprises an internal or termination.
Preferably, the random policy is:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
Preferably, the word segmentation and recombination network cuts the sentences to obtain phrases according to the action sequence output by the strategy network module, codes each phrase and respectively inputs the phrase as the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentence.
Preferably, the named entity recognition network module inputs the output of lattice-LSTM obtained by the word segmentation and recombination network into a CRF layer, scores each labeled sequence of the sentence by using a feature function set of the CRF layer, indexes and standardizes the score, calculates all possible labeled sequences by using a first-order Viterbi algorithm, takes the sequence with the highest score as final output, performs parameter training by reversely propagating the value of the loss function, and simultaneously takes the loss value as a delay reward updating strategy network module; the loss function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
Figure BDA0002266384690000031
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y represent the sentence and the corresponding annotation sequence of the sentence, respectively.
The training method is used for training the Chinese named entity recognition model and comprises the following steps:
the method comprises the following steps: inputting sentence data for training into a strategy network module, wherein the strategy network module samples one action for each word in a sentence under each state space and outputs an action sequence of the whole sentence;
step two: the word segmentation and recombination network divides sentences according to the action sequence output by the strategy network module, cuts the sentences into phrases, and combines the encoding of the phrases with the encoding vector of the last word of the phrase to obtain the lattice-LSTM representation of the word;
step three: inputting the hidden state obtained by the named entity recognition network from the word segmentation and recombination network into a CRF layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, and a hidden state vector h of each word in the sentence is obtainediThen, the state vector sequence H is set to { H ═ H1,h2,…,hnInputting into CRF layer; let y equal to l1,l2,…,lnRepresenting the output label of the CRF layer, and calculating the probability of the output label sequence by the following formula:
Figure BDA0002266384690000041
wherein s represents a sentence;is directed toiThe model parameters of (1);
Figure BDA0002266384690000043
is directed toi-1 and liThe bias parameter of (2); y' represents all possible output tag sets.
The formula for the function of the loss value is:
Figure BDA0002266384690000044
wherein λ is L2A regularization term coefficient; theta meterShowing parameter sets; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
Preferably, in the step one, the action includes internal or termination, and the formula of the random strategy is as follows:
π(at|st;θ)=ρ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
Preferably, in the second step, the word is characterized in a character level by LSTM, and the update formula is as follows:
Figure BDA0002266384690000045
wherein ,
Figure BDA00022663846900000415
a transfer function representing LSTM; x is the number oftA code vector representing a word input at time t of the sentence;
Figure BDA0002266384690000046
andrespectively, the cell state and the hidden state at time t.
After the division of the sentence is completed, the phrase information is integrated into a word granularity-based LSTM model, which is the basic cyclic LSTM function, as follows:
Figure BDA0002266384690000048
Figure BDA0002266384690000049
Figure BDA00022663846900000410
wherein ,
Figure BDA00022663846900000411
a coded vector representing the jth word in the sentence;
Figure BDA00022663846900000412
representing the hidden state of the j-1 character moment of the sentence; wcT and bcIs a model parameter;
Figure BDA00022663846900000413
respectively representing input, forgetting and output gates;representing a new candidate state;
Figure BDA0002266384690000051
representing the cell state of the j-1 word time of the sentence;
Figure BDA0002266384690000052
indicating the updated cell state;
Figure BDA0002266384690000053
the hidden state of the jth word time of the sentence is represented by an output gate
Figure BDA0002266384690000054
And the cell state at the current time
Figure BDA0002266384690000055
Determining; σ () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
The phrase information is characterized by an LSTM model without an output gate, and the specific formula is as follows:
Figure BDA0002266384690000056
Figure BDA0002266384690000057
wherein ,
Figure BDA0002266384690000058
a coded vector representing a phrase starting from the b-th word and ending at the e-th word in the sentence;
Figure BDA0002266384690000059
representing the hidden state of the b-th word of the sentence, namely the hidden state of the first word of the phrase; wwT and bwIs a model parameter;respectively representing input and forgetting to remember a gate;
Figure BDA00022663846900000511
representing a new candidate state;
Figure BDA00022663846900000512
cell state representing the first word of the phrase;
Figure BDA00022663846900000513
indicating the updated cell state; σ () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
In addition, an additional gate is added to select word granularity and word granularity information, the input is a coding vector of a word and the cell state of a phrase ending with the word, and the formula is defined as follows:
Figure BDA00022663846900000514
wherein ,encoding for representing the e-th character in a sentenceA code vector;
Figure BDA00022663846900000516
the cell state of the phrases starting from the b-th word to ending from the e-th word is represented, namely the cell state of the phrases taking the e-th word as the tail in the sentence; wlT and blIs a model parameter;
Figure BDA00022663846900000517
an additional door is shown; σ () represents a sigmoid function.
Figure BDA00022663846900000518
The updating mode of the hidden state is changed, the hidden state is updated without change, and the final characterization formula based on the lattice-LSTM model is as follows:
Figure BDA00022663846900000519
wherein ,
Figure BDA00022663846900000520
an input gate vector for the jth word;
Figure BDA00022663846900000521
an input gate vector for phrases starting at b and ending with j;
Figure BDA00022663846900000522
is the phrase cellular state;a new candidate cell state for the word;
Figure BDA00022663846900000524
is a phrase information vector;is a word information vector.
Preferably, before the step one is carried out, the named entity recognition network and the network parameters thereof are pre-trained, and the words used by the named entity recognition network are words obtained by dividing the original sentences through a simple heuristic algorithm;
and temporarily setting part of the network parameters of the entity recognition network which are pre-trained as the network parameters of the named entity recognition network, then pre-training the strategy network, and finally jointly training the whole network parameters.
Compared with the prior art, the invention has the beneficial effects that: the Chinese named entity recognition model based on reinforcement learning and the method thereof effectively divide sentences by utilizing reinforcement learning, avoid modeling redundant interference words matched in the sentences and effectively avoid dependence on an external dictionary and influence of long texts.
Drawings
FIG. 1 is a schematic diagram of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 2 is a schematic diagram of a strategy network module of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 3 is a schematic diagram of a named entity recognition network module based on a reinforcement learning Chinese named entity recognition model according to the present invention;
FIG. 4 is a flow chart of a training method of a Chinese named entity recognition model based on reinforcement learning according to the present invention;
FIG. 5 is an exemplary diagram of sentence segmentation in the training method of the Chinese named entity recognition model based on reinforcement learning according to the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent; for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted. The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms such as "upper", "lower", "left", "right", "long", "short", etc., indicating orientations or positional relationships based on the orientations or positional relationships shown in the drawings, it is only for convenience of description and simplicity of description, but does not indicate or imply that the device or element referred to must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationships in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
The technical scheme of the invention is further described in detail by the following specific embodiments in combination with the attached drawings:
example 1
1-3, an embodiment of a Chinese named entity recognition model based on reinforcement learning is shown, which comprises a strategy network module, a segmentation and recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action (action comprises internal or termination) for each word in the sentence under each state space by adopting a random strategy so as to obtain an action sequence for the whole sentence, and obtaining a delay reward according to the recognition result of the Chinese named entity recognition network so as to guide the strategy network module to update; the random strategy is:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
The word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, cutting the sentences into phrases, and combining the encoding of the phrases with the encoding vector of the last character of the phrases so as to obtain lattice-LSTM expression of the sentences;
specifically, the word segmentation and recombination network cuts the sentences to obtain phrases according to the action sequence output by the strategy network module, codes each phrase and respectively inputs the cell state at the last word of the corresponding phrase to obtain the lattice-LSTM representation of the sentence.
And the named entity recognition network module is used for inputting the hidden state expressed by the lattice-LSTM of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module. Wherein, the calculation formula of the loss value is as follows,
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
The working principle of the embodiment is as follows: firstly, the strategy network appoints an action sequence, then the word segmentation and recombination network executes the actions in the action sequence one by one, a phrase is obtained through the action of 'stopping', the phrase is used as the input information of the last character of the phrase, lattice-LSTM modeling is carried out to obtain a hidden state sequence, the hidden state is input to a named entity identification network to obtain a label sequence of a sentence, and the identification result is used as delay reward to guide the updating of a strategy network module.
The beneficial effects of this embodiment: the embodiment is reinforcement of an LSTM-CRF model based on a neural network, an reinforcement learning framework is combined to learn the internal relation of sentences, the sentence structure is efficiently divided, the obtained phrase information is integrated into a lattice-LSTM model based on word granularity, and the word granularity information related to the word granularity information are fully learned, so that a better recognition effect is achieved.
Example 2
Fig. 4 shows an embodiment of a training method of a chinese named entity recognition model based on reinforcement learning, which is used for training the model described in embodiment 1, and includes the following steps:
pretreatment: pre-training a named entity recognition network and network parameters thereof, wherein the words used by the named entity recognition network are words obtained by dividing original sentences through a simple heuristic algorithm;
and temporarily setting part of the network parameters of the entity recognition network which are pre-trained as the network parameters of the named entity recognition network, then pre-training the strategy network, and finally jointly training the whole network parameters.
The method comprises the following steps: inputting sentence data for training into a strategy network module, wherein the strategy network module samples one action for each word in a sentence under each state space and outputs an action sequence of the whole sentence;
in step one, the states, actions, and policies are defined as follows:
1. the state is as follows: an encoding vector of a currently input word and a context vector preceding the word;
2. the actions are as follows: defining two different operations, including internal and termination;
3. strategy: the random strategy is defined as follows:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stAnd the state of the network is optimized at the moment t.
Step two: the word segmentation and recombination network divides sentences according to the action sequence output by the strategy network module, cuts the sentences into phrases, and combines the encoding of the phrases with the encoding vector of the last word of the phrase to obtain the lattice-LSTM representation of the word;
as shown in FIG. 5, "Washington in the United states" is classified as "United states", "Washington". The character level of the character is characterized by LSTM, and the updating formula is as follows:
Figure BDA0002266384690000091
wherein ,
Figure BDA0002266384690000092
a transfer function representing LSTM; x is the number oftA code vector representing a word input at time t of the sentence;
Figure BDA0002266384690000093
and
Figure BDA0002266384690000094
respectively, the cell state and the hidden state at time t.
After the division of the sentence is completed, the phrase information is integrated into a word granularity-based LSTM model, which is the basic cyclic LSTM function, as follows:
Figure BDA0002266384690000095
Figure BDA0002266384690000096
Figure BDA0002266384690000097
wherein ,
Figure BDA0002266384690000098
a coded vector representing the jth word in the sentence;representing the hidden state of the j-1 character moment of the sentence; wcT and bcIs a model parameter;
Figure BDA00022663846900000910
respectively representing input, forgetting and output gates;
Figure BDA00022663846900000911
representing a new candidate state;
Figure BDA00022663846900000912
representing the cell state of the j-1 word time of the sentence;
Figure BDA00022663846900000913
indicating the updated cell state;
Figure BDA00022663846900000914
representing the hidden state of the jth word moment of the sentence; from the output gate
Figure BDA00022663846900000915
And the cell state at the current time
Figure BDA00022663846900000916
Determining; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
The phrase information is characterized by an LSTM model without an output gate, and the specific formula is as follows:
Figure BDA00022663846900000917
Figure BDA00022663846900000918
a coded vector representing a phrase starting from the b-th word and ending at the e-th word in the sentence;
Figure BDA00022663846900000920
representing the b-th word time of a sentenceThe hidden state of (a), i.e., the hidden state of the first word of the phrase; wwT and bwIs a model parameter;
Figure BDA00022663846900000921
respectively representing input and forgetting to remember a gate;
Figure BDA00022663846900000922
representing a new candidate state;
Figure BDA00022663846900000923
cell state representing the first word of the phrase;
Figure BDA00022663846900000924
indicating the updated cell state; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
In addition, an additional gate is added to select word granularity and word granularity information, the input is a coding vector of a word and the cell state of a phrase ending with the word, and the formula is defined as follows:
Figure BDA0002266384690000101
wherein ,
Figure BDA0002266384690000102
a coded vector representing the e-th word in the sentence;
Figure BDA0002266384690000103
the cell state of the phrases starting from the b-th word to ending from the e-th word is represented, namely the cell state of the phrases taking the e-th word as the tail in the sentence; wlT and blIs a model parameter;
Figure BDA0002266384690000104
an additional door is shown; σ () represents a sigmoid function.
Figure BDA0002266384690000105
The updating mode of the hidden state is changed, the hidden state is updated without change, and the final characterization formula based on the lattice-LSTM model is as follows:
Figure BDA0002266384690000106
wherein ,
Figure BDA0002266384690000107
an input gate vector for the jth word;an input gate vector for phrases starting at b and ending with j;
Figure BDA0002266384690000109
is the phrase cellular state;
Figure BDA00022663846900001010
a new candidate cell state for the word;
Figure BDA00022663846900001011
is a phrase information vector;
Figure BDA00022663846900001012
is a word information vector.
Step three: inputting the hidden state obtained by the named entity recognition network from the word segmentation and recombination network into a CRF layer, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, and a hidden state vector h of each word in the sentence is obtainediThen, the state vector sequence H is set to { H ═ H1,h2,…,hnInputting into CRF layer; let y equal to l1,l2,…,lnRepresenting the output label of the CRF layer, and calculating the probability of the output label sequence by the following formula:
Figure BDA00022663846900001013
wherein s represents a sentence;is directed toiThe model parameters of (1);
Figure BDA00022663846900001015
is directed toi-1 and liThe bias parameter of (2); y' represents all possible output tag sets.
The formula for the function of the loss value is:
Figure BDA00022663846900001016
wherein λ is L2Regular term coefficients, θ represents a parameter set, and s and y represent a sentence and the correct sequence of labels corresponding to the sentence, respectively.
The reward is defined as: the method comprises the steps of obtaining division of sentences after action sequences are sampled through a policy network, adding phrases obtained after the sentence division as word granularity information into an LSTM model based on word granularity to obtain a representation based on a lattice-LSTM model, inputting the representation into a named entity identification network module, obtaining entity labels of all words through a CRF layer, decoding entity labels, and calculating reward values according to identification results. This is a delayed reward with which policy network modules can be directed to update, since the reward value can only be calculated until the final recognition result is obtained.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (9)

1. A Chinese named entity recognition model based on reinforcement learning is characterized by comprising a strategy network module, a word segmentation and recombination network and a named entity recognition network module;
the strategy network module is used for sampling an action for each word in the sentence under each state space by adopting a random strategy, so that an action sequence is obtained for the whole sentence;
the word segmentation and recombination network is used for dividing sentences according to the action sequence output by the strategy network module, breaking the sentences into phrases, and combining the encoding of the phrases with the encoding vector of the last character of the phrases so as to obtain lattice-LSTM expression of the sentences;
and the named entity recognition network module is used for inputting the hidden state expressed by the lattice-LSTM of the sentence into the conditional random field, finally obtaining a named entity recognition result, calculating a loss value according to the recognition result to train a named entity recognition model, and simultaneously using the loss value as a delay reward to guide the updating of the strategy network module.
2. The reinforcement learning-based Chinese named entity recognition model of claim 1, wherein the action comprises an internal or termination.
3. The reinforcement learning-based Chinese named entity recognition model of claim 1, wherein the stochastic strategy is:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stThe state of the network is optimized at the moment t; σ () represents a sigmoid function; w, b denotes network parameters.
4. The model of claim 3, wherein the word segmentation and reassembly network segments the sentence according to the action sequence outputted by the strategy network module to obtain phrases, encodes each phrase, and inputs the phrase as the cell state at the last word of the corresponding phrase to obtain lattice-LSTM representation of the sentence.
5. The reinforced learning-based named entity recognition model of Chinese according to claim 4, wherein the named entity recognition network module inputs the output of lattice-LSTM obtained from the participle reorganization network into the conditional random field layer, scores each annotation sequence of the sentence by using the feature function set of the conditional random field layer and indexes and standardizes the score, calculates all possible annotation sequences by using a first-order Viterbi algorithm, and takes the highest scoring annotation sequence as the final output. Defining a loss function, carrying out parameter training on the loss value by back propagation, and taking the loss value as a delay reward updating strategy network module; the loss function is defined as the log-likelihood of the sentence level with the L2 regularization term as follows:
Figure FDA0002266384680000021
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
6. A training method of Chinese named entity recognition model based on reinforcement learning, which is used for training the Chinese named entity recognition model based on reinforcement learning of any one of claims 1 to 5, and comprises the following steps:
the method comprises the following steps: inputting sentence data for training into a strategy network module, wherein the strategy network module samples one action for each word in a sentence under each state space and outputs an action sequence of the whole sentence;
step two: the word segmentation and recombination network divides sentences according to the action sequence output by the strategy network module, breaks the sentences into phrases, codes the phrases and combines the code vectors of the last character of the phrases, thereby obtaining the lattice-LSTM representation of the characters;
step three: the named entity recognition network inputs the hidden state obtained from the word segmentation and recombination network into a conditional random field layer, finally obtains a named entity recognition result, calculates a loss value according to the recognition result to train a named entity recognition model, and simultaneously takes the loss value as a delay reward to guide the updating of the strategy network module;
the sentence is characterized by a lattice-LSTM model, and a hidden state vector h of each word in the sentence is obtainediThen, the state vector sequence H is set to { H ═ H1,h2,…,hnInputting a conditional random field layer; let y equal to l1,l2,…,lnAn output tag representing the conditional random field layer, the output tag sequence probability being calculated by:
Figure FDA0002266384680000022
wherein s represents a sentence;is directed toiThe model parameters of (1);
Figure FDA0002266384680000024
is directed toi-1 and liThe bias parameter of (2); y' represents all possible output tag sets.
The formula for the function of the loss value is:
Figure FDA0002266384680000031
wherein λ is L2A regularization term coefficient; θ represents a parameter set; s and y respectively represent a sentence and a correct labeling sequence corresponding to the sentence; p represents the probability that the sentence s is labeled as the sequence y, i.e., the probability of labeling the correct.
7. The method for training the Chinese named entity recognition model based on reinforcement learning of claim 6, wherein in the step one, the action comprises internal or termination, and the formula of the random strategy is as follows:
π(at|st;θ)=σ(W*st+b)
wherein ,π(at|st(ii) a θ) represents the selection action atThe probability of (d); θ ═ W, b, representing parameters of the policy network; stThe state of the network is optimized at the moment t; σ () represents a sigmoid function; w, b denotes network parameters.
8. The method for training the Chinese named entity recognition model based on reinforcement learning of claim 6, wherein in the second step, the word is characterized by the character level through LSTM, and the phrases are obtained according to the termination, and the formula is updated as follows:
Figure FDA0002266384680000032
wherein ,a transfer function representing LSTM; x is the number oftA code vector representing a word input at time t of the sentence;
Figure FDA0002266384680000034
and
Figure FDA0002266384680000035
respectively, the cell state and the hidden state at time t.
After the division of the sentence is completed, the phrase information is integrated into a word granularity-based LSTM model, which is the basic cyclic LSTM function, as follows:
Figure FDA0002266384680000036
Figure FDA0002266384680000037
Figure FDA0002266384680000038
wherein ,
Figure FDA0002266384680000039
a coded vector representing the jth word in the sentence;
Figure FDA00022663846800000310
representing the hidden state of the j-1 character moment of the sentence; wcT and bcIs a model parameter;
Figure FDA00022663846800000311
respectively representing input, forgetting and output gates;
Figure FDA00022663846800000312
representing a new candidate state;
Figure FDA00022663846800000313
representing the cell state of the j-1 word time of the sentence;
Figure FDA00022663846800000314
indicating the updated cell state;
Figure FDA0002266384680000041
representing the hidden state of the jth word moment of the sentence; from the output gate
Figure FDA0002266384680000042
And the cell state at the current time
Figure FDA0002266384680000043
Determining; σ () represents a sigmoid function, and tanh () represents a hyperbolic tangent activation function.
The phrase information is characterized by an LSTM model without an output gate, and the specific formula is as follows:
Figure FDA0002266384680000044
Figure FDA0002266384680000045
wherein ,
Figure FDA0002266384680000046
a coded vector representing a phrase starting from the b-th word and ending at the e-th word in the sentence;
Figure FDA0002266384680000047
representing the hidden state of the b-th word of the sentence, namely the hidden state of the first word of the phrase; wwT and bwIs a model parameter;
Figure FDA0002266384680000048
respectively representing input and forgetting to remember a gate;
Figure FDA0002266384680000049
representing a new candidate state;
Figure FDA00022663846800000410
express the phrase firstWord cell states;
Figure FDA00022663846800000411
indicating the updated cell state; σ () represents a sigmoid function; tanh () represents a hyperbolic tangent activation function.
In addition, an additional gate is added to select word granularity and word granularity information, the input is a coding vector of a word and the cell state of a phrase ending with the word, and the formula is defined as follows:
Figure FDA00022663846800000412
wherein ,
Figure FDA00022663846800000413
a coded vector representing the e-th word in the sentence;
Figure FDA00022663846800000414
the cell state of the phrases starting from the b-th word to ending from the e-th word is represented, namely the cell state of the phrases taking the e-th word as the tail in the sentence; wlT and blIs a model parameter;
Figure FDA00022663846800000415
an additional door is shown; σ () represents a sigmoid function.
Figure FDA00022663846800000416
The updating mode of the hidden state is changed, the hidden state is updated without change, and the final characterization formula based on the lattice-LSTM model is as follows:
Figure FDA00022663846800000417
wherein ,
Figure FDA00022663846800000418
an input gate vector for the jth word;
Figure FDA00022663846800000419
an input gate vector for phrases starting at b and ending with j;is the phrase cellular state;
Figure FDA00022663846800000421
a new candidate cell state for the word;
Figure FDA00022663846800000422
is a phrase information vector;
Figure FDA00022663846800000423
is a word information vector.
9. The training method of the Chinese named entity recognition model based on reinforcement learning of claim 6, wherein before the step one, the named entity recognition network and the network parameters thereof are pre-trained, and at this time, the words used by the named entity recognition network are words obtained by dividing the original sentence through a simple heuristic algorithm;
and temporarily setting part of the network parameters of the entity recognition network which are pre-trained as the network parameters of the named entity recognition network, then pre-training the strategy network, and finally jointly training the whole network parameters.
CN201911089295.3A 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof Active CN110826334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911089295.3A CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911089295.3A CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Publications (2)

Publication Number Publication Date
CN110826334A true CN110826334A (en) 2020-02-21
CN110826334B CN110826334B (en) 2023-04-21

Family

ID=69553722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911089295.3A Active CN110826334B (en) 2019-11-08 2019-11-08 Chinese named entity recognition model based on reinforcement learning and training method thereof

Country Status (1)

Country Link
CN (1) CN110826334B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476031A (en) * 2020-03-11 2020-07-31 重庆邮电大学 Improved Chinese named entity recognition method based on L attice-L STM
CN111666734A (en) * 2020-04-24 2020-09-15 北京大学 Sequence labeling method and device
CN111951959A (en) * 2020-08-23 2020-11-17 云知声智能科技股份有限公司 Dialogue type diagnosis guiding method and device based on reinforcement learning and storage medium
CN112163089A (en) * 2020-09-24 2021-01-01 中国电子科技集团公司第十五研究所 Military high-technology text classification method and system fusing named entity recognition
CN112699682A (en) * 2020-12-11 2021-04-23 山东大学 Named entity identification method and device based on combinable weak authenticator
CN112966517A (en) * 2021-04-30 2021-06-15 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN113051921A (en) * 2021-03-17 2021-06-29 北京智慧星光信息技术有限公司 Internet text entity identification method, system, electronic equipment and storage medium
CN114004233A (en) * 2021-12-30 2022-02-01 之江实验室 Remote supervision named entity recognition method based on semi-training and sentence selection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109597876A (en) * 2018-11-07 2019-04-09 中山大学 A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109117472A (en) * 2018-11-12 2019-01-01 新疆大学 A kind of Uighur name entity recognition method based on deep learning
CN109657239A (en) * 2018-12-12 2019-04-19 电子科技大学 The Chinese name entity recognition method learnt based on attention mechanism and language model

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111476031A (en) * 2020-03-11 2020-07-31 重庆邮电大学 Improved Chinese named entity recognition method based on L attice-L STM
CN111666734A (en) * 2020-04-24 2020-09-15 北京大学 Sequence labeling method and device
CN111951959A (en) * 2020-08-23 2020-11-17 云知声智能科技股份有限公司 Dialogue type diagnosis guiding method and device based on reinforcement learning and storage medium
CN112163089A (en) * 2020-09-24 2021-01-01 中国电子科技集团公司第十五研究所 Military high-technology text classification method and system fusing named entity recognition
CN112163089B (en) * 2020-09-24 2023-06-23 中国电子科技集团公司第十五研究所 High-technology text classification method and system integrating named entity recognition
CN112699682A (en) * 2020-12-11 2021-04-23 山东大学 Named entity identification method and device based on combinable weak authenticator
CN112699682B (en) * 2020-12-11 2022-05-17 山东大学 Named entity identification method and device based on combinable weak authenticator
CN113051921A (en) * 2021-03-17 2021-06-29 北京智慧星光信息技术有限公司 Internet text entity identification method, system, electronic equipment and storage medium
CN113051921B (en) * 2021-03-17 2024-02-20 北京智慧星光信息技术有限公司 Internet text entity identification method, system, electronic equipment and storage medium
CN112966517A (en) * 2021-04-30 2021-06-15 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN112966517B (en) * 2021-04-30 2022-02-18 平安科技(深圳)有限公司 Training method, device, equipment and medium for named entity recognition model
CN114004233A (en) * 2021-12-30 2022-02-01 之江实验室 Remote supervision named entity recognition method based on semi-training and sentence selection

Also Published As

Publication number Publication date
CN110826334B (en) 2023-04-21

Similar Documents

Publication Publication Date Title
CN110826334A (en) Chinese named entity recognition model based on reinforcement learning and training method thereof
CN108628823B (en) Named entity recognition method combining attention mechanism and multi-task collaborative training
CN110135457B (en) Event trigger word extraction method and system based on self-encoder fusion document information
Yao et al. An improved LSTM structure for natural language processing
CN108416058B (en) Bi-LSTM input information enhancement-based relation extraction method
CN110083831A (en) A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF
CN110866401A (en) Chinese electronic medical record named entity identification method and system based on attention mechanism
CN111767718B (en) Chinese grammar error correction method based on weakened grammar error feature representation
CN112541356B (en) Method and system for recognizing biomedical named entities
CN113177412A (en) Named entity identification method and system based on bert, electronic equipment and storage medium
Wu et al. An effective approach of named entity recognition for cyber threat intelligence
CN116432645A (en) Traffic accident named entity recognition method based on pre-training model
CN109766523A (en) Part-of-speech tagging method and labeling system
Han et al. MAF‐CNER: A Chinese Named Entity Recognition Model Based on Multifeature Adaptive Fusion
CN113360667A (en) Biomedical trigger word detection and named entity identification method based on multitask learning
CN112349294A (en) Voice processing method and device, computer readable medium and electronic equipment
CN115017890A (en) Text error correction method and device based on character pronunciation and character font similarity
CN111291550B (en) Chinese entity extraction method and device
Alkhatlan et al. Attention-based sequence learning model for Arabic diacritic restoration
CN116187304A (en) Automatic text error correction algorithm and system based on improved BERT
CN109960782A (en) A kind of Tibetan language segmenting method and device based on deep neural network
Shahid et al. Next word prediction for Urdu language using deep learning models
CN115240712A (en) Multi-mode-based emotion classification method, device, equipment and storage medium
Brill Pattern-based disambiguation for natural language processing
CN112634878B (en) Speech recognition post-processing method and system and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant