CN109558569A - A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model - Google Patents

A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model Download PDF

Info

Publication number
CN109558569A
CN109558569A CN201811531266.3A CN201811531266A CN109558569A CN 109558569 A CN109558569 A CN 109558569A CN 201811531266 A CN201811531266 A CN 201811531266A CN 109558569 A CN109558569 A CN 109558569A
Authority
CN
China
Prior art keywords
lstm
speech
bilstm
word
crf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811531266.3A
Other languages
Chinese (zh)
Inventor
周兰江
王兴金
张建安
周枫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201811531266.3A priority Critical patent/CN109558569A/en
Publication of CN109558569A publication Critical patent/CN109558569A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/221Parsing markup language streams
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The Laotian part-of-speech tagging method based on BiLSTM+CRF model that the present invention relates to a kind of, it belongs to natural language processing and machine learning techniques field.BiLSTM is based on LSTM structure, and BiLSTM can use contextual information to carry out part-of-speech tagging.By a sentence inputting to part-of-speech tagging into BiLSTM, BiLSTM can export the part of speech probability distribution of each word in sentence by calculating, and traditional way can select the maximum probability part of speech of each distribution, as part-of-speech tagging result.But the influence between part of speech is not accounted in this way, such as: verb etc. cannot be connect after quantifier.Therefore CRF model is introduced to solve this problem, and CRF model can be connected to the output layer of BiLSTM.Using the Laotian part-of-speech tagging model based on BiLSTM and CRF, part-of-speech tagging effectively can be carried out to Laotian, therefore the present invention has certain research significance.

Description

A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model
Technical field
The Laotian part-of-speech tagging method based on BiLSTM+CRF model that the present invention relates to a kind of, belongs to natural language processing With machine learning techniques field.
Background technique
Part-of-speech tagging be exactly be each word in sentence, determine the process of its best part of speech.Part-of-speech tagging is many natural languages One of the pre-treatment step of processing task, it be prepare for subsequent prior work, such as: syntactic analysis, information extraction Deng.It is rule-based for studying the technology used in early days, but Rulemaking is very cumbersome.Therefore Statistics-Based Method is able to Development, the model that early stage study of statistical methods uses have hidden equine husband model, condition random field (CRF) model and maximum entropy mould Type.Due to the rise of deep learning, research starts steering and is carried out with deep learning the research of part-of-speech tagging, achieves good Achievement.But this current technical idea was not studied in Laotian part-of-speech tagging, and model is also that oneself is built.
Summary of the invention
The Laotian part-of-speech tagging method based on BiLSTM+CRF model that the object of the present invention is to provide a kind of, by depth The two-way long short-term memory Recognition with Recurrent Neural Network BiLSTM technology of degree study is ground with traditional statistical method condition random field CRF Study carefully, be used in Laotian part-of-speech tagging, in an experiment and is achieved good results.
The technical solution adopted by the present invention is that: a kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model, packet Include following steps:
The building of Step1, BiLSTM+CRF model
Laotian part-of-speech tagging model based on BiLSTM and CRF comprising five layers: input layer, LSTM layers of forward direction, backward LSTM layers, full articulamentum and CRF layers;
(1) input layer:
The received data of input layer are the Laos sentence W with n word1…Wt…Wn, word enter BiLSTM before need turn The form for being changed to number just can be carried out calculating, therefore input one term vector matrix of layer building, and each Laos's word can be Its corresponding term vector is found in term vector matrix, the value of term vector represents the feature of the word, and term vector will also represent word input To it is preceding to LSTM layers, it is LSTM layers backward in corresponding LSTM, carry out calculating word information;
(2) forward direction LSTM layers:
LSTM layers of forward direction are made of LSTM, and LSTM determines reservation, output and the deletion of information, come from input layer Laos sentence In the term vector of each word will sequentially be input in corresponding LSTM, LSTM is connected by input sequence forward direction, each LSTM output Two parts word information: forward-facing state information FS and forward direction output information FH, information are all presented with a matrix type, forward-facing state letter Breath can be handed in the layer always, participate in the calculating of next LSTM, and forward direction output information will be output to full articulamentum meter Calculate part of speech probability distribution;
(3) LSTM layers backward:
Backward LSTM layers are also to be made of LSTM, and the term vector of each word is sequentially input in input layer Laos sentence In corresponding LSTM, but LSTM presses input sequence reverse connection, and each LSTM exports two parts word information: backward status information BS And backward output information BH, backward status information will be handed in the layer always, participate in the calculating of next LSTM, and it is backward Output information will be output to full articulamentum and calculate part of speech probability distribution;
(4) full articulamentum:
Full articulamentum is made of simple neural network unit, and each received data of unit are by LSTM layers of forward and backward The forward direction output information FH of output, backward output information BH, FH and BH cross calculating in unit back warp, will obtain part of speech probability point Cloth;
(5) CRF layers:
After full articulamentum obtains the probability distribution of each word, CRF model is distributed as sentence using these and calculates best word Property annotated sequence, CRF layers guarantee select greater probability part of speech from each distribution while, also will consider part of speech between phase Mutually influence;
The training of Step2, BiLSTM+CRF model
Training BiLSTM+CRF model uses Laotian chapter part-of-speech tagging corpus, it may be assumed that more are marked part of speech Laotian article, training uses the log-likelihood function based on sentence level first, general to calculate the part of speech that full articulamentum obtains The gap that rate distribution is really distributed with part of speech in Laotian chapter part-of-speech tagging corpus, then reduces difference using Adam algorithm Away from training the parameter of BiLSTM+CRF model with this, until model reaches stable, i.e., for gap value close to 0, model reaches stable Afterwards, so that it may obtain the Laotian part-of-speech tagging model based on BiLSTM and CRF, the sentence inputting of part-of-speech tagging will be needed to always The input layer of Laos's words and phrases marking model, the CRF layers of part of speech that will export each word in sentence.
The beneficial effects of the present invention are:
1, present invention employs the BiLSTM structure of deep learning, BiLSTM structure has not information before and after study sentence Wrong effect.
2, the present invention uses CRF model, and CRF model can be considered influencing each other between part of speech, connect in BiLSTM structure The last layer, it is highly effective to the selection of part of speech.
3, the present invention the experimental results showed that, the present invention propose Laotian part-of-speech tagging model part-of-speech tagging accuracy rate be higher than All traditional statistical models.
Detailed description of the invention
Fig. 1 is the overview flow chart in the present invention;
Fig. 2 is the BiLSTM+CRF model of Case-based Reasoning.
Specific embodiment
In order to describe in more detail the present invention and convenient for the understanding of those skilled in the art, with reference to the accompanying drawing and embodiment pair The present invention is further described, and the embodiment of this part for illustrating the present invention, do not come with this by the purpose being easy to understand The limitation present invention.
Embodiment 1: as shown in Figs. 1-2, a kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model is specific to walk It is rapid as follows:
Step1, BiLSTM+CRF model
As shown in Figure 2, Laos's sentence of 3 words has been used(Ministry of Finance says), BiLSTM+CRF model and workflow are explained.
(1) input layer:
Input layer is used to input 3 words of sentence, and each word can enter in term vector matrix, find oneself word to Amount.The term vector of 3 words also will enter into LSTM layers of forward, backward in corresponding LSTM structure, carry out calculating word information;
(2) forward direction LSTM layers:
LSTM layers of forward direction are made of 3 LSTM (L).SentenceIn (Ministry of Finance says) Each word term vector can enter corresponding L unit calculate word information: forward-facing state information (FS) and forward direction output information (FH).With L1For, the first word of sentenceThe term vector of (national treasury) enters L1In be calculated: forward-facing state information (FS1) and forward direction output information (FH1).Forward-facing state information (FS1) can be handed in the layer always, participate in next LSTM (L2) calculating, and forward direction output information (FH1) full articulamentum calculating part of speech probability distribution will be output to;
(3) LSTM layers backward:
Backward LSTM layers are made of 3 LSTM (R).Working method is identical as forward direction LSTM layer, but LSTM presses input sequence Reverse connection;
(4) full articulamentum:
Full articulamentum is made of 3 simple neural network units (Cell), and each Cell receives forward and backward output information (FH and BH) is as input.With Cell2For, explain the calculating and output information content of this layer: Cell2It receives from backward LSTM layers of BH2With from the preceding FH to LSTM layers2As input value, in Cell2It is middle to obtain word by calculating The part of speech probability distribution of (declaring), will enter into CRF layers;
(5) CRF layers:
By connecting available sentence entirelyEach word in (Ministry of Finance says) it is general Rate distribution, needs these set being input to CRF layers.CRF layers can guarantee to select greater probability part of speech from each distribution simultaneously, Also it will consider influencing each other between part of speech;
The training of Step2, BiLSTM+CRF model
Training BiLSTM+CRF model uses Laotian chapter part-of-speech tagging corpus, it may be assumed that more are marked part of speech Laotian article, training uses the log-likelihood function based on sentence level first, general to calculate the part of speech that full articulamentum obtains The gap that rate distribution is really distributed with part of speech in Laotian chapter part-of-speech tagging corpus, then reduces difference using Adam algorithm Away from training the parameter of BiLSTM+CRF model with this, until model reaches stable, i.e., for gap value close to 0, model reaches stable Afterwards, so that it may obtain the Laotian part-of-speech tagging model based on BiLSTM and CRF, the sentence inputting of part-of-speech tagging will be needed to always The input layer of Laos's words and phrases marking model, the CRF layers of part of speech that will export each word in sentence.
In the present embodiment, Laotian chapter part-of-speech tagging corpus is the more Laotian articles for being marked part of speech, with article In for one section of sentence:(beauty State does not announce advertising expenditure).
BiLSTM is based on LSTM structure, and LSTM structure is time recurrent neural network, suitable for processing time series The long task in middle interval, such as: machine translation, image recognition, part-of-speech tagging etc..Since LSTM structure is time recurrence , therefore using unidirectional LSTM structure when carrying out part-of-speech tagging task to sentence, contextual information, Zhi Nengli cannot be utilized With unidirectional information.And the introduction of BiLSTM is exactly to solve this problem, can use contextual information to carry out part of speech mark Note.By a sentence inputting to part-of-speech tagging into BiLSTM, BiLSTM can export the word of each word in sentence by calculating Property probability distribution, traditional way can select the maximum probability part of speech of each distribution, as part-of-speech tagging result.But in this way The influence between part of speech is not accounted for, such as: verb etc. cannot be connect after quantifier.Therefore CRF model is introduced to ask to solve this CRF model, can be connected to the output layer of BiLSTM by topic.It, can using the Laotian part-of-speech tagging model based on BiLSTM and CRF Effectively to carry out part-of-speech tagging to Laotian, therefore the present invention has certain research significance.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept Put that various changes can be made.

Claims (1)

1. a kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model, characterized by the following steps:
The building of Step1, BiLSTM+CRF model
Laotian part-of-speech tagging model based on BiLSTM and CRF comprising five layers: input layer, LSTM layers of forward direction, backward LSTM Layer, full articulamentum and CRF layers;
(1) input layer:
The received data of input layer are the Laos sentence W with n word1…Wt…Wn, word enter BiLSTM before need to be converted to The form of number just can be carried out calculating, therefore input one term vector matrix of layer building, each Laos's word can word to Find its corresponding term vector in moment matrix, the value of term vector represents the feature of the word, and term vector will also represent before word is input to To LSTM layers, it is LSTM layers backward in corresponding LSTM, carry out calculating word information;
(2) forward direction LSTM layers:
LSTM layers of forward direction are made of LSTM, and LSTM determines reservation, output and the deletion of information, every in input layer Laos sentence The term vector of a word will be sequentially input in corresponding LSTM, and LSTM is connected by input sequence forward direction, and each LSTM exports two Segment information: forward-facing state information FS and forward direction output information FH, information are all presented with a matrix type, forward-facing state information meeting It is handed in the layer always, participates in the calculating of next LSTM, and forward direction output information will be output to full articulamentum and calculate word Property probability distribution;
(3) LSTM layers backward:
Backward LSTM layers are also to be made of LSTM, and the term vector of each word is sequentially input to correspondence in input layer Laos sentence LSTM in, but LSTM presses input sequence reverse connection, and each LSTM exports two parts word information: backward status information BS and after To output information BH, backward status information will be handed in the layer always, participate in the calculating of next LSTM, and be exported backward Information will be output to full articulamentum and calculate part of speech probability distribution;
(4) full articulamentum:
Full articulamentum is made of simple neural network unit, and each received data of unit are exported by LSTM layers of forward and backward Forward direction output information FH, backward output information BH, FH and BH in unit back warp cross calculating, part of speech probability distribution will be obtained;
(5) CRF layers:
After full articulamentum obtains the probability distribution of each word, CRF model is distributed as sentence using these and calculates best part of speech mark Sequence is infused, CRF layers while guaranteeing to select greater probability part of speech from each distribution, will also consider the mutual shadow between part of speech It rings;
The training of Step2, BiLSTM+CRF model
Training BiLSTM+CRF model uses Laotian chapter part-of-speech tagging corpus, it may be assumed that more are marked the Laos of part of speech Chinese language chapter, training use the log-likelihood function based on sentence level first, to calculate the part of speech probability point that full articulamentum obtains The gap that part of speech is really distributed in cloth and Laotian chapter part-of-speech tagging corpus, then reduces gap using Adam algorithm, with The parameter of this training BiLSTM+CRF model, until model reaches stable, i.e., gap value is close to 0, after model reaches stable, so that it may To obtain the Laotian part-of-speech tagging model based on BiLSTM and CRF, the sentence inputting of part-of-speech tagging will be needed to Laos's words and phrases The input layer of property marking model, the CRF layers of part of speech that will export each word in sentence.
CN201811531266.3A 2018-12-14 2018-12-14 A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model Pending CN109558569A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811531266.3A CN109558569A (en) 2018-12-14 2018-12-14 A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811531266.3A CN109558569A (en) 2018-12-14 2018-12-14 A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model

Publications (1)

Publication Number Publication Date
CN109558569A true CN109558569A (en) 2019-04-02

Family

ID=65870089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811531266.3A Pending CN109558569A (en) 2018-12-14 2018-12-14 A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model

Country Status (1)

Country Link
CN (1) CN109558569A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489750A (en) * 2019-08-12 2019-11-22 昆明理工大学 Burmese participle and part-of-speech tagging method and device based on two-way LSTM-CRF
CN110705293A (en) * 2019-08-23 2020-01-17 中国科学院苏州生物医学工程技术研究所 Electronic medical record text named entity recognition method based on pre-training language model
CN113468890A (en) * 2021-07-20 2021-10-01 南京信息工程大学 Sedimentology literature mining method based on NLP information extraction and part-of-speech rules

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN107622050A (en) * 2017-09-14 2018-01-23 武汉烽火普天信息技术有限公司 Text sequence labeling system and method based on Bi LSTM and CRF
CN107644014A (en) * 2017-09-25 2018-01-30 南京安链数据科技有限公司 A kind of name entity recognition method based on two-way LSTM and CRF
US20180225281A1 (en) * 2017-02-06 2018-08-09 Thomson Reuters Global Resources Unlimited Company Systems and Methods for Automatic Semantic Token Tagging

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
US20180225281A1 (en) * 2017-02-06 2018-08-09 Thomson Reuters Global Resources Unlimited Company Systems and Methods for Automatic Semantic Token Tagging
CN107622050A (en) * 2017-09-14 2018-01-23 武汉烽火普天信息技术有限公司 Text sequence labeling system and method based on Bi LSTM and CRF
CN107644014A (en) * 2017-09-25 2018-01-30 南京安链数据科技有限公司 A kind of name entity recognition method based on two-way LSTM and CRF

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHIHENG HUANG ET.AL: "Bidirectional LSTM-CRF Models for Sequence Tagging", 《ARXIV:1508,01991V1》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489750A (en) * 2019-08-12 2019-11-22 昆明理工大学 Burmese participle and part-of-speech tagging method and device based on two-way LSTM-CRF
CN110705293A (en) * 2019-08-23 2020-01-17 中国科学院苏州生物医学工程技术研究所 Electronic medical record text named entity recognition method based on pre-training language model
CN113468890A (en) * 2021-07-20 2021-10-01 南京信息工程大学 Sedimentology literature mining method based on NLP information extraction and part-of-speech rules
CN113468890B (en) * 2021-07-20 2023-05-26 南京信息工程大学 Sedimentology literature mining method based on NLP information extraction and part-of-speech rules

Similar Documents

Publication Publication Date Title
WO2021155699A1 (en) Global encoding method for automatic abstract of chinese long text
CN108763284A (en) A kind of question answering system implementation method based on deep learning and topic model
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN109558569A (en) A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model
CN112183058B (en) Poetry generation method and device based on BERT sentence vector input
CN108491386A (en) natural language understanding method and system
CN110457661B (en) Natural language generation method, device, equipment and storage medium
CN110188175A (en) A kind of question and answer based on BiLSTM-CRF model are to abstracting method, system and storage medium
CN107679225A (en) A kind of reply generation method based on keyword
CN110188348A (en) A kind of Chinese language processing model and method based on deep neural network
Dethlefs Domain transfer for deep natural language generation from abstract meaning representations
CN109508457A (en) A kind of transfer learning method reading series model based on machine
CN112287106A (en) Online comment emotion classification method based on dual-channel hybrid neural network
CN110334196A (en) Neural network Chinese charater problem based on stroke and from attention mechanism generates system
CN114444481B (en) Sentiment analysis and generation method of news comment
CN109933773A (en) A kind of multiple semantic sentence analysis system and method
Zhang et al. Performance comparisons of Bi-LSTM and Bi-GRU networks in Chinese word segmentation
Zhang et al. Keyword-driven image captioning via Context-dependent Bilateral LSTM
Wang The application of intelligent speech recognition technology in the tone correction of college piano teaching
Wen et al. Visual prompt tuning for few-shot text classification
Wang Research on the art value and application of art creation based on the emotion analysis of art
Yulin et al. High school math text similarity studies based on CNN and BiLSTM
Wang et al. Multimodal Feature Fusion and Emotion Recognition Based on Variational Autoencoder
Wu et al. A text emotion analysis method using the dual-channel convolution neural network in social networks
Xia et al. Generating Questions Based on Semi-Automated and End-to-End Neural Network.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190402

RJ01 Rejection of invention patent application after publication