CN109558569A - A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model - Google Patents
A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model Download PDFInfo
- Publication number
- CN109558569A CN109558569A CN201811531266.3A CN201811531266A CN109558569A CN 109558569 A CN109558569 A CN 109558569A CN 201811531266 A CN201811531266 A CN 201811531266A CN 109558569 A CN109558569 A CN 109558569A
- Authority
- CN
- China
- Prior art keywords
- lstm
- speech
- bilstm
- word
- crf
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 14
- 238000012549 training Methods 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 7
- 238000013528 artificial neural network Methods 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 239000004744 fabric Substances 0.000 claims description 2
- 238000011160 research Methods 0.000 abstract description 4
- 238000010801 machine learning Methods 0.000 abstract description 2
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 241000283073 Equus caballus Species 0.000 description 1
- 230000003796 beauty Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/117—Tagging; Marking up; Designating a block; Setting of attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/221—Parsing markup language streams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The Laotian part-of-speech tagging method based on BiLSTM+CRF model that the present invention relates to a kind of, it belongs to natural language processing and machine learning techniques field.BiLSTM is based on LSTM structure, and BiLSTM can use contextual information to carry out part-of-speech tagging.By a sentence inputting to part-of-speech tagging into BiLSTM, BiLSTM can export the part of speech probability distribution of each word in sentence by calculating, and traditional way can select the maximum probability part of speech of each distribution, as part-of-speech tagging result.But the influence between part of speech is not accounted in this way, such as: verb etc. cannot be connect after quantifier.Therefore CRF model is introduced to solve this problem, and CRF model can be connected to the output layer of BiLSTM.Using the Laotian part-of-speech tagging model based on BiLSTM and CRF, part-of-speech tagging effectively can be carried out to Laotian, therefore the present invention has certain research significance.
Description
Technical field
The Laotian part-of-speech tagging method based on BiLSTM+CRF model that the present invention relates to a kind of, belongs to natural language processing
With machine learning techniques field.
Background technique
Part-of-speech tagging be exactly be each word in sentence, determine the process of its best part of speech.Part-of-speech tagging is many natural languages
One of the pre-treatment step of processing task, it be prepare for subsequent prior work, such as: syntactic analysis, information extraction
Deng.It is rule-based for studying the technology used in early days, but Rulemaking is very cumbersome.Therefore Statistics-Based Method is able to
Development, the model that early stage study of statistical methods uses have hidden equine husband model, condition random field (CRF) model and maximum entropy mould
Type.Due to the rise of deep learning, research starts steering and is carried out with deep learning the research of part-of-speech tagging, achieves good
Achievement.But this current technical idea was not studied in Laotian part-of-speech tagging, and model is also that oneself is built.
Summary of the invention
The Laotian part-of-speech tagging method based on BiLSTM+CRF model that the object of the present invention is to provide a kind of, by depth
The two-way long short-term memory Recognition with Recurrent Neural Network BiLSTM technology of degree study is ground with traditional statistical method condition random field CRF
Study carefully, be used in Laotian part-of-speech tagging, in an experiment and is achieved good results.
The technical solution adopted by the present invention is that: a kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model, packet
Include following steps:
The building of Step1, BiLSTM+CRF model
Laotian part-of-speech tagging model based on BiLSTM and CRF comprising five layers: input layer, LSTM layers of forward direction, backward
LSTM layers, full articulamentum and CRF layers;
(1) input layer:
The received data of input layer are the Laos sentence W with n word1…Wt…Wn, word enter BiLSTM before need turn
The form for being changed to number just can be carried out calculating, therefore input one term vector matrix of layer building, and each Laos's word can be
Its corresponding term vector is found in term vector matrix, the value of term vector represents the feature of the word, and term vector will also represent word input
To it is preceding to LSTM layers, it is LSTM layers backward in corresponding LSTM, carry out calculating word information;
(2) forward direction LSTM layers:
LSTM layers of forward direction are made of LSTM, and LSTM determines reservation, output and the deletion of information, come from input layer Laos sentence
In the term vector of each word will sequentially be input in corresponding LSTM, LSTM is connected by input sequence forward direction, each LSTM output
Two parts word information: forward-facing state information FS and forward direction output information FH, information are all presented with a matrix type, forward-facing state letter
Breath can be handed in the layer always, participate in the calculating of next LSTM, and forward direction output information will be output to full articulamentum meter
Calculate part of speech probability distribution;
(3) LSTM layers backward:
Backward LSTM layers are also to be made of LSTM, and the term vector of each word is sequentially input in input layer Laos sentence
In corresponding LSTM, but LSTM presses input sequence reverse connection, and each LSTM exports two parts word information: backward status information BS
And backward output information BH, backward status information will be handed in the layer always, participate in the calculating of next LSTM, and it is backward
Output information will be output to full articulamentum and calculate part of speech probability distribution;
(4) full articulamentum:
Full articulamentum is made of simple neural network unit, and each received data of unit are by LSTM layers of forward and backward
The forward direction output information FH of output, backward output information BH, FH and BH cross calculating in unit back warp, will obtain part of speech probability point
Cloth;
(5) CRF layers:
After full articulamentum obtains the probability distribution of each word, CRF model is distributed as sentence using these and calculates best word
Property annotated sequence, CRF layers guarantee select greater probability part of speech from each distribution while, also will consider part of speech between phase
Mutually influence;
The training of Step2, BiLSTM+CRF model
Training BiLSTM+CRF model uses Laotian chapter part-of-speech tagging corpus, it may be assumed that more are marked part of speech
Laotian article, training uses the log-likelihood function based on sentence level first, general to calculate the part of speech that full articulamentum obtains
The gap that rate distribution is really distributed with part of speech in Laotian chapter part-of-speech tagging corpus, then reduces difference using Adam algorithm
Away from training the parameter of BiLSTM+CRF model with this, until model reaches stable, i.e., for gap value close to 0, model reaches stable
Afterwards, so that it may obtain the Laotian part-of-speech tagging model based on BiLSTM and CRF, the sentence inputting of part-of-speech tagging will be needed to always
The input layer of Laos's words and phrases marking model, the CRF layers of part of speech that will export each word in sentence.
The beneficial effects of the present invention are:
1, present invention employs the BiLSTM structure of deep learning, BiLSTM structure has not information before and after study sentence
Wrong effect.
2, the present invention uses CRF model, and CRF model can be considered influencing each other between part of speech, connect in BiLSTM structure
The last layer, it is highly effective to the selection of part of speech.
3, the present invention the experimental results showed that, the present invention propose Laotian part-of-speech tagging model part-of-speech tagging accuracy rate be higher than
All traditional statistical models.
Detailed description of the invention
Fig. 1 is the overview flow chart in the present invention;
Fig. 2 is the BiLSTM+CRF model of Case-based Reasoning.
Specific embodiment
In order to describe in more detail the present invention and convenient for the understanding of those skilled in the art, with reference to the accompanying drawing and embodiment pair
The present invention is further described, and the embodiment of this part for illustrating the present invention, do not come with this by the purpose being easy to understand
The limitation present invention.
Embodiment 1: as shown in Figs. 1-2, a kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model is specific to walk
It is rapid as follows:
Step1, BiLSTM+CRF model
As shown in Figure 2, Laos's sentence of 3 words has been used(Ministry of Finance says),
BiLSTM+CRF model and workflow are explained.
(1) input layer:
Input layer is used to input 3 words of sentence, and each word can enter in term vector matrix, find oneself word to
Amount.The term vector of 3 words also will enter into LSTM layers of forward, backward in corresponding LSTM structure, carry out calculating word information;
(2) forward direction LSTM layers:
LSTM layers of forward direction are made of 3 LSTM (L).SentenceIn (Ministry of Finance says)
Each word term vector can enter corresponding L unit calculate word information: forward-facing state information (FS) and forward direction output information
(FH).With L1For, the first word of sentenceThe term vector of (national treasury) enters L1In be calculated: forward-facing state information
(FS1) and forward direction output information (FH1).Forward-facing state information (FS1) can be handed in the layer always, participate in next LSTM
(L2) calculating, and forward direction output information (FH1) full articulamentum calculating part of speech probability distribution will be output to;
(3) LSTM layers backward:
Backward LSTM layers are made of 3 LSTM (R).Working method is identical as forward direction LSTM layer, but LSTM presses input sequence
Reverse connection;
(4) full articulamentum:
Full articulamentum is made of 3 simple neural network units (Cell), and each Cell receives forward and backward output information
(FH and BH) is as input.With Cell2For, explain the calculating and output information content of this layer: Cell2It receives from backward
LSTM layers of BH2With from the preceding FH to LSTM layers2As input value, in Cell2It is middle to obtain word by calculating
The part of speech probability distribution of (declaring), will enter into CRF layers;
(5) CRF layers:
By connecting available sentence entirelyEach word in (Ministry of Finance says) it is general
Rate distribution, needs these set being input to CRF layers.CRF layers can guarantee to select greater probability part of speech from each distribution simultaneously,
Also it will consider influencing each other between part of speech;
The training of Step2, BiLSTM+CRF model
Training BiLSTM+CRF model uses Laotian chapter part-of-speech tagging corpus, it may be assumed that more are marked part of speech
Laotian article, training uses the log-likelihood function based on sentence level first, general to calculate the part of speech that full articulamentum obtains
The gap that rate distribution is really distributed with part of speech in Laotian chapter part-of-speech tagging corpus, then reduces difference using Adam algorithm
Away from training the parameter of BiLSTM+CRF model with this, until model reaches stable, i.e., for gap value close to 0, model reaches stable
Afterwards, so that it may obtain the Laotian part-of-speech tagging model based on BiLSTM and CRF, the sentence inputting of part-of-speech tagging will be needed to always
The input layer of Laos's words and phrases marking model, the CRF layers of part of speech that will export each word in sentence.
In the present embodiment, Laotian chapter part-of-speech tagging corpus is the more Laotian articles for being marked part of speech, with article
In for one section of sentence:(beauty
State does not announce advertising expenditure).
BiLSTM is based on LSTM structure, and LSTM structure is time recurrent neural network, suitable for processing time series
The long task in middle interval, such as: machine translation, image recognition, part-of-speech tagging etc..Since LSTM structure is time recurrence
, therefore using unidirectional LSTM structure when carrying out part-of-speech tagging task to sentence, contextual information, Zhi Nengli cannot be utilized
With unidirectional information.And the introduction of BiLSTM is exactly to solve this problem, can use contextual information to carry out part of speech mark
Note.By a sentence inputting to part-of-speech tagging into BiLSTM, BiLSTM can export the word of each word in sentence by calculating
Property probability distribution, traditional way can select the maximum probability part of speech of each distribution, as part-of-speech tagging result.But in this way
The influence between part of speech is not accounted for, such as: verb etc. cannot be connect after quantifier.Therefore CRF model is introduced to ask to solve this
CRF model, can be connected to the output layer of BiLSTM by topic.It, can using the Laotian part-of-speech tagging model based on BiLSTM and CRF
Effectively to carry out part-of-speech tagging to Laotian, therefore the present invention has certain research significance.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
Put that various changes can be made.
Claims (1)
1. a kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model, characterized by the following steps:
The building of Step1, BiLSTM+CRF model
Laotian part-of-speech tagging model based on BiLSTM and CRF comprising five layers: input layer, LSTM layers of forward direction, backward LSTM
Layer, full articulamentum and CRF layers;
(1) input layer:
The received data of input layer are the Laos sentence W with n word1…Wt…Wn, word enter BiLSTM before need to be converted to
The form of number just can be carried out calculating, therefore input one term vector matrix of layer building, each Laos's word can word to
Find its corresponding term vector in moment matrix, the value of term vector represents the feature of the word, and term vector will also represent before word is input to
To LSTM layers, it is LSTM layers backward in corresponding LSTM, carry out calculating word information;
(2) forward direction LSTM layers:
LSTM layers of forward direction are made of LSTM, and LSTM determines reservation, output and the deletion of information, every in input layer Laos sentence
The term vector of a word will be sequentially input in corresponding LSTM, and LSTM is connected by input sequence forward direction, and each LSTM exports two
Segment information: forward-facing state information FS and forward direction output information FH, information are all presented with a matrix type, forward-facing state information meeting
It is handed in the layer always, participates in the calculating of next LSTM, and forward direction output information will be output to full articulamentum and calculate word
Property probability distribution;
(3) LSTM layers backward:
Backward LSTM layers are also to be made of LSTM, and the term vector of each word is sequentially input to correspondence in input layer Laos sentence
LSTM in, but LSTM presses input sequence reverse connection, and each LSTM exports two parts word information: backward status information BS and after
To output information BH, backward status information will be handed in the layer always, participate in the calculating of next LSTM, and be exported backward
Information will be output to full articulamentum and calculate part of speech probability distribution;
(4) full articulamentum:
Full articulamentum is made of simple neural network unit, and each received data of unit are exported by LSTM layers of forward and backward
Forward direction output information FH, backward output information BH, FH and BH in unit back warp cross calculating, part of speech probability distribution will be obtained;
(5) CRF layers:
After full articulamentum obtains the probability distribution of each word, CRF model is distributed as sentence using these and calculates best part of speech mark
Sequence is infused, CRF layers while guaranteeing to select greater probability part of speech from each distribution, will also consider the mutual shadow between part of speech
It rings;
The training of Step2, BiLSTM+CRF model
Training BiLSTM+CRF model uses Laotian chapter part-of-speech tagging corpus, it may be assumed that more are marked the Laos of part of speech
Chinese language chapter, training use the log-likelihood function based on sentence level first, to calculate the part of speech probability point that full articulamentum obtains
The gap that part of speech is really distributed in cloth and Laotian chapter part-of-speech tagging corpus, then reduces gap using Adam algorithm, with
The parameter of this training BiLSTM+CRF model, until model reaches stable, i.e., gap value is close to 0, after model reaches stable, so that it may
To obtain the Laotian part-of-speech tagging model based on BiLSTM and CRF, the sentence inputting of part-of-speech tagging will be needed to Laos's words and phrases
The input layer of property marking model, the CRF layers of part of speech that will export each word in sentence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811531266.3A CN109558569A (en) | 2018-12-14 | 2018-12-14 | A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811531266.3A CN109558569A (en) | 2018-12-14 | 2018-12-14 | A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109558569A true CN109558569A (en) | 2019-04-02 |
Family
ID=65870089
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811531266.3A Pending CN109558569A (en) | 2018-12-14 | 2018-12-14 | A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109558569A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489750A (en) * | 2019-08-12 | 2019-11-22 | 昆明理工大学 | Burmese participle and part-of-speech tagging method and device based on two-way LSTM-CRF |
CN110705293A (en) * | 2019-08-23 | 2020-01-17 | 中国科学院苏州生物医学工程技术研究所 | Electronic medical record text named entity recognition method based on pre-training language model |
CN113468890A (en) * | 2021-07-20 | 2021-10-01 | 南京信息工程大学 | Sedimentology literature mining method based on NLP information extraction and part-of-speech rules |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
CN107622050A (en) * | 2017-09-14 | 2018-01-23 | 武汉烽火普天信息技术有限公司 | Text sequence labeling system and method based on Bi LSTM and CRF |
CN107644014A (en) * | 2017-09-25 | 2018-01-30 | 南京安链数据科技有限公司 | A kind of name entity recognition method based on two-way LSTM and CRF |
US20180225281A1 (en) * | 2017-02-06 | 2018-08-09 | Thomson Reuters Global Resources Unlimited Company | Systems and Methods for Automatic Semantic Token Tagging |
-
2018
- 2018-12-14 CN CN201811531266.3A patent/CN109558569A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106569998A (en) * | 2016-10-27 | 2017-04-19 | 浙江大学 | Text named entity recognition method based on Bi-LSTM, CNN and CRF |
US20180225281A1 (en) * | 2017-02-06 | 2018-08-09 | Thomson Reuters Global Resources Unlimited Company | Systems and Methods for Automatic Semantic Token Tagging |
CN107622050A (en) * | 2017-09-14 | 2018-01-23 | 武汉烽火普天信息技术有限公司 | Text sequence labeling system and method based on Bi LSTM and CRF |
CN107644014A (en) * | 2017-09-25 | 2018-01-30 | 南京安链数据科技有限公司 | A kind of name entity recognition method based on two-way LSTM and CRF |
Non-Patent Citations (1)
Title |
---|
ZHIHENG HUANG ET.AL: "Bidirectional LSTM-CRF Models for Sequence Tagging", 《ARXIV:1508,01991V1》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489750A (en) * | 2019-08-12 | 2019-11-22 | 昆明理工大学 | Burmese participle and part-of-speech tagging method and device based on two-way LSTM-CRF |
CN110705293A (en) * | 2019-08-23 | 2020-01-17 | 中国科学院苏州生物医学工程技术研究所 | Electronic medical record text named entity recognition method based on pre-training language model |
CN113468890A (en) * | 2021-07-20 | 2021-10-01 | 南京信息工程大学 | Sedimentology literature mining method based on NLP information extraction and part-of-speech rules |
CN113468890B (en) * | 2021-07-20 | 2023-05-26 | 南京信息工程大学 | Sedimentology literature mining method based on NLP information extraction and part-of-speech rules |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021155699A1 (en) | Global encoding method for automatic abstract of chinese long text | |
CN108763284A (en) | A kind of question answering system implementation method based on deep learning and topic model | |
CN110222163A (en) | A kind of intelligent answer method and system merging CNN and two-way LSTM | |
CN109558569A (en) | A kind of Laotian part-of-speech tagging method based on BiLSTM+CRF model | |
CN112183058B (en) | Poetry generation method and device based on BERT sentence vector input | |
CN108491386A (en) | natural language understanding method and system | |
CN110457661B (en) | Natural language generation method, device, equipment and storage medium | |
CN110188175A (en) | A kind of question and answer based on BiLSTM-CRF model are to abstracting method, system and storage medium | |
CN107679225A (en) | A kind of reply generation method based on keyword | |
CN110188348A (en) | A kind of Chinese language processing model and method based on deep neural network | |
Dethlefs | Domain transfer for deep natural language generation from abstract meaning representations | |
CN109508457A (en) | A kind of transfer learning method reading series model based on machine | |
CN112287106A (en) | Online comment emotion classification method based on dual-channel hybrid neural network | |
CN110334196A (en) | Neural network Chinese charater problem based on stroke and from attention mechanism generates system | |
CN114444481B (en) | Sentiment analysis and generation method of news comment | |
CN109933773A (en) | A kind of multiple semantic sentence analysis system and method | |
Zhang et al. | Performance comparisons of Bi-LSTM and Bi-GRU networks in Chinese word segmentation | |
Zhang et al. | Keyword-driven image captioning via Context-dependent Bilateral LSTM | |
Wang | The application of intelligent speech recognition technology in the tone correction of college piano teaching | |
Wen et al. | Visual prompt tuning for few-shot text classification | |
Wang | Research on the art value and application of art creation based on the emotion analysis of art | |
Yulin et al. | High school math text similarity studies based on CNN and BiLSTM | |
Wang et al. | Multimodal Feature Fusion and Emotion Recognition Based on Variational Autoencoder | |
Wu et al. | A text emotion analysis method using the dual-channel convolution neural network in social networks | |
Xia et al. | Generating Questions Based on Semi-Automated and End-to-End Neural Network. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190402 |
|
RJ01 | Rejection of invention patent application after publication |