CN110008467A - A kind of interdependent syntactic analysis method of Burmese based on transfer learning - Google Patents
A kind of interdependent syntactic analysis method of Burmese based on transfer learning Download PDFInfo
- Publication number
- CN110008467A CN110008467A CN201910158572.5A CN201910158572A CN110008467A CN 110008467 A CN110008467 A CN 110008467A CN 201910158572 A CN201910158572 A CN 201910158572A CN 110008467 A CN110008467 A CN 110008467A
- Authority
- CN
- China
- Prior art keywords
- burmese
- corpus
- syntactic analysis
- english
- interdependent syntactic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 73
- 238000013526 transfer learning Methods 0.000 title claims abstract description 18
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000012549 training Methods 0.000 claims abstract description 19
- 239000011159 matrix material Substances 0.000 claims description 18
- 238000013507 mapping Methods 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 4
- 238000003780 insertion Methods 0.000 claims description 4
- 230000037431 insertion Effects 0.000 claims description 4
- 230000001537 neural effect Effects 0.000 claims description 3
- 238000003058 natural language processing Methods 0.000 abstract description 4
- 230000000694 effects Effects 0.000 description 9
- 238000012546 transfer Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
Abstract
The present invention relates to a kind of interdependent syntactic analysis methods of the Burmese based on transfer learning, belong to natural language processing field.First with the interdependent syntactic analysis model of the interdependent syntactic analysis corpus training English of resourceful language English, secondly on the interdependent syntactic analysis corpus of Burmese on the basis of trained English interdependent syntactic analysis model using ideological trends network parameter to the low-resource of transfer learning, the interdependent syntactic analysis corpus of Burmese low quality is eventually adding to obtain the interdependent syntactic analysis model of Burmese to model tuning.This method can effectively promote the performance of low-resource language dependency syntactic analysis.
Description
Technical field
The present invention relates to a kind of interdependent syntactic analysis methods of the Burmese based on transfer learning, belong to natural language processing neck
Domain.
Background technique
Syntactic analysis is intended to convert graph structure (usually according to certain grammer system from the sequence form of word for sentence
Tree construction), it is one of key problem of natural language processing to portray the syntactic relation (Subject, Predicate and Object etc.) inside sentence.Effectively
Support multiple Tasks such as information extraction, sentiment analysis, machine translation.
The interdependent syntactic analysis method of mainstream is divided into based on figure (Graph-based) and based on transfer at present
(Transitionbased) two class: interdependent syntactic analysis is regarded as based on the method for figure and finds maximum life from complete directed graph
The problem of at tree, the side in figure indicate a possibility that there are certain syntactic relations between two words;Method based on transfer passes through
A series of transfer actions such as shift-ins, specification construct an interdependent syntax tree, and the target of study is to find optimal action sequence.With base
It is compared in the method for figure, the algorithm complexity based on transfer is lower, therefore has higher analysis efficiency, simultaneously because energy
Using richer feature, it is also suitable with the method based on figure to analyze accuracy rate.
Traditional method indicates state by extracting the feature of Manual definitions a series of, that is, the foundation classified, such as stack top
Word, part of speech, first word, part of speech in caching, the most left or most right word of generating portion dependency tree, these cores that are otherwise known as
Heart feature.In order to improve the precision of classification, it is also necessary to the various assemblage characteristics of Manual definition.Such as Zhang and Nivre (2011)
Once the feature and feature gang form of register optimization were given, 20 kinds of core features, 72 kinds of assemblage characteristics are shared.
Depth learning technology is successfully applied to interdependent sentence earliest by the Chen and Manning (2014) of Stanford University
Method analysis.What the nonlinear activation function in deep learning can also imply achievees the effect that feature combines in conventional method, from
And cumbersome assemblage characteristic design is avoided, it finally achieves and the comparable accuracy rate of conventional method.At the same time, this method by
This extremely time-consuming operation is combined in not needing explicitly to carry out feature, is accompanied by precomputation even depth study computing technique, most
The speed of interdependent syntactic analysis is also greatly accelerated eventually.
These methods are all to have the method for supervision, require the corpus of mark high quality, but Burmese belongs to low-resource language
Speech, without available labeled data, artificial mark cost is too high;Existing some ways for low-resource language are using bilingual
The interdependent treebank of the method construct low-resource of mapping, but the dependency tree inventory that such method obtains is in very big noise (syntax
It has differences, word presence can not match);For low-resource natural language processing, transfer learning obtains preferable effect;Therefore
It is proposed that the interdependent syntactic analysis method of Burmese based on transfer learning, the mould of benchmark is trained using English resources abundant
Type, and on the basis of this model, the corpus of some interdependent syntactic analyses of low quality Burmese is added to train Burmese interdependent
Syntactic analysis model.
Summary of the invention
It is low for promoting low-resource the present invention provides a kind of interdependent syntactic analysis method of the Burmese based on transfer learning
The training effect of the model of the interdependent syntactic analysis of the Burmese of quality.
The technical scheme is that obtaining Burmese interdependent syntax first with English-Burmese mapping thinking
Corpus is analyzed, Burmese corpus and English corpus are mapped to same semantic space;Secondly it is trained using bilingual term vector
The interdependent syntactic analysis model of English, and using the thought of transfer learning, Burmese interdependent syntactic analysis corpus is added, adjusts
The parameter of model obtains the model of the interdependent syntactic analysis of Burma.
Specific step is as follows:
Step1, the interdependent syntactic analysis corpus of Burmese is constructed using the Burmese bilingual parallel corporas of English-;
Step1.1, respectively in bilingual teaching mode English corpus and Burma's corpus segment;
Step1.2, the Burmese word corresponding relationship of English-is established;
Step1.3, using Stamford neural network training model to English corpus do interdependent syntactic analysis obtain English according to
Deposit syntactic analysis corpus;
Step1.4, using the word corresponding relationship in step Step1.2, the interdependent syntactic analysis of English corpus is mapped to
The interdependent syntactic analysis corpus of Burmese is arrived on Burmese corpus;
Step2, Burmese corpus and English corpus are mapped to same semantic space;Find a matrix of a linear transformation W
Realize arg min ∑i||XiW-Zi||2, wherein XiAnd ZiRespectively indicate the Burmese corpus and English corpus of the insertion of i-th of entry
Word, realized by matrix of a linear transformation W and Burmese corpus and English corpus be mapped to same semantic space, then used
Input of the term vector of the same semantic space obtained as interdependent syntactic analysis model;
Step2.1, Orthonormality constraints mapping matrix is utilized
During minimizing space length, matrix W can go excessively to be fitted it Mikolov et al. for fitting data
Middle a part of data lose other features in order to enable matrix W to learn global information, rather than for a portion.
Required list language invariance carrys out the dot product after reserved mapping, and Burmese term vector quality after mapping is avoided to decline.Increase in the map
It is orthogonal matrix, as (W that regular terms, which wherein requires W,TW=I).Orthogonality enhances the of overall importance of W matrix, therefore can be more preferable
The bilingual mapping of study, constraint W is orthogonal matrix, can effectively constrain the study of W, so that W is learnt entire information, rather than its
In primary comparison.
The length normalization method of Step2.2, maximum cosine
Bilingual term vector mapping is by the remote bilingual term vector of above-mentioned English, it cannot be guaranteed that each dimension of vector is big
It is small, when there is a dimension particularly large or small, exception is had during calculating, the appearance of the problem is because not having
Each dimension that a vector is calculated with equal extent needs dimension is unified into the same section.It will be macaronic
Word insertion standardization is that unit vector can guarantee that all trained examples are identical to the contribution of optimization aim.As long as W be it is orthogonal,
This is equivalent to the sum of the cosine similarity for maximizing dictionary entry, this is commonly used in the calculating of similarity:
The use of last regular terms can make in bilingual dictionary corresponding process, will not make every for the correspondence of some words
The overall situation is all taken into account during secondary study transfer matrix, learns the full dose information in Burmese and English.
Step2.3, iterative algorithm update bilingual dictionary
Expand dictionary by the method for iteration, iteratively solves each corresponding relationship, by corresponding relationship matrix W, find word
Not corresponding word is expanded in allusion quotation, is calculated new W matrix next time, obtains dictionary after update.During corresponding, such as
The remote bilingual term vector similarity of the fruit Chinese is lower than 80%, would not correspond to, the bilingual vector for only meeting the condition just will be considered that
It is corresponding, it is added in bilingual dictionary, is trained next time, until after certain number is trained, bilingual dictionary is not increasing,
Think that model convergence terminates trained or super more certain numbers and just stops.Because not right in dictionary during self study
The word answered should be labeled as 0, and the word being matched to is labeled as 1, introduce a new matrix D, and when Dij=1 is exactly the remote bilingual word of the Chinese
Remittance successful match, is exactly no successful match if it is 0.Then, target is to find optimal mapping matrix W, so that map source is embedding
The quadratic sum for entering the Euclidean distance between the target insertion Zj of X i W and dictionary entry D ij minimizes:
Step3, interdependent syntax is trained under the neural network of Stamford using the interdependent syntactic analysis corpus of English in Step1
Analysis model;
The input vector and the English for having marked dependence that resolver neural metwork training in Stamford needs are provided first
The interdependent syntactic analysis corpus of text, input vector is respectively that term vector, POS label vector and feature tag vector, term vector are
The term vector of same semantic space, the vector that POS label vector and feature tag vector are randomly generated are obtained in Step2;It is hidden
Layer is hidden to useTo train xw, xt, xlRespectively indicate hidden layer
Term vector, pos label vector and feature tag vector,Respectively indicate term vector, the pos label of hidden layer
The corresponding weight matrix of vector sum feature tag vector, b1Indicate biasing;The activation primitive that output layer uses: p=softmax
(W2H), W2Indicate weight.
Step3.1、[xw,xt,xl] input as model;WhereinIt is a d
The term vector of dimension;xt,xlWhat is respectively indicated is POS label vector and feature tag vector;
The activation primitive of Step3.2, model hidden layer:
WhereinAs biasing;dhIt is hidden layer
Node number;
The function of Step3.3, output layer:
P=softmax (W2h)
It is therein
Step4, optimize the training acquisition interdependent syntactic analysis model of Burmese using transfer learning shared parameter, it is specially sharp
With the Burmese of the identical semantic space obtained in the interdependent syntactic analysis corpus of the Burmese obtained in Step1 step and Step2
Term vector adjust the parameter of the interdependent syntactic analysis model of English that Step3 is obtained, and then training obtains the interdependent sentence of Burmese
Method analysis model.
Beneficial effects of the present invention:
The present invention obtains the interdependent syntactic analysis corpus of Burmese using mapping, and utilizes the ideological trends parameter of transfer learning
The effect for optimizing the interdependent syntactic analysis model of Burmese, effectively improves the performance of the interdependent syntactic analysis of Burmese.
Detailed description of the invention
Fig. 1 is the flow chart in the present invention;
Fig. 2 is model training figure in the present invention;
Fig. 3 is the interdependent syntactic analysis model training figure that network parameter is shared in the present invention.
Specific embodiment
The present invention will be further explained below with reference to the attached drawings and specific examples.
Embodiment 1: a kind of interdependent syntactic analysis method of Burmese based on transfer learning, process is as shown in Figure 1, first
By using the Burmese interdependent syntax of remote parallel treebank (English-Burmese the is aligned sentence pair 20028) building of Asian language English
Corpus is analyzed, using pairs of English-Burma word to the interdependent syntactic analysis relationship map of the English that will have been had been built up to Burmese
On sentence, the interdependent syntactic analysis corpus of the low-quality Burmese mapped totally 1799, low-quality 17688;In total 20028
Item;
Burmese and English are mapped to same semantic space by matrix of a linear transformation W realization, then using acquisition
Input of the term vector of same semantic space as the term vector of the model of interdependent syntactic analysis.Use the interdependent sentence of English above-mentioned
It is as shown in Figure 2 that method analyzes the corpus interdependent syntactic analysis model of training, specific training process under the neural network of Stamford.
Using the interdependent syntactic analysis corpus of the low quality Burmese of acquisition, and the identical semantic space that obtains is Burmese
Term vector is adjusted the parameter of interdependent syntactic analysis model to obtain the model of the interdependent syntactic analysis of Burmese.
What Fig. 3 was indicated is the neural network model figure of shared parameter, and Source Data and Target Data are English respectively
The interdependent syntactic analysis corpus of language (source corpus) and the interdependent syntactic analysis corpus of Burmese (target corpus).
Wherein the influence in the term vector of different dimensions to model training is different, and the results are shown in Table 1.
Influence of the term vector of 1 different dimensions of table to model training
As shown in Table 1, experimental result shows the effect that model can be improved by the dimension for increasing term vector, and UAS is promoted
About two percentage points, LAS promotes about one percentage point;
This method and other two methods are compared in the present embodiment, specific comparison is with Stamford neural network model
On the basis of be embedded in word2vec term vector training the interdependent syntactic analysis model of Burmese effect and be embedded in identical semanteme
The term vector in space trains the term vector instruction of the modelling effect come and the thought that transfer learning is added and identical semantic space
The modelling effect come is practised, the results are shown in Table 2, and experimental result shows that the effect that this law obtains is neural better than common Stamford
Network training model, UAS promote about half percentage point, and LAS promotes about one percentage point;
2 distinct methods Comparative result of table
Above in conjunction with attached drawing, the embodiment of the present invention is explained in detail, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
Put that various changes can be made.
Claims (2)
1. a kind of interdependent syntactic analysis method of Burmese based on transfer learning, it is characterised in that: first with English-Burmese
The thinking of mapping obtains Burmese interdependent syntactic analysis corpus, Burmese corpus and English corpus is mapped to same semantic empty
Between;Secondly the interdependent syntactic analysis model of English is trained using bilingual term vector, and using the thought of transfer learning, is added
Burmese interdependent syntactic analysis corpus, the parameter for adjusting model obtain the model of the interdependent syntactic analysis of Burma.
2. the interdependent syntactic analysis method of the Burmese according to claim 1 based on transfer learning, it is characterised in that: described
Specific step is as follows for the interdependent syntactic analysis method of Burmese based on transfer learning:
Step1, the interdependent syntactic analysis corpus of Burmese is constructed using the Burmese bilingual parallel corporas of English-;
Step1.1, respectively in bilingual teaching mode English corpus and Burma's corpus segment;
Step1.2, the Burmese word corresponding relationship of English-is established;
Step1.3, it interdependent syntactic analysis is done to English corpus using Stamford neural network training model obtains the interdependent sentence of English
Method analyzes corpus;
Step1.4, using the word corresponding relationship in step Step1.2, the interdependent syntactic analysis of English corpus is mapped to Burmese
The interdependent syntactic analysis corpus of Burmese is arrived on corpus;
Step2, Burmese corpus and English corpus are mapped to same semantic space;It is real to find a matrix of a linear transformation W
It is existingargmin∑i||XiW-Zi||2, wherein XiAnd ZiThe Burmese corpus of i-th of entry insertion and the word of English corpus are respectively indicated,
It is realized by matrix of a linear transformation W and Burmese corpus and English corpus is mapped to same semantic space, then using acquisition
Same semantic space input of the term vector as interdependent syntactic analysis model;
Step3, the interdependent syntactic analysis of training under the neural network of Stamford using the interdependent syntactic analysis corpus of English in Step1
Model;
First provide Stamford resolver neural metwork training need input vector and marked dependence English according to
Syntactic analysis corpus is deposited, input vector is respectively term vector, POS label vector and feature tag vector, and term vector is in Step2
Obtain the term vector of same semantic space, the vector that POS label vector and feature tag vector are randomly generated;Hidden layer uses
H=(W1 wxw+W1 txt+W1 lxl+b1)3, Lai Xunlian [W1 w, W1 t, W1 l], xw, xt, xlRespectively indicate term vector, the pos mark of hidden layer
Sign vector sum feature tag vector, W1 w, W1 t, W1 lRespectively indicate term vector, pos label vector and the feature tag of hidden layer to
Measure corresponding weight matrix, b1Indicate biasing;The activation primitive that output layer uses: p=softmax (W2H), W2Indicate weight.
Step4, optimize the training acquisition interdependent syntactic analysis model of Burmese using transfer learning shared parameter, specially utilize
The identical semantic space obtained in the interdependent syntactic analysis corpus of the Burmese obtained in Step1 step and Step2 it is Burmese
Term vector adjusts the parameter of the interdependent syntactic analysis model of English that Step3 is obtained, and then training obtains the interdependent syntax of Burmese
Analysis model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910158572.5A CN110008467A (en) | 2019-03-04 | 2019-03-04 | A kind of interdependent syntactic analysis method of Burmese based on transfer learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910158572.5A CN110008467A (en) | 2019-03-04 | 2019-03-04 | A kind of interdependent syntactic analysis method of Burmese based on transfer learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110008467A true CN110008467A (en) | 2019-07-12 |
Family
ID=67166340
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910158572.5A Pending CN110008467A (en) | 2019-03-04 | 2019-03-04 | A kind of interdependent syntactic analysis method of Burmese based on transfer learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110008467A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377918A (en) * | 2019-07-15 | 2019-10-25 | 昆明理工大学 | Merge the more neural machine translation method of the Chinese-of syntax analytic tree |
CN110489753A (en) * | 2019-08-15 | 2019-11-22 | 昆明理工大学 | Improve the corresponding cross-cutting sensibility classification method of study of neuromechanism of feature selecting |
CN110705253A (en) * | 2019-08-29 | 2020-01-17 | 昆明理工大学 | Burma language dependency syntax analysis method and device based on transfer learning |
CN110738057A (en) * | 2019-09-05 | 2020-01-31 | 中山大学 | text style migration method based on grammatical constraint and language model |
CN111046946A (en) * | 2019-12-10 | 2020-04-21 | 昆明理工大学 | Burma language image text recognition method based on CRNN |
CN114757167A (en) * | 2022-05-11 | 2022-07-15 | 昆明理工大学 | Vietnamese dependency syntactic analysis method based on parameter migration |
CN114925708A (en) * | 2022-05-24 | 2022-08-19 | 昆明理工大学 | Thai Chinese neural machine translation method fusing unsupervised dependency syntax |
US11741318B2 (en) | 2021-03-25 | 2023-08-29 | Nec Corporation | Open information extraction from low resource languages |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102760121A (en) * | 2012-06-28 | 2012-10-31 | 中国科学院计算技术研究所 | Dependence mapping method and system |
CN104991890A (en) * | 2015-07-15 | 2015-10-21 | 昆明理工大学 | Method for constructing Vietnamese dependency tree bank on basis of Chinese-Vietnamese vocabulary alignment corpora |
CN106250367A (en) * | 2016-07-27 | 2016-12-21 | 昆明理工大学 | The method building the interdependent treebank of Vietnamese based on the Nivre algorithm improved |
CN107894982A (en) * | 2017-10-25 | 2018-04-10 | 昆明理工大学 | A kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean |
-
2019
- 2019-03-04 CN CN201910158572.5A patent/CN110008467A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102760121A (en) * | 2012-06-28 | 2012-10-31 | 中国科学院计算技术研究所 | Dependence mapping method and system |
CN104991890A (en) * | 2015-07-15 | 2015-10-21 | 昆明理工大学 | Method for constructing Vietnamese dependency tree bank on basis of Chinese-Vietnamese vocabulary alignment corpora |
CN106250367A (en) * | 2016-07-27 | 2016-12-21 | 昆明理工大学 | The method building the interdependent treebank of Vietnamese based on the Nivre algorithm improved |
CN107894982A (en) * | 2017-10-25 | 2018-04-10 | 昆明理工大学 | A kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean |
Non-Patent Citations (10)
Title |
---|
CHAO XING ET AL: "Normalized word embedding and orthogonal transform for bilingual word translation", 《HUMAN LANGUAGE TECHNOLOGIES: THE 2015 ANNUAL CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ACL》 * |
GUO J ET AL: "A representation learning framework for multi-source transfer parsing", 《THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
JIANG GUO ET AL: "Cross-lingual Dependency Parsing Based on Distributed Representations", 《PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING》 * |
MIKEL ARTETXE ET AL: "Learning principled bilingual mappings of word embeddings while preserving monolingual invariance", 《PROCEEDINGS OF THE 2016 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 * |
SAMUEL L. SMITH ET AL: "Offline bilingual word vectors, orthogonal transformations and the inverted softmax", 《HTTPS://ARXIV.ORG/ABS/1702.03859》 * |
TAL SCHUSTER ET AL: "Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing", 《HTTPS://ARXIV.53YU.COM/ABS/1902.09492V1》 * |
李发杰 等: "借助汉-越双语词对齐语料构建越南语依存树库", 《中文信息学报》 * |
李英 等: "越南语短语树到依存树的转换研究", 《计算机科学与探索》 * |
郭江: "基于分布表示的跨语言跨任务自然语言分析", 《中国优秀博硕士学位论文全文数据库(博士)信息科技辑》 * |
高国骥: "基于跨语言分布式表示的跨语言文本分类", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377918A (en) * | 2019-07-15 | 2019-10-25 | 昆明理工大学 | Merge the more neural machine translation method of the Chinese-of syntax analytic tree |
CN110377918B (en) * | 2019-07-15 | 2020-08-28 | 昆明理工大学 | Chinese-transcendental neural machine translation method fused with syntactic parse tree |
CN110489753A (en) * | 2019-08-15 | 2019-11-22 | 昆明理工大学 | Improve the corresponding cross-cutting sensibility classification method of study of neuromechanism of feature selecting |
CN110489753B (en) * | 2019-08-15 | 2022-06-14 | 昆明理工大学 | Neural structure corresponding learning cross-domain emotion classification method for improving feature selection |
CN110705253A (en) * | 2019-08-29 | 2020-01-17 | 昆明理工大学 | Burma language dependency syntax analysis method and device based on transfer learning |
CN110738057A (en) * | 2019-09-05 | 2020-01-31 | 中山大学 | text style migration method based on grammatical constraint and language model |
CN110738057B (en) * | 2019-09-05 | 2023-10-24 | 中山大学 | Text style migration method based on grammar constraint and language model |
CN111046946A (en) * | 2019-12-10 | 2020-04-21 | 昆明理工大学 | Burma language image text recognition method based on CRNN |
US11741318B2 (en) | 2021-03-25 | 2023-08-29 | Nec Corporation | Open information extraction from low resource languages |
CN114757167A (en) * | 2022-05-11 | 2022-07-15 | 昆明理工大学 | Vietnamese dependency syntactic analysis method based on parameter migration |
CN114925708A (en) * | 2022-05-24 | 2022-08-19 | 昆明理工大学 | Thai Chinese neural machine translation method fusing unsupervised dependency syntax |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110008467A (en) | A kind of interdependent syntactic analysis method of Burmese based on transfer learning | |
CN110334219B (en) | Knowledge graph representation learning method based on attention mechanism integrated with text semantic features | |
CN109635280A (en) | A kind of event extraction method based on mark | |
CN106383816B (en) | The recognition methods of Chinese minority area place name based on deep learning | |
CN111325029B (en) | Text similarity calculation method based on deep learning integrated model | |
CN109753660B (en) | LSTM-based winning bid web page named entity extraction method | |
CN107562792A (en) | A kind of question and answer matching process based on deep learning | |
CN109800437A (en) | A kind of name entity recognition method based on Fusion Features | |
CN108920445A (en) | A kind of name entity recognition method and device based on Bi-LSTM-CRF model | |
CN111274790B (en) | Chapter-level event embedding method and device based on syntactic dependency graph | |
CN111222318B (en) | Trigger word recognition method based on double-channel bidirectional LSTM-CRF network | |
Yuan-jie et al. | Web service classification based on automatic semantic annotation and ensemble learning | |
CN107832458A (en) | A kind of file classification method based on depth of nesting network of character level | |
CN113672718B (en) | Dialogue intention recognition method and system based on feature matching and field self-adaption | |
CN108021557A (en) | Irregular entity recognition method based on deep learning | |
CN114676255A (en) | Text processing method, device, equipment, storage medium and computer program product | |
CN115329088B (en) | Robustness analysis method of graph neural network event detection model | |
CN113488196A (en) | Drug specification text named entity recognition modeling method | |
Zhao et al. | Synchronously improving multi-user English translation ability by using AI | |
CN112699685A (en) | Named entity recognition method based on label-guided word fusion | |
CN115062109A (en) | Entity-to-attention mechanism-based entity relationship joint extraction method | |
CN110852089A (en) | Operation and maintenance project management method based on intelligent word segmentation and deep learning | |
US10339223B2 (en) | Text processing system, text processing method and storage medium storing computer program | |
CN113204975A (en) | Sensitive character wind identification method based on remote supervision | |
CN115687610A (en) | Text intention classification model training method, recognition device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190712 |
|
RJ01 | Rejection of invention patent application after publication |