CN103020148A - System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks - Google Patents

System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks Download PDF

Info

Publication number
CN103020148A
CN103020148A CN2012104798011A CN201210479801A CN103020148A CN 103020148 A CN103020148 A CN 103020148A CN 2012104798011 A CN2012104798011 A CN 2012104798011A CN 201210479801 A CN201210479801 A CN 201210479801A CN 103020148 A CN103020148 A CN 103020148A
Authority
CN
China
Prior art keywords
dependence
word
core
treebank
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012104798011A
Other languages
Chinese (zh)
Inventor
邱锡鹏
赵建双
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN2012104798011A priority Critical patent/CN103020148A/en
Publication of CN103020148A publication Critical patent/CN103020148A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention belongs to the technical field of natural language processing, in particular to a system and a method for converting Chinese phrase structure tree banks into interdependent structure tree banks. The method comprises the following steps: splitting complex tree structures; creating a more accurate core mapping table; splitting complex Chinese structures according to a regular method; creating a dependence relationship tagging standard; and confirming a dependence relationship type according to the regular method. The system comprises a splitter for splitting long sentences in a tree bank into short sentences, the core mapping table for obtaining the initial dependence head node of each phrase, a dependence regulator for confirming the final dependence head node of the phrase, and a dependence relationship regulator for confirming the final dependency relationship between phrases and forming final dependence tree banks. According to the invention, a Penn Chinese Tree Bank is converted into the interdependent structure tree banks, so as to be more accurate, standard, and reasonable.

Description

A kind of system and method that Chinese phrase structure treebank is converted into the dependency structure treebank
Technical field
The invention belongs to the natural language processing technique field, be specially a kind of system and method thereof that Chinese phrase structure treebank is converted to the dependency structure treebank.
Background technology
Along with the development of natural language processing, rule-based research method demonstrates its limitation gradually, and people more and more trend towards using the rule of obtaining natural language based on statistical method from real language material.Syntactic analysis is arranged in a core position of natural language processing, and the quality of its performance has important impact to other technologies.It also be take based on statistical method as main stream approach.So the language material data have been served as an important role in syntactic analysis.The height of the accuracy of language material and the size of scale are determining the quality of the performance of syntactic analysis from most basic aspect, do not have language material extensive, high accuracy, and good algorithm has also lost his effect again.Treebank more and more causes people's interest as a kind of corpus that sentence has been carried out deep layer syntax mark.
The researchist has also obtained considerable achievement having carried out a large amount of research-and-development activitys aspect the treebank research at present.The mark system difference that these treebanks adopt is huge, substantially is divided into two kinds according to describing method, and a kind of is phrase structure tree, and a kind of is dependency tree.At world wide, the extensive treebank of great majority is based on phrase structure.In the Chinese treebank, the treebank that marks based on phrase structure also occupies main status, and wherein that the most famous is the Chinese treebank Penn Chinese Treebank of the University of Pennsylvania.
In the grammer system, dependency grammar is succinct with its form, be easy to mark, be convenient to the advantage such as application, is subject to gradually researchist's attention.And limited undoubtedly the development of Chinese parsing based on the scarcity of the Chinese treebank of interdependent syntax.Because the mark treebank needs perfect mark system and the mark flow process of standard, guarantee the quality that marks, this is a job of wasting time and energy.Although research finds that phrase structure is different on the form of expression with dependency structure, they all are the descriptions to the sentence syntactic structure, therefore structurally have consistance.And the phrase structure treebank is sufficient now, and we can convert phrase structure to dependency structure according to the contact between them, obtains the interdependent treebank that we want, thereby has removed a large amount of artificial mark work from.
Many people have attempted the phrase structure treebank is converted into interdependent treebank both at home and abroad at present.Wherein the method for main flow is to utilize the core node mapping table to find the core node of every one deck, and other nodes of same layer all depend on this core node, and travels through whole structure tree with the mode of recurrence.Treebank crossover tool PENN2MALT is exactly the main flow crossover tool that utilizes this thought, and it provides the core node mapping table of Penn Treebank and Penn Chinese Treebank, and its executable file, all freely shares now.
PENN2MALT has reached good effect for the conversion of the English language material of Penn Treebank, but because the complicacy of Chinese, and the simplicity of the rule of PENN2MALT self, with the PennChineseTreebank Chinese language material of PENN2MALT conversion as a result effect be not fine, if train interdependent syntax with the language material after his conversion, can affect the final performance of interdependent syntax.So we according to the characteristics of Chinese, have defined a large amount of rules, developed the crossover tool of oneself with the method for rule, with the language material of the language material after this crossover tool conversion with respect to the PENN2MALT conversion, have higher accuracy and standardization.
Summary of the invention
The object of the invention is to propose a kind of rule-based Chinese treebank converting system and method, PennChineseTreeBank Chinese structure treebank is converted to the interdependent treebank of more reasonable more standard.
A kind of method that Chinese phrase structure treebank is converted into the dependency structure treebank that the present invention proposes, its concrete steps are as follows:
1) reads in PennChineseTreebank Chinese treebank, and by splitter, the long sentence in the treebank is split as short sentence.
2) determine final core mapping table, and utilize the core mapping table to obtain the initial dependence head node of each word.
3) determine the final dependence head node of each word by the dependent Rule device.
4) set up dependence type mark standard, by the dependence normalizer, determine the final dependence between word and the word, form final dependence treebank.
The present invention mainly comprises: split complicated tree construction; Set up more accurately core mapping table, and get rid of the situation that punctuate, modal particle, interjection are done core word; Utilize the special grammar structure in the regular method solution Chinese; Set up dependence type mark standard; Utilize the method for rule to determine the dependence type.The below introduces main contents of the present invention one by one.
One, splits complicated tree construction
In Penn Chinese Treebank treebank, there are many long sentence, and these long sentence are labeled in the structure tree, its structure complexity very, may there be a plurality of root nodes in such structure tree, and these root nodes Existence dependency relationship not each other, if so convert such long sentence to dependency tree, can greatly reduce the accuracy rate of interdependent treebank.And adopting splitter that these long sentences are cut into several short sentence among the present invention, each short sentence self forms an independently structure tree, thereby has reduced the complexity of structure tree.The structure tree that again these is regenerated converts dependent tree to, thereby obtains the dependence treebank of higher accuracy and standardization.Its specific rules is: according to the characteristics of tree construction, in the child nodes of root node, with its for comma or branch be made as the fractionation point, long sentence is split as short sentence, and the tree after splitting with original root node as present root node.
Two, set up more accurately core mapping table,
Although the source code of PENN2MALT crossover tool is not increased income, but its core mapping table comes forth, the present invention passes through great many of experiments, discovery is not very desirable with the language material of the core mapping table conversion that it is announced, so by the research to Penn Chinese Treebank treebank, set up the core mapping table of oneself, as shown in table 1.
Table 1
The core mapping table is used for determining which child node in the structure tree is the core node of father node.Each father node mark has a rule set in the table.His rule set comprises two aspects, and the one, scan direction, the 2nd, core phrase set of types.L representative in the table scans the child node sequence from left to right, and the r representative scans the child node sequence from right to left.
Can obtain by following algorithm the core node of each node according to the core mapping table.
1. decision node marks whether in table, if not in table then be left intact, otherwise turns to 2.
2. the direction of scanning that provides according to the first rule in the rule list, scan successively its child node, if core phrase set of types is empty, then take first node of scanning as core node, turn to 3, otherwise seek successively mark in the core phrase set of types in the mode that repeats to scan, if find then turn to 3, otherwise carry out successively Second Rule and three sigma rule by this way.
3. decision node is leaf node, if so, then finishes, otherwise to its each child, carries out step 1,2,3 in the mode of recurrence.
Three, the situation of core word done in eliminating punctuate, modal particle, interjection
Because the core mapping table is not all listed all situations, represent that with r or l rightmost or leftmost word is as core word at last, this causes many sentences to do core word with punctuate, modal particle, interjection, and these all do booster action in sentence, can not do core word, this patent is by when looking for core node with the core mapping table, got rid of the situation that core node done in punctuate, modal particle, interjection, makes the more accurate standard of language material, more reasonable of conversion.
The method of four, utilization rule solves the special grammar structure in the Chinese
Some special grammar structures that exist in the Chinese, if only solve with the core mapping table, the result is inaccurate, the present invention is directed to these special syntactic structures conducts in-depth research, found a large amount of rules, set up the dependent Rule device, obtained more accurate rational result by these rules.
1. " " word structure and " quilt " word structure
BA-sentence and " quilt " words and expressions are the special sentence formulas in the Chinese, and they also are subject to many researchists' attention in the linguistics field, and many researchists have done a large amount of research to them.According to Penn Chinese Treebank marking structure, if just find their dependence according to the rule of core mapping table as PENN2MALT, the dependence that obtains so will not meet grammer and the meaning of one's words of Chinese.So our own taxeme according to Chinese, oneself has defined about them and has relied on standard, and utilizes rule to realize.Rule wherein is: " " among the child of the node of closelying follow behind word or " quilt " byte point, if subject-predicate or SVO structure, then subject and predicate all depend on " " word or " quilt " byte point, and as their object.
2. " get " the word structure
" get " the word structure and occur in Chinese frequently, it appears in the verb phrase usually, immediately following the verb back, makes its auxiliary verb.Obviously the core word that " gets " byte point is the verb of his front, but the core word of the object of his back but is " getting " byte point.
3. parallel construction
In Chinese, often have the sentence of some parallel constructions, and the coordinate noun in these sentences is labeled in the same phrase structure in Penn Chinese Treebank.And whom these nouns arranged side by side do not have rely on whose relation, and not only only have a core word in a structure, and it also has some secondary core words, are not all right with a core mapping table only.The present invention has defined a standard, allows top noun as core word, and the conjunction between those connection coordinate nouns relies on the noun of conjunction back, if coordinate noun is to separate with pause mark, pause mark relies on the noun of its front.
4. special verb phrase structure
In Penn Chinese Treebank treebank, many mark standards are arranged, comprising the mark of distinguishing for some special verb structures.We just in time can utilize these to mark our dependence syntax of standard.The mark of the verb phrase structure that these are special comprises VCD, VRD, VSB, VCP, VPT, VNV.We have defined dependence standard as shown in table 2 by the research to these anomalous verb phrase structures:
Table 2
Figure 402237DEST_PATH_IMAGE002
Five, set up dependence type mark standard
Because the simplicity of PENN2MALT rule, and the complicacy of Chinese self, the dependence type of PENN2MALT definition is not very accurate, and the present invention has defined the mark standard such as table 3 by the further investigation to the dependence treebank language material after Penn Chinese Treebank language material and the conversion:
Table 3
Figure 951030DEST_PATH_IMAGE003
Six, utilize the method for rule to determine the dependence type
The present invention is by the dependence normalizer, seeks dependence between word and the word from two aspects:
1. from PennChineseTreebank Chinese treebank mark, find their dependence.
2. find their dependence from the characteristics of the characteristics of word self and its dependence word.
For first aspect, its specific rules is:
1) in the PennChineseTreebank Chinese treebank, vertex ticks is that the dependence with its core word of DVP, ADVP is decided to be the adverbial modifier; Vertex ticks is that the dependence with its core word of DNP, DP, ADJP is decided to be attribute.
2) in the PennChineseTreebank Chinese treebank, vertex ticks is decided to be subject for the dependence with its core word of-SUB; Vertex ticks is decided to be object for the dependence with its core word of-OBJ; Vertex ticks is decided to be the adverbial modifier for the dependence with its core word of-ADV; Vertex ticks is decided to be complement for the dependence with its core word of-EXT.
3) in the PennChineseTreebank Chinese treebank, vertex ticks is that the dependence with its non-core node of VRD, VCP, VPT is decided to be complement; Vertex ticks is that the dependence with its non-core node of VCD is decided to be side by side; Vertex ticks is that the dependence with its non-core node of VSB is decided to be interlock; Vertex ticks is that the dependence with its non-core node of VNV is decided to be the query interlock.
For second aspect, its specific rules is as shown in table 4:
Table 4
Because these rules are to have conflict, these rules need to be reserved priority, concrete priority is followed successively by from high to low: the dependence type that the tabulation in the second aspect is listed is root node, tense, the tone, sigh with feeling, punctuate, the word structure, the word structure, get the word structure, the rule of ground word structure, then be the rule 1 in the first aspect), 2), 3), that the dependence type of listing in the tabulation in the second aspect is side by side at last, related, guest Jie, quantity, subject, object, attribute, the adverbial modifier, the rule of complement, the strict sequencing according to priority can obtain accurately dependence.
The present invention also provides a kind of system that Chinese phrase structure treebank is converted into the dependency structure treebank, it is characterized in that, this system comprises:
Splitter is used for the long sentence of treebank is split as short sentence;
The core mapping table is for the initial dependence head node that obtains each word;
The dependent Rule device is for the final dependence head node of determining each word;
The dependence normalizer is used for the final dependence between definite word and the word, forms final dependence treebank.
A kind of rule-based Chinese treebank converting system provided by the invention and method convert PennChineseTreeBank Chinese structure treebank to interdependent treebank, have more accuracy, standardization and rationality.
Description of drawings
Fig. 1: system flowchart.
Fig. 2: phrase structure tree exemplary plot.
Fig. 3: only process the dependency structure tree exemplary plot that obtains with the core mapping table.
Fig. 4: the final dependency structure tree exemplary plot that obtains of processing.
Embodiment
Below in conjunction with drawings and Examples the present invention is described in further detail.
Fig. 1 is the process flow diagram of system of the present invention, and its concrete steps are:
A) read in PennChineseTreebank Chinese treebank, and by splitter, the long sentence in the treebank is split as short sentence;
B) determine final core mapping table, and utilize the core mapping table to obtain the initial dependence head node of each word;
C) determine the final dependence head node of each word by the dependent Rule device;
D) set up dependence type mark standard, by the dependence normalizer, determine the final dependence between word and the word, form final dependence treebank.
Embodiment 1
A phrase among the PennChineseTreebank " will be in the vegetable cell mycin suction body during mosquito feed.", its structural representation is as shown in Figure 2.Dependency structure tree schematic diagram after its final conversion as shown in Figure 4.Below we just analyze the concrete steps that this phrase structure tree converts dependency tree to.
At first, reading in this phrase structure tree with the form of tree construction, owing to not having comma or branch in the child node of its ceiling, is not long sentence so judge it, need not enter splitter.
Then, utilize the rule of core mapping table to come it is processed.Specifically come declarative procedure with a noun phrase in this phrase.As shown in Figure 2, " plant " in the phrase, " cell ", " mycin " three words have formed a noun phrase, their father node is NP-SBJ, so finding father node from mapping table is the rule set of NP, because first row is r in the rule set, so from right to left scanning, in this rule set, the node that will look for successively is respectively NP, NN, NT, NR, QP, IP, PN.Because their three part of speech all is NN, so the NP node is not found in scanning for the first time, enter for the second time scanning, first word is exactly the core node that will look for, so " mycin " this node is decided to be the core word of this phrase, then finish the scanning to this phrase, enter next phrase.Travel through by this way complete phrase structure tree.The dependency structure tree that finally obtains as shown in Figure 3.
Secondly, by the dependent Rule device, find this phrase meet " " characteristic of word structure, so utilize in the regular device " " rule treatments of word structure.Because " " child of the node of closelying follow behind the byte point is the SVO structure, so the subject in this structure and predicate are all relied on " " word, and do his object.Namely " mycin " and " suction " these two words have all been relied on " " this word, and their dependence has been decided to be " object ".
At last, by the dependence normalizer, determine their dependence.For example, self part of speech of " mosquito " in this phrase is noun, and its core word part of speech is verb, and it is on the left side of core word, is " subject " so define its dependence.Utilize like this rule in the dependence normalizer and the regular priority of reserving, finally can obtain final dependency structure tree as shown in Figure 4.
Owing to also the dependency structure treebank of Chinese is not set up evaluating standard, can not adopt the automatically method of evaluation and test to it.In order to verify our conversion effect, 400 samples that extracted randomly the final transformation result of system of the present invention carry out desk checking, and its net result is as shown in table 5:
Table 5
Figure 932204DEST_PATH_IMAGE005
Above data declaration the present invention has obtained a good effect, and it has high accuracy and standard degree.

Claims (7)

1. a method that Chinese phrase structure treebank is converted into the dependency structure treebank is characterized in that, concrete steps are as follows:
A) read in PennChineseTreebank Chinese treebank, and by splitter, the long sentence in the treebank is split as short sentence;
B) determine final core mapping table, and utilize the core mapping table to obtain the initial dependence head node of each word;
C) determine the final dependence head node of each word by the dependent Rule device;
D) set up dependence type mark standard, by the dependence normalizer, determine the final dependence between word and the word, form final dependence treebank.
2. method according to claim 1, it is characterized in that: step a) described in splitter according to the characteristics of tree construction, in the child nodes of root node, will be for comma or branch be made as the fractionation point, long sentence is split as short sentence, and the tree after splitting with original root node as present root node.
3. method according to claim 1, it is characterized in that: the mapping table of core step b) is to copy the core mapping tableau format of announcing in the PENN2MALT crossover tool, according to the characteristics of PennChineseTreebank Chinese treebank and the characteristics of dependent tree, the more accurately core mapping table determined, it has got rid of the situation that core word done in punctuate, modal particle, interjection.
Method according to claim 1, it is characterized in that: the device of dependent Rule step c), it is according to the characteristics of Chinese grammar and the mark characteristics of PennChineseTreebank Chinese treebank, for only using step b) described in the unascertainable dependency structure of core mapping table, determine concrete rule, determine the final dependence head node of each word; Wherein said concrete rule is:
A) " " rule of word structure and " quilt " word structure: " " among the child of the node of closelying follow behind word or " quilt " byte point, if subject-predicate or SVO structure, then subject and predicate all depend on " " word or " quilt " byte point, and as their object;
B) " get " rule of word structure: " getting " byte point is take the verb of his front as core word, and the object of his back is take " getting " byte point as core word;
C) rule of parallel construction: allow top noun as core word, and the conjunction between those connection coordinate nouns relies on the noun of conjunction back, if coordinate noun is to separate with pause mark, pause mark relies on the noun of its front;
D) rule of special verb phrase: the mark of special verb phrase structure comprises VCD, VRD, VSB, VCP, VPT, VNV.By the research to these anomalous verb phrase structures, obtain following rule list:
Figure 537232DEST_PATH_IMAGE001
4. method according to claim 1 is characterized in that, steps d) described in dependence type mark standard,
Shown in specifically seeing the following form:
Figure 628947DEST_PATH_IMAGE002
5. method according to claim 1 is characterized in that: the normalizer of dependence steps d), seek the dependence between word and the word, from two aspects:
1) from PennChineseTreebank Chinese treebank mark, finds their dependence;
2 characteristics from the characteristics of word self and its dependence word find their dependence;
Wherein said first aspect, its specific rules is:
1. in the PennChineseTreebank Chinese treebank, vertex ticks is that the dependence with its core word of DVP, ADVP is decided to be the adverbial modifier; Vertex ticks is that the dependence with its core word of DNP, DP, ADJP is decided to be attribute;
2. in the PennChineseTreebank Chinese treebank, the vertex ticks suffix is respectively-SUB ,-OBJ ,-ADV ,-EXT, the dependence of its core word is decided to be respectively subject, object, the adverbial modifier, complement;
3. in the PennChineseTreebank Chinese treebank, vertex ticks is that the dependence with its non-core node of VRD, VCP, VPT is decided to be complement; Vertex ticks is that the dependence with its non-core node of VCD is decided to be side by side; Vertex ticks is that the dependence with its non-core node of VSB is decided to be interlock; Vertex ticks is that the dependence with its non-core node of VNV is decided to be the query interlock;
Described second aspect, specific rules is seen following rule list:
6. method according to claim 5, it is characterized in that: these rules of described first aspect and second aspect are to have conflict, these rules are reserved priority, concrete priority is followed successively by from high to low: the dependence type of listing in the described second aspect rule is root node, tense, the tone, sigh with feeling, punctuate, the word structure, the word structure, get the word structure, the rule of ground word structure, then be in the described first aspect rule 1., 2., 3., that the dependence type of listing in the described second aspect rule is side by side at last, related, guest Jie, quantity, subject, object, attribute, the adverbial modifier, the rule of complement, the strict sequencing according to priority can obtain accurately dependence.
7. a system that Chinese phrase structure treebank is converted into the dependency structure treebank is characterized in that, this system comprises:
Splitter is used for the long sentence of treebank is split as short sentence;
The core mapping table is for the initial dependence head node that obtains each word;
The dependent Rule device is for the final dependence head node of determining each word;
The dependence normalizer is used for the final dependence between definite word and the word, forms final dependence treebank.
CN2012104798011A 2012-11-23 2012-11-23 System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks Pending CN103020148A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012104798011A CN103020148A (en) 2012-11-23 2012-11-23 System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012104798011A CN103020148A (en) 2012-11-23 2012-11-23 System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks

Publications (1)

Publication Number Publication Date
CN103020148A true CN103020148A (en) 2013-04-03

Family

ID=47968752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012104798011A Pending CN103020148A (en) 2012-11-23 2012-11-23 System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks

Country Status (1)

Country Link
CN (1) CN103020148A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740235A (en) * 2016-01-29 2016-07-06 昆明理工大学 Phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features
CN104199811B (en) * 2014-09-10 2017-06-16 上海携程商务有限公司 Short sentence analytic modell analytical model method for building up and system
CN107748742A (en) * 2017-06-16 2018-03-02 平安科技(深圳)有限公司 A kind of method, terminal and equipment based on syntax dependence extraction centre word
CN110457466A (en) * 2019-06-28 2019-11-15 谭浩 Generate method, computer readable storage medium and the terminal device of interview report
WO2020233261A1 (en) * 2019-07-12 2020-11-26 之江实验室 Natural language generation-based knowledge graph understanding assistance system
CN113486220A (en) * 2021-07-28 2021-10-08 平安国际智慧城市科技股份有限公司 Verb phrase component labeling method and device, electronic equipment and storage medium
CN115017913A (en) * 2022-04-21 2022-09-06 广州世纪华轲科技有限公司 Semantic component analysis method based on master-slave framework mode
CN115017902A (en) * 2022-06-09 2022-09-06 青海师范大学 Deep learning-based Tibetan phrase structure recognition model construction method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201819A (en) * 2007-11-28 2008-06-18 北京金山软件有限公司 Method and system for transferring tree bank

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101201819A (en) * 2007-11-28 2008-06-18 北京金山软件有限公司 Method and system for transferring tree bank

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
周惠巍等: "短语结构到依存结构树库转换研究", 《大连理工大学学报》, vol. 50, no. 4, 31 July 2010 (2010-07-31), pages 610 - 613 *
李正华等: "短语结构树库向依存结构树库转化研究", 《中文信息学报》, vol. 22, no. 6, 30 November 2008 (2008-11-30), pages 14 - 19 *
王跃龙等: "短语结构树到依存树的转换", 《第三届学生计算语言学研讨会论文集》, 31 August 2006 (2006-08-31) *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199811B (en) * 2014-09-10 2017-06-16 上海携程商务有限公司 Short sentence analytic modell analytical model method for building up and system
CN105740235A (en) * 2016-01-29 2016-07-06 昆明理工大学 Phrase tree to dependency tree transformation method capable of combining Vietnamese grammatical features
CN105740235B (en) * 2016-01-29 2019-02-19 昆明理工大学 It is a kind of merge Vietnamese grammar property tree of phrases to dependency tree conversion method
CN107748742A (en) * 2017-06-16 2018-03-02 平安科技(深圳)有限公司 A kind of method, terminal and equipment based on syntax dependence extraction centre word
WO2018227995A1 (en) * 2017-06-16 2018-12-20 平安科技(深圳)有限公司 Method, terminal, device and storage medium for extracting head based on syntax dependency relationship
CN110457466A (en) * 2019-06-28 2019-11-15 谭浩 Generate method, computer readable storage medium and the terminal device of interview report
WO2020233261A1 (en) * 2019-07-12 2020-11-26 之江实验室 Natural language generation-based knowledge graph understanding assistance system
CN113486220A (en) * 2021-07-28 2021-10-08 平安国际智慧城市科技股份有限公司 Verb phrase component labeling method and device, electronic equipment and storage medium
CN113486220B (en) * 2021-07-28 2024-01-23 平安国际智慧城市科技股份有限公司 Verb phrase component labeling method, verb phrase component labeling device, electronic equipment and storage medium
CN115017913A (en) * 2022-04-21 2022-09-06 广州世纪华轲科技有限公司 Semantic component analysis method based on master-slave framework mode
CN115017902A (en) * 2022-06-09 2022-09-06 青海师范大学 Deep learning-based Tibetan phrase structure recognition model construction method and device

Similar Documents

Publication Publication Date Title
CN103020148A (en) System and method for converting Chinese phrase structure tree banks into interdependent structure tree banks
CN104268132B (en) machine translation method and system
CN104268133B (en) machine translation method and system
CN105005557A (en) Chinese ambiguity word processing method based on dependency parsing
CN108665141B (en) Method for automatically extracting emergency response process model from emergency plan
Menacer et al. Machine translation on a parallel code-switched corpus
Tachicart et al. Lexical differences and similarities between Moroccan dialect and Arabic
CN105320650A (en) Machine translation method and system
CN104516870B (en) A kind of translation inspection method and its system
CN102760121A (en) Dependence mapping method and system
KR101527046B1 (en) System and method for converting ontology-based rule set into bim model checker rule set for bim quality check
Kettnerová et al. The syntax-semantics interface of Czech verbs in the valency lexicon
CN109815503A (en) A kind of human-computer interaction interpretation method
Ogrodniczuk et al. Rule-based coreference resolution module for Polish
Garje et al. Transmuter: an approach to rule-based English to Marathi machine translation
CN102945231B (en) Construction method and system of incremental-translation-oriented structured language model
Williams et al. Identifying missing dictionary entries with frequency-conserving context models
Yang et al. Inflating a small parallel corpus into a large quasi-parallel corpus using monolingual data for Chinese-Japanese machine translation
Tran et al. Improve effectiveness resolving some inter-sentential anaphoric pronouns indicating human objects in Vietnamese paragraphs using finding heuristics with priority
Li et al. The extracting method of Chinese-Naxi translation template based on improved dependency tree-to-string
Zhu Analysis of Chinese word segmentation technology
JP2004318344A (en) System and method for machine translation and computer program
Lu Literature Review of Second Language Learners’ Acquisition of Chinese Resultative Construction
Sallakh et al. Negation Variation in Verbal Sentences in Jordanian Children’s Speech
Yu et al. A research on constructing Mongolian Treebank based on phrase structure grammar

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130403