CN107894982A

CN107894982A - A kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean

Info

Publication number: CN107894982A
Application number: CN201711005546.6A
Authority: CN
Inventors: 严馨; 李思远; 郭剑毅; 周枫; 王红斌
Original assignee: Kunming University of Science and Technology
Current assignee: Kunming University of Science and Technology
Priority date: 2017-10-25
Filing date: 2017-10-25
Publication date: 2018-04-10

Abstract

The present invention relates to the method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, belong to natural language processing field.The present invention first builds card Chinese word alignment Parallel Corpus, first word alignment is carried out at the parallel material storehouse of structure card Chinese word alignment using GIZA++, but the problem of Sparse is occurred due to GIZA++, reuse the fuzzy matching of bilingual dictionary and the method for term vector word similarity system design improves the accuracy rate of word alignment；Chinese dependency tree corpus is built again after the completion of card Chinese word alignment building of corpus；With reference to card Chinese word alignment corpus and Chinese dependency tree corpus and then card language dependency tree corpus is built, then by manually adjusting to obtain final card language dependency tree corpus.The method that interdependent treebank is built in the present invention simplifies the process of artificial mark Kampuchean sentence dependence, the plenty of time is saved, the accuracy rate of interdependent treebank can be effectively improved using bilingual dictionary matching and term vector similarity method structure bilingual word-alignment corpus.

Description

A kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean

Technical field

The present invention relates to a kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, belong to nature language Say processing technology field.

Background technology

The structure of the interdependent treebank of card language is card language and the important step in Chinese intertranslation work, and the research to card language also has Vital effect.Currently, the politics of China and south east asia, economic interaction is frequent all the more, and Cambodia is as Southeast Asia The important country in area, its relation between China is also rather close, so the research work to card language exchanges for two countries Also seem particularly significant.The syntactic analysis of card language and the interdependent treebank structure of card language occupy very big ground in the work of research card language Position.The interdependent mark system of good card language and the interdependent treebank of card language can be to the morphology on card Chinese intertranslation work and card language upper strata point The application such as analysis, syntactic analysis, semantic analysis and machine translation improves a lot.

The content of the invention

The invention provides a kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, for solving The existing interdependent treebank imperfection of card language, card sentence to dependence be difficult to analysis the problems such as.

The technical scheme is that：A kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, Methods described comprises the following steps that：

Step1, structure card Chinese word alignment Parallel Corpus；

Step1.1, collect card Chinese parallel sentence pair；

Step1.2, word alignment training is carried out using GIZA++ to card Chinese parallel sentence pair；

Step1.3, pass through fuzzy matching of the bilingual dictionary to sparse data progress dictionary；

Step1.4, the card words and phrases that can not be still alignd after the fuzzy matching of dictionary, using term vector similarity-rough set Method is handled for improving word alignment accuracy rate；Wherein term vector similarity-rough set refers to the Chinese that former sentence centering can not align Term vector corresponding to the Chinese translation for the card words and phrases that the term vector of word can not align with former sentence centering carries out similarity-rough set；

Step2, the Chinese dependency tree corpus of structure；

Step2.1, Chinese sentence word segmentation processing is carried out to card Chinese word alignment parallel sentence pair storehouse；

Step2.2, part-of-speech tagging processing is carried out to the Chinese language material after processing；

Step2.3, the interdependent treebank of LTP Language Processings platform construction Chinese is used to the Chinese language material after part-of-speech tagging, together When obtain Chinese dependence；

Step3, with reference to card Chinese word alignment Parallel Corpus and Chinese dependency tree corpus, build card language dependency tree language material Storehouse；

Step3.1, Chinese dependence is mapped in the sentence of card language by card Chinese word alignment parallel sentence pair corpus Go, so as to obtain the interdependent treebank of card language；

Step3.2, the sub- dependence of card sentence built according to the interdependent treebank of card language, according to left in the sub- dependence of card sentence The change of right additional relationships is adjusted to the sub- dependence of card sentence, then by manual synchronizing, it is interdependent to obtain final card language Treebank.

Carry out the specific steps of the fuzzy matching of dictionary in the step Step1.3 to sparse data by bilingual dictionary such as Under：

Step1.3.1, the sparse data after word alignment is found out, i.e., any one there can not be the Chinese of alignment relation with card language Word；

Step1.3.2, with reference to the card Chinese dictionary fuzzy matching word alignment based on bilingual dictionary is carried out, in translating for Cambodia's word Collected works remove to calculate maximum that translation of similarity of Chinese word of can not being alignd with former sentence centering in closing, expression is as follows：

C in the formula₁And c₂Former sentence centering and the Chinese word in dictionary translation are represented respectively, | c_1∩c₂| it is c₁And c₂Contained The number of public word, | c₁| and | c₂| it is respectively c₁And c₂Contained number of words, Sim (c₁,c₂) it is Chinese word c₁, c₂Fuzzy matching phase Like degree；Thus definable, Cambodia word k and former sentence centering Chinese word c matching similarity are as follows：

Sim (k, c)=maxSim (d, c)

Wherein, d ∈ DT_k, DT_kFor Cambodia word k all Chinese translation set, the Chinese that Sim (d, c) is card words and phrases k is translated The similarity with Chinese word c, max are to take max function to text respectively, and Sim (k, c) is Cambodia word k and Chinese word c matching Similarity, in order to obtain Cambodia's word that matching similarity meets aligned condition, threshold θ is set, and

The left side is Cambodia word k and Chinese word c alignment function in formula, and value is 1 and 0；Wherein 1 represent card words and phrases k with Former sentence centering Chinese word c semantic similarities, can match alignment；0 represents that card words and phrases k is unrelated with former sentence centering Chinese word c semantemes, nothing Method matching alignment.

The step Step1.4's comprises the following steps that：

Step1.4.1, by word2vec carry out Chinese data training, obtain Chinese language words term vector；

After the completion of Step1.4.2, training, the term vector w for the Chinese word that former sentence centering can not be alignd₁With former sentence centering without Term vector w corresponding to the Chinese translation of the card words and phrases of method alignment₂Carry out Similarity Measure, two term vector w₁,w₂Similarity It is expressed as below：

Wherein, term vector w₁,w₂For multi-C vector, n dimensions, w are shared_1i,w_2iIn i be vector dimension, and i=1, 2,…,n}；Former sentence is as follows to the Cambodia word k that can not be alignd and former sentence centering the Chinese word c that can not be alignd matching similarity It is shown：

Sim (k, c)=maxSim (w₁,w₂)

Wherein, w₁For Chinese word c term vector, w₂For the term vector of Cambodia word k Chinese translation, maxSim (w₁,w₂) To take max function, representing to find in all Chinese translations for the card words and phrases k that can not be alignd can not align with former sentence in That most similar translator of Chinese word of Chinese word c semantemes, the similarity maximum is Sim (k, c), represent Cambodia word k With Chinese word c matching similarity；

In order to obtain two term vectors that similarity meets aligned condition, it is α to set a threshold value,

The left side is Cambodia word k and Chinese word c alignment function in formula, and value is 1 and 0；Wherein 1 represent card words and phrases k with Former sentence centering Chinese word c semantic similarities, can match alignment；0 represents that card words and phrases k is unrelated with former sentence centering Chinese word c semantemes, nothing Method matching alignment；

If the matching phase for multiple card words and phrases that former one Chinese word that can not be alignd of sentence centering can not align with former sentence centering When meeting threshold condition simultaneously like degree, i.e.,

By Chinese word c₁Respectively with card words and phrases k₁,k₂,…k_nAlignment.

The beneficial effects of the invention are as follows：The present invention is by GIZA++, and innovative introducing dictionary fuzzy matching and word Vector similitude matching several method is combined the bilingual parallel word alignment corpus of the card Chinese for constructing high-accuracy.Itd is proposed The method for building interdependent treebank simplifies the process of artificial mark Kampuchean sentence dependence, saves the plenty of time.Most The accuracy rate of the interdependent treebank of constructed Cambodia is effectively raised eventually.

Brief description of the drawings

Fig. 1 is that the total flow chart of the interdependent treebank of Kampuchean is built in the present invention；

The Chinese dependence schematic diagram of Fig. 2 positions present invention；

Fig. 3 is the Kampuchean dependence building process schematic diagram of the present invention.

Embodiment

Embodiment 1：As Figure 1-3, a kind of method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, Methods described comprises the following steps that：

Step1, structure card Chinese word alignment Parallel Corpus；

Step1.1, collect card Chinese parallel sentence pair；

Step2, the Chinese dependency tree corpus of structure；

Step2.3, the interdependent treebank of LTP Language Processings platform construction Chinese is used to the Chinese language material after part-of-speech tagging, together When obtain Chinese dependence；As shown in Figure 2；

Sim (k, c)=maxSim (d, c)

The step Step1.4's comprises the following steps that：

Sim (k, c)=maxSim (w₁,w₂)

In the step Step3.2, distinguished according to the grammer of Kampuchean and Chinese, it is interdependent that card sentence can be summed up Relationship change mode.That is the sentence word order problem of card language and Chinese has certain difference, and the left additional relationships in Chinese sentence are reflected Just no longer it is that left (right side) adds dependence after being mapped in card sentence, and right (left side) can be become and add dependence.At this moment The additional relation of Chinese cannot be applied mechanically again, and to be repaiied the dependence between vocabulary in card sentence according to adjustment criterion Just it is being correct additional relationships.Finally adjusted again by manual synchronizing, obtain the final interdependent treebank of card language.Card sentence is interdependent Relation adjustment algorithm can be：

The card sentence for having marked dependence that Input is obtained by mapping

Output adjusts card sentence after criterion modification according to dependence

Contain RAD (LAD) dependences Then in If input sentences

Do is adjusted to sentence or so additional relationships

Else does not adjust

Endif

Shown in Fig. 3, " this nut is hard, and without any taste ", " hard " core " ROOT " table for whole word Show；" nut " depends on " this ", and the relation between them is represented for fixed middle relation with " ATT "；" hard " depends on " hard Fruit ", the relation between them are represented for subject-predicate relation with " SBV "；Relation between " hard " and " not having " is coordination Represented with " COO "；" not having " with " and " relation be shape in relation with " ADV " represent；" not having " and the relation of " taste " are Dynamic guest's relation is represented with " VOB "；In Chinese dependence " any " with " " be to belong to right additional relation to use " RAD " Represent, but the word order of the Kampuchean obtained afterwards by bilingual word-alignment mapping has occurred and that change.In KampucheanMiddle word order is changed.Such as "(taste Road) " with "() " word order changes, and at this moment we cannot just apply mechanically the relation (RAD) that the right side of Chinese adds again, and will basis The modification rule specified by " taste " in card language and " " between dependence be modified to left additional relation (LAD).It is interdependent Syntactic relation is shown in table 1：

1 interdependent syntactic relation of table

By collecting card Chinese parallel sentence pair from internet in the present invention, and it is big by having been obtained after above three alignment procedure The card Chinese parallel sentence pair of about 10000, and form corresponding Chinese-card language parallel sentence pair storehouse.Analysis of Chinese sentence it is interdependent The instrument that relation uses is Harbin Institute of Technology's natural language processing cloud platform, will in the present invention in order to preferably use the instrument Its mark collection combines card language feature and has carried out corresponding modification, and is based on Chinese-card language alignment relation, generates 10000 The interdependent corpus of card language of bar.

The present invention innovatively introduces dictionary fuzzy matching and term vector similarity mode both approaches to building the card Chinese Word alignment Parallel Corpus is improved.Bilingual sentence is carried out to word alignment with conventional GIZA++ alignment schemes first, but due to The problem of Sparse occurs in this method so that obtained word alignment parallel sentence pair is not very correct, therefore reuses word Allusion quotation Method of Fuzzy Matching is further corrected, due to that may be lacked in dictionary and the former identical word of sentence centering translation Language, just carried out using term vector Similarity Match Method it is last perfect, so as to obtain an accurate card Chinese word alignment Parallel Corpus.The present invention compared with prior art, constructed bilingual word-alignment Parallel Corpus after improvement before, , can be effective when the Chinese dependence built is mapped in the sentence of card language by card Chinese word alignment parallel sentence pair storehouse The accuracy rate of mapping is improved, so that the accuracy rate to the interdependent treebank of card language through mapping also improves therewith.

Above in conjunction with accompanying drawing to the present invention embodiment be explained in detail, but the present invention be not limited to it is above-mentioned Embodiment, can also be before present inventive concept not be departed from those of ordinary skill in the art's possessed knowledge Put that various changes can be made.

Claims

A kind of 1. method based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, it is characterised in that：Methods described Comprise the following steps that：

Step1, structure card Chinese word alignment Parallel Corpus；

Step1.1, collect card Chinese parallel sentence pair；

Step1.2, word alignment training is carried out using GIZA++ to card Chinese parallel sentence pair；

Step1.3, pass through fuzzy matching of the bilingual dictionary to sparse data progress dictionary；

Step1.4, the card words and phrases that can not be still alignd after the fuzzy matching of dictionary, using term vector similarity-rough set method Handle for improving word alignment accuracy rate；Wherein term vector similarity-rough set refers to the Chinese word that former sentence centering can not align Term vector corresponding to the Chinese translation for the card words and phrases that term vector can not align with former sentence centering carries out similarity-rough set；

Step2, the Chinese dependency tree corpus of structure；

Step2.1, Chinese sentence word segmentation processing is carried out to card Chinese word alignment parallel sentence pair storehouse；

Step2.2, part-of-speech tagging processing is carried out to the Chinese language material after processing；

Step2.3, the interdependent treebank of LTP Language Processings platform construction Chinese is used to the Chinese language material after part-of-speech tagging, simultaneously To Chinese dependence；

Step3, with reference to card Chinese word alignment Parallel Corpus and Chinese dependency tree corpus, build card language dependency tree corpus；

Step3.1, Chinese dependence is mapped in the sentence of card language by card Chinese word alignment parallel sentence pair corpus, from And obtain the interdependent treebank of card language；

Step3.2, the sub- dependence of card sentence is built according to the interdependent treebank of card language, left and right is attached in the foundation sub- dependence of card sentence Add the change of relation to be adjusted the sub- dependence of card sentence, then by manual synchronizing, obtain the final interdependent treebank of card language.
2. the method according to claim 1 based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, its feature It is：Carry out comprising the following steps that for the fuzzy matching of dictionary in the step Step1.3 to sparse data by bilingual dictionary：

Step1.3.1, the sparse data after word alignment is found out, i.e., any one there can not be the Chinese word of alignment relation with card language；

Step1.3.2, with reference to the card Chinese dictionary fuzzy matching word alignment based on bilingual dictionary is carried out, in the collection of translations of Cambodia's word Remove to calculate maximum that translation of similarity of Chinese word of can not being alignd with former sentence centering in conjunction, expression is as follows：

<mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>c</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mn>2</mn> <mo>&CenterDot;</mo> <mrow> <mo>|</mo> <mrow> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>&cap;</mo> <msub> <mi>c</mi> <mn>2</mn> </msub> </mrow> <mo>|</mo> </mrow> </mrow> <mrow> <mrow> <mo>|</mo> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>|</mo> </mrow> <mo>+</mo> <mrow> <mo>|</mo> <msub> <mi>c</mi> <mn>2</mn> </msub> <mo>|</mo> </mrow> </mrow> </mfrac> </mrow>

C in the formula₁And c₂Former sentence centering and the Chinese word in dictionary translation are represented respectively, | c₁∩c₂| it is c₁And c₂Contained is public The number of word, | c₁| and | c₂| it is respectively c₁And c₂Contained number of words, Sim (c₁,c₂) it is Chinese word c₁, c₂Fuzzy matching similarity； Thus definable, Cambodia word k and former sentence centering Chinese word c matching similarity are as follows：

Sim (k, c)=maxSim (d, c)

Wherein, d ∈ DT_k, DT_kFor Cambodia word k all Chinese translation set, the Chinese translation point that Sim (d, c) is card words and phrases k Not with Chinese word c similarity, for max to take max function, Sim (k, c) is that Cambodia word k is similar with Chinese word c matching Degree, in order to obtain Cambodia's word that matching similarity meets aligned condition, threshold θ is set, and

<mrow> <mi>a</mi> <mi>l</mi> <mi>i</mi> <mi>g</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>1</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>s</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&theta;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>s</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo><</mo> <mi>&theta;</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

The left side is Cambodia word k and Chinese word c alignment function in formula, and value is 1 and 0；Wherein 1 represents card words and phrases k and former sentence Centering Chinese word c semantic similarities, can match alignment；0 represent card words and phrases k it is unrelated with former sentence centering Chinese word c semantemes, can not With alignment.
3. the method according to claim 1 based on the card Chinese word alignment language material structure interdependent treebank of Kampuchean, its feature It is：The step Step1.4's comprises the following steps that：

Step1.4.1, by word2vec carry out Chinese data training, obtain Chinese language words term vector；

After the completion of Step1.4.2, training, the term vector w for the Chinese word that former sentence centering can not be alignd₁Can not be right with former sentence centering Term vector w corresponding to the Chinese translation of neat card words and phrases₂Carry out Similarity Measure, two term vector w₁,w₂Similarity it is as follows Represent：

<mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>w</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msub> <mi>w</mi> <mn>1</mn> </msub> <mo>&CenterDot;</mo> <msub> <mi>w</mi> <mn>2</mn> </msub> </mrow> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>w</mi> <mn>1</mn> </msub> <mo>|</mo> <mo>|</mo> <mo>&CenterDot;</mo> <mo>|</mo> <mo>|</mo> <msub> <mi>w</mi> <mn>2</mn> </msub> <mo>|</mo> <mo>|</mo> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <msubsup> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mrow> <mn>1</mn> <mi>i</mi> </mrow> </msub> <mo>&CenterDot;</mo> <msub> <mi>w</mi> <mrow> <mn>2</mn> <mi>i</mi> </mrow> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <msqrt> <mrow> <msubsup> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <msup> <msub> <mi>w</mi> <mrow> <mn>1</mn> <mi>i</mi> </mrow> </msub> <mn>2</mn> </msup> </mrow> </msqrt> <mo>&CenterDot;</mo> <msqrt> <mrow> <msubsup> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </msubsup> <msup> <msub> <mi>w</mi> <mrow> <mn>2</mn> <mi>i</mi> </mrow> </msub> <mn>2</mn> </msup> </mrow> </msqrt> </mrow> </mfrac> </mrow>

Wherein, term vector w₁,w₂For multi-C vector, n dimensions, w are shared_1i,w_2iIn i be vector dimension, and i=1,2 ..., n}；Former sentence is as follows to the Cambodia word k that can not be alignd and former sentence centering the Chinese word c that can not be alignd matching similarity：

Sim (k, c)=maxSim (w₁,w₂)

Wherein, w₁For Chinese word c term vector, w₂For the term vector of Cambodia word k Chinese translation, maxSim (w₁,w₂) it is to take Max function, represent to find the Chinese that can not be alignd in former sentence in all Chinese translations for the card words and phrases k that can not be alignd That most similar translator of Chinese word of words and phrases c semantemes, the similarity maximum is Sim (k, c), represents Cambodia word k and the Chinese Words and phrases c matching similarity；

In order to obtain two term vectors that similarity meets aligned condition, it is α to set a threshold value,

<mrow> <mi>a</mi> <mi>l</mi> <mi>i</mi> <mi>g</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>1</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>,</mo> <mi>c</mi> <mo>)</mo> </mrow> <mo><</mo> <mi>&alpha;</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

The left side is Cambodia word k and Chinese word c alignment function in formula, and value is 1 and 0；Wherein 1 represents card words and phrases k and former sentence Centering Chinese word c semantic similarities, can match alignment；0 represent card words and phrases k it is unrelated with former sentence centering Chinese word c semantemes, can not With alignment；

If the matching similarity for multiple card words and phrases that former one Chinese word that can not be alignd of sentence centering can not align with former sentence centering When meeting threshold condition simultaneously, i.e.,

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>k</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>k</mi> <mn>2</mn> </msub> <mo>,</mo> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>S</mi> <mi>i</mi> <mi>m</mi> <mrow> <mo>(</mo> <msub> <mi>k</mi> <mi>n</mi> </msub> <mo>,</mo> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mi>&alpha;</mi> </mrow> </mtd> </mtr> </mtable> </mfenced>

By Chinese word c₁Respectively with card words and phrases k₁,k₂,…k_nAlignment.