CN1067782C - Inference technology on imperfect knowledge - Google Patents

Inference technology on imperfect knowledge Download PDF

Info

Publication number
CN1067782C
CN1067782C CN97111945A CN97111945A CN1067782C CN 1067782 C CN1067782 C CN 1067782C CN 97111945 A CN97111945 A CN 97111945A CN 97111945 A CN97111945 A CN 97111945A CN 1067782 C CN1067782 C CN 1067782C
Authority
CN
China
Prior art keywords
reduction
node
rule
tree
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN97111945A
Other languages
Chinese (zh)
Other versions
CN1175039A (en
Inventor
陈肇雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huajian long Technology Co. Ltd.
Original Assignee
HUAJIAN MACHINE TRANSLATION CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUAJIAN MACHINE TRANSLATION CO Ltd filed Critical HUAJIAN MACHINE TRANSLATION CO Ltd
Priority to CN97111945A priority Critical patent/CN1067782C/en
Publication of CN1175039A publication Critical patent/CN1175039A/en
Application granted granted Critical
Publication of CN1067782C publication Critical patent/CN1067782C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

The present invention relates to a technology which has the steps: a secondary characteristic is defined as needed; a part structure tree is generated from an original text sentence; a generated part reduction structure tree is reduced from the highest layer; the relevant test condition of a context in the existing rule is removed; then, each node is reduced; if the reduction does not succeed, after the secondary characteristic in the rule is further removed, the reduction continues until the reduction succeeds or each node becomes each leaf node which can be directly reduced into the sentence. The technology is used for a machine translation system, and can determine a solution the most approximate to a correct solution under the condition of incomplete knowledge.

Description

In the mechanical translation based on the inference method of imperfect knowledge
The present invention relates to the analysis ratiocination method in a kind of mechanical translation, belong to the machine translation mothod field.
In rule-based machine translation system, no matter how carefully Rule Design gets and rationally always has some language phenomenons and be not taken into account, and it is all powerless that general machine translation system runs into this type of problem.
Purpose of the present invention aims to provide a kind of inference technology based on imperfect knowledge, and separating of approaching the most be obtained and correctly be separated to this technology can under the incomplete situation of knowledge.
The present invention realizes by the following method:
A kind of inference technology that uses a computer and carry out based on imperfect knowledge, it takes following steps:
(1) defines accidental quality as required;
(2) original text sentence generation part-structure is set;
(3) the part reduction architecture tree to generating, carry out following algorithm steps:
(1) the context dependent test condition that at first will have now in the rule is removed, and to the top reduction of carrying out of partial tree, if the reduction success, then this algorithm successfully finishes then; Otherwise change step (2);
(2) further the accidental quality in the regular head is removed, continue then highest level node is carried out reduction, if the reduction success, then this algorithm successfully finishes, otherwise changes step (3);
(3) according to the partial tree structure, if each node is a sentence with its reduction directly then for the leaf node in the tree in top; Otherwise each node in top all is reduced to following one deck node of its correspondence, generates new intermediate result, change step (1) new partial tree is repeated said process.
The present invention is by ignoring the strict less important semantic feature of context dependent conditioned disjunction, utilization gradually reduces the approach method of condition, carry out the reduction of broad constraint, realized the translation that approaches to training structure not, thereby enlarged the coverage of rule system, improved the adaptivity of machine translation system greatly.
Below in conjunction with accompanying drawing and invention example the present invention is described in detail.
Fig. 1 is an algorithm flow chart of the present invention; Fig. 2 is the part-structure tree.
The present invention is to use common computer to realize, the steps include:
One, definition accidental quality.
Accidental quality can define arbitrarily as required.Here, we are defined as accidental quality the whole features except that grammar property in the tagsort of grammatical system.
Two, the original text sentence is called the Translation Processing algorithm,, then can't generate a complete reduction architecture tree if can't carry out reduction to it with existing rule, but generating portion structure tree (the imperfect structure tree that promptly lacks root node).The part-structure tree as shown in Figure 2.
Three, the part reduction architecture tree to generating by step 2, carry out following algorithm steps (referring to Fig. 1):
(1) maximum layer of partial tree is P1=Pi1 Pi2 ... Pik, putting current intermediate result is Pi=P1.
(2) for intermediate result Pi, can be sentence with its reduction owing to there is not rule, therefore the context dependent function that at first will have now in the rule removes, and then Pi is carried out reduction.If the reduction success, then this algorithm successfully finishes; Otherwise change step (3);
(3) further the accidental quality in the regular head is removed, continue then Pi is carried out reduction.If the reduction success, then this algorithm successfully finishes, otherwise changes step (4);
(4) according to the partial tree structure, if each node is a sentence with its reduction directly then for the leaf node in the tree among the Pi; Otherwise each node among the Pi is reduced to following one deck node of its correspondence, generates new intermediate result, change step (2) new part-structure tree is repeated said process.
Because Translation Processing mechanism has been used the top reduction mistake of existing rule to partial tree, but can't be sentence with its reduction, illustrate that existing rule knowledge is incomplete.Therefore, at first need to relax the condition restriction of rule, make it can cover more language phenomenon, and then to the top reduction of carrying out of partial tree.
The method of relaxing the rule condition restriction has two kinds: the one, remove the context dependent function in the rule, and the current header pattern in the rule no longer is limited in the specific context environment; The 2nd, remove the accidental quality in the regular head, can mate as long as satisfy principal character.
Therefore, the expressed method of above-mentioned algorithm is exactly: the context dependent test condition that at first will have now in the rule is removed, and the maximum layer to partial tree carries out reduction then, if the reduction success, then this algorithm successfully finishes.If unsuccessful, then further the accidental quality in the regular head is removed, continue then highest level node is carried out reduction, if the reduction success, then this algorithm successfully finishes.If merit that is that all right is then according to the partial tree structure, if each node is a sentence with its reduction directly then for the leaf node in the tree in top; Otherwise each node in top all is reduced to following one deck node of its correspondence, generates new intermediate result, then new part-structure tree is repeated said process.
Illustrate present technique below.
" You could take note of the signatures " is translated as Chinese " you should note those signatures " with sentence.
Existing word is:
Entry 1:signature NP signature
Entry 2:the T
Entry 3:take note of VP notes
Entry 4:could AUX energy
Entry 5:you NP you
Existing rule is:
Rule 1:T NP →, NP, those NP
Rule 2:VP NP →, VP, VP NP
Rule 3:AUX VP →, VP, AUX VP
Rule 4:NP VP → SEARCH (L, (1,1), NP), S, NP VP
Wherein: NP represents noun phrase, and VP represents verb phrase, and AUX represents auxiliary verb, and T represents article, and S represents sentence.
Reduction procedure:
(1) use 1 to 5 couple of former sentence reduction result of entry to be: NP AUX VP T NP.
(2) service regeulations 1 are NP with T NP reduction, and the reduction result of whole sentence is: NP AUX VP NP.
(3) service regeulations 2 are VP with VP NP reduction, and the reduction result of whole sentence is: NP AUX VP.
(4) service regeulations 3 are VP with AUX VP reduction, and the reduction result of whole sentence is: NP VP.
(5) current reduction result and Else Rule head coupling all gets nowhere, though can mate with the head of rule 4, because the left side of NP VP does not have NP, so the SEARCH function can not satisfy.Therefore, utilize existing dictionary and the rule can't this sentence of reduction, promptly this analysis is belonged to reasoning based on imperfect knowledge.Carry out this algorithm below:
At first remove the context dependent function in the rule, in this example, remove the context dependent function of rule in 4, obtain new rule and be:
Rule 1:T NP →, NP, those NP
Rule 2:VP NP →, VP, VP NP
Rule 3:AUX VP →, VP, AUX VP
Rule 4:NP VP →, S, NP VP
(6) with rule 4 new forms can with current reduction as a result reduction be sentence, this process successfully finishes.

Claims (1)

  1. In the mechanical translation of carrying out that uses a computer based on the inference method of imperfect knowledge, the steps include:
    (1) defines in the tagsort of grammatical system the accidental quality that is characterized as except that grammar property as required;
    (2) original text sentence generation part-structure is set;
    (3) the part reduction architecture tree to generating, carry out following algorithm steps:
    (1) the context dependent test condition that at first will have now in the rule is removed, and to the top reduction of carrying out of partial tree, if the reduction success, then this algorithm successfully finishes then; Otherwise change step (2);
    (2) further the accidental quality in the regular head is removed, continue then highest level node is carried out reduction, if the reduction success, then this algorithm successfully finishes, otherwise changes step (3);
    (3) according to the partial tree structure, if each node is a sentence with its reduction directly then for the leaf node in the tree in top; Otherwise each node in top all is reduced to following one deck node of its correspondence, generates new intermediate result, change step (1) new partial tree is repeated said process.
CN97111945A 1997-07-02 1997-07-02 Inference technology on imperfect knowledge Expired - Fee Related CN1067782C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN97111945A CN1067782C (en) 1997-07-02 1997-07-02 Inference technology on imperfect knowledge

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN97111945A CN1067782C (en) 1997-07-02 1997-07-02 Inference technology on imperfect knowledge

Publications (2)

Publication Number Publication Date
CN1175039A CN1175039A (en) 1998-03-04
CN1067782C true CN1067782C (en) 2001-06-27

Family

ID=5171968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN97111945A Expired - Fee Related CN1067782C (en) 1997-07-02 1997-07-02 Inference technology on imperfect knowledge

Country Status (1)

Country Link
CN (1) CN1067782C (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100437557C (en) * 2004-02-04 2008-11-26 北京赛迪翻译技术有限公司 Machine translation method and apparatus based on language knowledge base

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0651340A2 (en) * 1993-10-28 1995-05-03 International Business Machines Corporation Language translation apparatus and method using context-based translation models

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0651340A2 (en) * 1993-10-28 1995-05-03 International Business Machines Corporation Language translation apparatus and method using context-based translation models

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100437557C (en) * 2004-02-04 2008-11-26 北京赛迪翻译技术有限公司 Machine translation method and apparatus based on language knowledge base

Also Published As

Publication number Publication date
CN1175039A (en) 1998-03-04

Similar Documents

Publication Publication Date Title
CN1954315B (en) Systems and methods for translating chinese pinyin to chinese characters
JP3906356B2 (en) Syntax analysis method and apparatus
US6862566B2 (en) Method and apparatus for converting an expression using key words
KR100904049B1 (en) System and Method for Classifying Named Entities from Speech Recongnition
CN1950831A (en) Apparatus and method for handwriting recognition
CN1224954C (en) Speech recognition device comprising language model having unchangeable and changeable syntactic block
US20100057437A1 (en) Machine-translation apparatus using multi-stage verbal-phrase patterns, methods for applying and extracting multi-stage verbal-phrase patterns
CN113408307B (en) Neural machine translation method based on translation template
Hayes-Roth et al. An Automatically Compilable Recognition Network For Structured Patterns.
CN1067782C (en) Inference technology on imperfect knowledge
Alegria et al. Robustness and customisation in an analyser/lemmatiser for Basque
Doush et al. Improving post-processing optical character recognition documents with Arabic language using spelling error detection and correction
Kanoun et al. Affixal approach for Arabic decomposable vocabulary recognition a validation on printed word in only one font
Al-Qaraghuli et al. Correcting Arabic Soft Spelling Mistakes Using Transformers
CN1107276C (en) Fully automatic system for separating Chinese words from sentences
Jacquemont et al. Correct your text with Google
Vasiu et al. Enhancing tokenization by embedding romanian language specific morphology
Hoch et al. On virtual partitioning of large dictionaries for contextual post-processing to improve character recognition
Linares et al. A hybrid language model based on a combination of n-grams and stochastic context-free grammars
Smeaton et al. Using morpho-syntactic language analysis in phrase matching
Ciravegna et al. Knowledge extraction from texts by SINTESI
CN115496079B (en) Chinese translation method and device
KR19990015131A (en) How to translate idioms in the English-Korean automatic translation system
JP3698454B2 (en) Parallel phrase analysis device and learning data automatic creation device
Nakano et al. A grammar and a parser for spontaneous speech

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent for invention or patent application
CB02 Change of applicant information

Address after: 100083 Beijing City, Haidian District Xueyuan Road No. 32, West Building Huajian Corporation Li Hua

Applicant after: Huajian Machine Translation Co., Ltd.

Applicant before: Chen Zhaoxiong

COR Change of bibliographic data

Free format text: CORRECT: APPLICANT; FROM: CHEN ZHAOXIONG TO: HUAJIAN MACHINE TRANSLATION CO., LTD

C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: BEIJING HUAJIAN CHANGHE SCIENCE CO., LTD.

Free format text: FORMER OWNER: HUAJIAN MACHINE TRANSLATION CO., LTD

Effective date: 20090508

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20090508

Address after: Room 207, West Building, Kequn Building, 30 College Road, Haidian District, Beijing: 100083

Patentee after: Beijing Huajian long Technology Co. Ltd.

Address before: The postcode of West Building Huajian Group Co., Ltd., Kequn Building, 30 College Road, Haidian District, Beijing: 100083

Patentee before: Huajian Machine Translation Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20010627

Termination date: 20160702

CF01 Termination of patent right due to non-payment of annual fee