KR19990071163A - A device and method for translating a language into a target language of a source language using salmon information - Google Patents

A device and method for translating a language into a target language of a source language using salmon information Download PDF

Info

Publication number
KR19990071163A
KR19990071163A KR1019980006475A KR19980006475A KR19990071163A KR 19990071163 A KR19990071163 A KR 19990071163A KR 1019980006475 A KR1019980006475 A KR 1019980006475A KR 19980006475 A KR19980006475 A KR 19980006475A KR 19990071163 A KR19990071163 A KR 19990071163A
Authority
KR
South Korea
Prior art keywords
language
salmon
word
source
target language
Prior art date
Application number
KR1019980006475A
Other languages
Korean (ko)
Other versions
KR100474824B1 (en
Inventor
이재원
권철중
Original Assignee
윤종용
삼성전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 윤종용, 삼성전자 주식회사 filed Critical 윤종용
Priority to KR1019980006475A priority Critical patent/KR100474824B1/en
Publication of KR19990071163A publication Critical patent/KR19990071163A/en
Application granted granted Critical
Publication of KR100474824B1 publication Critical patent/KR100474824B1/en

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention discloses an apparatus and a method for translating a language into a target language of a source language using salmon information. The apparatus for translating a source language into a target language using salmon information according to the present invention includes a salmon analyzer for determining a predetermined salmon appearing in a context with respect to a source original word having ambiguity in an input sentence to be translated, A translation dictionary database for storing a list of semantic-purpose language synonyms and a list of source language-based salons, a source language for storing source language information, a source language information database for source language information, a source language dictionary, A source language semantic determining unit for determining the semantics of the source language word using the source language salmon information, a target language salmon information database for storing the target language salmon information, and a plurality of target languages having a meaning determined by the source language semantic determination unit A target language word for a source language word is extracted using the target language salmon information It characterized by comprising a selection prescribed target language words.

Description

A device and method for translating a language into a target language of a source language using salmon information

The present invention relates to language translation, and more particularly, to an apparatus and method for translating a language into a target language of a source language using salmon information in selecting a word for ambiguity.

As the computer develops, researches on the language translation device, in which the machine translates the language translation, which was possible only by the human being, is actively being studied. For example, a language translation apparatus such as a Japanese translation apparatus, a Korean translation apparatus, etc., selects a target language, that is, a conjugation word suitable for a source language by inputting a source language, that is, a translation target language.

An important point in such a language translation apparatus is to efficiently select a target language word corresponding to a source language word. One primitive language word can have a plurality of meanings, in which case a plurality of target language words correspond. Therefore, the ambiguous primitive language words are understood in context when they are used in the sentences, and there is a proper purpose language word for them. At this time, in selecting the target language, collocation information, which is a word appearing in the context of the ambiguous source language word, provides very important information.

As described above, salmon information, which is context information, is very important in selecting a target language word for a primitive language word. In many conventional researches, a primitive language word having ambiguity and salmon information frequently appearing in a context are used The target language word was selected. Conventional language translation methods mainly use source language information, and some methods using target language information have been attempted. However, since the conventional language translation methods use only information on a specific language, the following problems have been encountered.

In the conventional language translation method using only the source language information, the target language word is simply selected using the source language information appearing in the context of the source language word. Here, the source language information is obtained from a source language corpus tagged with the target language word, but it is very difficult to build such a source language corpus in large quantities in an implementation. In addition, since the target language is selected from the viewpoint of the primitive language, the disadvantage of not being able to select a natural word in terms of the target language appears.

In addition, since the conventional language translation method using only the target language word information uses the target language word information, natural words suitable for the target language can be selected. However, since the source language information is not considered, there is a disadvantage that target language words irrelevant to the original meaning of the source language words are selected.

SUMMARY OF THE INVENTION The present invention has been made in order to overcome the above-mentioned problems, and it is an object of the present invention to provide a salmon information system and a salmon information system, To provide a device for translating a language into a target language of a primitive language.

According to another aspect of the present invention, there is provided a method for translating a source language into a target language using salmon information performed by the language translation apparatus.

1 is a block diagram of a language translation apparatus according to the present invention.

2 is a flow chart for explaining a language translation method according to the present invention.

3 is a detailed flowchart of a preferred embodiment of the language translation method shown in FIG.

According to another aspect of the present invention, there is provided an apparatus for translating a source language into a target language using salmon information,

A salmon analyzer for determining a predetermined salmon appearing in a context with respect to a source original word having ambiguity in an input sentence to be translated, a list of semantic target language synonyms that the source language can have, and a translation dictionary Database, source language Source language for storing salmon information Salmon information database, primitive language For tuples consisting of words and salmon, source language using semantic information of translation dictionary and source language Salmon language determining semantic meaning And a target language word for a source language word is selected using a target language salmon information among a plurality of target language words having a meaning determined by the target language salmon information database and the source language semantic determination unit for storing the target language salmon information And a target language word selecting unit.

According to another aspect of the present invention, there is provided a method for translating a source language into an object language using a salmon information according to the present invention,

(a) determining a predetermined salmon that appears together in a context with respect to a word of a source language having ambiguity in an input sentence to be translated; (b) for a tuple consisting of a source language word and a salmon, And (c) selecting a target language word for the source language word using the target language word information among a plurality of target language words having a meaning.

Hereinafter, a configuration and operation of a language translation apparatus as a target language of a source language using salmon information according to the present invention and a language translation method thereof will be described with reference to the accompanying drawings.

FIG. 1 is a block diagram of a language translation apparatus according to the present invention. The system includes a salmon analyzer 102, a source language semantic determiner 104, an objective language word selector 106, a source language salmon information database 108, Conversion dictionary database 110 and a target language salmon information database 112. [

The present invention relates to a language translation apparatus and method for separately using a source language and a target language word information in order to select an optimum target language word. When analyzing the information of the source language and the target language, it was found that the problem of providing different information is different. The source language information is suitable for solving the problem of analyzing the meaning of the primitive language word in the context, while the target language information is useful for solving the problem of selecting a natural purpose language word which is dependent on the target language Do. The present invention provides a language translation apparatus and method for efficiently selecting natural language words as a result of using salmon information of two languages, which are relatively easy to construct, using the above feature.

More specifically, referring to FIG. 1, the salmon analyzer 102 determines a predetermined salmon appearing in a context with respect to a source language word having ambiguity in an input sentence to be translated. Here, the predetermined salmon corresponds to a modifier which is connected before or after the ambiguous source language word, and the number may be one or more.

The primitive language semantics determiner 104 determines whether or not a primitive language constructed from a primitive language-intended phonetic dictionary can have a primitive language word having ambiguity in an input sentence and a tuple consisting of a salmon determined by the pars analyzer 102 The meaning of the ambiguous source language words is determined using the semantic target language lexicon list, the list of lexicon, and the source linguistic information learned from the linguistic corpus. Here, the semantic-specific target language snippet list and the salmon list for the source language are stored in the translation dictionary database 110, and the source language salmon information is stored in the source language salmon information database 108.

When the meaning is determined in the primitive language semantic determinator 104, the target language word selection unit 106 selects the most appropriate target language word using the target language salmon information among the plurality of target language candidates having the meaning. Here, the target language word information is stored in the target language word information database 110.

Table 1 shows an example of a semantic target language list and a word list for a source language stored in the conversion dictionary database 110 as described above. The translation dictionary database 110 comprises a list of ambiguous source language words, a list of target language synonyms, and a list of salmon words. Primitive language words with ambiguity contain various meanings. The list of target language synonyms is a list of words that can appear in the target language corresponding to each of a plurality of meanings that the source language words can have. The list of salmon is a list of words used together when a primitive language word is used in a given context in one of several meanings. On the other hand, the frequency is obtained from the source language corpus when a statistically ambiguous source language word is used with the corresponding meaning, that is, one of a plurality of meanings, when used with the salmon.

Primitive language word Meaningful Purpose Language Thesaurus Native language salmon Frequency SW {t 11 , t 12 , ..., t 1k } {tc 11 , tc 12 , ..., tc 1l } f 1 ... ... {t i1 , t i2 , ..., t 1m } {tc i1 , tc i2 , ..., tc in } f i wear {Dress, shine, write, ...} {short, ring, green, ...} ... ... {Dig, dig, ...} {hole, ...}

Referring to Table 1, a list of target language synonyms corresponding to each of a plurality of semantics for a source language word, ambiguous in the case where the source language is English and the target language is Korean, is " ..., and dig. Also, the target language synonyms for each of the phrases "dress, ..., dig" are further classified. The primitive language word wear has the meaning of "dress, ..., dig" when used in conjunction with the primordial language salmon illustrated in Table 1 in a given context.

This translation dictionary database 110 is constructed from a list of source language words with ambiguities, a list of destination language synonyms, and a source language-purpose language dictionary (e.g., an English-language dictionary) that stores a list of salmon in advance. On the other hand, the source language salmon information stored in the source language salmon information database 108 is composed of the source language words and the tuples and frequency numbers of the source language words of the salmon used together in a given context. The primitive language tuples and the frequency for each of them are constructed from a primitive language corpus with text in a primitive language that can be used to obtain context information of the primitive language. Here, the source language corpus is a collection of English texts, for example, a specific field or commonly used English texts. While the source language tagged corpus used in the conventional language translation is difficult to construct, the corpus of the present invention is relatively easy to construct since it does not require special tagging.

On the other hand, the target language word information stored in the target language word database 110 includes a plurality of target language words having a single meaning, and object language words of a salmon used together with the words in a predetermined context Tuples and frequency. The frequency is a frequency for each of the plurality of target language words paired with the target language word of the salmon, and is obtained from the target language corpus.

That is, the object language salvo information database 112 is constructed from a target language corpus having text in a target language in which the context information of the target language can be known. Here, the target language corpus may be a collection of Korean texts used in a specific field or generally, for example, a Korean translation apparatus.

The source language semantic determination unit 104 determines whether or not the source language word having ambiguity and the salmon language determined by the salmon analysis unit 102 are included in the source language semantic determination unit 104. [ The semantic having the maximum probability value is determined as the meaning of the primitive language word by using each frequency obtained from the source language corpus among the plurality of meanings of the primitive language words in the conversion dictionary database 112. [

On the other hand, if the target language word selection unit 106 is described again according to the target language word database 110, the target language word selection unit 106 selects a plurality of words having the meaning determined by the source language semantic determination unit 104 The target language word having the maximum probability value of each frequency obtained from the target language corpus among the target language candidate words is selected.

2 is a flow chart for explaining a language translation method according to the present invention.

First, in operation 210, a predetermined salmon appearing in a context is determined for a word of a source language having ambiguity in an input sentence to be translated. Next, for the tuple consisting of the ambiguous primitive language word and the salmon, the meaning of the primitive language word is determined using the translation dictionary and the primitive language salvage information (Step 220). Next, in operation 230, a target language word for the source language word is selected using the target language word information among the plurality of target language candidate words having the meaning determined in operation 220.

3 is a detailed flowchart of a preferred embodiment of the language translation method shown in FIG. Referring to FIG. 3, steps 220 and 230 shown in FIG. 2 will be described in detail.

First, when a salmon is determined for a primitive language word in operation 210, a plurality of meanings related to the salmon determined in operation 210 among the possible semantics of the ambiguous primitive language are stored in the converted dictionary database 110 (Step 222). Next, for the tuple consisting of the source language word and the salmon, a probability value to have each of the plurality of meanings obtained in operation 222 is calculated from the source language corpus (operation 224). Next, the meaning having the maximum probability value among the plurality of meanings is determined as the meaning of the ambiguous source language word (Step 226).

2, the primitive language word information of step 220 includes a list of primitive language words having ambiguity, a list of semantic target language synonyms, and a source language dictionary of step 222 to store a list of words, It is provided from the source language corpus.

In the above-described method of calculating the probability value, for example, if k meanings are possible for a source language word having ambiguity, a probability value is obtained as shown in the following Equation 1 for each meaning.

In Equation (1), n i means the frequency with which the source language word and the salmon are used together and has the meaning i, and is obtained from the source language corpus. The means i with the maximum probability value P i is determined by means of the source language words (the step 226).

After operation 226, the target language candidate words corresponding to the determined meaning of the source language word and the target language words of the salmon determined in operation 210 are configured in operation 232. In general, several objective language words can be mapped to one meaning, and these are words in a synonymy relationship. For example, in Table 1, if the meaning of wear, the primordial language word, is determined to be "embezzlement", "shin, write, ..." Here, the words "wear, shine, write, ..." become target language candidate words. After step 232, each probability value is calculated from the target language salmon information database 112 for a tuple of the target language candidate word and the target language word of the salmon for each of the target language candidate words (operation 234). The target language candidate word having the maximum probability value among the calculated probability values is selected as the target language word (Step 236).

2, the target language word information in step 230 includes a plurality of target language words having a single meaning, and tuples consisting of target language words of a salmon used together with the words in a predetermined context And is provided from a target language corpus having text in a target language in which the frequency is known.

The method of calculating the probability value in the above-described step 234 is, for example, as follows. First, for each of the target language candidate words, tuples consisting of the target language candidate words and the target language words of the salmon can be constructed. If there are m target language words with semantic i, then m tuples can be expressed as follows.

(t i1, t c), (t i2, t c), (t i3, t c), ..., (t im, t c)

Here, t ij denotes the j-th candidate language word having the meaning i, and t c denotes the target language word of the salmon. That is, t c is the result of transforming the sali for ambiguous source language words into target language words.

For each of these tuples, the probability value to be used in the target language is obtained by the following equation (2).

Here, Q k means a probability value for the k-th candidate language candidate word, n ij means the frequency at which the k-th target language candidate t ik having the i-th meaning and the target language word t c of the salmon appear together. The frequency is easily obtained from the target language corpus. The target language word t ik having the maximum probability value Q k is selected as a translation word for the source language word.

Up to now, an apparatus and a method for translating a language into a target language of a source language using salmon information according to the present invention have been described. Compared with the prior art, the present invention is characterized in that salmon information is used separately for each of a source language and a target word. Although the source language and the target language salmon information are exemplarily described, the present invention is not limited thereto and other variations and modifications are possible.

As described above, the apparatus and method for translating a source language into a target language using salmon information according to the present invention separately use salmon information for each of a source language and a target language, , It is effective to select a natural object language word that is optimal for a primitive language.

Claims (10)

A device for translating a language into a target language of a source language using salmon information, A salmon analyzer for determining a predetermined salmon appearing in a context together with a source original word having ambiguity in an input sentence to be translated; A translation dictionary database storing a list of semantic object language synonyms that a source language can have and a list of source language words; A source language that stores source language salmon information; a salmon information database; A primitive language semantic determining unit for determining a semantic meaning of the primitive language word using the transformation dictionary and the primitive language salmon information for a tuple composed of the primitive language word and the salmon; A target language salmon information database for storing target language salmon information; And And a target language word selecting unit for selecting a target language word for the source language word using the target language salmon information among a plurality of target language words having the above meaning determined by the source language semantic determining unit Translation device. 2. The system of claim 1, A list of ambiguous source language words, a list of target language synonyms corresponding to each of a plurality of semantics that the source language words may have, and a list of source language synonyms A dictionary of source language synonyms, and a list of salmon words as the translation dictionary; The source language salmon information database comprises: And a source language language corpora storing a frequency for a tuple composed of a source language word and a salmon used together when the source language word is used in a context, wherein the frequency of the tuple composed of the source language word and the salmon is defined as a source language word And storing the information as information. [3] The apparatus of claim 2, For a tuple consisting of the source language word and the salmon, a meaning having a maximum probability value using each usage frequency obtained from the source language corpus among a plurality of meanings of the source language word related to the salmon, As a meaning of < / RTI > The method of claim 1, wherein the target language- Wherein the language information is constructed from a target language corpus having text in the target language in which the context information of the target language can be known, A plurality of target language words having a single meaning and tuples composed of target language words of a salmon used together with the words in a predetermined context are stored in the plurality of target language words, Wherein the target language lexical information storage unit stores the target language linguistic information as the target language linguistic information. 5. The apparatus according to claim 4, A target language word having a maximum probability value is selected as a target language word for the source language word using each frequency of use obtained from the target language corpus among a plurality of target language words having the above meaning determined by the source language semantic determination unit Wherein the language translation apparatus comprises: A method for translating a source language into a target language using salmon information, (a) determining a predetermined salmon that appears together in a context with respect to a word of a source language having ambiguity in an input sentence to be translated; (b) determining, for a tuple consisting of the source language word and the salmon, the meaning of the source language word using the translation dictionary and the source language word information; And (c) selecting a target language word for the source language word using target language word information among a plurality of target language words having the above meaning. 7. The method of claim 6, wherein the translation dictionary is stored in a translation dictionary database, The translation dictionary database includes a list of ambiguous source language words, a list of target language synonyms corresponding to each of a plurality of semantics that the source language words may have, and a source language word, A dictionary of source language synonyms, a list of target language synonyms, and a list of salmon words are provided from a source language-object language dictionary storing a list of accompanying words when used in a given context, The source language salmon information is stored in a source language salmon information database, Wherein the source language salmon information database is provided from a source language corpus storing a frequency for a tuple consisting of a source language word and a salmon used together when the source language word is used in context, Wherein the frequency of the tuple to be constructed is stored as the source language information. 8. The method of claim 7, wherein step (b) (b1) obtaining a plurality of meanings of the source language words in association with the salmon in the translation dictionary database; (b2) calculating, for a tuple consisting of the source language word and the salmon, a probability value to have each of the plurality of semantics from the source language salmon information database; And (b3) determining a meaning having a maximum probability value among the plurality of meanings as the meaning of the primitive language word. 7. The method of claim 6, wherein the target language word information is stored in a target language word database, The target language salmon information database is provided from a target language corpus having text in the target language in which context information of the target language can be known, A plurality of target language words having a single meaning and tuples composed of target language words of a salmon used together with the words in a predetermined context are stored in the plurality of target language words, Wherein the use frequency for each of the plurality of target language words is obtained from the target language corpus and is stored as the target language word information. 10. The method of claim 9, wherein step (c) (c1) constructing a tuple consisting of target language candidate words corresponding to the meaning determined in step (b) and target language words of the salmon; (c2) calculating probability values for the tuple of the target language candidate word and the target language word of the salmon for each of the target language candidate words, from the target language word database; And (c3) selecting the target language candidate word having the maximum probability value as the target language word.
KR1019980006475A 1998-02-27 1998-02-27 Apparatus and method for translating source language to target language using collocation information KR100474824B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1019980006475A KR100474824B1 (en) 1998-02-27 1998-02-27 Apparatus and method for translating source language to target language using collocation information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1019980006475A KR100474824B1 (en) 1998-02-27 1998-02-27 Apparatus and method for translating source language to target language using collocation information

Publications (2)

Publication Number Publication Date
KR19990071163A true KR19990071163A (en) 1999-09-15
KR100474824B1 KR100474824B1 (en) 2005-03-16

Family

ID=37229480

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1019980006475A KR100474824B1 (en) 1998-02-27 1998-02-27 Apparatus and method for translating source language to target language using collocation information

Country Status (1)

Country Link
KR (1) KR100474824B1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100377902B1 (en) * 2000-05-25 2003-03-29 이근배 A Method for Constructing Korean Wordnet and A Korean Wordnet Constructed by Using the Same
WO2008070750A1 (en) * 2006-12-05 2008-06-12 Microsoft Corporation Web-based collocation error proofing

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100377902B1 (en) * 2000-05-25 2003-03-29 이근배 A Method for Constructing Korean Wordnet and A Korean Wordnet Constructed by Using the Same
WO2008070750A1 (en) * 2006-12-05 2008-06-12 Microsoft Corporation Web-based collocation error proofing
US7774193B2 (en) 2006-12-05 2010-08-10 Microsoft Corporation Proofing of word collocation errors based on a comparison with collocations in a corpus

Also Published As

Publication number Publication date
KR100474824B1 (en) 2005-03-16

Similar Documents

Publication Publication Date Title
JP4504555B2 (en) Translation support system
US5794177A (en) Method and apparatus for morphological analysis and generation of natural language text
US5109509A (en) System for processing natural language including identifying grammatical rule and semantic concept of an undefined word
US6101492A (en) Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis
US6055528A (en) Method for cross-linguistic document retrieval
JP2001043236A (en) Synonym extracting method, document retrieving method and device to be used for the same
KR20170106308A (en) Annotation assistance device and computer program therefor
KR20160060253A (en) Natural Language Question-Answering System and method
JPH02308370A (en) Machine translation system
US20080162115A1 (en) Computer program, apparatus, and method for searching translation memory and displaying search result
WO1997004405A9 (en) Method and apparatus for automated search and retrieval processing
JPH03185561A (en) Method for inputting european word
CN105573992A (en) Real-time translation method and apparatus
Barlow Parallel texts and corpus-based contrastive analysis
Kuo et al. Learning transliteration lexicons from the web
Harshawardhan et al. Phrase based English–Tamil Translation System by Concept Labeling using Translation Memory
Alkım et al. Machine translation infrastructure for Turkic languages (MT-Turk)
KR19990071163A (en) A device and method for translating a language into a target language of a source language using salmon information
KR100745367B1 (en) Method of index and retrieval of record based on template and question answering system using as the same
JPH0561902A (en) Mechanical translation system
Hlava et al. Cross-language retrieval-English/Russian/French
KR20040050394A (en) A Translation Engine Apparatus for Translating from Source Language to Target Language and Translation Method thereof
JPH08329059A (en) General purpose reference device
JP2994080B2 (en) Translation selection method
Kuo et al. Active learning for constructing transliteration lexicons from the Web

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20080115

Year of fee payment: 4

LAPS Lapse due to unpaid annual fee