WO2001055901A1 - Machine translation system, translation server thereof, and client thereof - Google Patents

Machine translation system, translation server thereof, and client thereof Download PDF

Info

Publication number
WO2001055901A1
WO2001055901A1 PCT/JP2001/000343 JP0100343W WO0155901A1 WO 2001055901 A1 WO2001055901 A1 WO 2001055901A1 JP 0100343 W JP0100343 W JP 0100343W WO 0155901 A1 WO0155901 A1 WO 0155901A1
Authority
WO
WIPO (PCT)
Prior art keywords
translation
sentence
word
language
unit
Prior art date
Application number
PCT/JP2001/000343
Other languages
French (fr)
Japanese (ja)
Inventor
Hiroyasu Kikuchi
Original Assignee
Joyport Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Joyport Incorporated filed Critical Joyport Incorporated
Priority to AU27073/01A priority Critical patent/AU2707301A/en
Publication of WO2001055901A1 publication Critical patent/WO2001055901A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/55Rule-based translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Definitions

  • Machine translation system its translation server and its client
  • the present invention relates to a machine translation system that translates an original sentence into another language according to a translation program distributed using a network such as the Internet, a translation server and a client thereof, and a recording medium storing the program.
  • Scenic technology Translation involves re-expressing the content expressed in one language in another language, and consists of a process of understanding the content of the expression and a process of expressing the understood content again in the target language. It is difficult to realize this process strictly in a so-called computer environment.Machine translation employs a method that replaces the source language expression with the target language expression by using grammar and semantic correspondence between languages. . Machine translation methods are broadly divided into transfer methods that perform this replacement between two languages, and intermediate language methods that convert the target language through a universal language as the third language. In practice, the intermediate language method requires much time, so the transfer method is the mainstream in most cases.
  • the transfer method includes a direct translation method that replaces words at the word level, a semantic analysis translation method that analyzes and constructs the meaning of a language, an example-based translation method that uses a bilingual corpus (aggregated words), and a knowledge base that uses common sense. Translation is being carried out by considering a form translation method and an eight-bridge type translation method that mixes these.
  • Machine translation is an applied technology that represents natural language processing during computer processing.
  • the transfer method uses morphological analysis, syntax analysis, semantic analysis, and other language analysis, and second, syntax conversion and vocabulary conversion. Thirdly, it is composed of three processes, such as translation selection and morphological adjustment.
  • an input device for inputting an original sentence to be translated
  • a single sentence extracting means for cutting out the input original sentence by sentence and outputting the original sentence
  • An example database that stores the dictionary database and example sentences and their translations as pairs. — A database section consisting of evening bases. (4) A complete translation that searches the example database and outputs an example sentence that completely matches the original sentence.
  • a matching sentence translating means and (5) a similar example translating means for searching an example database to extract an example sentence that is partially inconsistent with the original sentence, and translating the translated sentence by changing the translated sentence according to the original sentence; (6) Example sentences that match the original sentence by more than a predetermined rate by searching the example database A sentence segmentation method that divides the original sentence according to a predetermined rule when no sentence is found, and (7) a sentence perfect match sentence that searches the example database and outputs a bilingual sentence of an example sentence that completely matches the segmented original sentence A translation means; and (8) a divided sentence similar example translating means for searching an example database, extracting an example sentence that is partially inconsistent with the divided original sentence, and translating by changing the bilingual sentence according to the divided original sentence.
  • a new sentence translator for searching and translating the dictionary database when an example sentence that matches the divided original text by a predetermined rate or more cannot be found by searching the example database; and (10) each of the divided original texts.
  • a split translation assembly means for assembling the translation of and outputting a translation of the original sentence.
  • a machine translation server is installed at a central location to use the similar example translation system via the Internet, and access is made from terminals via the Internet to translate. It states that the terminal can use machine translation from any terminal and from any location as long as it has a function to access the Internet.
  • the example database is searched, and a bilingual translation is obtained from an exact matching example sentence, or translation is performed using a similar example sentence. Therefore, translation can be performed quickly with a simple configuration. Since the previously translated sentences are used, duplicated translation work can be omitted, the translated sentences can be unified, and in the case of post-editing, the translated sentences can be corrected by referring to the example database, making editing work easy. According to the company, translation can be performed via the Internet, so users can use machine translation from anywhere with only a small terminal device.
  • e-mail when e-mail is transmitted across countries, it can be delivered via a network and translated into the local language rather than displayed on the destination client in the original text. It was also desired in order to carry out simple and prompt processing of the wastewater.
  • the present invention relates to a transfer method, which is a method of machine translation, by using a network as a medium for processing distribution and distributing an analysis unit to a client side, and a conversion unit and a generation unit to a server side. It is an object of the present invention to reduce processing at a server where processing is concentrated and to perform high-speed translation processing from a client.
  • the present invention provides a translation server used in a machine translation system that translates an original sentence into another language using a network.
  • a conversion unit that receives and converts the original sentence analysis data into multilingual analysis data of the other language of each word based on the appropriate translation dictionary and / or the original sentence co-lingual word dictionary and the translation word dictionary, and is converted by the conversion unit
  • a generating unit that generates a translated sentence that is a sentence in a foreign language from the foreign language analysis data based on a translation rule and a translation dictionary, wherein the generating unit A composite sentence that can be displayed as a pair with the translated sentence is generated and output.
  • the present invention analyzes the original text using a network, entrusts the analysis result to an agent as an analysis file, and converts the analysis file from the analysis file to a foreign language analysis database based on a dictionary database.
  • a communication processing unit capable of communicating with a network in a client used for a machine translation system that generates a translation from a foreign language analysis data for each word and sends the translation to a client at a destination address;
  • the analysis unit analyzes the word for each space according to each language, or extracts a word according to the symbol.
  • the present invention also provides a machine translation system for translating an original sentence into another language using a network, comprising a client, a translation server, a destination client, and an agent mediating between them, wherein the client is A communication processing unit capable of communicating with the network, an input unit for inputting the original sentence, an analyzing unit for separating and analyzing the original sentence in units of words, and an analysis data analyzed by the analyzing unit as an e-mail. An output unit for outputting to the communication processing unit, wherein the translation server receives original sentence analysis data obtained by analyzing the original sentence in units of words, and converts the original sentence analysis data into a suitable translation dictionary and a Z or original co-language word dictionary.
  • a generator for generating a translated sentence is another language sentence based on the fine translation dictionary, in front Symbol generation section generates and outputs a synthesized statement that can be displayed as pairs and the translation and the original text.
  • the present invention also provides a computer-readable recording medium storing a translation program installed by a translation server used in a machine translation system that translates an original sentence into another language using a network
  • the translation server analyzes the first analysis data obtained by analyzing the original sentence for each word of the minimum unit by the client based on the word dictionary of the language and the language installed again, and the second analyzed data is used.
  • the analysis data is converted into translation data based on the translation dictionary installed, and the translated translation data is generated based on the installed translation rules and appropriate translation rules.
  • a computer-readable recording medium characterized by transmitting a translation to a destination client is provided.
  • FIG. 1 is a conceptual block diagram of a conceptual translation of the present invention.
  • FIG. 2 is a configuration diagram of the machine translation system of the present invention.
  • FIG. 3 is a configuration diagram of the machine translation system of the present invention.
  • FIG. 4 is a diagram showing a physical configuration of conceptual translation of the present invention and a conventional example.
  • FIG. 5 is a block diagram of the machine translation system of the present invention.
  • FIG. 6 is an overall schematic flowchart of the machine translation system of the present invention.
  • FIG. 7 is a flowchart of the analysis unit of the machine translation system of the present invention.
  • FIG. 8 is a flowchart of the conversion unit of the machine translation system of the present invention.
  • FIG. 9 is a flowchart of the generation unit of the machine translation system of the present invention.
  • FIG. 10 is an explanatory diagram of an actual translation example of the machine translation system of the present invention.
  • FIG. 1 shows a schematic configuration diagram of the machine translation system of the present invention.
  • an e-mail is sent from an English client 100 to a Japanese client.
  • the client 100 creates “This is an applej” as the body of the mail and sends it to the translation server 200 as an analysis file with an analyzed intermediate file obtained by analyzing the body using the analysis dictionary.
  • the machine translation server 200 is usually located in the Japanese country, converts the analyzed intermediate file into the target language based on the translated word dictionary, and matches the translation rules.
  • a translation is generated based on the translation dictionary, and at that time, the translation may include advertisement information.
  • the client 3000 in the Japanese country checks whether an e-mail has arrived at the translation server 200.
  • the e-mail contains the following text: "This is an applej" and the completed translation "Lingo is here.” And an example If the display coexist and corporate ads that have been commissioned by the company.
  • the network network includes a public network, a world-wide Internet network, LAN, WAN, WLL (Wireless Local Loop), etc., which are connected in series or in parallel. It includes a network network, and if a translation program for a translation server is installed on a client, the client can execute the original text analysis by the originating client and then convert and generate the analysis data. It may be.
  • FIG. 2 is a configuration diagram showing a flow of the machine translation system of the present invention. Although the configuration and operation of the general machine translation system have been described with reference to FIG. 1, further details will be described with reference to FIG.
  • a predetermined translation program is distributed and installed on servers and clients on the Internet, and the specificity of each server and client, a unique translation, and features are used.
  • the text is created by the text creation unit 11 from the client 100 having a predetermined resource
  • the sent text is analyzed by the analysis unit 12, and the analysis result is analyzed by the agency 400.
  • An analysis file is created in the analysis file section 22 from the UR L (Un if rm resource Locator), send it in the form of resource utilization on the Internet, and output the analysis file to the conversion unit 23 on the server 200 having the predetermined resource, where it is converted into the predetermined language.
  • UR L Un if rm resource Locator
  • the generation unit 24 translates the predetermined language into another predetermined language, completes the translation, and generates translation display data that can be displayed by the translation generation unit 32 of the output client 300. Then, the translated sentence is displayed and output on the display of the client 300.
  • the parser generates translations while utilizing it as an auxiliary means.
  • the analysis unit 12 has analysis methods such as morphological analysis, syntactic analysis, and semantic analysis of the text. Analyze based on.
  • the conversion unit 23 has a technique such as syntax conversion and vocabulary conversion, and converts each minimum unit word into a foreign language using a translation dictionary or the like.
  • the generation unit 24 has a method of selecting a translated word, adjusting morphemes, and the like, and generates a translated sentence according to a translation rule such as an arrangement of words.
  • the learning function generation unit 25 updates various dictionaries used by the client 100 and the server 300 as the latest dictionary, and outputs, for example, the original text from the client that created the text. If the translation is more appropriate, the feedback is provided as learning data 251, and the result of comparison between the completed translated sentence from the server 200 and client 300 and the text is provided as learning data 25 2. It has a function to update various dictionaries with the latest dictionaries. This learning function is appropriate, especially when translating into many other languages.
  • the generation unit 24 of the server 200 outputs the composite sentence 26, it is possible to display the text 14 and the translation created by the generation unit 24 in parallel on the display of the client 300. Then, it is generated using codes such as HTML of image forming language such as page script. As a result, on the output client side, It makes it easy to compare the text with the translated text, and it is also effective for client users who have some knowledge of both languages as a means of checking the differences between the text and the translated text.
  • the distributed translation program described above may be a program in which the analysis, conversion, and generation programs are stored in a storage medium with the entire program as one package software, or as an analysis program for a client and a translation server.
  • Package software may be used individually as the conversion 'generation program.
  • various dictionaries are also included, and the analysis program uses the source language co-language word dictionary, and the conversion and generation program includes the source language co-language word dictionary, the translation dictionary, the translation rules, and the appropriate translation dictionary. I just need to add it. However, these dictionaries are also equipped with programs that are updated as needed to reflect the learning function.
  • a recording medium storing the translation program a hard disk, a floppy disk, CD-R ⁇ M, DVD, MO or the like is used.
  • a network such as the Internet as a transmission medium for delivering the translation program, so that it is not necessary to store the translation program in a package software.
  • FIG. 3 shows an example in which the same reference numerals as in FIG. 2 have the same functions, and an analysis unit 12 of the client 100 is provided with a conversion unit 13 for adding a conversion function.
  • the conversion unit 13 may analyze the word in the minimum unit by the analysis unit, and may not translate a certain word. By performing the conversion process, the load on the translation server can be reduced.
  • FIG. 4 shows an illustration of the distributed processing of machine translation.
  • Fig. 4 (A) shows the case of a conventional stand-alone translation system.
  • the analysis unit 51 analyzes the text and the analyzed data is converted to a minimum by the conversion unit 52.
  • Each word in the unit is converted into another language, and the converted data in another language is generated as a translation of a sentence by the generation unit 53 and output.
  • FIG. 4 (B) as described above, the client 100 analyzes the text, the conversion section 23 of the server 200 converts the analysis result, and the generation section generates a translated sentence. Output to the destination clients 301 and 302 via the Internet.
  • the destination client may be singular, plural in the multicast format, or the majority in the broadcast format, and when there are multiple destination clients, each destination client is in one language. However, each may be in a different language zone, and there is no restriction.
  • This distributed processing can significantly reduce the load on the server, and can process translation requests from many clients at high speed.
  • FIG. 5 shows an overall block diagram of the machine translation system of the present embodiment.
  • a sender client 100 has a text creator 11 for inputting a text which is an original text from an input device, and a text dictionary for inputting a text and a word dictionary in the same language as the text.
  • An analysis unit 12 that analyzes the smallest word unit based on the word dictionary 14, and a file transmitting and receiving agent 15 that outputs the text analysis data and text analyzed by the analysis unit 12 to the Internet Consists of
  • the translation server 200 inputs the file transmission / reception agency 16 received from the file transmission / reception agency 15 via the Internet, the text analysis data from the file transmission / reception agency 16 and the text.
  • the conversion unit 23 converts the text analysis data into other languages of the receiving client in word units, and the generation unit 24 generates translations of other languages according to the translation rules and the appropriate translation dictionary.
  • a suitable translation dictionary is also referred to as a so-called proper translation rule.
  • a rule for determining which one to select is stored in a recording medium. It stores translation rules as a database in the same way as dictionaries, and expresses the appropriate translation dictionary and the appropriate translation rule equally.
  • the receiving client 300 receives a file transmission / reception agency (not shown) corresponding to the file transmission / reception agency 17 and a text and a translation from the file transmission / reception agency (not shown) and outputs them on a display or the like. Means 3 3. Note that the clients 100 and 300 operate in the opposite manner in the transmission / reception relationship.
  • FIG. 6 shows a schematic flowchart of the entire machine translation system.
  • the sender client 100 creates the body in the body creation unit 11 and starts the process of requesting translation to the body of the e-mail to other countries (S11).
  • the original text of the text is extracted from the file (S12).
  • the original sentence is analyzed with reference to the client original sentence synonym word dictionary 14. For example, in the case of an English sentence, the space is used as a reference, and in the case of Japanese, two words are used as a reference to determine whether or not the word is in the word dictionary. 13 ) .
  • the presence or absence of unregistered words is determined by analyzing the original sentence (S14).
  • an unprocessed flag is assigned to the entire original sentence, and the original sentence is adopted (S15).
  • An analysis file as a storage means is generated by the ant 100 or the file transmission / reception agency 15 (S16).
  • the file transmission / reception agency 15 transmits the original file and the analysis file to the translation server 200 via the Internet, using the address of the translation server 200 as the destination address (S17).
  • the original file is sent to the translation server 200 by the file transfer agency 16 And the analysis file are received.
  • the conversion unit 23 determines whether or not the analysis file has an unprocessed flag (S18). If the unprocessed flag is present, the server is provided with a reanalysis of the unprocessed portion.
  • the execution is performed with reference to the word dictionary 29 of the original language (S 19).
  • Adopt as it is (S21).
  • each word in the minimum unit is converted from the analysis file into a translated word in accordance with the translated word dictionary (S22).
  • the converted file converted into the translated word is rearranged and combined in the generator 24 according to the translation rule and the translation dictionary of the translation rule, and “.” And!
  • a translated text is generated (S23).
  • the file transmission / reception agency 17 transmits the original text and the translation of the translated text to the designated URL (Uniform Resource Locator).
  • a corporate advertisement for advertising revenue is added to the original and translated texts, and output via the Internet to the destination client address.
  • the destination client receives the original sentence, the translated sentence, and the corporate advertisement as an e-mail and displays it on an output means such as a display.
  • the original and translated texts are displayed side by side, and the system operator who displayed the corporate advertisement information and distributed the translation program to the client or server gained advertising revenue and gained public benefit.
  • the original sentence is extracted from the file in which the original sentence is stored (S31), and the original sentence is analyzed for each word by processing symbols and spaces in order to decompose and analyze the original sentence. For example, in the case of English, it is decomposed with spaces and symbols placed before and after the word, such as ".”, "One", """, and in the case of Japanese, the symbols", "". "" And the like (S32).
  • a word search is started while referring to the client original sentence co-language word dictionary from the beginning of the sentence (S33), and the presence or absence of an unregistered word is determined (S34).
  • the number of characters is changed, and the search is performed again while referring to the client original co-language word dictionary (S35).
  • S35 client original co-language word dictionary
  • S36 unregistered word
  • an unregistered flag is added to the entire e-mail and the original text remains as it is. It is stored in the storage means as the analysis data to be used, or the process directly proceeds to the next step (S37).
  • the analysis data is generated together with the original text as an analysis file (S38), and transmitted to a predetermined URL by the file transmission / reception agency 15 (S39).
  • the given URL is the operator of the translation system, counts how much of this analytics data is being transferred, and, at this stage, a request from an advertising supplier who wishes to advertise internationally. A corporate advertisement may be added to the analysis data in accordance with.
  • the unregistered words of the original sentence are sent to the learning function unit 25 in a feedback manner, and are reflected in the client original sentence co-lingual word dictionary 14 so that the original sentence words are analyzed as they are.
  • the translation unit 23 receives the translated words and creates a target table for grammar operation.For example, to clarify the arrangement of the subject, predicate, object, complement, conjunction, etc. A subject, a predicate, an object, a complement, a conjunctive, etc., and a subject, a predicate, an object, a complement, a conjunction, etc. of a translated word are created in a target table (S51). Next, referring to the translation rule and the appropriate translation dictionary 28 (S52), apply the translation rule to the translated word data (S53), and then determine whether there is an unregistered rule (S54).
  • a reference is made to the example dictionary, and a translation table is searched from the example dictionary for custom words and proverbs in the target table created from the analysis data (S55).
  • it may be checked whether or not the word is in the example dictionary not only for each word but also in sentences.
  • it is determined whether or not there is an unregistered example (S56). If there is an unregistered example, then an approximate example is extracted from an approximate example dictionary (not shown) (S57), and an unregistered example is extracted.
  • the translated words are combined with reference to the translation rules and the appropriate translation dictionary to obtain a translated sentence of the e-mail (S58).
  • the approximation example is used in step S57, the approximation example is stored in the example dictionary as one of the learning functions of the example dictionary, and is used when the next reference is made to the example dictionary. It can be useful.
  • a translated sentence is created, sent to a predetermined URL together with the original sentence, a corporate advertisement is attached, and output to the destination client via the Internet.
  • the translation program delivered by the operator of the machine translation system is divided into an analysis unit and a conversion / generation unit, and the analysis of the flow chart described in Fig. 7 is performed.
  • the operator of the machine translation system can also earn advertising revenue.
  • an example is shown in which the text and the translation are displayed in parallel on the display of the receiving client.
  • the translation server can simply generate the translation by ignoring the text and converting the data from the analysis data, so that the translation processing can be easily and quickly performed.
  • Figure 10 shows an example of the case of handling by this machine translation system in the case of converting from Korean to English and from Japanese to English.
  • Figure 10 shows the case of translation from Korean to English.
  • the original sentence is divided into words separated by [violence], analyzed, and (b) in the conversion part processing.
  • the results are shown by analyzing the original sentence co-lingual word dictionary in the translation server, separating the original sentence by [na], and then extracting the corresponding translation.
  • (c) shows the result of generating the translated sentence by the generation unit processing.
  • Fig. 10 shows the case of translation from Japanese to English.
  • the analysis is started from two characters of the original sentence, and is converted to three or four characters.
  • the original sentence is divided into words separated by [Okina] and analyzed.
  • the conversion unit processing the original sentence is further analyzed by the same-language word dictionary in the translation server, and the original sentence is divided by [na] for each word of the minimum unit, and then the corresponding translation word is extracted from the translation word dictionary. I do.
  • Korean and Japanese have the same word order, such as subject, object, and verb, so mutual translation can be performed without using a translation table from the conversion unit to the generation unit. Direct translations can be combined.
  • the transfer method is described as a distributed method.
  • a delimiter is provided for each sentence on the client side by a program of a one-sentence analysis unit, and a server side uses an example dictionary or the like.
  • the translation according to the example can be performed, so if the analysis unit is loaded on the client and the translation server is equipped with a conversion unit, the translated sentence can be output as in the transfer method.
  • the e-mail when translating to another language, is not limited to an e-mail that needs to be translated in real time. Even if it is an e-mail with an attached document, the attached document may be translated, or the present invention may be used to translate a general document. In addition, not only between languages, but also domestic languages such as standard language, Osaka dialect and Satsuma language may be applied to the machine translation system.
  • a real-time Internet telephone system may be operated by performing conversion Z generation processing in an intermediate processing server or the like such as J / RO and converting it into a predetermined voice.

Abstract

A machine translation system by the transfer translation method in which units are distributed by use of the Internet as a medium for processing distribution such that an analyzing unit (12) is provided in a client, and a conversion unit (23) and a generating unit (24) are provided in a server. A translation server (200) used in a machine translation system for translating an original text to a text in a second language by use of the Internet comprises a conversion unit (23) that receives original text analysis data representing the result of analysis of the original text in units of a word and converts the original analysis data to second language analysis data on each word in the second language according to a appropriate translation dictionary and/or original text language word dictionary and translated word dictionary, and a generating unit (24) that generates a translated text in the second language according to the second language analysis data, translation rules, and translated word dictionary. The generating unit (24) generates a synthesis text for displaying a pair of the original text and the translated text and outputs the synthesis text.

Description

明細書  Specification
機械翻訳システムとその翻訳サーバ及びそのクライアン卜 Machine translation system, its translation server and its client
技術分野 本発明は、 インターネッ ト等のネッ トワークを利用して分散した翻訳プ ログラムに従って、 原文を他国語に翻訳する機械翻訳システムとその翻訳 サーバ、クライアント、及びそのプログラムを格納した記録媒体に関する。 TECHNICAL FIELD The present invention relates to a machine translation system that translates an original sentence into another language according to a translation program distributed using a network such as the Internet, a translation server and a client thereof, and a recording medium storing the program.
景技術 翻訳は、 ある言語で表現された内容をほかの言語で表現し直すことであ り、 表現の内容を理解する過程と、 理解した内容を改めて目的言語で表現 する過程から成り立つている。 いわゆるコンピュ一夕でこの過程を厳密に 実現することは困難で、 機械翻訳では、 言語間の文法や意味の対応関係を 用いて原言語の表現を目的言語の表現に置き換える方法を採用している。 機械翻訳の方式は、 2言語間でこの置き換えを行う トランスファ方式と、 第三の言語としてユニバーサルな言語を仲介して目的言語に変換する中間 言語方式とに大別される。 実用的には、 中間言語方式では手間がかかるの で、 ほとんどの場合はトランスファ方式が主流である。 Scenic technology Translation involves re-expressing the content expressed in one language in another language, and consists of a process of understanding the content of the expression and a process of expressing the understood content again in the target language. It is difficult to realize this process strictly in a so-called computer environment.Machine translation employs a method that replaces the source language expression with the target language expression by using grammar and semantic correspondence between languages. . Machine translation methods are broadly divided into transfer methods that perform this replacement between two languages, and intermediate language methods that convert the target language through a universal language as the third language. In practice, the intermediate language method requires much time, so the transfer method is the mainstream in most cases.
トランスファ方式には、 単語レベルで置換するダイ レク ト翻訳方式、 言 語の意味を解析して組み立てる意味解析翻訳方式、 対訳コーパス (集約語) を用いた用例ベース翻訳方式、 常識まで使用する知識ベース形翻訳方式、 それらを混合した八イブリ ツ ド形翻訳方式などが考えられて翻訳実行され ている。 機械翻訳は、 コンピュータ処理中、 自然言語処理を代表する応用技術で あり、 トランスファ方式では、 第一に形態素解析、 構文解析、 意味解析な どの言語解析、 第二に構文変換、 語彙変換などの変換、 第三に訳語選択、 形態素調整などの生成の三つの過程から構成される。 The transfer method includes a direct translation method that replaces words at the word level, a semantic analysis translation method that analyzes and constructs the meaning of a language, an example-based translation method that uses a bilingual corpus (aggregated words), and a knowledge base that uses common sense. Translation is being carried out by considering a form translation method and an eight-bridge type translation method that mixes these. Machine translation is an applied technology that represents natural language processing during computer processing.The transfer method uses morphological analysis, syntax analysis, semantic analysis, and other language analysis, and second, syntax conversion and vocabulary conversion. Thirdly, it is composed of three processes, such as translation selection and morphological adjustment.
言語解析技術では、解析精度の大幅な向上が見られるが、長文となると、 大量の格フレームなどを収録した意味辞書の整備により、 意味解析技術の 展望が開けつつある。  In language analysis technology, the accuracy of analysis has been greatly improved. However, in the case of long sentences, the prospect of semantic analysis technology is opening up with the development of a semantic dictionary containing a large number of case frames.
変換技術では、 複数の用言にまたがる表現でも一括して変換する広域変 換技術などが試みられつつある。 また、 生成技術では、 決定詞 (冠詞、 数) の生成などが問題であるが、 他の技術としては、 文脈処理を応用した省略 補完技術、 システム内で原文を翻訳しやすい文に自動的に置き換える原文 自動書き換え技術、 大量のコーパスから文法や語彙に関する知識自動獲得 技術などがある。  As for conversion technology, wide-area conversion technology is being attempted, which converts even expressions that span multiple words collectively. In the generation technology, the generation of determinants (articles and numbers) is a problem, but other technologies include omission complementary technology that applies context processing, and automatic conversion of sentences into texts that are easy to translate in the system. Replacement source text Automatic rewriting technology, technology for automatically acquiring knowledge of grammar and vocabulary from a large corpus, etc.
翻訳知識としては、 言語知識、 状況知識、 専門知識など、 多種多様な知 識が必要であり、これらをルールと機械辞書の形式に変換する必要もある。 最近では、 翻訳の即時性を重視する応用として、 電子メール翻訳のサ一ビ スが期待されている。  A wide variety of knowledge such as linguistic knowledge, situational knowledge, and specialized knowledge is required as translation knowledge, and it is necessary to convert these into rules and machine dictionaries. Recently, e-mail translation services are expected as an application that emphasizes the immediacy of translation.
ここで、 従来技術として、 特開平 1 0 — 3 1 2 3 8 2号公報として、 類 似用例翻訳システムが開示されている。  Here, as a conventional technique, a similar example translation system is disclosed in Japanese Patent Application Laid-Open No. H10-3012382.
同公報によれば、 ( 1 ) 翻訳すべき原文章を入力する入力装置と、 ( 2 ) 入力された原文章を文単位に切り出し原文を出力する一文切出手段と、 According to the publication, (1) an input device for inputting an original sentence to be translated, and (2) a single sentence extracting means for cutting out the input original sentence by sentence and outputting the original sentence,
( 3 ) 辞書データベース及び例文とその対訳文を対として記憶した用例デ —夕ベースからなるデータベース部と、 (4 ) 用例データベースを検索し て原文と完全に一致した例文の対訳文を出力する完全一致文翻訳手段と、 ( 5 )用例データベースを検索して原文と一部不一致となる例文を抽出し、 その対訳文を原文に従って変更することにより翻訳を行う類似用例翻訳手 段と、 ( 6 ) 用例データベースを検索して原文と所定率以上一致する例文 を発見できなかった場合に、 原文を所定の規則に従って分割する文分割手 段と、 ( 7 ) 用例データベースを検索して分割原文と完全に一致した例文 の対訳文を出力する分割文完全一致文翻訳手段と、 ( 8 ) 用例データべ一 スを検索して分割原文と一部不一致となる例文を抽出し、 その対訳文を分 割原文に従って変更することにより翻訳を行う分割文類似用例翻訳手段と ( 9 ) 用例データベースを検索して分割原文と所定率以上一致する例文を 発見できなかった場合に、 辞書データベースを検索して翻訳する新規文翻 訳手段と、 ( 1 0 ) 分割原文のそれぞれの翻訳文を組み立てて、 原文の翻 訳文を出力する分割翻訳文組立手段と、 を備えたことが記載されている。 (3) An example database that stores the dictionary database and example sentences and their translations as pairs. — A database section consisting of evening bases. (4) A complete translation that searches the example database and outputs an example sentence that completely matches the original sentence. A matching sentence translating means; and (5) a similar example translating means for searching an example database to extract an example sentence that is partially inconsistent with the original sentence, and translating the translated sentence by changing the translated sentence according to the original sentence; (6) Example sentences that match the original sentence by more than a predetermined rate by searching the example database A sentence segmentation method that divides the original sentence according to a predetermined rule when no sentence is found, and (7) a sentence perfect match sentence that searches the example database and outputs a bilingual sentence of an example sentence that completely matches the segmented original sentence A translation means; and (8) a divided sentence similar example translating means for searching an example database, extracting an example sentence that is partially inconsistent with the divided original sentence, and translating by changing the bilingual sentence according to the divided original sentence. And (9) a new sentence translator for searching and translating the dictionary database when an example sentence that matches the divided original text by a predetermined rate or more cannot be found by searching the example database; and (10) each of the divided original texts. And a split translation assembly means for assembling the translation of and outputting a translation of the original sentence.
さらに、 同公報には、 類似用例翻訳システムをイン夕一ネッ トを介して 利用するため、 機械翻訳サーバを一力所に設置しておき、 イン夕一ネッ ト を通じて端末機からアクセスして翻訳できるようにすることが記載され、 端末機は、 インターネッ トにアクセスする機能さえあれば、 どのような端 末からでも、 どのようなところからでも、 機械翻訳を利用できるとしてい る。  In addition, in this publication, a machine translation server is installed at a central location to use the similar example translation system via the Internet, and access is made from terminals via the Internet to translate. It states that the terminal can use machine translation from any terminal and from any location as long as it has a function to access the Internet.
このように、 同公報によれば、 用例データベースを検索して、 完全一致 する例文から対訳を求めたり、 類似の例文を利用して翻訳するので、 簡単 な構成で高速に翻訳することができ、 以前に翻訳した文を用いるので、 重 複した翻訳作業を省く ことができ、 翻訳文の統一がとれ、 後編集の場合に も用例データベースを参照して翻訳文を修正するので、 簡単に編集作業が でき、 インタ一ネッ トを介して翻訳処理ができるので、 利用者は小型の端 末機だけで、 どこからでも機械翻訳を利用することができるとしている。  As described above, according to the publication, the example database is searched, and a bilingual translation is obtained from an exact matching example sentence, or translation is performed using a similar example sentence. Therefore, translation can be performed quickly with a simple configuration. Since the previously translated sentences are used, duplicated translation work can be omitted, the translated sentences can be unified, and in the case of post-editing, the translated sentences can be corrected by referring to the example database, making editing work easy. According to the company, translation can be performed via the Internet, so users can use machine translation from anywhere with only a small terminal device.
しかしながら、 従来の機械翻訳におけるネッ トワークの役割は、 特開平 1 0 - 3 1 2 3 8 2号公報にも記載されているように、 入出力装置部と、 翻訳処理部とを結ぶユーザ一インターフェースに対する経路資源の活用以 外にはなく、機械翻訳書理自体の分散処理という発想はなかった。従って、 ネッ トワークに接続された無限に等しい資源を有するサーバの有効活用が 望まれている。 特に、 機械翻訳システムをインターネッ ト等のネッ トヮー クにて分散処理システムとして構成し、 莫大なデータベースを分散配置し て活用することで、 一極集中的な翻訳業務を分散することは従来から望ま れていた。 However, the role of the network in the conventional machine translation is, as described in Japanese Patent Application Laid-Open No. Hei 10-32122, a user interface that connects the input / output device unit and the translation processing unit. There was no idea other than the use of path resources for, and there was no idea of distributed processing of machine translation literature itself. Therefore, effective use of servers connected to the network and having infinite resources is ineffective. Is desired. In particular, it has long been desirable to disperse over-intensive translation work by configuring the machine translation system as a distributed processing system on the Internet or other networks and distributing and using huge databases. Had been.
また、 例えば、 電子メールが各国間を跨いで伝送する場合には、 ネッ ト ワークを通して配送して、 原文で宛先のクライアン卜に表示するよりも、 現地語に翻訳して表示すれば、 電子メールの処理業務を簡潔に且つ迅速に 行う上でも、 望まれていた。  Also, for example, when e-mail is transmitted across countries, it can be delivered via a network and translated into the local language rather than displayed on the destination client in the original text. It was also desired in order to carry out simple and prompt processing of the wastewater.
本発明は、 機械翻訳の一方法である トランスファー方式について、 処理 分散のための媒体にネッ トワークを利用し、 解析部をクライアント側に、 変換部、 生成部をサーバ側にと分散することにより、 処理の集中するサ一 バでの処理を軽減し、 クライアン卜からの大量の翻訳処理を高速で行うこ とができるようにすることを課題とする。  The present invention relates to a transfer method, which is a method of machine translation, by using a network as a medium for processing distribution and distributing an analysis unit to a client side, and a conversion unit and a generation unit to a server side. It is an object of the present invention to reduce processing at a server where processing is concentrated and to perform high-speed translation processing from a client.
また、 所定の言語から、 別の複数の言語に対する変換も、 クライアント に依存することなく、 サーバ側の変換部と生成部に、 他言語への変換/生 成のためのリソースを追加していく ことで、 他言語に対応した機械翻訳サ —バを構築して行く ことが可能となることを課題とする。  In addition, for the conversion from a given language to another multiple languages, resources for conversion / generation to other languages are added to the conversion unit and generation unit on the server side without depending on the client. This makes it possible to build a machine translation server that supports other languages.
発明の開示 本発明は、 上記課題を解決するために、 ネッ トワークを利用して原文を 他国語に翻訳する機械翻訳システムに使用する翻訳サーバにおいて、 前記 原文を単語単位に解析した原文解析データを受信して当該原文解析データ から適訳辞書及び 又は原文同言語単語辞書、 訳語辞書に基づいて各単語 の前記他国語の多国語解析データに変換する変換部と、 該変換部で変換さ れた前記他国語解析データから翻訳規則及び訳語辞書に基づいて他国語の 文章である翻訳文を生成する生成部とを有し、 前記生成部では前記原文と 前記翻訳文とを対として表示できる合成文を生成して出力する。 また、 本発明は、 ネッ トワークを利用して原文を解析し、 解析結果をェ 一ジェン卜に解析ファイルとして委ね、 翻訳サーバで該解析フアイルから 辞書データベースに基づいて他国語解析デ一夕に変換して、 単語毎の他国 語解析デ一夕から翻訳文を生成して宛先ァ ドレスのクライアントに送出す る機械翻訳システムに使用するクライアントにおいて、 ネッ トワークに通 信可能な通信処理部と、 前記原文を入力する入力部と、 該原文を単語単位 に分離して解析する解析部と、 前記解析部で解析された解析データを電子 メールとして前記通信処理部に出力する出力部とを備え、 前記解析部では 各国語に応じてスペース間毎に前記単語を解析し、 或いは記号に応じて単 語を抽出する。 DISCLOSURE OF THE INVENTION In order to solve the above-mentioned problems, the present invention provides a translation server used in a machine translation system that translates an original sentence into another language using a network. A conversion unit that receives and converts the original sentence analysis data into multilingual analysis data of the other language of each word based on the appropriate translation dictionary and / or the original sentence co-lingual word dictionary and the translation word dictionary, and is converted by the conversion unit A generating unit that generates a translated sentence that is a sentence in a foreign language from the foreign language analysis data based on a translation rule and a translation dictionary, wherein the generating unit A composite sentence that can be displayed as a pair with the translated sentence is generated and output. In addition, the present invention analyzes the original text using a network, entrusts the analysis result to an agent as an analysis file, and converts the analysis file from the analysis file to a foreign language analysis database based on a dictionary database. A communication processing unit capable of communicating with a network in a client used for a machine translation system that generates a translation from a foreign language analysis data for each word and sends the translation to a client at a destination address; An input unit for inputting an original sentence, an analyzing unit for separating and analyzing the original sentence in word units, and an output unit for outputting analysis data analyzed by the analyzing unit to the communication processing unit as an e-mail, The analysis unit analyzes the word for each space according to each language, or extracts a word according to the symbol.
また、 本発明は、 ネッ トワークを利用して原文を他国語に翻訳する機械 翻訳システムにおいて、クライアントと翻訳サーバと宛先クライアン卜と、 それらを仲介するエージェントとからなり、 前記クライアントは、 前記ネ ッ トワークに通信可能な通信処理部と、 前記原文を入力する入力部と、 該 原文を単語単位に分離して解析する解析部と、 前記解析部で解析された解 析デ一夕を電子メールとして前記通信処理部に出力する出力部とを備え、 前記翻訳サーバは、 前記原文を単語単位に解析した原文解析データを受信 して当該原文解析データから適訳辞書及び Z又は原文同言語単語辞書、 訳 語辞書に基づいて各単語の前記他国語の多国語解析データに変換する変換 部と、 該変換部で変換された前記他国語解析データから翻訳規則及び訳語 辞書に基づいて他国語の文章である翻訳文を生成する生成部とを有し、 前 記生成部では前記原文と前記翻訳文とを対として表示できる合成文を生成 して出力する。  The present invention also provides a machine translation system for translating an original sentence into another language using a network, comprising a client, a translation server, a destination client, and an agent mediating between them, wherein the client is A communication processing unit capable of communicating with the network, an input unit for inputting the original sentence, an analyzing unit for separating and analyzing the original sentence in units of words, and an analysis data analyzed by the analyzing unit as an e-mail. An output unit for outputting to the communication processing unit, wherein the translation server receives original sentence analysis data obtained by analyzing the original sentence in units of words, and converts the original sentence analysis data into a suitable translation dictionary and a Z or original co-language word dictionary. A conversion unit for converting each word into the multilingual analysis data of the other language based on the translation dictionary; and a translation rule from the multilingual analysis data converted by the conversion unit. And a generator for generating a translated sentence is another language sentence based on the fine translation dictionary, in front Symbol generation section generates and outputs a synthesized statement that can be displayed as pairs and the translation and the original text.
また、 本発明は、 ネッ トワークを利用して原文を他国語に翻訳する機械 翻訳システムに使用する翻訳サーバによってィンス トールされる翻訳プロ グラムを格納したコンピュータが読み取り可能な記録媒体において、 前記 翻訳サーバは、 クライアントによつて原文を最小単位の単語毎に解析され た第 1の解析データを再度ィンス トールされた言語同言語の単語辞書に基 づいて解析し、 当該解析された第 2の解析データに対してィンス トールさ れた訳語辞書に基づいて訳語データに変換し、 当該変換された訳語データ をインス トールされた翻訳規則及び適訳規則に基づいて翻訳文を生成し、 生成した前記翻訳文を宛先のクライアン卜に送出することを特徴とするコ ンピュー夕が読み取り可能な記録媒体を提供する。 The present invention also provides a computer-readable recording medium storing a translation program installed by a translation server used in a machine translation system that translates an original sentence into another language using a network, The translation server analyzes the first analysis data obtained by analyzing the original sentence for each word of the minimum unit by the client based on the word dictionary of the language and the language installed again, and the second analyzed data is used. The analysis data is converted into translation data based on the translation dictionary installed, and the translated translation data is generated based on the installed translation rules and appropriate translation rules. A computer-readable recording medium characterized by transmitting a translation to a destination client is provided.
図面の簡単な説明 図 1は、 本発明の概念的な翻訳の物理的構成図である。 BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a conceptual block diagram of a conceptual translation of the present invention.
図 2は、 本発明の機械翻訳システムの構成図である。  FIG. 2 is a configuration diagram of the machine translation system of the present invention.
図 3は、 本発明の機械翻訳システムの構成図である。  FIG. 3 is a configuration diagram of the machine translation system of the present invention.
図 4は、 本発明と従来例の概念的な翻訳の物理的構成図である。  FIG. 4 is a diagram showing a physical configuration of conceptual translation of the present invention and a conventional example.
図 5は、 本発明の機械翻訳システムのプロック図である。  FIG. 5 is a block diagram of the machine translation system of the present invention.
図 6は、本発明の機械翻訳システムの全体の概約フローチヤ一トである。 図 7は、 本発明の機械翻訳システムの解析部のフローチヤ一トである。 図 8は、 本発明の機械翻訳システムの変換部のフローチヤ一トである。 図 9は、 本発明の機械翻訳システムの生成部のフローチヤ一トである。 図 1 0は、 本発明の機械翻訳システムの実際の翻訳例の説明図である。  FIG. 6 is an overall schematic flowchart of the machine translation system of the present invention. FIG. 7 is a flowchart of the analysis unit of the machine translation system of the present invention. FIG. 8 is a flowchart of the conversion unit of the machine translation system of the present invention. FIG. 9 is a flowchart of the generation unit of the machine translation system of the present invention. FIG. 10 is an explanatory diagram of an actual translation example of the machine translation system of the present invention.
発明を実施するための最良の形態 BEST MODE FOR CARRYING OUT THE INVENTION
[第 1の実施形態] [First Embodiment]
( 1 ) 構成の説明  (1) Description of configuration
図 1 に本発明の機械翻訳システムの概略構成図を示す。 図 1 において、 英語国のクライアント 1 0 0から、 日本語国のクライアントに電子メール を発想した場合、 クライアント 1 0 0は、 メール本文として 「This is an applej を作成し、 その本文を解析辞書によって解析した解析済みの中間フ アイルを追加した解析ファイルとして翻訳サーバ 2 0 0に向けてネッ トヮ 一ク網を介して送出する。 機械翻訳サーバ 2 0 0は、 通常日本語国に配置 され、解析済みの中間ファイルより訳語辞書に基づいて対象言語に変換し、 翻訳規則と適訳辞書に基づいて翻訳文を生成する。 その際、 翻訳文に広告 情報を含ませてもよい。 日本語国のクライアント 3 0 0は、 翻訳サーバ 2 0 0に電子メールが来ているかどうかを確認すると共に英語国のクライア ント 1 0 0からの電子メールを受領する。 その電子メールには、 メール本 文 「This is an applej と、 完成された翻訳文 「ここにリ ンゴがあるよ。 」 と、 例えば企業から委託された企業広告とを併存して表示する。 FIG. 1 shows a schematic configuration diagram of the machine translation system of the present invention. In Figure 1, an e-mail is sent from an English client 100 to a Japanese client. Then, the client 100 creates “This is an applej” as the body of the mail and sends it to the translation server 200 as an analysis file with an analyzed intermediate file obtained by analyzing the body using the analysis dictionary. The machine translation server 200 is usually located in the Japanese country, converts the analyzed intermediate file into the target language based on the translated word dictionary, and matches the translation rules. A translation is generated based on the translation dictionary, and at that time, the translation may include advertisement information.The client 3000 in the Japanese country checks whether an e-mail has arrived at the translation server 200. Upon confirmation, you will receive an e-mail from the English client 100. The e-mail contains the following text: "This is an applej" and the completed translation "Lingo is here." And an example If the display coexist and corporate ads that have been commissioned by the company.
ここで、 ネッ トワーク網とは、 公衆網や、 ワールドワイ ド的なインター ネッ ト網や、 LAN, WAN, WL L (Wireless Local Loop) 等を含み、 それらがシリーズに或いは並列的に接続されたネッ トワーク網をも含むも のであり、 クライアントに翻訳サーバ用の翻訳プログラムをインストール すれば、 そのクライアン卜が発信元のクライアントによる原文の解析を実 行した後に当該解析データを変換し且つ生成することとしてもよい。  Here, the network network includes a public network, a world-wide Internet network, LAN, WAN, WLL (Wireless Local Loop), etc., which are connected in series or in parallel. It includes a network network, and if a translation program for a translation server is installed on a client, the client can execute the original text analysis by the originating client and then convert and generate the analysis data. It may be.
図 2に、 本発明の機械翻訳システムの流れを示す構成図を示す。 図 1に よって概略の機械翻訳システムの構成、 及び動作を説明したが、 図 2を参 照して、 更に詳細に説明する。  FIG. 2 is a configuration diagram showing a flow of the machine translation system of the present invention. Although the configuration and operation of the general machine translation system have been described with reference to FIG. 1, further details will be described with reference to FIG.
図 2において、 まず、 インタ一ネッ ト上のサーバ、 及びクライアントに 所定の翻訳プログラムを分散して、 インストールしておき、 各サーバ及び クライアントの特異性、 特有の翻訳、 特徴を利用する。 まず、 所定のリソ ースを有するクライアント 1 0 0から本文作成部 1 1で本文を作成すると 共に、 送出される本文を解析部 1 2で解析し、 解析結果をエージェンシー 4 0 0でその解析結果から解析ファイル部 2 2で解析ファイルを作成し、 その結果を翻訳プログラムを発行した所定のリソースを有する UR L ( Un i f o rm r e s o u r c e L o c a t o r ) に向けてインターネッ ト上のリソース利用 形態で送出し、 その所定のリソースを有するサーバ 2 0 0で解析ファイル を変換部 2 3に出力し、 そこで所定の言語に変換し、 生成部 2 4でその所 定の言語を別の所定の言語に翻訳して完成し、 出力側のクライアント 3 0 0の翻訳文作成部 3 2で表示可能な翻訳文表示データを生成して当該クラ イアント 3 0 0のディスプレイに翻訳文を表示出力する。 In FIG. 2, first, a predetermined translation program is distributed and installed on servers and clients on the Internet, and the specificity of each server and client, a unique translation, and features are used. First, the text is created by the text creation unit 11 from the client 100 having a predetermined resource, the sent text is analyzed by the analysis unit 12, and the analysis result is analyzed by the agency 400. An analysis file is created in the analysis file section 22 from the UR L (Un if rm resource Locator), send it in the form of resource utilization on the Internet, and output the analysis file to the conversion unit 23 on the server 200 having the predetermined resource, where it is converted into the predetermined language. Then, the generation unit 24 translates the predetermined language into another predetermined language, completes the translation, and generates translation display data that can be displayed by the translation generation unit 32 of the output client 300. Then, the translated sentence is displayed and output on the display of the client 300.
ここで、 本文を単なる単語の羅列と見るのではなく、 何らかの構造を持 つた集合 (コーパス) と見なして、 その構造を抽出し、 それを図形や数式 的な記号で中間表現し、 これを主要な補助手段として活用しながら翻訳文 を生成するもので、 解析部 1 2では、 本文を形態素解析、 構文解析、 意味 解析等の解析手法があり、 そのいずれの解析であっても最小単位の単語を 元に解析する。 また、 変換部 2 3では、 構文変換、 語彙変換等の手法があ り、 最小単位の単語毎に訳語辞書等を用いて他国語に変換する。 また、 生 成部 2 4では、 訳語選択、 形態素調整等の手法があり、 単語の配列などの 翻訳規則に従って翻訳文を生成する。  Here, instead of seeing the text as a mere list of words, we regard it as a set (corpus) with some structure, extract its structure, and intermediately express it with figures and mathematical symbols. The parser generates translations while utilizing it as an auxiliary means.The analysis unit 12 has analysis methods such as morphological analysis, syntactic analysis, and semantic analysis of the text. Analyze based on. The conversion unit 23 has a technique such as syntax conversion and vocabulary conversion, and converts each minimum unit word into a foreign language using a translation dictionary or the like. The generation unit 24 has a method of selecting a translated word, adjusting morphemes, and the like, and generates a translated sentence according to a translation rule such as an arrangement of words.
また、 学習機能生成部 2 5は、 クライアント 1 0 0、 サーバ 3 0 0で用 いる各種の辞書を最新の辞書として更新するもので、 本文を作成したクラ イアン卜から、 例えば原文のまま出力した方が翻訳として適切である場合 などにその旨学習データ 2 5 1 として、 またサーバ 2 0 0やクライアント 3 0 0から完成した翻訳文と本文との比較結果を学習デ一夕 2 5 2 として フィードバックをかけて、 各種の辞書を最新の辞書に更新する機能を有し ている。 この学習機能により、 特に多数の他国語に翻訳する場合には、 適 切である。  The learning function generation unit 25 updates various dictionaries used by the client 100 and the server 300 as the latest dictionary, and outputs, for example, the original text from the client that created the text. If the translation is more appropriate, the feedback is provided as learning data 251, and the result of comparison between the completed translated sentence from the server 200 and client 300 and the text is provided as learning data 25 2. It has a function to update various dictionaries with the latest dictionaries. This learning function is appropriate, especially when translating into many other languages.
また、サーバ 2 0 0の生成部 2 4では、合成文 2 6 として出力するため、 本文 1 4と生成部 2 4で作成した翻訳文を並列的にクライアント 3 0 0側 のディスプレイに表示できるように、 ページスクリプ卜等の画像形成言語 の H T M L等の符号で生成する。この結果、出力側のクライアン卜側では、 本文と翻訳文との比較を容易としており、 ある程度の両国語を知識として 有するクライアントのユーザーにとって、 本文と翻訳文との相違点を確認 する意味でも、 有効である。 In addition, since the generation unit 24 of the server 200 outputs the composite sentence 26, it is possible to display the text 14 and the translation created by the generation unit 24 in parallel on the display of the client 300. Then, it is generated using codes such as HTML of image forming language such as page script. As a result, on the output client side, It makes it easy to compare the text with the translated text, and it is also effective for client users who have some knowledge of both languages as a means of checking the differences between the text and the translated text.
また、 上記分散する翻訳プログラムは、 全体のプログラムを 1パッケ一 ジソフ トとして、 解析 · 変換 , 生成の各プログラムを記憶媒体に格納した ものでもよく、 又はクライアント用に解析プログラムとして、 翻訳サーバ 用に変換'生成プログラムとして、個別にパッケージソフ トとしてもよい。 そのいずれの場合でも、 各種辞書をも含めておき、 解析プログラムには原 語同言語単語辞書を、 変換 · 生成プログラムには原語同言語単語辞書と訳 語辞書と翻訳規則及び適訳辞書とを添えておればよい。 ただし、 それらの 辞書は学習機能を反映するために、 随時更新するプログラムも搭載してお く。 また、 翻訳プログラムを格納した記録媒体としては、 ハードディスク やフロッピーディスク、 C D— R〇M、 D V D、 M O等が用いられる。 ま た、 翻訳プログラムの配送には、 インタ一ネッ ト等のネッ トワークを伝送 媒体として利用することが、 パッケージソフ トに格納する手間を省けて好 ましい。  The distributed translation program described above may be a program in which the analysis, conversion, and generation programs are stored in a storage medium with the entire program as one package software, or as an analysis program for a client and a translation server. Package software may be used individually as the conversion 'generation program. In each case, various dictionaries are also included, and the analysis program uses the source language co-language word dictionary, and the conversion and generation program includes the source language co-language word dictionary, the translation dictionary, the translation rules, and the appropriate translation dictionary. I just need to add it. However, these dictionaries are also equipped with programs that are updated as needed to reflect the learning function. Further, as a recording medium storing the translation program, a hard disk, a floppy disk, CD-R 、 M, DVD, MO or the like is used. In addition, it is preferable to use a network such as the Internet as a transmission medium for delivering the translation program, so that it is not necessary to store the translation program in a package software.
図 3には、 図 2 と同符号には同一機能を有するものとし、 クライアン ト 1 0 0の解析部 1 2に変換機能を付加する変換部 1 3を備えた例を示して いる。 この変換部 1 3では、 例えば本文そのものを翻訳文に記載した方が よい場合には、 解析部で最小単位の単語に分析した場合に、 ある単語に関 しては翻訳不要とする場合に、 変換処理を行って、 翻訳サーバの負担を軽 減することができる。  FIG. 3 shows an example in which the same reference numerals as in FIG. 2 have the same functions, and an analysis unit 12 of the client 100 is provided with a conversion unit 13 for adding a conversion function. For example, if it is better to write the text itself in the translated sentence, the conversion unit 13 may analyze the word in the minimum unit by the analysis unit, and may not translate a certain word. By performing the conversion process, the load on the translation server can be reduced.
図 4には、 機械翻訳の分散処理についての説明図を示す。 図 4 ( A ) に は従来のスタンドアローン型の翻訳システムの場合で、 トランスファ一方 式の翻訳の場合、 解析部 5 1で本文を解析し、 解析された解析データを変 換部 5 2で最小単位の各単語を他国語に変換し、 変換された他国語のデー 夕を生成部 5 3で文章の翻訳文として生成して、 出力する。 一方、 図 4 ( B ) には、 上述したように、 クライアント 1 0 0で本文を 解析し、 サーバ 2 0 0の変換部 2 3で解析結果を変換し、 生成部で翻訳文 を生成し、 イン夕一ネッ トを介して送信先のクライアント 3 0 1 , 3 0 2 に出力する。 この場合、 宛先クライアントは単数でも、 マルチキャス ト形 式の複数でも、 ブロードキャス ト形式の大多数であってもよく、 宛先クラ イアン卜が複数である場合に各宛先クライアントは 1力国語であってもそ れぞれが異なる言語圏であってもよく、 制限されない。 この分散処理によ つて、 サーバの負荷を著しく軽減し、 多くのクライアントからの翻訳依頼 を高速に処理できる。 Figure 4 shows an illustration of the distributed processing of machine translation. Fig. 4 (A) shows the case of a conventional stand-alone translation system.In the case of single-transfer translation, the analysis unit 51 analyzes the text and the analyzed data is converted to a minimum by the conversion unit 52. Each word in the unit is converted into another language, and the converted data in another language is generated as a translation of a sentence by the generation unit 53 and output. On the other hand, in FIG. 4 (B), as described above, the client 100 analyzes the text, the conversion section 23 of the server 200 converts the analysis result, and the generation section generates a translated sentence. Output to the destination clients 301 and 302 via the Internet. In this case, the destination client may be singular, plural in the multicast format, or the majority in the broadcast format, and when there are multiple destination clients, each destination client is in one language. However, each may be in a different language zone, and there is no restriction. This distributed processing can significantly reduce the load on the server, and can process translation requests from many clients at high speed.
図 5に本実施形態の機械翻訳システムの全体のブロック図を示す。 図 2 に対して、 デ一夕ベースの各種辞書を明示している点が追加されている。 図 5において、 送信側クライアント 1 0 0には、 入力デバイスからの原 文である本文を入力する本文作成部 1 1 と、 本文を入力し、 本文と同一言 語による単語辞書のクライアント原文同言語単語辞書 1 4に基づいて、 最 小単位の単語単位に解析する解析部 1 2 と、 解析部 1 2で解析された本文 解析データと本文とをインターネッ 卜に出力するフアイル送受信エージェ ンシ 1 5 とから構成される。  FIG. 5 shows an overall block diagram of the machine translation system of the present embodiment. In addition to Fig. 2, the point that various dictionaries based on data are specified is added. In FIG. 5, a sender client 100 has a text creator 11 for inputting a text which is an original text from an input device, and a text dictionary for inputting a text and a word dictionary in the same language as the text. An analysis unit 12 that analyzes the smallest word unit based on the word dictionary 14, and a file transmitting and receiving agent 15 that outputs the text analysis data and text analyzed by the analysis unit 12 to the Internet Consists of
また、 翻訳サーバ 2 0 0は、 ファイル送受信エージェンシ 1 5からイン 夕ーネッ トを介して受信するフアイル送受信エージェンシ 1 6 と、 フアイ ル送受信エージェンシ 1 6からの本文解析デ一夕と本文とを入力し、 本文 解析デ一夕を単語単位に受信側クライアン卜の他国語言語に変換する変換 部 2 3 と、 他国語言語を翻訳規則や適訳辞書に則って翻訳文を生成する生 成部 2 4と、 その翻訳文と本文とを入力してィン夕ーネッ トに出力するフ アイル送受信エージェンシ 1 7 と、 変換部 2 3で変換用に用いるサーバ原 文同言語単語辞書 2 9 と、 変換部 2 3での変換の際及び生成部 2 4の翻訳 文の生成の際に用いる訳語辞書 2 7 と、 生成部 2 4の翻訳文の生成の際に 用いる翻訳規則及び適訳辞書 2 8 と、 インターネッ トに接続されたクライ 7ントゃ翻訳サーバ等からの各種翻訳用データに従って各種辞書 2 7 〜 2 9を最新辞書に更新する学習機能手段 2 5 とから構成される。 Further, the translation server 200 inputs the file transmission / reception agency 16 received from the file transmission / reception agency 15 via the Internet, the text analysis data from the file transmission / reception agency 16 and the text. The conversion unit 23 converts the text analysis data into other languages of the receiving client in word units, and the generation unit 24 generates translations of other languages according to the translation rules and the appropriate translation dictionary. A file transmission / reception agency 17 for inputting the translation and its text and outputting it to the Internet, a server source co-language word dictionary 29 used for conversion by the conversion unit 23, and a conversion unit A translation dictionary 27 used for the conversion in 23 and the generation of the translation by the generation unit 24; a translation rule and a suitable translation dictionary 28 used for the generation of the translation in the generation unit 24; Clients connected to the Internet And a learning function means 25 for updating the various dictionaries 27 to 29 to the latest dictionary in accordance with various translation data from a translation server or the like.
ここで、 適訳辞書は、 通称、 適訳規則という表現もあり、 例えば 1単語 にっき複数の訳語がある場合に、 どれを選択するかを判別する規則が記録 媒体に格納されており、 この適訳規則をデータベースとして、 辞書と同様 に格納したものであり、 適訳辞書と適訳規則とを同等に表現している。 また、 受信側クライアント 3 0 0は、 ファイル送受信エージェンシ 1 7 に対応する不図示のファイル送受信エージェンシと、 不図示のファイル送 受信エージェンシからの本文と翻訳文とを受信してそれをディスプレイ等 の出力手段の表示部 3 3 とから構成される。 なお、 クライアント 1 0 0 、 3 0 0は相互に送受信関係を反対にも動作することは、 勿論である。  Here, a suitable translation dictionary is also referred to as a so-called proper translation rule.For example, when there are multiple translations for one word, a rule for determining which one to select is stored in a recording medium. It stores translation rules as a database in the same way as dictionaries, and expresses the appropriate translation dictionary and the appropriate translation rule equally. Further, the receiving client 300 receives a file transmission / reception agency (not shown) corresponding to the file transmission / reception agency 17 and a text and a translation from the file transmission / reception agency (not shown) and outputs them on a display or the like. Means 3 3. Note that the clients 100 and 300 operate in the opposite manner in the transmission / reception relationship.
( 2 ) 動作の説明  (2) Description of operation
次に、 図 6に本機械翻訳システムの全体の概約フローチャートを示す。 まず、 送信側クライアン卜 1 0 0で本文作成部 1 1で本文を作成し、 他国 に向けて電子メールの本文として翻訳を依頼する処理を開始する( S 1 1 ) まず、 電子メールを元ファイルとしてファイルに格納しておき、 ファイル から当該本文の原文を抜き出す ( S 1 2 ) 。 次に、 クライアント原文同言 語単語辞書 1 4を参照しつつ原文を解析する。 例えば英文の場合にはスぺ ースを基準にして、 日本語の場合には 2語を基準に単語辞書にあるか否か を参照して、 最小単位の単位毎に分解 · 解析する ( S 1 3 ) 。 原文の解析 で未登録単語の有無を判断し ( S 1 4 ) 、 未登録単語が有ればその原文の 全体に未処理フラグを付与し原文のまま採用し ( S 1 5 ) 、 送信側クライ アント 1 0 0又はファイル送受信エージェンシ 1 5で記憶手段としての解 析ファイルを生成する ( S 1 6 ) 。 フアイル送受信エージェンシ 1 5は翻 訳サーバ 2 0 0へ元ファィルと解析フアイルをインターネッ トを介して翻 訳サーバ 2 0 0のア ドレスを宛先ア ドレスとして送信する ( S 1 7 ) 。 翻訳サーバ 2 0 0では、 ファイル送受信エージェンシ 1 6で元ファイル と解析ファイルを受信し、 変換部 2 3で、 解析ファイルの未処理フラグの 有無を判断し (S 1 8 ) 、 未処理フラグが有れば未処理部分の再解析をサ ーバが具備する原文同言語の単語辞書 2 9を参照しつつ実行する(S 1 9 ) 次に、 解析ファイルに未登録単語の有無を判断し ( S 2 0 ) 、 未登録単語 があれば解析ファイルの原文のまま採用する ( S 2 1 ) 。 Next, FIG. 6 shows a schematic flowchart of the entire machine translation system. First, the sender client 100 creates the body in the body creation unit 11 and starts the process of requesting translation to the body of the e-mail to other countries (S11). Then, the original text of the text is extracted from the file (S12). Next, the original sentence is analyzed with reference to the client original sentence synonym word dictionary 14. For example, in the case of an English sentence, the space is used as a reference, and in the case of Japanese, two words are used as a reference to determine whether or not the word is in the word dictionary. 13 ) . The presence or absence of unregistered words is determined by analyzing the original sentence (S14). If there is an unregistered word, an unprocessed flag is assigned to the entire original sentence, and the original sentence is adopted (S15). An analysis file as a storage means is generated by the ant 100 or the file transmission / reception agency 15 (S16). The file transmission / reception agency 15 transmits the original file and the analysis file to the translation server 200 via the Internet, using the address of the translation server 200 as the destination address (S17). The original file is sent to the translation server 200 by the file transfer agency 16 And the analysis file are received. The conversion unit 23 determines whether or not the analysis file has an unprocessed flag (S18). If the unprocessed flag is present, the server is provided with a reanalysis of the unprocessed portion. The execution is performed with reference to the word dictionary 29 of the original language (S 19). Next, the presence or absence of unregistered words in the analysis file is determined (S 20). Adopt as it is (S21).
つぎに、 解析ファイルから最小単位の単語毎に訳語辞書に従って訳語に 変換する (S 2 2 ) 。 訳語に変換された変換ファイルは、 生成部 2 4で、 翻訳規則及び適訳規則の適訳辞書に従って、 単語の並び替えや組み合わせ を行い、 適当な箇所に 「。 」 や !", 」 「、 」 等を挿入し、 文章的又は句節 的な事例を相当数格納した用例辞書に従って慣用句や格言などを翻訳生成 して、 翻訳文章を生成する (S 2 3 ) 。  Next, each word in the minimum unit is converted from the analysis file into a translated word in accordance with the translated word dictionary (S22). The converted file converted into the translated word is rearranged and combined in the generator 24 according to the translation rule and the translation dictionary of the translation rule, and “.” And! By inserting “,” “,” and the like, and translating and generating idioms and maxims according to an example dictionary storing a considerable number of grammatical or phrasal cases, a translated text is generated (S23).
次に、 文章作成が終了すれば、 ファイル送受信エージェンシ 1 7は、 こ の原文と訳文の翻訳文とを指定 UR L (Uniform Resource Locator) に送 信する。 指定 UR Lではその原文と翻訳文とに広告収入のための企業広告 を付加して、 宛先のクライアントアドレスに向けてインターネッ トを介し て出力する。 宛先のクライアントは、 その原文と翻訳文と企業広告を電子 メールとして受信してディスプレイ等の出力手段に表出する。 ディスプレ ィに表示する場合には、 原文と翻訳文とを並列表示すると共に、 企業広告 情報を表示して、 当該翻訳プログラムをクライアントやサーバに配信した システム運用者は広告収入を得ると共に、 公衆に向けての社会的需要に応 えることができる。  Next, when the text creation is completed, the file transmission / reception agency 17 transmits the original text and the translation of the translated text to the designated URL (Uniform Resource Locator). At the designated URL, a corporate advertisement for advertising revenue is added to the original and translated texts, and output via the Internet to the destination client address. The destination client receives the original sentence, the translated sentence, and the corporate advertisement as an e-mail and displays it on an output means such as a display. When displaying on the display, the original and translated texts are displayed side by side, and the system operator who displayed the corporate advertisement information and distributed the translation program to the client or server gained advertising revenue and gained public benefit. To meet the social demands for
つぎに、 図 7のフローチヤ一トを参照しつつ、 送信側クライアント 1 0 0の解析部 1 2の動作を説明する。 まず、 原文を格納したファイルから原 文を抜き出し ( S 3 1 ) 、 原文を分解 · 解析するために記号やスペースの 処理で単語単位に解析する。 例えば、 英語の場合には、 単語の前後に配置 するスペースや記号である 「. 」 」 「一」 「" 」 等で分解し、 日本語 の場合には記号の 「、 」 「。 」 「 · 」 等で単語を抽出する (S 3 2 ) 。 次 に、 文の先頭からクライアント原文同言語単語辞書を参照しつつ単語検索 を開始し ( S 3 3 ) 、 次に未登録単語の有無を判断し ( S 3 4 ) 、 未登録 単語があれば検索の文字数を変えてクライアント原文同言語単語辞書を参 照しつつ再検索する (S 3 5 ) 。 次に、 未登録単語がない場合と共に、 更 に未登録単語の有無を判断し ( S 3 6 ) 、 未登録単語があれば当該電子メ ール全体に未登録フラグを付与して原文のまま採用する解析データとして 記憶手段に格納し、 又は直接次のステップに移行する ( S 3 7 ) 。 つぎに、 解析データは原文と共に解析ファイルとして生成され ( S 3 8 ) 、 フアイ ル送受信エージェンシ 1 5で所定の U R Lへ送信する ( S 3 9 ) 。 所定の UR Lは、 当該翻訳システムの運用者であり、 この解析データがどのく ら い転送されているのかをカウントすると共に、 この段階で国際的な企業広 告を希望する広告供給者の要求に従って解析データに企業広告を付加して もよい。 Next, the operation of the analyzing unit 12 of the transmitting client 100 will be described with reference to the flowchart of FIG. First, the original sentence is extracted from the file in which the original sentence is stored (S31), and the original sentence is analyzed for each word by processing symbols and spaces in order to decompose and analyze the original sentence. For example, in the case of English, it is decomposed with spaces and symbols placed before and after the word, such as ".", "One", """, and in the case of Japanese, the symbols", "". "" And the like (S32). Next First, a word search is started while referring to the client original sentence co-language word dictionary from the beginning of the sentence (S33), and the presence or absence of an unregistered word is determined (S34). The number of characters is changed, and the search is performed again while referring to the client original co-language word dictionary (S35). Next, when there is no unregistered word, it is further determined whether or not there is an unregistered word (S36). If there is an unregistered word, an unregistered flag is added to the entire e-mail and the original text remains as it is. It is stored in the storage means as the analysis data to be used, or the process directly proceeds to the next step (S37). Next, the analysis data is generated together with the original text as an analysis file (S38), and transmitted to a predetermined URL by the file transmission / reception agency 15 (S39). The given URL is the operator of the translation system, counts how much of this analytics data is being transferred, and, at this stage, a request from an advertising supplier who wishes to advertise internationally. A corporate advertisement may be added to the analysis data in accordance with.
つぎに、 図 8のフローチャートを参照しつつ、 翻訳サーバ 2 0 0の変換 部 2 3の動作を説明する。 まず、 ファイル送受信エージェンシ 1 6で受信 した原文と解析データから、 解析データ内に未処理フラグがあるか否かを 判断し (S 4 1 ) 、 未処理フラグがあれば、 未処理の未登録単語を翻訳サ ーバ 2 0 0が所有する原文と同一言語の単語辞書 2 9を参照しつつ検索す る (S 4 2 ) 。 次に、 未処理フラグがない場合と共に、 未登録単語の有無 を判断し ( S 4 3 ) 、 未登録単語の有れば原文同言語単語辞書 2 9を参照 しつつ検索の文字数を変えて再検索する ( S 4 4 ) 。 次に、 未登録単語の 有無を判断し ( S 4 5 ) 、 未登録単語の有れば不正確通知をフラグのよう に付与し原文のまま次のステップに出力する。 この場合、 フィ一ドバック 的に学習機能部 2 5に原文の未登録単語を送出し、 原文単語のまま解析す るようにクライアント原文同言語単語辞書 1 4に反映する。  Next, the operation of the conversion unit 23 of the translation server 200 will be described with reference to the flowchart of FIG. First, it is determined whether there is an unprocessed flag in the analysis data based on the original text received by the file transmission / reception agency 16 and the analysis data (S41). If there is an unprocessed flag, the unprocessed unregistered word is determined. Is searched with reference to the word dictionary 29 in the same language as the original sentence owned by the translation server 200 (S42). Next, together with the case where there is no unprocessed flag, the presence / absence of an unregistered word is determined (S43). Search (S44). Next, the presence / absence of an unregistered word is determined (S45). If there is an unregistered word, an inaccurate notification is given like a flag and the original sentence is output to the next step. In this case, the unregistered words of the original sentence are sent to the learning function unit 25 in a feedback manner, and are reflected in the client original sentence co-lingual word dictionary 14 so that the original sentence words are analyzed as they are.
つぎに、 未登録単語がない電子メールの解析データと共に不正確通知の 付与された解析データとから訳語辞書 2 7を検索しつつ最小単位の単語毎 に訳語変換する (S 4 7 ) 。 次に、 訳語データ中に訳語辞書に存在しない 未登録単語の有無を判断し ( S 4 8 ) 、 未登録単語が有ればその原文の単 語のままを採用し ( S 4 9 ) 、 終了して生成部 2 4に出力する一方、 その 結果を他言語訳語辞書の学習 · メンテナンス処理に出力する。 Next, search the translation word dictionary 27 from the analysis data of the e-mail with no unregistered words and the analysis data to which the incorrect The translation is performed (S47). Next, it is determined whether or not there is an unregistered word in the translated word data that does not exist in the translated word dictionary (S48). If there is an unregistered word, the word of the original sentence is adopted (S49), and the process ends. Then, the result is output to the generation unit 24, and the result is output to the learning and maintenance processing of the translation dictionary for another language.
また、 図 9のフローチャートを参照しつつ、 翻訳サーバ 2 0 0の生成部 2 4の動作を説明する。 まず、 変換部 2 3での訳語デ一夕を受領して、 文 法操作のための対象テーブルを作成し、 例えば主語、 述語、 目的語、 補語、 接続詞等の配置を明確とするため、 原文の主語、 述語、 目的語、 補語、 接 続詞等と、 訳語の主語、 述語、 目的語、 補語、 接続詞等とを対象テーブル 的に作成する (S 5 1 ) 。 次に、 翻訳規則及び適訳辞書 2 8を参照して (S 5 2 ) 、 訳語データに翻訳規則を適用し (S 5 3 ) 、 次に、 未登録規則の 有無を判断し (S 5 4) 、 未登録規則があれば用例辞書を参照してその解 析デ一夕から作成した対象テーブルで習慣語やことわざなどの用例辞書か ら訳語を検索する ( S 5 5 ) 。 この場合、 通常の用例辞書と同様に、 単語 毎ばかりでなく文章的にも用例辞書にあるか否かを照会しても良い。 つぎ に、 未登録用例の有無を判断し ( S 5 6 ) 、 未登録用例の有れば、 つぎに 近似用例を不図示の近似用例辞書から抽出し (S 5 7 ) 、 未登録用例のな い場合と共に、 翻訳規則及び適訳辞書を参照しつつ訳語を結合して電子メ ールの翻訳文とする (S 5 8 ) 。 一方、 ステップ S 5 7で近似用例を使用 した場合には、 用例辞書の学習機能の一つとして、 その近似用例を用例辞 書に格納して、 次に用例辞書に照会する場合に使用して役立てることがで さる。  The operation of the generation unit 24 of the translation server 200 will be described with reference to the flowchart of FIG. First, the translation unit 23 receives the translated words and creates a target table for grammar operation.For example, to clarify the arrangement of the subject, predicate, object, complement, conjunction, etc. A subject, a predicate, an object, a complement, a conjunctive, etc., and a subject, a predicate, an object, a complement, a conjunction, etc. of a translated word are created in a target table (S51). Next, referring to the translation rule and the appropriate translation dictionary 28 (S52), apply the translation rule to the translated word data (S53), and then determine whether there is an unregistered rule (S54). If there is an unregistered rule, a reference is made to the example dictionary, and a translation table is searched from the example dictionary for custom words and proverbs in the target table created from the analysis data (S55). In this case, as in the case of a normal example dictionary, it may be checked whether or not the word is in the example dictionary not only for each word but also in sentences. Next, it is determined whether or not there is an unregistered example (S56). If there is an unregistered example, then an approximate example is extracted from an approximate example dictionary (not shown) (S57), and an unregistered example is extracted. At the same time, the translated words are combined with reference to the translation rules and the appropriate translation dictionary to obtain a translated sentence of the e-mail (S58). On the other hand, when the approximation example is used in step S57, the approximation example is stored in the example dictionary as one of the learning functions of the example dictionary, and is used when the next reference is made to the example dictionary. It can be useful.
この訳語の結合により、 翻訳文を作成し、 原文と共に所定の U R Lに送 出して、 企業広告等を付与して、 宛先のクライアントに向けてインタ一ネ ッ トを介して出力する。  By combining the translated words, a translated sentence is created, sent to a predetermined URL together with the original sentence, a corporate advertisement is attached, and output to the destination client via the Internet.
また、 本機械翻訳システムの運用者が配送する翻訳プログラムは、 解析 部用と変換 · 生成部とに分けて、 図 7 にて説明したフローチヤ一卜の解析 用プログラムと、 図 8、 図 9似て説明したフローチャートの変換 · 生成用 プログラムとをそれぞれ送信側クライアントに、 各国の翻訳サーバに、 そ れぞれィンス トールすることにより、 クライアント · サーバシステムのよ うに分散処理することができる。 The translation program delivered by the operator of the machine translation system is divided into an analysis unit and a conversion / generation unit, and the analysis of the flow chart described in Fig. 7 is performed. By installing the program for translation and the program for converting and generating flowcharts similar to those shown in Figs. 8 and 9 on the sending client and the translation server in each country, respectively, Can be distributed.
また、 電子メールなどに原文と翻訳文と共に、 企業広告を並列表示する ことにより、機械翻訳システムの運用者は広告収入をも得ることができる。 また、 上記実施形態では、 受信側クライアントの表示に、 本文と翻訳文 とを並列表記する例を示したが、 本文がなくても、 翻訳文だけでも受信側 クライアントにとって支障がない場合には、翻訳文だけを表示しても良レ 。 その場合には、 翻訳サーバにとっても、 本文を無視して解析デ一夕から変 換して翻訳文を生成すればよいので、 翻訳処理としては簡単 · 高速になる という効果を奏し得る。  In addition, by displaying the corporate advertisement in parallel with the original and translated text in e-mail, etc., the operator of the machine translation system can also earn advertising revenue. Also, in the above embodiment, an example is shown in which the text and the translation are displayed in parallel on the display of the receiving client. However, even if there is no text, if the translated text alone does not hinder the receiving client, It is good to display only the translation. In such a case, the translation server can simply generate the translation by ignoring the text and converting the data from the analysis data, so that the translation processing can be easily and quickly performed.
( 3 ) 機械翻訳の例示  (3) Example of machine translation
図 1 0に本機械翻訳システムによって扱った場合の例示を、 韓国語から 英文、 日本語から英文にした場合を示す。  Figure 10 shows an example of the case of handling by this machine translation system in the case of converting from Korean to English and from Japanese to English.
図 1 0 ( A ) は韓国語から英語への翻訳の場合、 ( a ) 解析部処理にお いて原文を [暴] で区切った単語に分割 , 解析し、 (b ) 変換部処理にお いて更に翻訳サーバ内の原文同言語単語辞書で解析して、 原文を [拿] で 区切り、 その後対応訳語を抽出した結果を示している。 その後、 ( c ) 生 成部処理で翻訳文を生成した結果を示している。  Figure 10 (A) shows the case of translation from Korean to English. (A) In the analysis part processing, the original sentence is divided into words separated by [violence], analyzed, and (b) in the conversion part processing. In addition, the results are shown by analyzing the original sentence co-lingual word dictionary in the translation server, separating the original sentence by [na], and then extracting the corresponding translation. After that, (c) shows the result of generating the translated sentence by the generation unit processing.
同様に、 図 1 0 ( B ) は日本語から英語への翻訳の場合、 ( a ) 解析部 処理において、 原文の 2文字から解析を開始し、 3文字、 或いは 4文字と して、 原文同言語単語辞書との照合の結果、 原文を [翁] で区切った単語 に分割 ' 解析する。 つぎに、 ( b ) 変換部処理において、 更に翻訳サーバ 内の原文同言語単語辞書で解析して、 原文を最小単位の単語毎に、 [拿] で区切り、 その後、 訳語辞書から対応訳語を抽出する。 その後、 ( c ) 生 成部処理で、 英文の翻訳規則及び適訳辞書に従って、 例えば最初の文字は 大文字に、 単語毎の間にはスペースを設け、 疑問文には [? ] を負荷する などの規則に従って、 翻訳文を生成した結果を示している。 Similarly, Fig. 10 (B) shows the case of translation from Japanese to English. (A) In the processing of the analysis part, the analysis is started from two characters of the original sentence, and is converted to three or four characters. As a result of collation with the language word dictionary, the original sentence is divided into words separated by [Okina] and analyzed. Next, (b) in the conversion unit processing, the original sentence is further analyzed by the same-language word dictionary in the translation server, and the original sentence is divided by [na] for each word of the minimum unit, and then the corresponding translation word is extracted from the translation word dictionary. I do. Then, (c) in the generation unit processing, according to the English translation rules and the appropriate dictionary, for example, the first character is In capital letters, leave a space between words, and in question text [? ] Shows the result of generating a translation according to rules such as loading
なお、 韓国語と日本語とは、 主語、 目的語、 動詞等の語順が一致してい るので、 相互の翻訳には、 変換部から生成部に至る場合に翻訳テーブルを 使用しなくても、 直接対応訳語を組み合わせることができる。  Note that Korean and Japanese have the same word order, such as subject, object, and verb, so mutual translation can be performed without using a translation table from the conversion unit to the generation unit. Direct translations can be combined.
なお、 上記実施形態では、 トランスファ方式の分散方式について説明し たが、 ピボッ ト方式であっても、 クライアント側に一文解析部のプロダラ ムで一文毎に区切りをつけ、サーバ側で用例辞書等で変換することにより、 用例に応じた翻訳ができるので、 解析部をクライアントに負担し、 翻訳サ —バに変換部を備えておけば、 トランスファ方式と同様に翻訳文を出力で さる。  In the above embodiment, the transfer method is described as a distributed method. However, even in the case of the pivot method, a delimiter is provided for each sentence on the client side by a program of a one-sentence analysis unit, and a server side uses an example dictionary or the like. By performing the conversion, the translation according to the example can be performed, so if the analysis unit is loaded on the client and the translation server is equipped with a conversion unit, the translated sentence can be output as in the transfer method.
また、 上記実施形態では、 主に他国語を使用する外国への電子メールの 例について説明したが、 他国語への翻訳という場合は、 リアルタイムに翻 訳処理する必要のある電子メールに限らず、 添付文書を添えた電子メール であっても、 その添付文書を翻訳することにしてもよいし、 一般の書類の 翻訳にも、 本発明を利用してもよいことは勿論である。 また、 各国語間だ けではなく、 国内の例えば標準語と大阪弁と薩摩言葉等を本機械翻訳シス テムに適用してもよい。  In the above embodiment, an example of an e-mail to a foreign language mainly using another language has been described. However, when translating to another language, the e-mail is not limited to an e-mail that needs to be translated in real time. Even if it is an e-mail with an attached document, the attached document may be translated, or the present invention may be used to translate a general document. In addition, not only between languages, but also domestic languages such as standard language, Osaka dialect and Satsuma language may be applied to the machine translation system.
また、 インターネッ 卜テレホンといわれるインターネッ トを利用した電 話において、 原語の音声をテキス トとして解読して、 本発明の機械翻訳シ ステムを利用して、 そのテキス トを解析し、 プロバイダ或いはブリ ッジ/ ルー夕等の中間処理サーバ等で変換 Z生成処理を行い、 所定の音声に変換 することにより、 リアルタイムのインターネッ トテレホンシステムを運用 するようにしてもよい。  In addition, in a telephone using the Internet called an Internet telephone, the original speech is decoded as text, and the text is analyzed using the machine translation system of the present invention, and the provider or the bridge is used. A real-time Internet telephone system may be operated by performing conversion Z generation processing in an intermediate processing server or the like such as J / RO and converting it into a predetermined voice.
産業上の利用可能性 上述した本発明によれば、 インターネッ トを介して送信側クライアント と翻訳サーバとで原文から翻訳文を作成する場合に、 一括的にサーバ内で 翻訳する場合に代えて、 トランスファ方式の解析部分をクライアン卜で実 行し、 翻訳サーバで変換 · 生成部分を分散処理するので、 翻訳サーバの負 荷を軽減し、 特に、 各国語に応じた翻訳サーバを使用するので、 翻訳サ一 バの存在する自国語の辞書を使用できるので、 翻訳文を日常使用する言葉 で表現することができ、 文語体的にかたぐるしい表現に比較して、 イン夕 ーネッ 卜ならではの表現で表示できる。 Industrial applicability According to the above-described present invention, when a translated sentence is created from an original sentence by a transmitting client and a translation server via the Internet, an analysis part of a transfer method is used instead of a case where the translation is performed collectively in the server. It is executed by the client, and the translation and generation parts are distributed and processed by the translation server, reducing the load on the translation server. In particular, since a translation server corresponding to each language is used, there is a translation server Because the native language dictionary can be used, translations can be expressed in everyday words, and can be displayed in a way that is unique to the Internet compared to expressions that are literary and rigid.

Claims

請求の範囲 The scope of the claims
1 . ネッ トワークを利用して原文を他国語に翻訳する機械翻訳システ ムに使用する翻訳サーバにおいて、 前記原文を単語単位に解析した原文解 析データを受信して当該原文解析データから適訳辞書及び Z又は原文同言 語単語辞書、 訳語辞書に基づいて各単語の前記他国語の多国語解析データ に変換する変換部と、 該変換部で変換された前記他国語解析データから翻 訳規則及び訳語辞書に基づいて他国語の文章である翻訳文を生成する生成 部とを有し、 前記生成部では前記原文と前記翻訳文とを対として表示でき る合成文を生成して出力することを特徴とする翻訳サーバ。 1. A translation server used for a machine translation system that translates an original sentence into another language using a network, receives the original sentence analysis data obtained by analyzing the original sentence in units of words, and converts the original sentence analysis data into an appropriate translation dictionary. A conversion unit that converts each word into the multilingual analysis data of the other language based on the Z or the original sentence synonym word dictionary and the translation word dictionary; and a translation rule and a translation rule based on the other language analysis data converted by the conversion unit. A generating unit that generates a translated sentence that is a sentence in another language based on the translated word dictionary, wherein the generating unit generates and outputs a synthesized sentence that can display the original sentence and the translated sentence as a pair. Characteristic translation server.
2 . 請求の範囲 1 に記載の翻訳サーバにおいて、 前記原文解析データ に含まれた解析未処理の単語に対して前記原文同言語単語辞書に基づいて 原文をそのまま出力する場合があることを特徴とする翻訳サーバ。  2. In the translation server according to claim 1, the original sentence may be output as it is based on the original sentence co-language word dictionary for unparsed words included in the original sentence analysis data. Translation server to do.
3 . ネッ トワークを利用して原文を解析し、 解析結果をエージェント に解析ファイルとして委ね、 翻訳サーバで該解析ファイルから辞書データ ベースに基づいて他国語解析データに変換して、 単語毎の他国語解析デー 夕から翻訳文を生成して宛先ァドレスのクライアントに送出する機械翻訳 システムに使用する送信側のクライアントにおいて、  3. Analyze the original text using the network, leave the analysis result to the agent as an analysis file, convert the analysis file from the analysis file to foreign language analysis data based on the dictionary database, and translate each word into another language. The client on the sending side used for the machine translation system that generates a translation from the analysis data and sends it to the client at the destination address,
前記ネッ 卜ワークに通信可能な通信処理部と、 前記原文を入力する入力 部と、 該原文を単語単位に分離して解析する解析部と、 前記解析部で解析 された解析デ一夕を電子メールとして前記通信処理部に出力する出力部と を備え、 前記解析部では各国語に応じてスペース間毎に前記単語を分離 · 解析し、 或いは記号に応じて前記単語を抽出することを特徴とするクライ アン卜。  A communication processing unit capable of communicating with the network; an input unit for inputting the original sentence; an analyzing unit for separating and analyzing the original sentence in word units; and an analysis unit for analyzing the analysis data analyzed by the analyzing unit. An output unit that outputs the word as an email to the communication processing unit.The analysis unit separates and analyzes the word for each space according to each national language, or extracts the word according to a symbol. Client to do.
4 . ネッ トワークを利用して原文を他国語に翻訳する機械翻訳システ ムにおいて、 クライアン卜と翻訳サーバと宛先クライアン卜と、 それらを仲介するェ ージェントとからなり、 4. In a machine translation system that translates the original text into another language using a network, It consists of a client, a translation server, a destination client, and an agent that mediates them.
前記クライアントは、 前記ネッ トワークに通信可能な通信処理部と、 前 記原文を入力する入力部と、 該原文を単語単位に分離して解析する解析部 と、 前記解析部で解析された解析データを電子メールとして前記通信処理 部に出力する出力部とを備え、  A communication processing unit capable of communicating with the network; an input unit for inputting the original text; an analyzing unit for separating and analyzing the original text in word units; and analysis data analyzed by the analyzing unit. And an output unit for outputting an e-mail to the communication processing unit.
前記翻訳サーバは、 前記原文を単語単位に解析した原文解析データを受 信して当該原文解析データから適訳辞書及び/又は原文同言語単語辞書、 訳語辞書に基づいて各単語の前記他国語の多国語解析データに変換する変 換部と、 該変換部で変換された前記他国語解析データから翻訳規則及び訳 語辞書に基づいて他国語の文章である翻訳文を生成する生成部とを有し、 前記生成部では前記原文と前記翻訳文とを対として表示できる合成文を生 成して出力することを特徴とする機械翻訳システム。  The translation server receives original sentence analysis data obtained by analyzing the original sentence in units of words, and, based on the proper sentence analysis data and the appropriate translation dictionary and / or the original sentence co-lingual word dictionary and the translated word dictionary, translates the foreign language of each word. A conversion unit that converts the data into multilingual analysis data; and a generation unit that generates a translation sentence in another language based on the translation rules and the translation dictionary from the other language analysis data converted by the conversion unit. A machine translation system, wherein the generation unit generates and outputs a composite sentence capable of displaying the original sentence and the translated sentence as a pair.
5 . 請求の範囲 4に記載の機械翻訳システムにおいて、 前記宛先クラ イアン卜への翻訳文の表示に併存して広告を掲載することを特徴とする機 械翻訳システム。  5. The machine translation system according to claim 4, wherein an advertisement is placed alongside the display of the translation to the destination client.
6 . ネッ 卜ワークを利用して原文を他国語に翻訳する機械翻訳システ ムに使用する翻訳サーバによってィンス トールされる翻訳プログラムを格 納したコンピュータが読み取り可能な記録媒体において、  6. On a computer-readable recording medium storing a translation program installed by a translation server used in a machine translation system for translating an original text into another language using a network,
前記翻訳サーバは、 送信側クライアントによって前記原文を最小単位の 単語毎に解析された第 1 の解析データを再度ィンス トールされた言語同言 語の単語辞書に基づいて解析し、 当該解析された第 2の解析データに対し てィンス トールされた訳語辞書に基づいて訳語データに変換し、 当該変換 された訳語データをィンス トールされた翻訳規則及び適訳規則に基づいて 翻訳文を生成し、 生成した前記翻訳文を宛先のクライアントに送出するこ とを特徴とするプログラムを格納したコンピュータが読み取り可能な記録 媒体。  The translation server analyzes the first analysis data obtained by analyzing the original sentence for each word of the minimum unit by the transmitting client based on the word dictionary of the language synonym re-installed, and The translated data is converted to translated data based on the translated dictionary installed for the analysis data in Step 2, and the translated translated data is generated and generated based on the installed translation rules and appropriate translation rules. A computer-readable recording medium storing a program, wherein the translated sentence is transmitted to a destination client.
PCT/JP2001/000343 2000-01-25 2001-01-19 Machine translation system, translation server thereof, and client thereof WO2001055901A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU27073/01A AU2707301A (en) 2000-01-25 2001-01-19 Machine translation system, translation server thereof, and client thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2000-15340 2000-01-25
JP2000015340A JP2001209643A (en) 2000-01-25 2000-01-25 Machine translation system, translation server therefor and client therefor

Publications (1)

Publication Number Publication Date
WO2001055901A1 true WO2001055901A1 (en) 2001-08-02

Family

ID=18542660

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2001/000343 WO2001055901A1 (en) 2000-01-25 2001-01-19 Machine translation system, translation server thereof, and client thereof

Country Status (4)

Country Link
JP (1) JP2001209643A (en)
AU (1) AU2707301A (en)
TW (1) TW501030B (en)
WO (1) WO2001055901A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7330810B2 (en) 2002-06-07 2008-02-12 International Business Machines Corporation Method and apparatus for developing a transfer dictionary used in transfer-based machine translation system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI385538B (en) * 2008-07-18 2013-02-11 Inventec Corp Translation system by words capturing and method thereof
KR101498456B1 (en) * 2010-07-06 2015-03-06 에스케이플래닛 주식회사 Apparatus and method for translating using encyclopedia
CN105740239A (en) * 2016-02-01 2016-07-06 中译语通科技(北京)有限公司 Translation method and system of character on webpage
TWI634435B (en) * 2017-07-31 2018-09-01 全家便利商店股份有限公司 Cloud translation and printing system and method
CN110598222B (en) * 2019-09-12 2023-05-30 北京金山数字娱乐科技有限公司 Language processing method and device, training method and device of language processing system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6347877A (en) * 1986-08-18 1988-02-29 Canon Inc Translation device
JPH02202143A (en) * 1989-01-31 1990-08-10 Toshiba Corp Electronic mail system
JPH1185753A (en) * 1997-09-11 1999-03-30 Technical Aatsu:Kk Multilanguage translating method with no erroneous translation
JPH11250066A (en) * 1998-03-04 1999-09-17 Casio Comput Co Ltd Electronic mail device and medium for recording electronic mail processing program
JPH11316720A (en) * 1998-02-05 1999-11-16 Nippon Computer Eididdo Design Kk Network communicating system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6347877A (en) * 1986-08-18 1988-02-29 Canon Inc Translation device
JPH02202143A (en) * 1989-01-31 1990-08-10 Toshiba Corp Electronic mail system
JPH1185753A (en) * 1997-09-11 1999-03-30 Technical Aatsu:Kk Multilanguage translating method with no erroneous translation
JPH11316720A (en) * 1998-02-05 1999-11-16 Nippon Computer Eididdo Design Kk Network communicating system
JPH11250066A (en) * 1998-03-04 1999-09-17 Casio Comput Co Ltd Electronic mail device and medium for recording electronic mail processing program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Interlingual MT - An industrial initiative", MACHINE TRANSLATION SUMMIT, 1987, pages 135 - 140, XP002937886 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7330810B2 (en) 2002-06-07 2008-02-12 International Business Machines Corporation Method and apparatus for developing a transfer dictionary used in transfer-based machine translation system
US7487082B2 (en) 2002-06-07 2009-02-03 International Business Machines Corporation Apparatus for developing a transfer dictionary used in transfer-based machine translation system

Also Published As

Publication number Publication date
TW501030B (en) 2002-09-01
AU2707301A (en) 2001-08-07
JP2001209643A (en) 2001-08-03

Similar Documents

Publication Publication Date Title
US7383542B2 (en) Adaptive machine translation service
CA2469593C (en) Adaptive machine translation
EP0519049B1 (en) Machine translation and telecommunications system
US5497319A (en) Machine translation and telecommunications system
US5845143A (en) Language conversion system and text creating system using such
CN100428241C (en) System and method for defining and translating chat abbreviations
US7848916B2 (en) System, method and program product for bidirectional text translation
US20020169592A1 (en) Open environment for real-time multilingual communication
US20020193986A1 (en) Pre-translated multi-lingual email system, method, and computer program product
JPH03278174A (en) Translation method and system for communication between different language
JP2004334791A (en) Machine translation apparatus, data processing method and program
JP2003529845A (en) Method and apparatus for providing multilingual translation over a network
WO2001055901A1 (en) Machine translation system, translation server thereof, and client thereof
JPH10312382A (en) Similar example translation system
WO2010142422A1 (en) A method for inter-lingual electronic communication
JP4940606B2 (en) Translation system, translation apparatus, translation method, and program
Brix et al. Suggestion for a more Productive Workflow and Infrastructure of the Permanent Commission on Standardization of Terminology
JP3311957B2 (en) User dictionary construction method, user dictionary construction device, translation method, and translation device
JP3389313B2 (en) Machine translation equipment
JP3892227B2 (en) Machine translation system
JPH1153363A (en) Translation machine and recording medium
JP2003114890A (en) Translation device, translation method, translation server, and program
JPH10187732A (en) Multilingual communication system
JP4033622B2 (en) Machine translation system
JP2008210216A (en) User retrieval device, method, and program

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: COMMUNICATION PURSUANT TO RULE 69 EPC (EPO FORM 1205A OF 151102)

122 Ep: pct application non-entry in european phase