JPS6133577A - Mechanical translator - Google Patents

Mechanical translator

Info

Publication number
JPS6133577A
JPS6133577A JP15448584A JP15448584A JPS6133577A JP S6133577 A JPS6133577 A JP S6133577A JP 15448584 A JP15448584 A JP 15448584A JP 15448584 A JP15448584 A JP 15448584A JP S6133577 A JPS6133577 A JP S6133577A
Authority
JP
Japan
Prior art keywords
character string
katakana
dictionary
foreign language
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP15448584A
Other languages
Japanese (ja)
Other versions
JPH0346865B2 (en
Inventor
Masato Obe
正人 小部
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to JP15448584A priority Critical patent/JPS6133577A/en
Publication of JPS6133577A publication Critical patent/JPS6133577A/en
Publication of JPH0346865B2 publication Critical patent/JPH0346865B2/ja
Granted legal-status Critical Current

Links

Abstract

PURPOSE:To reduce the capacity of a dictionary and to reduce the load of mechanical translation by replacing a ''katakana'' (the square form of ''kana'' (Japanese syllabary) character string in an input text corresponding to a sent foreign language character string by the character string expressed by the sent foreign language and seding the replaced character string to a translation part. CONSTITUTION:A translator 1 is constituted of a preprocessing part 2 and a translation part 3 and the preprocessing part 2 replaces the ''katakana'' character string corresponding to the foreign language in an input text for Japanese character strings by alphabetically expressed English words and sends the obtained text to the translation part 3. The translation part 3 has a Japanese analysis part 4 and an English formation part 5. The Japanese analysis part 4 checks the parts of speech for words and also checks connecting relation between words and the English formation part 5 forms English words in accordance with the analyzed result of the Japanese analysis part 4.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は、機械翻訳の前処理において、入力テキスト中
の外国語を表す片仮名文字列を外国語表記の文字列に変
換し、しかる後にテキストを翻訳部に送るようになった
機械翻訳装置に関するものである。
[Detailed Description of the Invention] [Field of Industrial Application] The present invention converts a katakana character string representing a foreign language in an input text into a character string written in a foreign language in preprocessing of machine translation, and then converts the text into a character string written in a foreign language. This relates to a machine translation device that sends text to a translation department.

〔従来技術と問題点〕[Prior art and problems]

従来の日本語を英語に翻訳する機械翻訳装置においては
、片仮名文字列を日本語単語辞書だけから検索していた
ため、英語単語で片仮名表記されているものは日本語単
語辞書に登録しておく必要がある。英語単語を片仮名表
記したものを日本語単語辞書に入れ且つ同一の英語単語
をアルファベット表記したものを英語単語辞書に入れて
おくことは、辞書の容量を増大させるばかりでなく翻訳
処理を複雑にするこという欠点がある。
In conventional machine translation devices that translate Japanese into English, katakana character strings were searched only from the Japanese word dictionary, so English words written in katakana must be registered in the Japanese word dictionary. There is. Putting English words written in katakana in a Japanese word dictionary and putting the same English words written in alphabetical letters in an English word dictionary not only increases the capacity of the dictionary but also complicates the translation process. There is a drawback to this.

〔発明の目的〕[Purpose of the invention]

本発明は、上記の欠点を除去するものであって、辞書の
容量を小さくできると共に機械翻訳の負荷を軽減できる
ようにした機械翻訳装置を提供することを目的としてい
る。
The present invention eliminates the above-mentioned drawbacks, and aims to provide a machine translation device that can reduce the capacity of a dictionary and reduce the load of machine translation.

〔目的を達成するための手段〕[Means to achieve the purpose]

そしてそのため、本発明の機械翻訳装置は、片仮名列抽
出部と辞書アクセス部とテキスト併合部とを有する前処
理部、翻訳部、日本語辞書、及び外国語辞書を具備する
機械翻訳装置であって、上記外国語辞書の中には外国語
文字表記の文字列に対応して片仮名文字表記の文字列が
書き込まれ、上記片仮名抽出部は、入力テキスト中の片
仮名文字列を上記辞書アクセス部に送り片仮名文字列以
外を上記テキスト併合部に送るように構成され、上記辞
書アクセス部は、送られて来た片仮名文字列と一致する
片仮名文字列が上記日本語辞書の中にあるか否かを調べ
、ない場合には送られて来た片仮名文字列と一致する片
仮名文字列が上記外国語辞書の中にあるか否かを調べ、
ある場合には対応する外国語文字表記の文字列を上記テ
キスト併合部に送るように構成され、上記テキスト併合
部は、送られて来た外国語文字列に対応する入力テキス
ト中の片仮名文字列を送られて来た外国語表記の文字列
に置き換え、置き換え済みのテキストを上記翻訳部に送
るよう構成されていることを特徴とするものである。
Therefore, the machine translation device of the present invention is a machine translation device that includes a preprocessing section having a katakana sequence extraction section, a dictionary access section, and a text merging section, a translation section, a Japanese dictionary, and a foreign language dictionary. , a character string written in katakana characters is written in the foreign language dictionary corresponding to a string written in foreign language characters, and the katakana extraction section sends the katakana character string in the input text to the dictionary access section. The dictionary access unit is configured to send text other than katakana character strings to the text merging unit, and the dictionary access unit checks whether or not there is a katakana character string that matches the sent katakana character string in the Japanese dictionary. , if not, check whether there is a katakana string that matches the sent katakana string in the foreign language dictionary,
In some cases, the text merging unit is configured to send a string in the corresponding foreign language character notation to the text merging unit, and the text merging unit converts the katakana character string in the input text corresponding to the sent foreign language character string. The system is characterized in that it is configured to replace the text with the received character string written in a foreign language, and send the replaced text to the translation section.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の実施例を図面を参照しつつ説明する。 Embodiments of the present invention will be described below with reference to the drawings.

第1図は本発明の1実施例の概要を示す図である。第1
図において、1は翻訳装置、2は前処理部、3は翻訳部
、4は日本語解析部、5は英語生成部をそれぞれ示して
いる。
FIG. 1 is a diagram showing an outline of one embodiment of the present invention. 1st
In the figure, 1 is a translation device, 2 is a preprocessing section, 3 is a translation section, 4 is a Japanese analysis section, and 5 is an English generation section.

翻訳装置!1は前処理部2と翻訳部3から構成されてい
る。前処理部2は、日本語文字列の入力テキストにおけ
る外来語の片仮名文字列をアルファベント表記の英語単
語に置き換え、この結果骨られるテキストを翻訳部3に
送る。翻訳部3は、日本語解析部4と英語生成部5を、
有している。なお、入力テキストは単語単位に分割され
ているものである。日本語解析部4は、単語の品詞種別
を調べ、単語間の係り受は関係を調べるものである。英
語生成部5は、日本語解析部4の解析結果に従って英語
を生成するものである。
Translation device! 1 is composed of a preprocessing section 2 and a translation section 3. The preprocessing unit 2 replaces katakana character strings of foreign words in the input text of Japanese character strings with English words written in alpha bent notation, and sends the resulting text to the translation unit 3. The translation unit 3 includes the Japanese analysis unit 4 and the English generation unit 5.
have. Note that the input text is divided into words. The Japanese language analysis unit 4 examines the part-of-speech type of a word, and examines the relationship between words. The English generation unit 5 generates English according to the analysis result of the Japanese analysis unit 4.

第2図は前処理部2の1実施例のブロック図である。第
2図において、6は片仮名抽出部、7は辞書アクセス部
、8は日本語単語辞書、9は英語単語辞書、10はテキ
スト併合部をそれぞれ示している。片仮名列抽出部6は
、入力テキストの中から片仮名列を抽出し、抽出した片
仮名列を辞書アクセス部7に送り、片仮名以外をテキス
ト併合部10に送るものである。辞書アクセス部7は、
送られて来た片仮名列に基づいて日本語単語辞書8を検
索し、該当する単語が日本語辞書8の中に存在する場合
には当該片仮名列をそのままテキスト併合部10に送り
、該当する単語が日本語単語辞書8に存在しない場合に
は英語単語辞書9を検索し、該当するアルファベット表
記の英語単語をテキスト併合部10g送る。テキスト併
合部10は、送られて来たアルファベット表記の英語単
語で以て入力テキスト中の該当する片仮名列を置き換え
、その結果生成されるテキストを翻訳部3に送る。日本
語単語辞書8の中には複数の日本語単語レコードが格納
されており、日本語単語レコードは仮名の見出フィール
ド、対応する漢字が格納される漢字フィールド、品詞種
別を格納するフィールド、使用回数を格納するフィール
ド及び単語を表す記号が格納される記号フィールドなど
から構成されている。英語単語辞書9には複数の英語単
語レコードが格納されており、英語単語レコードはアル
ファベットの見出フィールド、対応する片仮名表記の英
語単語を格納するフィールド、品詞種別を格納するフィ
ールド、使用回数を格納するフィールド及び記号フィー
ルドなどから構成されている。
FIG. 2 is a block diagram of one embodiment of the preprocessing section 2. As shown in FIG. In FIG. 2, 6 indicates a katakana extraction section, 7 a dictionary access section, 8 a Japanese word dictionary, 9 an English word dictionary, and 10 a text merging section. The katakana string extraction section 6 extracts katakana strings from the input text, sends the extracted katakana strings to the dictionary access section 7, and sends the extracted katakana strings to the text merging section 10. The dictionary access section 7 is
The Japanese word dictionary 8 is searched based on the sent katakana string, and if the corresponding word exists in the Japanese dictionary 8, the katakana string is sent as is to the text merging unit 10, and the corresponding word is searched. If the word does not exist in the Japanese word dictionary 8, the English word dictionary 9 is searched and the corresponding alphabetical English word is sent to the text merging unit 10g. The text merging unit 10 replaces the corresponding katakana string in the input text with the sent alphabetical English words, and sends the resulting text to the translation unit 3. A plurality of Japanese word records are stored in the Japanese word dictionary 8, and the Japanese word records include a header field for kana, a kanji field for storing the corresponding kanji, a field for storing the part of speech type, and a field for storing the part of speech type. It consists of a field for storing the number of times, a symbol field for storing symbols representing words, etc. The English word dictionary 9 stores a plurality of English word records, and the English word records include an alphabetical heading field, a field for storing the corresponding English word written in katakana, a field for storing the part of speech type, and a field for storing the number of usages. It consists of fields such as fields and symbol fields.

なお、コンセント等の和製英語は日本語単語辞書8に格
納される。
Note that Japanese-English words such as "outlet" are stored in the Japanese word dictionary 8.

第3図は第2図の実施例の動作を説明する図である。■
のようなテキストが入力されると、■のような片仮名列
が片仮名抽出部6から辞書アクセス部7に送られ、■の
ような文字列が片仮名抽出部6からテキスト併合部】O
に送られ、■のようなアルファベント文字列が辞書アク
セス部7からテキスト併合部lOに送られ、■のような
テキストがテキスト併合部lOから出力される。
FIG. 3 is a diagram illustrating the operation of the embodiment of FIG. 2. ■
When a text like ``■'' is input, a katakana string like ``■'' is sent from the katakana extraction section 6 to the dictionary access section 7, and a character string like ``■'' is sent from the katakana extraction section 6 to the text merging section ]O.
An alphavent character string such as ■ is sent from the dictionary access unit 7 to the text merging unit lO, and a text such as ■ is output from the text merging unit lO.

翻訳部3は下記のような動作を行うものである。The translation unit 3 performs the following operations.

例えば「私は5uper Con+puterを開発す
る。」という文字列が翻訳部3に入力されたと仮定する
。日本語解析部4は、入力テキストの単語の品詞種別及
び記号を調べる。「私」は代名詞、「はjは係助詞、r
super ComputerJは名詞、「を」は格助
詞、「開発する」は動詞という品詞域別を有している。
For example, assume that the character string "I will develop 5uper Computer+puter" is input to the translation unit 3. The Japanese language analysis unit 4 examines the part of speech type and symbol of the word in the input text. ``I'' is a pronoun, ``haj is a particle, r
Super ComputerJ has a noun, "wo" is a case particle, and "developing" is a verb.

また、「私」は例えば001、「開発する」は例えば0
05という記号を有している。次に日本語解析部4は、
「私」と「は」とが名詞文節を作ること、rsuper
 ComputerJと「を」が名詞文節を作ることを
知り、更に「私はJという名詞文節が「開発する」に係
ること、rSuper Computerを」という名
詞文節も「開発する」に係ることを知る。日本語解析部
4の解析結果は英語生成部5に渡され、”英語生成部5
はこの解析結果に基づいてrI develop a 
5uper Computer、 Jという英語文を生
成する。
Also, "I" is, for example, 001, and "Developer" is, for example, 0.
It has the symbol 05. Next, the Japanese language analysis section 4
"I" and "wa" form a noun clause, rsuper
We know that ComputerJ and ``wo'' form a noun clause, and we also know that the noun clause ``I am J'' relates to ``to develop,'' and the noun clause ``rSuper Computer'' also relates to ``to develop.'' The analysis results of the Japanese analysis unit 4 are passed to the English generation unit 5, and the results are passed to the English generation unit 5.
is based on this analysis result.
5uper Computer generates the English sentence J.

〔発明の効果〕〔Effect of the invention〕

以上の説明から明らかなように、本発明によれば、辞書
の容量を小さく出来ること及び機械翻訳の負荷を軽減す
ること等の効果を奏することが出来る。
As is clear from the above description, according to the present invention, it is possible to achieve effects such as reducing the capacity of a dictionary and reducing the load of machine translation.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明の翻訳装置の1実施例の概要を示す図、
第2図は第1図の前処理部の1実施例のブロック図、第
3図は第2図の動作を説明するための図である。 1・・・翻訳装置、2・・・前処理部、3・・・翻訳部
、4・・・日本語解析部、5・・・英語生成部、6・・
・片仮名抽出部、7・・・辞書アクセス部、8・・・日
本語単語辞書、9・・・英語単語辞書、10・・・テキ
スト併合部。
FIG. 1 is a diagram showing an outline of one embodiment of the translation device of the present invention,
FIG. 2 is a block diagram of one embodiment of the preprocessing section shown in FIG. 1, and FIG. 3 is a diagram for explaining the operation of FIG. 2. 1... Translation device, 2... Preprocessing unit, 3... Translation unit, 4... Japanese analysis unit, 5... English generation unit, 6...
- Katakana extraction unit, 7...Dictionary access unit, 8...Japanese word dictionary, 9...English word dictionary, 10...Text merging unit.

Claims (1)

【特許請求の範囲】[Claims] 片仮名列抽出部と辞書アクセス部とテキスト併合部とを
有する前処理部、翻訳部、日本語辞書、及び外国語辞書
を具備する機械翻訳装置であつて、上記外国語辞書の中
には外国語文字表記の文字列に対応して片仮名文字表記
の文字列が書き込まれ、上記片仮名抽出部は、入力テキ
スト中の片仮名文字列を上記辞書アクセス部に送り片仮
名文字列以外を上記テキスト併合部に送るように構成さ
れ、上記辞書アクセス部は、送られて来た片仮名文字列
と一致する片仮名文字列が上記日本語辞書の中にあるか
否かを調べ、ない場合には送られて来た片仮名文字列と
一致する片仮名文字列が上記外国語辞書の中にあるか否
かを調べ、ある場合には対応する外国語文字表記の文字
列を上記テキスト併合部に送るように構成され、上記テ
キスト併合部は、送られて来た外国語文字列に対応する
入力テキスト中の片仮名文字列を送られて来た外国語表
記の文字列に置き換え、置き換え済みのテキストを上記
翻訳部に送るよう構成されていることを特徴とする機械
翻訳装置。
A machine translation device comprising a preprocessing section having a katakana sequence extraction section, a dictionary access section, and a text merging section, a translation section, a Japanese dictionary, and a foreign language dictionary, wherein the foreign language dictionary includes a foreign language dictionary. A character string expressed in katakana characters is written in correspondence with the character string expressed in characters, and the katakana extraction unit sends the katakana character strings in the input text to the dictionary access unit and sends the characters other than the katakana character strings to the text merging unit. The dictionary access unit checks whether or not there is a katakana character string that matches the sent katakana character string in the Japanese dictionary, and if there is no katakana character string that matches the sent katakana character string, the dictionary access unit It is configured to check whether or not there is a katakana character string that matches the character string in the foreign language dictionary, and if so, to send the character string in the corresponding foreign language character notation to the text merging unit, and to The merging unit is configured to replace the katakana character string in the input text corresponding to the received foreign language character string with the received character string written in the foreign language, and send the replaced text to the translation unit. A machine translation device characterized by:
JP15448584A 1984-07-25 1984-07-25 Mechanical translator Granted JPS6133577A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP15448584A JPS6133577A (en) 1984-07-25 1984-07-25 Mechanical translator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP15448584A JPS6133577A (en) 1984-07-25 1984-07-25 Mechanical translator

Publications (2)

Publication Number Publication Date
JPS6133577A true JPS6133577A (en) 1986-02-17
JPH0346865B2 JPH0346865B2 (en) 1991-07-17

Family

ID=15585271

Family Applications (1)

Application Number Title Priority Date Filing Date
JP15448584A Granted JPS6133577A (en) 1984-07-25 1984-07-25 Mechanical translator

Country Status (1)

Country Link
JP (1) JPS6133577A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03211669A (en) * 1990-01-17 1991-09-17 Nec Corp Mechanical translation device
JPH07271786A (en) * 1994-10-20 1995-10-20 Casio Comput Co Ltd Word processor

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03211669A (en) * 1990-01-17 1991-09-17 Nec Corp Mechanical translation device
JPH07271786A (en) * 1994-10-20 1995-10-20 Casio Comput Co Ltd Word processor

Also Published As

Publication number Publication date
JPH0346865B2 (en) 1991-07-17

Similar Documents

Publication Publication Date Title
JPH0689304A (en) Method and apparatus for preparing text used by text processing system
US20070179932A1 (en) Method for finding data, research engine and microprocessor therefor
Tufiş et al. Automatic diacritics insertion in Romanian texts
Marsi et al. Memory-based morphological analysis generation and part-of-speech tagging of Arabic
JPH05266069A (en) Two-way machie translation system between chinese and japanese languages
Tzoukermann et al. Combining linguistic knowledge and statistical learning in French part-of-speech tagging
JPS61100861A (en) Document editing device
Nongmeikapam et al. A transliteration of CRF based Manipuri POS tagging
Jha et al. Inflectional morphology analyzer for Sanskrit
JPS5892063A (en) Idiom processing system
JPS6133577A (en) Mechanical translator
Saito et al. Multi-language named-entity recognition system based on HMM
Kanaan et al. An improved algorithm for the extraction of triliteral Arabic roots
Hiro et al. Word‐sense disambiguation with a corpus‐based semantic network
Carter Lattice-based word identification in CLARE
Freigang Automation of translation: past, presence, and future
Myint et al. Morpheme-Based Myanmar Word Segmenter
JPS62130458A (en) Kana to kanji conversion processing system
KR100248386B1 (en) Japanese morpheme analyzing method and apparatus using human-readible morpheme connection information and character information
Modi POS Tagging and Structural Annotation of Handwritten Text Image Corpus of Devnagari Script
Pennanen What Happens in Conversion?.
JPS63115264A (en) Document processor
JPS6395573A (en) Method for processing unknown word in analysis of japanese sentence morpheme
Yona A finite-state based morphological analyzer for Hebrew
JPH0581313A (en) Dictionary preparing device