JP2830097B2 - Sentence search method - Google Patents
Sentence search methodInfo
- Publication number
- JP2830097B2 JP2830097B2 JP1176516A JP17651689A JP2830097B2 JP 2830097 B2 JP2830097 B2 JP 2830097B2 JP 1176516 A JP1176516 A JP 1176516A JP 17651689 A JP17651689 A JP 17651689A JP 2830097 B2 JP2830097 B2 JP 2830097B2
- Authority
- JP
- Japan
- Prior art keywords
- search
- sentence
- information
- input sentence
- search condition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Description
【発明の詳細な説明】 〔産業上の利用分野〕 本発明は自然言語処理システムにおいて文章を検索す
るための方法に関する。Description: TECHNICAL FIELD The present invention relates to a method for retrieving sentences in a natural language processing system.
従来、文章を検索するための検索条件として表層文字
列の正規表現によるものがある。表層文字による検索で
は、部分文字列として文中に含まれているものも検索さ
れる。また、形態素解析を用いて形態素単位での検索を
行なう方式もあるが、一般に形態素には多品詞性や多義
性が存在するので必要とする文以外のものが検索される
ことが多い。2. Description of the Related Art Conventionally, as a search condition for searching a sentence, there is a search condition using a regular expression of a surface character string. In the search using surface characters, those included in the sentence as partial character strings are also searched. There is also a method of performing a search in morpheme units using morphological analysis. However, in general, morphemes have many parts of speech and ambiguity, so that sentences other than necessary sentences are often searched.
上記のような従来の方法では結果として不要な文を多
く検索するので、必要な文のみを得るためには再度抽出
作業を行なう必要があるという欠点がある。Since the conventional method as described above searches for many unnecessary sentences as a result, there is a disadvantage that it is necessary to perform the extraction work again in order to obtain only the necessary sentences.
本発明による文章検索方式は、入力文を読み込む手段
と,前記入力文に対する辞書引きをする手段と,辞書引
き後の辞書情報を用いて前記入力文を解析する手段とを
有する自然言語解析システムにおいて、正規表現と辞書
情報と構文情報と意味情報とを組み合せた検索条件を指
定する手段と,前記入力文の形態素解析結果と構文解析
結果と意味解析結果とに基づいて前記入力文と前記検索
条件とを照合する手段とを具備する。A sentence search method according to the present invention is a natural language analysis system having means for reading an input sentence, means for performing a dictionary lookup on the input sentence, and means for analyzing the input sentence using dictionary information after the dictionary lookup. Means for specifying a search condition combining regular expression, dictionary information, syntax information, and semantic information; and the input sentence and the search condition based on a morphological analysis result, a syntax analysis result, and a semantic analysis result of the input sentence. And means for collating.
以下、本発明について図面を参照しながら説明する。 Hereinafter, the present invention will be described with reference to the drawings.
第1図は本発明による文章検索方式の一実施例を示す
ブロック図である。同図において、通信線01を通じて入
力文読み込み部1に読み込まれた自然言語入力文は通信
線12を介して形態素解析部2に伝達される。形態素解析
部2は通信線23を介して入力文に対する辞書情報を辞書
部3より得て形態素解析を行ない、通信線25および26を
介して形態素解析結果を検索条件照合部5および構文解
析部6に伝達する。構文解析部6は構文解析を行ない、
通信線65および67を介して構文解析結果を検索条件照合
部5および意味解析部7に伝達する。意味解析部7は意
味解析を行ない、通信線75を介して意味解析結果を検索
条件照合部5に伝達する。FIG. 1 is a block diagram showing one embodiment of a text search method according to the present invention. In FIG. 1, a natural language input sentence read by the input sentence reading unit 1 through a communication line 01 is transmitted to a morphological analysis unit 2 through a communication line 12. The morphological analysis unit 2 obtains dictionary information for the input sentence from the dictionary unit 3 via the communication line 23, performs morphological analysis, and compares the morphological analysis results via the communication lines 25 and 26 with the search condition matching unit 5 and the syntax analysis unit 6. To communicate. The syntax analysis unit 6 performs syntax analysis,
The result of the syntax analysis is transmitted to the search condition matching unit 5 and the semantic analysis unit 7 via the communication lines 65 and 67. The semantic analysis unit 7 performs semantic analysis, and transmits the semantic analysis result to the search condition matching unit 5 via the communication line 75.
また、通信線04を通じて検索条件指定部4に読み込ま
れた検索条件は通信線45を介して検索条件照合部5に伝
達される。検索条件照合部5は形態素解析結果と構文解
析結果と意味解析結果とを用いて入力文と検索条件とを
照合し、条件を満足するなら通信線50を介して出力す
る。The search condition read into the search condition specifying unit 4 via the communication line 04 is transmitted to the search condition matching unit 5 via the communication line 45. The search condition matching unit 5 matches the input sentence with the search condition using the morphological analysis result, the syntax analysis result, and the semantic analysis result, and outputs the result via the communication line 50 if the condition is satisfied.
第2図は検索条件の例とその条件に従って検索された
文章の例を示す説明図である。検索条件は表層文字列と
その文字列に対する辞書情報と構文情報をコンマでなら
べて括弧で囲んで記述する。括弧で囲まれていない表層
文字列は部分文字列として文中に含まれてもよい。辞書
情報は属性と属性値の形式で表現する。「品詞・名詞」
の指定はその形態素の品詞が名詞であることを示す。
「意味・人工物」指定はその形態素の意味が人工物であ
ることを示す。構文情報は「主語」,「目的語」のよう
に示される。表層文字の指定法は正規表現を用いる。ド
ット(.)は任意文字を示し、アスタリスク(*)は0
個以上連なっいることを示す。括弧内の表層文字や辞書
情報の指定がない場合、その条件に関しては無条件とす
る。論理条件子&,|はそれぞれAND,ORを示す。括弧内に
コンマでならんでいる辞書情報はAND条件とする。FIG. 2 is an explanatory diagram showing an example of a search condition and an example of a sentence searched in accordance with the condition. The search condition is described by enclosing parenthesis character strings, dictionary information and syntax information for the character strings in parentheses. A surface character string not enclosed in parentheses may be included in a sentence as a partial character string. Dictionary information is expressed in the form of attributes and attribute values. "Part of speech / noun"
Indicates that the part of speech of the morpheme is a noun.
The “meaning / artifact” designation indicates that the meaning of the morpheme is an artifact. Syntax information is indicated as "subject" and "object". A regular expression is used to specify the surface characters. A dot (.) Indicates an arbitrary character, and an asterisk (*) indicates 0
Indicates that there are more than one. If there is no designation of surface characters or dictionary information in parentheses, the condition is unconditional. The logical conditioners &, | indicate AND and OR, respectively. Dictionary information with commas in parentheses is an AND condition.
検索条件の基本単位は次のようになる。この基本単位
を出現順にならべて記述する。The basic unit of the search condition is as follows. These basic units are described in the order of appearance.
表層文字.*(表層文字,属性・値| 属性・値|……,構文情報,意味情報).* 第2図(a)は、名詞または代名詞が主語であって用
言の対象格となるもの,任意の動詞がこの順で現れる文
を検索する例である。また同図(b)は、代名詞が主語
であって用言の動作主格となるもの,任意の動詞がこの
順で現れる文を検索する例である。さらに同図(c)
は、主語となる人工物を意味する用言の道具格となる名
詞,任意の動詞がこの順で現れる文を検索する例であ
る。Surface text. * (Surface characters, attribute / value | attribute / value | ..., syntax information, semantic information). * FIG. 2 (a) is an example of searching for a sentence in which a noun or pronoun is the subject and is the target case of a verb, and a sentence in which an arbitrary verb appears in this order. FIG. 2B is an example of searching for a sentence in which the pronoun is the subject and the verbal noun, and an arbitrary verb appear in this order. Further, FIG.
Is an example of searching for a sentence in which a noun, which is the instrument of a verbal meaning a man-made object, and an arbitrary verb appear in this order.
以上、詳細に説明したように本発明の文章検索方式に
よれば、検索条件として表層文字の正規表現と辞書情報
と構文情報と意味情報とを組み合せて検索できるので、
検索精度と検索指定表現能力とを一層向上させることが
できるという効果がある。As described above in detail, according to the text search method of the present invention, a search can be performed by combining a regular expression of surface characters, dictionary information, syntax information, and semantic information as search conditions.
There is an effect that search accuracy and search designation expression ability can be further improved.
第1図は本発明による文章検索方式の一実施例を示すブ
ロック図、第2図は検索条件と検索文の例を示す説明図
である。 1……入力文読み込み部、2……形態素解析部、3……
辞書部、4……検索条件指定部、5……検索条件照合
部、6……構文解析部、7……意味解析部。FIG. 1 is a block diagram showing an embodiment of a text search method according to the present invention, and FIG. 2 is an explanatory diagram showing examples of search conditions and search sentences. 1 ... input sentence reading unit, 2 ... morphological analysis unit, 3 ...
Dictionary section, 4... Search condition specifying section, 5... Search condition matching section, 6... Syntax analysis section, 7.
───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.6,DB名) G06F 17/30 G06F 17/27 JICST科学技術文献ファイル──────────────────────────────────────────────────続 き Continued on the front page (58) Fields surveyed (Int.Cl. 6 , DB name) G06F 17/30 G06F 17/27 JICST Scientific and Technical Reference File
Claims (1)
する辞書引きをする手段と,辞書引き後の辞書情報を用
いて前記入力文を解析する手段とを有する自然言語解析
システムにおいて、正規表現と辞書情報と構文情報と意
味情報とを組み合せた検索条件を指定する手段と,前記
入力文の形態素解析結果と構文解析結果と意味解析結果
とに基づいて前記入力文と前記検索条件とを照合する手
段とを具備することを特徴とする文章検索方式。1. A natural language analysis system comprising: means for reading an input sentence; means for performing a dictionary lookup on the input sentence; and means for analyzing the input sentence using dictionary information obtained after the dictionary lookup. Means for designating a search condition in which the input sentence is combined with dictionary information, syntax information, and semantic information; A text search method comprising:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1176516A JP2830097B2 (en) | 1989-07-06 | 1989-07-06 | Sentence search method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1176516A JP2830097B2 (en) | 1989-07-06 | 1989-07-06 | Sentence search method |
Publications (2)
Publication Number | Publication Date |
---|---|
JPH0340067A JPH0340067A (en) | 1991-02-20 |
JP2830097B2 true JP2830097B2 (en) | 1998-12-02 |
Family
ID=16014991
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP1176516A Expired - Fee Related JP2830097B2 (en) | 1989-07-06 | 1989-07-06 | Sentence search method |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2830097B2 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB9713019D0 (en) * | 1997-06-20 | 1997-08-27 | Xerox Corp | Linguistic search system |
JP5259462B2 (en) * | 2009-03-12 | 2013-08-07 | 株式会社東芝 | Apparatus, method and program for supporting search |
-
1989
- 1989-07-06 JP JP1176516A patent/JP2830097B2/en not_active Expired - Fee Related
Non-Patent Citations (1)
Title |
---|
隅田、提「構文の照合による柔軟なテキスト検索機能を備えた翻訳支援システム」情報処理学会第37回全国大会講演論文集(▲II▼)P.978−979(昭63−9−12) |
Also Published As
Publication number | Publication date |
---|---|
JPH0340067A (en) | 1991-02-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2783558B2 (en) | Summary generation method and summary generation device | |
JP3691844B2 (en) | Document processing method | |
JP3372532B2 (en) | Computer-readable recording medium for emotion information extraction method and emotion information extraction program | |
JPS6318458A (en) | Method and apparatus for extracting feeling information | |
JP2830097B2 (en) | Sentence search method | |
JP2830098B2 (en) | Sentence search method | |
JP2830099B2 (en) | Sentence search method | |
JP2771976B2 (en) | Language analyzer | |
JP3016040B2 (en) | Natural language processing system | |
JP2000250913A (en) | Example type natural language translation method, production method and device for list of bilingual examples and recording medium recording program of the production method and device | |
JP4054353B2 (en) | Machine translation apparatus and machine translation program | |
JP2958044B2 (en) | Kana-Kanji conversion method and device | |
JPH0668070A (en) | Compound word dictionary registering device | |
JPH0571982B2 (en) | ||
JP3132563B2 (en) | Document creation support device | |
KOGKITSIDOU et al. | Normalisation of 16𝑡ℎ and 17𝑡ℎ century texts in French and geographical named entity recognition | |
JP2928246B2 (en) | Translation support device | |
JPH0916593A (en) | Technical term extractor and document understanding supporting system | |
JP2655711B2 (en) | Homomorphic reading system | |
JP2838849B2 (en) | Language processor | |
JPH09185629A (en) | Machine translation method | |
JPH0244462A (en) | Natural language processor | |
JP2870278B2 (en) | Meaning selection device | |
JPS63109572A (en) | Derivative processing system | |
JPH0320866A (en) | Text base retrieval system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
LAPS | Cancellation because of no payment of annual fees |