JP6260208B2

JP6260208B2 - Text summarization device

Info

Publication number: JP6260208B2
Application number: JP2013231111A
Authority: JP
Inventors: 辰彦斉藤; 貴弘大塚; 山浦　正; 正山浦
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2013-11-07
Filing date: 2013-11-07
Publication date: 2018-01-17
Anticipated expiration: 2033-11-07
Also published as: JP2015090663A

Description

本発明は、入力されたテキストデータを要約して要約テキストデータを生成するテキスト要約装置に関するものである。 The present invention relates to a text summarization apparatus that summarizes input text data and generates summary text data.

入力されたテキストデータを要約して音声として読み上げる装置が知られている（例えば、特許文献１参照）。このような要約読み上げ装置においては、入力されたテキストに含まれる単語単位に付与された重要度を用いて、ユーザが設定した要約率になるように要約を行う。 An apparatus that summarizes input text data and reads it out as speech is known (for example, see Patent Document 1). In such a summary reading device, summarization is performed so that the summarization rate set by the user is obtained using the importance given to each word included in the input text.

また、入力されたテキストデータを複数の部分テキスト及び当該部分テキストに対応した複数の重要度に分割し、設定された速度指令に基づき重要度の低い部分テキストの内容をスキップして速読を行う音声合成装置が知られている（例えば、特許文献２参照）。 Also, the input text data is divided into a plurality of partial texts and a plurality of importance levels corresponding to the partial texts, and the content of the less important partial texts is skipped based on the set speed command for speed reading. A speech synthesizer is known (see, for example, Patent Document 2).

特開２００１−２８２８１５号公報JP 2001-282815 A 特開平５−１８１４９１号公報JP-A-5-181491

しかしながら、従来の装置では、要約率または速度指令を変更することにより入力テキストにおける要約の度合いを変更できるものの、テキストデータを構成する部分テキストに付与された重要度に関しては状況に応じて動的に変更することができなかった。そのため、重要度が低く設定された部分テキストであっても、状況によっては要約テキストに含めてユーザに提供すべきであるにも関わらず要約テキストに含まれない場合があり、その一方で、重要度が高く設定された部分テキストであっても、状況によってはユーザへ提供する必要がないにも関わらず要約テキストに含めて提供してしまう場合があり、ユーザに対し要約テキストの内容、すなわち要約情報を適切に提供することができない場合があるという課題があった。 However, in the conventional apparatus, although the degree of summarization in the input text can be changed by changing the summarization rate or speed command, the importance given to the partial text constituting the text data is dynamically changed according to the situation. Could not change. Therefore, even partial text that is set to low importance may not be included in the summary text, although it should be provided to the user in some situations. Even if the partial text is set to a high degree, it may not be provided to the user depending on the situation, but may be provided in the summary text. There was a problem that information could not be provided appropriately.

本発明は、上述した課題を解決するためになされたものであり、ユーザに対し要約情報を適切に提供することができるテキスト要約装置を提供することを目的とする。 The present invention has been made to solve the above-described problems, and an object of the present invention is to provide a text summarization apparatus that can appropriately provide summary information to a user.

本発明に係るテキスト要約装置は、複数の部分テキストデータから構成されたテキストデータが入力されるデータ入力部と、データ入力部に入力された部分テキストデータに付与される重要度を記憶する重要度記憶部と、過去に入力されたテキストデータに含まれる部分テキストデータの履歴情報に基づいて重要度を変更する重要度変更部と、重要度に基づいて、データ入力部に入力されるテキストデータから１又は複数の部分テキストデータを抜き出して要約テキストデータを生成するデータ処理部とを備え、重要度変更部は、過去に入力されたテキストデータが緊急情報に関するテキストデータである場合は、過去に入力されたテキストデータに含まれる部分テキストデータに対応する重要度を高くすることを特徴とする

The text summarization device according to the present invention stores a data input unit to which text data composed of a plurality of partial text data is input, and an importance level for storing the importance level assigned to the partial text data input to the data input unit. From the storage unit, the importance level changing unit that changes the importance level based on the history information of the partial text data included in the text data input in the past, and the text data input to the data input unit based on the importance level A data processing unit that extracts one or a plurality of partial text data to generate summary text data, and the importance level changing unit inputs in the past when the text data input in the past is text data related to emergency information The importance corresponding to the partial text data included in the text data is increased.

本発明のテキスト要約装置によれば、過去のテキストデータに含まれる部分テキストデータの履歴情報に基づいて重要度を変更するので、ユーザに対し要約情報を適切に提供することが可能となる。 According to the text summarizing apparatus of the present invention, since the importance level is changed based on the history information of the partial text data included in the past text data, it is possible to appropriately provide the summary information to the user.

実施の形態１に係るテキスト要約装置の構成例を示す図である。It is a figure which shows the structural example of the text summarization apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係る重要度の更新についての動作例を示すフローチャートThe flowchart which shows the operation example about the update of the importance which concerns on Embodiment 1 実施の形態１に係る入力テキスト及び要約テキストの例を示す図である。It is a figure which shows the example of the input text and summary text which concern on Embodiment 1. FIG. 実施の形態１に係る新たに入力されたテキストの例を示す図である。6 is a diagram illustrating an example of newly input text according to Embodiment 1. FIG. 実施の形態１に係るテキスト要約装置の他の構成例を示す図である。It is a figure which shows the other structural example of the text summarizing apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係るテキスト要約装置の他の構成例を示す図である。It is a figure which shows the other structural example of the text summarizing apparatus which concerns on Embodiment 1. FIG. 実施の形態１に係るテキスト要約装置の他の構成例を示す図である。It is a figure which shows the other structural example of the text summarizing apparatus which concerns on Embodiment 1. FIG. 実施の形態２に係るテキスト要約装置の構成例を示す図である。It is a figure which shows the structural example of the text summarization apparatus which concerns on Embodiment 2. FIG. 実施の形態２に係る入力テキスト１０１の例を示す図である。It is a figure which shows the example of the input text 101 which concerns on Embodiment 2. FIG. 実施の形態２に係る解析部２１の動作例を示すフローチャートである。10 is a flowchart illustrating an operation example of an analysis unit 21 according to the second embodiment. 実施の形態２に係る解析結果テキスト１０２の例を示す図である。It is a figure which shows the example of the analysis result text 102 which concerns on Embodiment 2. FIG. 実施の形態２に係る重要度付与部２２の動作例を示すフローチャートである。10 is a flowchart illustrating an operation example of the importance level assigning unit 22 according to the second embodiment. 実施の形態２に係る重要度テーブル１０３の例を示す図である。It is a figure which shows the example of the importance level table 103 which concerns on Embodiment 2. FIG. 実施の形態２に係る重要度付きテキスト１０４の例を示す図である。It is a figure which shows the example of the text 104 with the importance which concerns on Embodiment 2. FIG. 実施の形態２に係る部分テキストデータ選択部２３の動作例を示すフローチャートである。10 is a flowchart illustrating an operation example of a partial text data selection unit 23 according to the second embodiment. 実施の形態２に係る要約テキスト１０５の例を示す図である。It is a figure which shows the example of the summary text 105 which concerns on Embodiment 2. FIG. 実施の形態２に係るテキスト履歴１０６の例を示す図である。It is a figure which shows the example of the text log | history 106 which concerns on Embodiment 2. FIG. 実施の形態２に係るテキスト要約装置の他の構成例を示す図である。It is a figure which shows the other structural example of the text summarizing apparatus based on Embodiment 2. FIG. 実施の形態２に係るテキスト要約装置の他の構成例を示す図である。It is a figure which shows the other structural example of the text summarizing apparatus based on Embodiment 2. FIG. 実施の形態２に係るテキスト要約装置の他の構成例を示す図である。It is a figure which shows the other structural example of the text summarizing apparatus based on Embodiment 2. FIG.

実施の形態１．
以下図面を用いて本発明の実施の形態１を説明する。 Embodiment 1 FIG.
Embodiment 1 of the present invention will be described below with reference to the drawings.

図１は実施の形態１に係るテキスト要約装置の構成例を示す図である。テキスト要約装置１００は、テキストデータ入力部１と、データ処理部２と、重要度記憶部３と、要約度変更部４と、テキスト履歴データ記憶部５と、重要度変更部６とを備える。テキスト要約装置１００は、例えばナビゲーション装置に搭載される装置またはナビゲーション装置自体等が該当するが、これに限定されるものではなく、テキストデータが入力されて、その要約テキストの内容である要約情報をユーザに提供するものであれば何でもよい。なお、要約情報の提供とは、要約テキストの内容を文書として提供することに限らず、要約テキストの内容を音声として提供すること等も含む。 FIG. 1 is a diagram illustrating a configuration example of a text summarizing apparatus according to the first embodiment. The text summarizing apparatus 100 includes a text data input unit 1, a data processing unit 2, an importance level storage unit 3, a summary level change unit 4, a text history data storage unit 5, and an importance level change unit 6. The text summarization device 100 corresponds to, for example, a device mounted on the navigation device or the navigation device itself, but is not limited to this. The text data is input and summary information that is the content of the summary text is input. Anything provided to the user is acceptable. The provision of the summary information is not limited to providing the contents of the summary text as a document, but also includes providing the contents of the summary text as speech.

テキストデータ入力部１には、テキストデータが入力される。ここで入力されるテキストデータは、例えば、ニュース等のＷｅｂ情報、地震速報等の緊急情報、天気情報、周辺の施設情報等の内容を表す文書のデータが該当する。テキストデータは複数の部分テキストデータにより構成されている。部分テキストは、例えば文、文を構成する文節、または文節を構成する単語が該当する。なお、テキストデータは、例えばテキスト要約装置１００がサーバ等に要求することによりテキストデータ入力部１に入力される。 Text data is input to the text data input unit 1. The text data input here corresponds to, for example, document data representing contents such as Web information such as news, emergency information such as earthquake early warning, weather information, and peripheral facility information. The text data is composed of a plurality of partial text data. The partial text corresponds to, for example, a sentence, a phrase constituting the sentence, or a word constituting the phrase. The text data is input to the text data input unit 1 when the text summarizing apparatus 100 requests the server or the like, for example.

データ処理部２は、入力テキストデータを構成する部分テキストデータに付与される重要度に基づいて、データ入力部１に入力されるテキストデータから１または複数の部分テキストデータを抜き出して要約テキストデータを生成する。ここで、重要度は、ユーザに提示すべき度合いを表す指標である。したがって、テキストデータのうち重要度が高い部分テキストデータは、要約テキストデータに含めてユーザに提示すべきデータであるといえる。 The data processing unit 2 extracts one or more partial text data from the text data input to the data input unit 1 based on the importance given to the partial text data constituting the input text data, and outputs the summary text data. Generate. Here, the importance is an index representing the degree to be presented to the user. Therefore, it can be said that partial text data with high importance among text data is data to be presented to the user by being included in the summary text data.

重要度記憶部３には、部分テキストデータと重要度が対応付けて記憶され、例えばメモリ等により実現される。重要度は、例えば、過去に入力された多数のテキストにおける単語の出現回数から学習することにより設定してもよいし、ユーザが任意に設定してもよい。部分テキストが文または文節の場合であっても、文または文節を構成する単語の重要度の和としたり、重要度の和を単語数で割って正規化すること等により、文または文節単位での重要度を求めることができるので、重要度記憶部３は結果として部分テキストデータに対応する重要度を記憶するといえる。なお、重要度記憶部３に記憶される重要度に関する情報は、例えばデータ処理部２が保持するような構成であってもよい。また、部分テキストが単語である場合は、過去に入力されたテキストに含まれる単語だけではなく、その単語との共起を考慮して重要度を記憶するようにしてもよい。 The importance storage unit 3 stores partial text data and importance in association with each other, and is realized by, for example, a memory. The importance may be set, for example, by learning from the number of appearances of words in a large number of texts input in the past, or may be arbitrarily set by the user. Even if the partial text is a sentence or clause, the sum of the importance of the words that make up the sentence or clause, or the normalization by dividing the sum of importance by the number of words, etc. Therefore, it can be said that the importance storage unit 3 stores the importance corresponding to the partial text data as a result. The information on the importance stored in the importance storage unit 3 may be configured to be held by the data processing unit 2, for example. When the partial text is a word, the importance may be stored in consideration of co-occurrence with the word as well as the word included in the text input in the past.

また、重要度は、入力された１つのテキスト内における単語の出現回数から設定してもよい。また、ＴＦ−ＩＤＦ（ＴｅｒｍＦｒｅｑｕｅｎｃｙ−ＩｎｖｅｒｓｅＤｏｃｕｍｅｎｔＦｒｅｑｕｅｎｃｙ）により求めた値を重要度としてもよい。また、重要度記憶部３には、単語の情報とともに品詞情報も合わせて記憶されるようにしてもよく、また、文書を特徴づける名詞や形容詞については重要度を高くするようにしてもよい。 Further, the importance may be set from the number of appearances of a word in one input text. Further, a value obtained by TF-IDF (Term Frequency-Inverse Document Frequency) may be used as the importance. Further, the importance level storage unit 3 may store the part of speech information together with the word information, and may increase the importance level of nouns and adjectives that characterize the document.

要約度変更部４は、設定された要約度の値を変更する。要約度とは、ユーザに提示する要約度合いを表す指標であり、値が高いほどユーザに提供される要約テキストが短くなる。この要約度は、例えばテキスト要約装置がナビゲーション装置であれば、ユーザがダイヤルやボタンによりその度合いを設定できる。ただし、要約度は必ずしも変更されなくてもよく、データ処理部２が予め設定された要約度を固定値として記憶しておくようにしてもよい。以下では、要約度は固定値として説明を行う。 The summarization level changing unit 4 changes the value of the set summarization level. The summarization degree is an index representing the degree of summarization presented to the user. The higher the value, the shorter the summary text provided to the user. For example, if the text summarization device is a navigation device, the user can set the degree of summarization using a dial or a button. However, the summarization level does not necessarily have to be changed, and the data processing unit 2 may store a preset summarization level as a fixed value. In the following description, the summarization level is described as a fixed value.

テキスト履歴データ記憶部５には、データ処理部２で抜き出された部分テキストデータとその出現回数をテキスト履歴データ（履歴情報）として記憶される。なお、以下の説明では、部分テキストデータとその出現回数を履歴情報として説明するが、これに限らない。例えば、出現回数に基づき算出した重要度変更のための重み付け値等が履歴情報としてテキスト履歴データ記憶部５に記憶されていてもよい。また、部分テキストデータが文または文節である場合、文または文節を構成する単語毎の出現回数を履歴情報としてテキスト履歴データ記憶部５に記憶してもよい。 The text history data storage unit 5 stores the partial text data extracted by the data processing unit 2 and the number of appearances thereof as text history data (history information). In the following description, partial text data and the number of appearances thereof are described as history information, but the present invention is not limited to this. For example, a weighting value or the like for changing the importance calculated based on the number of appearances may be stored in the text history data storage unit 5 as history information. When the partial text data is a sentence or a phrase, the number of appearances for each word constituting the sentence or the phrase may be stored in the text history data storage unit 5 as history information.

重要度変更部６は、重要度記憶部３に記憶される部分テキストデータの重要度のうち、データ処理部２で抜き出された１または複数の部分テキストデータの重要度を、テキスト履歴データ記憶部５に記憶された部分テキストデータの履歴情報に基づき変更する。 The importance level changing unit 6 stores the importance level of one or more partial text data extracted by the data processing unit 2 out of the importance levels of the partial text data stored in the importance level storage unit 3. It changes based on the history information of the partial text data stored in the part 5.

次に、実施の形態１における重要度の変更処理についての動作について説明する。図２は実施の形態１に係る重要度の変更についての動作例を示すフローチャートである。 Next, the operation of the importance level changing process in the first embodiment will be described. FIG. 2 is a flowchart showing an operation example for changing the importance according to the first embodiment.

まず、テキストデータ入力部１にテキストデータが入力される（ステップＳＴ１）。図３は実施の形態１に係る入力テキスト及び要約テキストの例を示す図である。図３に示すように、入力テキストは複数の部分テキストから構成される。部分テキスト１は「ＡＢＣＤＥＦＧ」であり、部分テキスト２は「ＨＩＪＫＬＭＮ」であり、部分テキスト３は「ＯＰＱＲＳＴＵ」である。 First, text data is input to the text data input unit 1 (step ST1). FIG. 3 is a diagram showing an example of input text and summary text according to the first embodiment. As shown in FIG. 3, the input text is composed of a plurality of partial texts. The partial text 1 is “ABCDEFG”, the partial text 2 is “HIJKLMN”, and the partial text 3 is “OPQRSTU”.

次に、データ処理部２は、入力されたテキストデータから部分テキストデータを抽出し（ステップＳＴ２）、抽出した部分テキストデータの重要度を重要度記憶部３から取得し、要約度と比較する（ステップＳＴ３）。図３の例では、データ処理部２は、入力テキストデータから部分テキスト１に対応する部分テキストデータを抽出し、重要度記憶部３から対応する重要度を取得する。ここでは部分テキスト１に対応する部分テキストデータの重要度を３．５とし、要約度は３．０とする。なお、以下では部分テキストデータに付与された重要度を、単に、部分テキストの重要度と表現することもある。 Next, the data processing unit 2 extracts partial text data from the input text data (step ST2), acquires the importance of the extracted partial text data from the importance storage unit 3, and compares it with the summarization ( Step ST3). In the example of FIG. 3, the data processing unit 2 extracts partial text data corresponding to the partial text 1 from the input text data, and acquires the corresponding importance from the importance storage unit 3. Here, the importance of the partial text data corresponding to the partial text 1 is 3.5, and the summarization is 3.0. Hereinafter, the importance assigned to the partial text data may be simply expressed as the importance of the partial text.

データ処理部２は、抽出した部分テキストデータの重要度が要約度よりも高い場合（ステップＳＴ４−Ｙｅｓ）、その部分テキストデータとその出現回数をテキスト履歴データ記憶部５に履歴情報として記憶する（ステップＳＴ５）。図３の例では、部分テキスト１の重要度は要約度より高いので、データ処理部２は、部分テキスト１「ＡＢＣＤＥＦＧ」と出現回数「１」を履歴情報としてテキスト履歴データ記憶部５に記憶する。 When the importance of the extracted partial text data is higher than the summarization degree (step ST4-Yes), the data processing unit 2 stores the partial text data and the number of appearances thereof as history information in the text history data storage unit 5 ( Step ST5). In the example of FIG. 3, since the importance of the partial text 1 is higher than the summarization degree, the data processing unit 2 stores the partial text 1 “ABCDEFG” and the number of appearances “1” in the text history data storage unit 5 as history information. .

データ処理部２は、残りの部分テキストデータがある場合（ステップＳＴ６−Ｙｅｓ）、ステップＳＴ２からステップＳＴ５までの処理を再度行う。図３の例では、部分テキスト１以外の残りの部分テキストデータが残っているので、データ処理部２は、次の部分テキスト２のデータに対して部分テキスト１と同様の処理を行う。その次は、部分テキスト３のデータに対して部分テキスト１、２と同様の処理を行う。ここでは、部分テキスト２の重要度は２．０で要約度３．０より低く、部分テキスト３の重要度は３．２で要約度３より高いものとする。そのため、データ処理部２は入力テキストデータから部分テキスト１と部分テキスト３のデータを抜き出すので、テキスト履歴データ記憶部５には結果として、部分テキスト１「ＡＢＣＤＥＦＧ」と出現回数「１」及び部分テキスト３「ＯＰＱＲＳＴＵ」と出現回数「１」が履歴情報として記憶される。 When there is remaining partial text data (step ST6-Yes), the data processing unit 2 performs the processing from step ST2 to step ST5 again. In the example of FIG. 3, since the remaining partial text data other than the partial text 1 remains, the data processing unit 2 performs the same process as the partial text 1 on the data of the next partial text 2. Next, the same processing as that for the partial texts 1 and 2 is performed on the data of the partial text 3. Here, the importance of the partial text 2 is 2.0, which is lower than the summary degree 3.0, and the importance of the partial text 3 is 3.2, which is higher than the summary degree 3. Therefore, the data processing unit 2 extracts the data of the partial text 1 and the partial text 3 from the input text data. As a result, the text history data storage unit 5 stores the partial text 1 “ABCDEFG”, the appearance count “1”, and the partial text. 3 “OPQRSTU” and appearance count “1” are stored as history information.

データ処理部２は、残りの部分テキストデータがない場合は（ステップＳＴ６−Ｎｏ）、要約テキストデータを作成する（ステップＳＴ７）。図３の例では、入力されたテキストデータのうち、部分テキスト１、３の部分テキストデータが抜き出されているので、要約テキスト「ＡＢＣＤＥＦＧＯＰＱＲＳＴＵ」に対応する要約テキストデータが作成される。 If there is no remaining partial text data (step ST6-No), the data processing unit 2 creates summary text data (step ST7). In the example of FIG. 3, since the partial text data of the partial texts 1 and 3 are extracted from the input text data, the summary text data corresponding to the summary text “ABCDEFG OPQRSTU” is created.

重要度変更部６は、テキスト履歴データ記憶部５に記憶されている部分テキストデータの履歴情報に基づき、重要度記憶部３に記憶されている重要度を変更する（ステップＳＴ８）。図３の例では、テキスト履歴データ記憶部５には、部分テキスト１「ＡＢＣＤＥＦＧ」、部分テキスト３「ＯＰＱＲＳＴＵ」の出現回数「１」が記憶されており、重要度変更部６は、重要度記憶部３に記憶される部分テキスト「ＡＢＣＤＥＦＧ」、「ＯＰＱＲＳＴＵ」の重要度をそれぞれ２．５、２．２に変更するものとする。 The importance level changing unit 6 changes the importance level stored in the importance level storage unit 3 based on the history information of the partial text data stored in the text history data storage unit 5 (step ST8). In the example of FIG. 3, the text history data storage unit 5 stores the number of appearances “1” of the partial text 1 “ABCDEFG” and the partial text 3 “OPQRSTU”, and the importance level changing unit 6 stores the importance level memory. Assume that the importance levels of partial texts “ABCDEFG” and “OPQRSTU” stored in part 3 are changed to 2.5 and 2.2, respectively.

重要度変更部６は、入力されたテキストデータがユーザにとって重要な情報であればそのテキストデータに含まれる部分テキストデータの重要度を高くするよう変更し、ユーザにとって重要でなければ部分テキストデータの重要度を低くするように変更する。詳細は後述するが、ここでは、重要度変更部６は、テキスト履歴データ記憶部５に記憶される部分テキストデータの出現回数が多いほど、重要度記憶部３に記憶される部分テキストデータの重要度が低くなるよう変更するものとして説明する。 The importance level changing unit 6 changes the importance level of the partial text data included in the text data if the input text data is important information for the user, and changes the level of the partial text data if it is not important for the user. Change to be less important. Although details will be described later, here, the importance level changing unit 6 increases the importance of the partial text data stored in the importance level storage unit 3 as the number of appearances of the partial text data stored in the text history data storage unit 5 increases. It demonstrates as what changes so that a degree may become low.

なお、ここでは図３に示すテキストデータが入力された場合での重要度の変更について説明したが、この重要度は、テキストデータが新たに入力され要約テキストデータが作成される毎に更新される。例えば新たに部分テキスト「ＡＢＣＤＥＦＧ」を含むテキストデータが入力され要約テキストデータとして抽出されると、テキスト履歴データ記憶部５に記憶される「ＡＢＣＤＥＦＧ」の出現回数は「２」に変更される。そして、重要度変更部６は、出現回数「２」に基づいて、重要度記憶部３に記憶される部分テキスト「ＡＢＣＤＥＦＧ」の重要度がさらに低くなるよう変更する。 Here, the change in the importance level when the text data shown in FIG. 3 is input has been described, but this importance level is updated every time text data is newly input and summary text data is created. . For example, when new text data including partial text “ABCDEFG” is input and extracted as summary text data, the number of occurrences of “ABCDEFG” stored in the text history data storage unit 5 is changed to “2”. Then, the importance level changing unit 6 changes the importance level of the partial text “ABCDEFG” stored in the importance level storage unit 3 based on the appearance count “2” so that the importance level is further lowered.

また、図２ではステップＳＴ７の要約テキストデータ作成の後でステップＳＴ８の重要度変更を行うものとして説明したが、これらの処理の順序は任意であり、もしくはこれらの処理が並列になされてもよい。また、ステップＳＴ３からＳＴ７では、抽出した部分テキストデータの重要度と要約度とを比較し、重要度が要約度よりも高い場合にテキスト履歴データ記憶部５に履歴情報を記憶し、要約テキストデータを構成する部分テキストデータとして選択するものとして説明したが、これに限らない。例えば、重要度と要約度を比較しなくても、入力されたテキストデータを構成する部分テキストデータのうち、相対的に重要度の高い部分テキストデータをｎ（ｎは任意の整数）個抽出して要約テキストデータを作成するとともにテキスト履歴データ記憶部５に履歴情報を記憶するようにしてもよい。その場合は、部分テキストデータの抽出において要約度は必要ないこととなる。 In FIG. 2, it has been described that the importance level is changed in step ST8 after the summary text data is created in step ST7. However, the order of these processes is arbitrary, or these processes may be performed in parallel. . In steps ST3 to ST7, the importance level and the summarization level of the extracted partial text data are compared. When the importance level is higher than the summarization level, history information is stored in the text history data storage unit 5, and the summary text data is stored. However, the present invention is not limited to this. For example, even if the importance level and the summary level are not compared, n (n is an arbitrary integer) pieces of relatively high importance partial text data are extracted from the partial text data constituting the input text data. Thus, summary text data may be created and history information may be stored in the text history data storage unit 5. In that case, the summarization level is not necessary in the extraction of the partial text data.

次に、新たにテキストデータが入力された場合における要約テキストデータ生成処理の動作について説明する。図４は実施の形態１に係る新たに入力されたテキストの例を示す図である。図３に示すテキストデータは、図４に示す新たに入力されたテキストデータと区別するため、以降の説明では過去に入力されたテキストデータと呼ぶことにする。なお、図３の入力テキストと図４の入力テキストは類似する内容であるが、図４の入力テキストは、図３に示す過去に入力されたテキストと比べて異なる部分テキスト「ＶＷＸＹＺ」を含む点で相違する。なお、類似内容のテキストデータが入力される状況としては、例えば、類似内容であるが発信元が異なるＷｅｂニュース等が入力される状況が想定される。 Next, the operation of the summary text data generation process when new text data is input will be described. FIG. 4 is a diagram illustrating an example of newly input text according to the first embodiment. The text data shown in FIG. 3 is called text data input in the past in the following description in order to distinguish it from the newly input text data shown in FIG. The input text of FIG. 3 and the input text of FIG. 4 have similar contents, but the input text of FIG. 4 includes a partial text “VWXYZ” that is different from the text previously input shown in FIG. Is different. In addition, as a situation where text data having similar contents is input, for example, a situation in which Web news or the like having similar contents but a different source is input is assumed.

新たにテキストデータが入力されると、まずは、図２のステップＳＴ１からＳＴ３までの処理がなされる。ステップＳＴ３では、データ処理部２が、新たに入力されたテキストデータから抽出した部分テキストデータの重要度と、要約度とを比較する。ここでの重要度は、過去に入力されたテキストデータに含まれる部分テキストデータの履歴情報に基づいて重要度変更部６により変更された値となる。図４の例では、部分テキスト１と部分テキスト３の重要度は過去に入力された部分テキスト１、部分テキスト３の履歴情報に基づき、それぞれ２．５、２．２と低く変更されているので、要約度３．０よりも小さくなる。また、部分テキスト２の重要度は過去に入力されたテキストデータによっては変更されていないので、２．０のままとなる。なお、部分テキスト４「ＶＷＸＹＺ」の重要度は２．８とする。 When new text data is input, first, processing from steps ST1 to ST3 in FIG. 2 is performed. In step ST3, the data processing unit 2 compares the importance level of the partial text data extracted from the newly input text data with the summary level. The importance here is a value changed by the importance changing unit 6 based on the history information of the partial text data included in the text data input in the past. In the example of FIG. 4, the importance levels of the partial text 1 and the partial text 3 are changed to low values of 2.5 and 2.2 based on the history information of the partial text 1 and the partial text 3 input in the past, respectively. The sum is less than 3.0. Further, since the importance of the partial text 2 is not changed by the text data input in the past, it remains 2.0. The importance of the partial text 4 “VWXYZ” is 2.8.

新たに入力されたテキストデータを構成する部分テキストデータの重要度はいずれも要約度より高くならないので、ステップＳＴ４以降の処理は行われない。従って、データ処理部２は、新たに入力されたテキストデータから要約テキストデータを生成しなくなるので、過去に入力されたテキストデータから生成した要約テキストと同一または類似内容の要約テキストを繰り返し提供することを防止でき、ユーザに対し要約情報を適切に提供することが可能となる。 Since the importance level of the partial text data constituting the newly input text data is not higher than the summarization level, the processes after step ST4 are not performed. Therefore, the data processing unit 2 does not generate the summary text data from the newly input text data, and therefore repeatedly provides the summary text having the same or similar contents as the summary text generated from the previously input text data. Therefore, summary information can be appropriately provided to the user.

なお、これまでは、過去に入力されたテキストデータの履歴情報に基づいて重要度変更部６が部分テキストの重要度を低くする例について説明したが、重要度を高くするように変更してもよい。そうすることにより、過去に入力されたテキストデータと類似内容のテキストデータが新たに入力された場合であっても、新たに入力されたテキストデータを構成する部分テキストデータの重要度は高く変更されているので、データ処理部２は、過去の要約テキストと同一または類似する内容の要約テキストを生成してユーザに提供することができる。 Heretofore, the example in which the importance level changing unit 6 reduces the importance level of the partial text based on the history information of the text data input in the past has been described. However, the importance level changing unit 6 may be changed to increase the importance level. Good. By doing so, the importance of the partial text data constituting the newly input text data is changed to a high level even when text data similar in content to the text data input in the past is newly input. Therefore, the data processing unit 2 can generate a summary text having the same or similar content as the past summary text and provide it to the user.

特に、入力テキストデータが緊急地震速報等の緊急情報に関するテキストデータの場合においては、過去にユーザに提供されていたとしても繰り返し提供する必要があることが多い。そういった場合であっても、緊急情報に関する要約テキストの内容を繰り返し提供でき、ユーザに対し要約情報を適切に提供することが可能となる。 In particular, when the input text data is text data related to emergency information such as an earthquake early warning, it is often necessary to repeatedly provide the text data even if it has been provided to the user in the past. Even in such a case, the contents of the summary text relating to the emergency information can be repeatedly provided, and the summary information can be appropriately provided to the user.

以上より、実施の形態１によれば、重要度変更部６が、過去に入力されたテキストデータに含まれる部分テキストデータの履歴情報に基づいて、重要度記憶部３に記憶される部分テキストデータの重要度を変更するので、新たにテキストデータが入力された場合において、そのテキストの要約情報がユーザとって必要であれば積極的に提供し、必要でなければ提供しないようになるので、ユーザに対し要約情報を適切に提供することが可能となる。 As described above, according to the first embodiment, the importance level changing unit 6 stores the partial text data stored in the importance level storage unit 3 based on the history information of the partial text data included in the text data input in the past. Since the importance level of the text is changed, when new text data is input, the summary information of the text is actively provided if necessary and not provided unless necessary. It is possible to provide summary information appropriately.

図５は実施の形態１に係るテキスト要約装置の他の構成例を示す図である。図５に示すように、テキスト要約装置１１０は音声合成部（音声生成部）７を備えていてもよい。 FIG. 5 is a diagram showing another configuration example of the text summarizing apparatus according to the first embodiment. As shown in FIG. 5, the text summarizing device 110 may include a speech synthesis unit (speech generation unit) 7.

音声合成部７は、データ処理部２で生成された要約テキストデータに基づき、要約テキストの内容を音声合成して外部に出力する。図３の例では、「ＡＢＣＤＥＦＧＯＰＱＲＳＴＵ」を音声としてユーザに提供する。 The speech synthesizer 7 synthesizes the content of the summary text based on the summary text data generated by the data processing unit 2 and outputs it to the outside. In the example of FIG. 3, “ABCDEFG OPQRSTU” is provided to the user as voice.

そうすることにより、ユーザは音声によって要約内容を聴くことができるので、例えばテキスト要約装置がナビゲーション装置であれば、ユーザは運転中にナビ画面を見ることなく要約情報の提供を受けることができ、安全な走行が可能となる。 By doing so, since the user can listen to the summary content by voice, for example, if the text summarization device is a navigation device, the user can be provided with summary information without looking at the navigation screen during driving, Safe driving is possible.

図６は実施の形態１に係るテキスト要約装置の他の構成例を示す図である。図６に示すように、テキスト要約装置１２０は操作履歴記憶部８と嗜好キーワード抽出部９とを備えていてもよい。 FIG. 6 is a diagram showing another configuration example of the text summarizing apparatus according to the first embodiment. As shown in FIG. 6, the text summarizing device 120 may include an operation history storage unit 8 and a preference keyword extraction unit 9.

操作履歴記憶部８は、過去のユーザ操作履歴が記憶される。ユーザ操作履歴とは、例えばナビゲーション装置おいては、目的地の設定操作、車内で流す音楽ＣＤの選択等が該当する。 The operation history storage unit 8 stores past user operation history. For example, in the navigation device, the user operation history corresponds to a destination setting operation, selection of a music CD to be played in the vehicle, and the like.

嗜好キーワード抽出部９は、操作履歴記憶部８に記憶される操作履歴の情報から、ユーザの嗜好を表すキーワードを抽出する。例えばユーザが音楽ＣＤとしてアーティスト「ＸＸＸ」の曲を選択操作した場合、嗜好キーワード抽出部９は、嗜好キーワードとして「ＸＸＸ」を抽出し、テキスト履歴データ記憶部５に履歴情報として記憶する。重要度変更部６は、テキスト履歴データ記憶部５に記憶される履歴情報に基づき、重要度記憶部３に記憶される部分テキストデータの重要度のうち、嗜好キーワードに対応するテキストデータを構成する部分テキストデータの重要度が高くなるよう変更する。 The preference keyword extraction unit 9 extracts a keyword representing the user's preference from the operation history information stored in the operation history storage unit 8. For example, when the user selects and operates a song of artist “XXX” as a music CD, the preference keyword extraction unit 9 extracts “XXX” as a preference keyword and stores it as history information in the text history data storage unit 5. Based on the history information stored in the text history data storage unit 5, the importance level changing unit 6 configures text data corresponding to the preference keyword among the importance levels of the partial text data stored in the importance level storage unit 3. Change the partial text data so that the importance is high.

そうすることにより、アーティスト「ＸＸＸ」に関するテキストデータが入力された場合において、その要約テキストデータが生成されやすくなり、ユーザの興味のある事項についての要約情報を適切に提供することが可能となる。 By doing so, when text data related to the artist “XXX” is input, the summary text data is easily generated, and it is possible to appropriately provide the summary information about the items that the user is interested in.

図７は実施の形態１に係るテキスト要約装置の他の構成例を示す図である。図７に示すように、テキスト要約装置１３０は、音声認識キーワード抽出部１０を備えていてもよい。 FIG. 7 is a diagram showing another configuration example of the text summarizing apparatus according to the first embodiment. As shown in FIG. 7, the text summarizing device 130 may include a speech recognition keyword extracting unit 10.

音声認識キーワード抽出部１０は、外部からの音声を認識して音声認識情報をテキストデータとして抽出し、そのテキストデータを構成する部分テキストデータの履歴情報をテキスト履歴データ記憶部５に記憶する。ここで、外部からの音声とは、例えば車内外の会話、ラジオの内容、またはＣＤによる音声等が該当する。このような外部からの音声は、ユーザによって関心のある内容として考えられることができる。そのため、重要度変更部６は、テキスト履歴データ記憶部５に記憶される履歴情報に基づき、重要度記憶部３に記憶される部分テキストデータの重要度のうち、音声認識キーワード抽出部１０で抽出されたキーワードに対応するテキストデータを構成する部分テキストデータの重要度が高くなるよう変更する。 The speech recognition keyword extraction unit 10 recognizes an external speech, extracts speech recognition information as text data, and stores history information of partial text data constituting the text data in the text history data storage unit 5. Here, the voice from the outside corresponds to, for example, a conversation inside or outside the vehicle, the contents of the radio, or a voice by CD. Such external audio can be considered as content of interest by the user. Therefore, the importance level changing unit 6 extracts the importance level of the partial text data stored in the importance level storage unit 3 by the voice recognition keyword extraction unit 10 based on the history information stored in the text history data storage unit 5. It changes so that the importance of the partial text data which comprises the text data corresponding to the made keyword becomes high.

そうすることにより、音声認識されたキーワードに関するテキストデータが入力された場合において、その要約テキストデータが生成されやすくなり、ユーザの関心の高い事項についての要約情報を適切に提供することが可能となる。なお、音声認識されたキーワードは既に話題となった内容と考える場合は、重要度変更部６が当該キーワードに関する部分テキストデータの重要度を低くするように変更してもよい。 By doing so, when text data related to a speech-recognized keyword is input, the summary text data can be easily generated, and it is possible to appropriately provide summary information about a matter of high user interest. . Note that when the speech-recognized keyword is considered as a topic that has already been discussed, the importance level changing unit 6 may change the keyword so that the importance level of the partial text data related to the keyword is lowered.

実施の形態２．
以下図面を用いて本発明の実施の形態２について説明する。 Embodiment 2. FIG.
The second embodiment of the present invention will be described below with reference to the drawings.

図８は実施の形態２に係るテキスト要約装置の構成例を示す図である。実施の形態２のテキスト要約装置２００は、データ処理部２は、解析部２１と、重要度付与部２２と、要約テキストデータ選択部２３と、要約テキストデータ記憶部２４とを備える点で実施の形態１のテキスト要約装置１００と異なる。なお、その他の構成については実施の形態１と同様であるので図１と同一の符号を付してその説明を省略する。 FIG. 8 is a diagram showing a configuration example of the text summarizing apparatus according to the second embodiment. The text summarization apparatus 200 according to the second embodiment is implemented in that the data processing unit 2 includes an analysis unit 21, an importance level assigning unit 22, a summary text data selection unit 23, and a summary text data storage unit 24. Different from the text summarization apparatus 100 of the first embodiment. Since the other configuration is the same as that of the first embodiment, the same reference numerals as those in FIG.

解析部２１は、テキストデータ入力部１に入力されたテキストデータの文章解析（言語解析）を行う。つまり、解析部２１は、テキストデータを部分テキストデータに分割する。 The analysis unit 21 performs sentence analysis (language analysis) of the text data input to the text data input unit 1. That is, the analysis unit 21 divides the text data into partial text data.

重要度付与部２２は、解析部２１にて文章解析された部分テキストデータに対して、重要度記憶部３に記憶される重要度を用いて、解析部２１にて生成された部分テキストデータに重要度を付与する。 The importance level assigning unit 22 applies the importance level stored in the importance level storage unit 3 to the partial text data generated by the analysis unit 21 with respect to the partial text data analyzed by the analysis unit 21. Give importance.

部分テキストデータ選択部２３は、重要度付与部２２にて重要度が付与された各部分テキストデータのうち、要約度変更部４から入力された要約度よりも高い重要度を持つ部分テキストデータを選択して要約テキストデータに含める。 The partial text data selection unit 23 selects partial text data having an importance level higher than the summary level input from the summary level change unit 4 among the partial text data levels assigned with the importance level by the importance level assignment unit 22. Select to include in summary text data.

要約テキストデータ記憶部２４は、部分テキストデータ選択部２３により選択された部分テキストデータから構成される要約テキストデータを記憶する。 The summary text data storage unit 24 stores summary text data composed of partial text data selected by the partial text data selection unit 23.

次に、実施の形態２においてテキストデータが入力されてから部分テキストデータの履歴情報が記憶されるまでの動作について説明する。図９は入力テキスト１０１の例を示す図である。以下の説明では、部分テキストは文節単位で表されるものとして説明を行うが、実施の形態１でも説明したとおり、部分テキストは文単位であってもよいし、単語単位であってもよい。 Next, the operation from the input of text data until the history information of partial text data is stored in the second embodiment will be described. FIG. 9 is a diagram illustrating an example of the input text 101. In the following description, the partial text is described as being expressed in phrase units. However, as described in the first embodiment, the partial text may be in sentence units or in word units.

図１０は実施の形態２に係る解析部２１の動作例を示すフローチャートである。解析部２１は、まず、テキストデータ入力部１に入力された図９に示すテキストを文に分割する（ステップＳＴ２１）。文への分割は、例えば句点で分割することにより実現可能である。 FIG. 10 is a flowchart showing an operation example of the analysis unit 21 according to the second embodiment. First, the analysis unit 21 divides the text shown in FIG. 9 input to the text data input unit 1 into sentences (step ST21). The division into sentences can be realized, for example, by dividing at a punctuation point.

次に、解析部２１は、文に分割したテキストをさらに文節単位に分割する（ステップＳＴ２２）。文から文節への分割は、例えばＫＮＰ、ＣａｂｏＣｈａ等の構文解析器を用いればよい。なお、構文解析器とは、文がどのような構造から成っているのかを解析し、その構造を出力するものである。 Next, the analysis unit 21 further divides the text divided into sentences into segment units (step ST22). For example, a syntax analyzer such as KNP or CaboCha may be used to divide sentences into phrases. The syntax analyzer analyzes what structure a sentence is composed of and outputs the structure.

続いて、解析部２１は、文節に分割したテキストをさらに単語単位に分割する（ステップＳＴ２３）。文節から単語への分割には、例えばＭｅｃａｂ等の形態素解析器を用いればよい。なお、形態素解析器とは、文がどのような単語、品詞から成っているかを解析するものである。 Subsequently, the analysis unit 21 further divides the text divided into phrases into word units (step ST23). For division into phrases from words, for example, a morphological analyzer such as Mecab may be used. Note that the morphological analyzer analyzes what words and parts of speech a sentence consists of.

解析部２１は、言語解析の結果として、解析結果テキストを作成する。図１１は実施の形態２に係る解析結果テキスト１０２の例を示す図である。図１１に示すように、入力テキスト１０１は、解析部２１による解析の結果、「新型/ロケット/「/イプシロン/」・・・」のように、単語単位に分割される。図１１において、「/」は単語の区切れを表し、「//」は文節の区切れを表し、「///」は文の区切れを表す。 The analysis unit 21 creates an analysis result text as a result of language analysis. FIG. 11 is a diagram illustrating an example of the analysis result text 102 according to the second embodiment. As shown in FIG. 11, the input text 101 is divided into word units as “new model / rocket /“ / epsilon / ”...” As a result of analysis by the analysis unit 21. In FIG. 11, “/” represents a word break, “//” represents a sentence break, and “///” represents a sentence break.

図１２は実施の形態２に係る重要度付与部２２の動作例を示すフローチャートである。重要度付与部２２には、まず、解析部２１によって単語単位に分割された解析済みテキストのデータが入力される（ステップＳＴ３１）。 FIG. 12 is a flowchart showing an operation example of the importance level assigning unit 22 according to the second embodiment. First, analyzed text data divided into word units by the analysis unit 21 is input to the importance level assigning unit 22 (step ST31).

次に、重要度付与部２２は、重要度記憶部３に記憶される重要度テーブルの重要度を用いて、解析部２１に分割された各単語に重要度を付与する（ステップＳＴ３２）。図１３は重要度テーブル１０３の例を示す図である。図１３の例では、重要度記憶部３には、単語に対応して重要度が記憶されている。例えば単語「新型」については重要度１５、単語「ラーメン」については重要度２、のように与えられている。 Next, the importance assigning unit 22 assigns importance to each word divided by the analysis unit 21 using the importance of the importance table stored in the importance storage unit 3 (step ST32). FIG. 13 is a diagram showing an example of the importance level table 103. In the example of FIG. 13, importance is stored in the importance storage unit 3 in correspondence with words. For example, the importance level 15 is given for the word “new type”, and the importance level 2 is given for the word “ramen”.

重要度付与部２２は、解析結果テキスト１０２に重要度を付与することにより、重要度付きテキストを作成する。図１４は実施の形態２に係る重要度付きテキスト１０４の例を示す図である。図１４に示すように、１つ目の文節「新型ロケット「イプシロン」初号機が」の重要度は、単語「新型」、「ロケット」、「「」、「イプシロン」、「」」、「初号」、「機」、「が」にそれぞれ付与された重要度の和を単語数で割って正規化した値０．７となる。同様に、２つ目の文節「１４日午後２時、」の重要度は０．２、３つ目の文節「鹿児島県肝付町の宇宙航空研究開発機構内乃浦宇宙空間観測所で」の重要度は「０．４」、４つ目の文節「打ち上げられた」の重要度は１．０となる。このように、重要度付与部２２は、部分テキストとしての文節に対して重要度を付与する。 The importance level assigning unit 22 creates a text with importance level by giving importance to the analysis result text 102. FIG. 14 is a diagram showing an example of the text 104 with importance according to the second embodiment. As shown in FIG. 14, the importance of the first phrase “New rocket“ Epsilon ”is the first” is the words “New”, “Rocket”, ““ ”,“ Epsilon ”,“ ””, “First” The sum of the importance given to each of “No.”, “Machine”, and “Ga” is divided by the number of words and normalized to 0.7. Similarly, the importance of the second phrase “2pm on the 14th” is 0.2, and the third phrase “at the Uchinoura Space Observatory, Japan Aerospace Exploration Agency in Kappacho, Kagoshima” The importance is “0.4”, and the fourth phrase “launched” has an importance of 1.0. As described above, the importance level assigning unit 22 gives the importance level to the clause as the partial text.

また、重要度付与部２２は、各文節の重要度の和を文節数で割って正規化することにより、文「新型ロケット「イプシロン」初号機が１４日午後２時、鹿児島県肝付町の宇宙航空研究開発機構内乃浦宇宙空間観測所で打ち上げられた」の重要度を０．６と求めることができる。なお、ここでは重要度を正規化して求めたが、これに限定されない。 In addition, the importance level assigning unit 22 normalizes the sum of the importance levels of each clause by dividing the number of clauses, so that the first sentence of the new rocket “Epsilon” is 2:00 pm The importance of “launched at the Ueno Space Research Station, Japan Aerospace Exploration Agency” can be calculated as 0.6. Although the importance is obtained here by normalization, the present invention is not limited to this.

図１５は実施の形態２に係る部分テキストデータ選択部２３の動作例を示すフローチャートである。部分テキストデータ選択部２３は、まず、要約度変更部４から入力された文の要約度と、文の重要度とを比較する（ステップＳＴ４１）。テキストデータ選択部２３は、比較の結果、文の重要度が文の要約度以上であれば（ステップＳＴ４１−Ｙｅｓ）、ステップＳＴ４２の処理を行う一方で、文の重要度が文の要約度よりも小さければ（ステップＳＴ４１−Ｎｏ）、ステップＳＴ４６の処理に移る。ここでは文の要約度は０．５とする。そうすると、図１４の例では、１つ目の文の重要度は０．６であり文の要約度以上となるので、ステップＳＴ４２の処理へ移る。なお、要約度については、要約度変更部４から入力された値ではなく、部分テキストデータ選択部２３が予め保有している値であってもよい。 FIG. 15 is a flowchart showing an operation example of the partial text data selection unit 23 according to the second embodiment. The partial text data selection unit 23 first compares the sentence summarization level input from the summarization level changing unit 4 with the sentence importance level (step ST41). As a result of the comparison, if the importance of the sentence is equal to or higher than the sentence summarization (step ST41-Yes), the text data selection unit 23 performs the process of step ST42, while the sentence importance is higher than the sentence summarization degree. Is smaller (step ST41-No), the process proceeds to step ST46. Here, the sentence summarization level is 0.5. Then, in the example of FIG. 14, since the importance level of the first sentence is 0.6, which is equal to or higher than the sentence summarization level, the process proceeds to step ST42. Note that the summarization level may be a value held in advance by the partial text data selection unit 23 instead of the value input from the summarization level changing unit 4.

次に、部分テキストデータ選択部２３は、ステップＳＴ４１で選択された文に含まれる文節の重要度と、文節の要約度とを比較する（ステップＳＴ４２）。文節の要約度は文の要約度と同じ値であってもよいし、異なる値として文の要約度とは別に設定されていてもよい。ここでは文節の要約度は文の要約度と同じ０．５とする。図１４の例では、１つ目の文節「新型ロケット「イプシロン」初号機が」の重要度は０．６で要約度以上であるので（ステップＳＴ４３−Ｙｅｓ）、１つ目の文節の内容に該当する部分テキストデータが要約テキストデータ記憶部２４に記憶され（ステップＳＴ４３）、１つ目の文節の内容に該当する部分テキストデータとその出現回数が履歴情報としてテキスト履歴データ記憶部５に記憶される（ステップＳＴ４４）。 Next, the partial text data selection unit 23 compares the importance level of the phrase included in the sentence selected in step ST41 with the summary level of the phrase (step ST42). The phrase summarization level may be the same value as the sentence summarization level, or may be set differently from the sentence summarization level. Here, the summarization level of the clause is 0.5, which is the same as the summarization level of the sentence. In the example of FIG. 14, the importance of the first phrase “new rocket“ Epsilon ”first aircraft” is 0.6, which is higher than the summarization level (step ST43—Yes). The corresponding partial text data is stored in the summary text data storage unit 24 (step ST43), and the partial text data corresponding to the content of the first phrase and the number of appearances thereof are stored in the text history data storage unit 5 as history information. (Step ST44).

部分テキストデータ選択部２３による重要度と要約度との比較対象である文が最後である場合、処理が終了する（ステップＳＴ４６−Ｙｅｓ）。ここでは最後の文ではないため（ステップＳＴ４６−Ｎｏ）、次の文節に処理が移る（ステップＳＴ４５）。 If the sentence to be compared between the importance level and the summary level by the partial text data selection unit 23 is the last, the process ends (step ST46—Yes). Here, since it is not the last sentence (step ST46-No), a process transfers to the following clause (step ST45).

次に部分テキストデータ選択部２３は、２つ目の文節「１４日午後２時、」の重要度と文節の要約度とを比較する（ステップＳＴ４２）。２つ目の文節の重要度は０．２であり文節の要約度よりも小さいので（ステップＳＴ４２−Ｎｏ）、次の文節に処理が移る（ステップＳＴ４５）。３つ目の文節「鹿児島県肝付町の宇宙航空研究開発機構内之浦宇宙空間観測所で」についても重要度が０．４で文節の要約度よりも小さいので、２つ目の文節と同様の処理がなされる。４つ目の文節「打ち上げられた。」の重要度は１．０で文節の要約度以上であるので、１つ目の文節と同様の処理がなされる。 Next, the partial text data selection unit 23 compares the importance level of the second phrase “2pm on the 14th” with the summary level of the phrase (step ST42). Since the importance level of the second clause is 0.2, which is smaller than the summary level of the clause (step ST42-No), the processing moves to the next clause (step ST45). The third phrase “At Uchinoura Space Observatory of the Japan Aerospace Exploration Agency in Katsukijima, Kagoshima Prefecture” also has an importance of 0.4, which is smaller than the summarization of the phrase, so the same processing as the second phrase Is made. Since the importance level of the fourth phrase “Launched” is 1.0, which is equal to or higher than the summary level of the phrase, the same processing as the first phrase is performed.

２つ目の文以降についても１つ目の文と同様の処理がなされ、入力テキスト内の全ての文について同様の処理がなされると（ステップＳＴ４６−Ｙｅｓ）、部分テキストデータ選択部２３の処理は終了する。 For the second sentence and after, the same process as the first sentence is performed, and if the same process is performed for all sentences in the input text (step ST46-Yes), the process of the partial text data selection unit 23 is performed. Ends.

部分テキストデータ選択部２３による部分テキスト選択処理の結果、要約テキストデータ記憶部２４には要約テキストデータが記憶される。図１６は実施の形態２に係る要約テキスト１０５の例を示す図である。図１６に示すように、要約テキストは、部分テキストデータ選択部２３により選択された部分テキスト、つまり、１つ目の文節「新型ロケット「イプシロン」初号機が」と４つ目の文節「打ち上げられた。」から構成される。 As a result of the partial text selection process by the partial text data selection unit 23, the summary text data is stored in the summary text data storage unit 24. FIG. 16 is a diagram illustrating an example of the summary text 105 according to the second embodiment. As shown in FIG. 16, the summary text is the partial text selected by the partial text data selection unit 23, that is, the first clause “new rocket“ Epsilon ”first machine” and the fourth clause “launched”. It is composed of.

また、部分テキストデータ選択部２３による履歴情報の記録（ステップＳＴ４４）により、テキスト履歴データ記憶部５には部分テキストデータの出現回数がテキスト履歴として記憶される。図１７は実施の形態２に係るテキスト履歴１０６の例を示す図である。図１７に示すように、１つ目の文節内の単語「新型」、「ロケット」、「イプシロン」、「初号」「機」についての出現回数「１」が記憶される。また、４つ目の文節内の単語「打ち上げ」についての出現回数「１」が記憶される。２つ目以降の文についても同様である。 Further, the history information is recorded by the partial text data selection unit 23 (step ST44), and the text history data storage unit 5 stores the number of appearances of the partial text data as the text history. FIG. 17 is a diagram showing an example of the text history 106 according to the second embodiment. As shown in FIG. 17, the number of appearances “1” for the words “new”, “rocket”, “epsilon”, “first issue”, and “machine” in the first phrase is stored. In addition, the number of appearances “1” for the word “launch” in the fourth phrase is stored. The same applies to the second and subsequent sentences.

以上より、実施の形態２によれば、解析部２１により言語解析して分割された部分テキストデータに対して、重要度付与部２２により重要度が付与され、その重要度に基づいて部分テキストデータ選択部２３が部分テキストデータを選択して履歴情報を履歴データ記憶部５に記憶するので、実施の形態１と同様の効果を奏する。また、解析部２１は言語解析を行って入力テキストデータを部分テキストデータに分割するので、文章構造が考慮された部分テキストデータの履歴情報の記憶が可能となる。 As described above, according to the second embodiment, the importance level is given by the importance level assigning unit 22 to the partial text data divided by the language analysis by the analysis unit 21, and the partial text data is based on the importance level. Since the selection unit 23 selects the partial text data and stores the history information in the history data storage unit 5, the same effects as those of the first embodiment are obtained. Moreover, since the analysis unit 21 performs language analysis and divides the input text data into partial text data, it is possible to store history information of the partial text data in which the sentence structure is considered.

図１８は実施の形態２に係るテキスト要約装置の他の構成例を示す図である。図１８に示すように、実施の形態２のテキスト要約装置２１０においても、音声合成部７を備えていてもよい。なお、実施の形態２の音声合成部７は実施の形態１と同様であるので図５と同一の符号を付してその説明を省略する。 FIG. 18 is a diagram showing another configuration example of the text summarizing apparatus according to the second embodiment. As shown in FIG. 18, the text summarization device 210 according to the second embodiment may also include the speech synthesizer 7. Note that since the speech synthesizer 7 of the second embodiment is the same as that of the first embodiment, the same reference numerals as those in FIG.

図１９は実施の形態２に係るテキスト要約装置の他の構成例を示す図である。図１９に示すように、実施の形態２のテキスト要約装置２２０においても、嗜好キーワード抽出部９を備えていてもよい。なお、実施の形態２の嗜好キーワード抽出部９は実施の形態１と同様であるので図６と同一の符号を付してその説明を省略する。 FIG. 19 is a diagram showing another configuration example of the text summarizing apparatus according to the second embodiment. As shown in FIG. 19, the text summarization device 220 of the second embodiment may also include a preference keyword extraction unit 9. Note that the preference keyword extraction unit 9 of the second embodiment is the same as that of the first embodiment, so the same reference numerals as those in FIG.

図２０は実施の形態２に係るテキスト要約装置の他の構成例を示す図である。図２０に示すように、実施の形態２のテキスト要約装置２３０においても、音声認識キーワード抽出部１０を備えていてもよい。なお、実施の形態２の音声認識キーワード抽出部１０は実施の形態１と同様であるので図７と同一の符号を付してその説明を省略する。 FIG. 20 is a diagram showing another configuration example of the text summarizing apparatus according to the second embodiment. As shown in FIG. 20, the text summarization device 230 of the second embodiment may also include the speech recognition keyword extraction unit 10. Note that the speech recognition keyword extraction unit 10 of the second embodiment is the same as that of the first embodiment, and therefore the same reference numerals as those in FIG.

１テキストデータ入力部、２データ処理部、３重要度記憶部、４要約度変更部、５テキスト履歴データ記憶部、６重要度変更部、７音声合成部、８操作履歴記憶部、９嗜好キーワード抽出部、１０音声認識キーワード抽出部、２１解析部、２２重要度付与部、２３部分テキストデータ選択部、２４要約テキストデータ記憶部、１００、１１０、１２０、１３０、２００、２１０、２２０、２３０テキスト要約装置 DESCRIPTION OF SYMBOLS 1 Text data input part, 2 Data processing part, 3 Importance storage part, 4 Summarization degree change part, 5 Text history data storage part, 6 Importance change part, 7 Speech synthesizer, 8 Operation history storage part, 9 Preference keyword Extraction unit, 10 speech recognition keyword extraction unit, 21 analysis unit, 22 importance assigning unit, 23 partial text data selection unit, 24 summary text data storage unit, 100, 110, 120, 130, 200, 210, 220, 230 text Summary device

Claims

A data input part for inputting text data composed of a plurality of partial text data; and
An importance storage unit for storing importance assigned to the partial text data input to the data input unit;
An importance changing unit that changes the importance based on history information of the partial text data included in the text data input in the past;
A data processing unit that extracts one or more partial text data from text data input to the data input unit based on the importance and generates summary text data;
The importance level changing unit increases the importance level corresponding to partial text data included in the text data input in the past when the text data input in the past is text data related to emergency information. A text summarization device.

The importance level changing unit increases the importance level of the partial text data included in the text data input in the past when the information of the text data input in the past is information important to the user, to claim 1, characterized in that when the information of the text data entered into the past is not important information for the user to lower the importance of the partial text data contained in the text data inputted to the past Description text summarization device.

When the text data input in the past is text data related to Web information, the importance level changing unit adds the text data input in the past out of the importance levels stored in the importance level storage unit. The text summarization apparatus according to claim 2 , wherein the importance corresponding to the partial text data included is lowered.

A preference information extraction unit that extracts user preference information as text data from past operation history by the user,
The importance level changing unit corresponds to the importance level corresponding to the partial text data included in the text data related to the preference information extracted by the preference information extraction unit among the importance levels stored in the importance level storage unit. The text summarization apparatus according to claim 2, wherein the text summarization apparatus is made high.

A speech recognition information extraction unit that recognizes external speech and extracts speech recognition information as text data;
The importance level changing unit corresponds to the partial text data included in text data related to the speech recognition information extracted by the speech recognition information extraction unit among the importance levels stored in the importance level storage unit. 5. The text summarization apparatus according to claim 2, wherein the importance is increased.

The data processing unit
An analysis unit that analyzes the text data input to the data input unit and divides the text data into a plurality of partial text data that are sentences, clauses, or words;
An importance level assigning unit that gives the importance level stored in the importance level storage unit to the plurality of partial text data divided by the analysis unit;
A partial text data selection unit that selects the partial text data having a value that is greater than a set value by the importance level assigned by the importance level grant unit from among the plurality of partial text data. The text summarization device according to any one of claims 2 to 5 .

Text summarizing apparatus according to any one of claims 1 to 6, characterized in that it comprises a speech synthesis unit to output the speech synthesis the contents of the short text based on the summary text data generated by the data processing unit.