JP2018147411A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2018147411A5 JP2018147411A5 JP2017044433A JP2017044433A JP2018147411A5 JP 2018147411 A5 JP2018147411 A5 JP 2018147411A5 JP 2017044433 A JP2017044433 A JP 2017044433A JP 2017044433 A JP2017044433 A JP 2017044433A JP 2018147411 A5 JP2018147411 A5 JP 2018147411A5
- Authority
- JP
- Japan
- Prior art keywords
- sentence
- posted
- noun
- short
- data processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 claims description 11
- 238000003672 processing method Methods 0.000 claims description 7
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims 1
- 238000000605 extraction Methods 0.000 claims 1
- 238000000034 method Methods 0.000 description 2
Description
本発明に係るデータ処理方法は、1又は複数のデータ処理装置を用いて短文を自動作成する方法であって、前記データ処理装置により、インターネットを介して収集した任意の事象に関する投稿文章群の各投稿文を解析し、前記投稿文に含まれる単語を出現頻度で順位付けする文章解析工程と、前記単語の順位データに基づいて前記投稿文章群に関する短文を作成する短文作成工程とを行う。
本発明のデータ処理方法では、前記文章解析工程で、前記投稿文毎に品詞分解する工程と、品詞分解された単語から名詞を抽出する工程と、抽出された名詞を出現頻度で順位付けする工程とを実施し、前記短文作成工程で前記名詞の順位データに基づいて前記短文を作成してもよい。
また、前記短文作成工程では、例えば、前記単語の順位データに基づき、前記投稿文章群の各投稿文に含まれる名詞の中から特定の内容を表す名詞を選定する工程と、選定された名詞を用いて短文を作成する工程を行うことができ、その場合、前記名詞を選定する工程において順位が上位の名詞を優先して選定すればよい。
Data processing method according to the present invention is a method for automatically creating an SMS using one or more data processing device, by said data processor, each of the posts sentence group for any events collected through the Internet A sentence analyzing step of analyzing a posted sentence and ranking the words included in the posted sentence by the frequency of appearance, and a short sentence creating step of creating a short sentence related to the posted sentence group based on the ranking data of the words are performed.
In the data processing method according to the present invention, in the sentence analysis step, a step of performing part-of-speech decomposition for each of the posted sentences, a step of extracting a noun from a part-of-speech-decomposed word, and a step of ranking the extracted nouns by appearance frequency And the short sentence may be created based on the rank data of the noun in the short sentence creating step.
In the short sentence creating step, for example, a step of selecting a noun representing a specific content from nouns included in each posted sentence of the posted sentence group based on the ranking data of the word, A short sentence can be created using the noun, and in that case, the noun in the higher rank in the step of selecting the noun may be preferentially selected.
[動作]
次に、本実施形態のデータ処理装置1の動作、即ち、データ処理装置1を用いて投稿文章群に関する短文を自動作成する方法について説明する。本実施形態のデータ処理方法は、1又は複数のデータ処理装置1により、文章解析工程と、短文作成工程とを行い、投稿文章群に関する短文を自動作成する。図2は本実施形態のデータ処理方法を示すフローチャートであり、図3は順位データの例である。
[motion]
Next, an operation of the data processing apparatus 1 of the present embodiment, that is, a method of automatically creating a short sentence related to a group of posted texts using the data processing apparatus 1 will be described. In the data processing method according to the present embodiment, a sentence analysis step and a short sentence creation step are performed by one or a plurality of data processing apparatuses 1 to automatically create a short sentence relating to a group of posted sentences. FIG. 2 is a flowchart illustrating a data processing method according to the present embodiment, and FIG. 3 is an example of rank data.
[文章解析工程]
文章解析工程は、主にデータ処理装置1の記事解析部2で行われ、インターネットを介して収集した任意の事象に関する投稿文章群の各投稿文を解析し、投稿文に含まれる単語を出現頻度で順位付けする。単語の抽出及び順位付けの方法は、特に限定されるものではないが、例えば、図2に示すように、投稿文毎に品詞分解した後(ステップS1)、品詞分解された単語から名詞を抽出し(ステップS2)、抽出された名詞を出現頻度で順位付けする(ステップS3)ことにより、実施することができる。
[Sentence analysis process]
The sentence analysis process is mainly performed by the article analysis unit 2 of the data processing device 1, analyzes each posted sentence of a posted sentence group regarding an arbitrary event collected via the Internet, and determines a word included in the posted sentence by an appearance frequency. Rank in. The method of extracting and ranking the words is not particularly limited. For example, as shown in FIG. 2, after the parts of speech are decomposed for each posted sentence (step S1), the nouns are extracted from the words that have been decomposed. (Step S2), and the extracted nouns are ranked according to the frequency of appearance (Step S3).
Claims (13)
前記単語の順位データに基づいて前記投稿文章群に関する短文を作成する短文作成部と、
を有するデータ処理装置。 A sentence analysis unit that analyzes each posted sentence of a posted sentence group regarding an arbitrary event collected via the Internet, and ranks words included in the posted sentence by appearance frequency,
A short sentence creating unit that creates a short sentence related to the post sentence group based on the ranking data of the words,
A data processing device having:
前記投稿文毎に品詞分解する品詞分解部と、
品詞分解された単語から名詞を抽出する名詞抽出部と、
抽出された名詞を出現頻度で順位付けする名詞カウント部と
を備え、
前記短文作成部は前記名詞の順位データに基づいて前記短文を作成する請求項1に記載のデータ処理装置。 The sentence analysis unit,
A part-of-speech decomposition unit that performs part-of-speech decomposition for each post sentence,
A noun extraction unit that extracts a noun from the part-of-speech word,
A noun counting unit that ranks the extracted nouns by appearance frequency,
The data processing device according to claim 1, wherein the short sentence creating unit creates the short sentence based on rank data of the noun.
前記単語の順位データに基づき、前記投稿文章群の各投稿文に含まれる名詞の中から特定の内容を表す名詞を選定する単語選定部と、
前記単語選定部で選定された名詞を用いて短文を作成する短文生成部と
を備え、
前記単語選定部は順位が上位の名詞を優先して選定する請求項1又は2に記載のデータ処理装置。 The short sentence creation unit,
A word selecting unit that selects a noun representing a specific content from nouns included in each posted sentence of the posted sentence group, based on the ranking data of the word;
A short sentence generation unit that creates a short sentence using the noun selected by the word selection unit,
The data processing device according to claim 1, wherein the word selection unit selects a noun having a higher rank in priority.
前記データ処理装置により、
インターネットを介して収集した任意の事象に関する投稿文章群の各投稿文を解析し、前記投稿文に含まれる単語を出現頻度で順位付けする文章解析工程と、
前記単語の順位データに基づいて前記投稿文章群に関する短文を作成する短文作成工程と、
を行うデータ処理方法。 A method for automatically creating a short sentence using one or more data processing devices,
With the data processing device,
A sentence analysis step of analyzing each posted sentence of the posted sentence group regarding any event collected via the Internet, and ranking the words included in the posted sentence by the frequency of appearance,
A short sentence creating step of creating a short sentence related to the post sentence group based on the ranking data of the words,
Data processing method to do.
前記投稿文毎に品詞分解する工程と、
品詞分解された単語から名詞を抽出する工程と、
抽出された名詞を出現頻度で順位付けする工程と
を有し、
前記短文作成工程は前記名詞の順位データに基づいて前記短文を作成する請求項7に記載のデータ処理方法。 The sentence analysis step includes:
A step of decomposing part of speech for each post sentence,
Extracting a noun from the part-of-speech decomposed word;
Ranking the extracted nouns by appearance frequency,
8. The data processing method according to claim 7, wherein the short sentence creating step creates the short sentence based on rank data of the noun.
前記単語の順位データに基づき、前記投稿文章群の各投稿文に含まれる名詞の中から特定の内容を表す名詞を選定する工程と、
選定された名詞を用いて短文を作成する工程と
を有し、
前記名詞を選定する工程では、順位が上位の名詞を優先して選定する請求項7又は8に記載のデータ処理方法。 The short sentence creation step includes:
A step of selecting a noun representing a specific content from nouns included in each posted sentence of the posted sentence group based on the ranking data of the words;
Creating a short sentence using the selected noun,
9. The data processing method according to claim 7, wherein in the step of selecting the noun, the noun having a higher rank is preferentially selected.
前記単語の順位データに基づいて前記投稿文章群に関する短文を作成する短文作成装置と、
を有するデータ処理システム。 A sentence analyzing apparatus that analyzes each posted sentence of a posted sentence group related to an arbitrary event collected via the Internet, and ranks words included in the posted sentence by appearance frequency,
A short sentence creating apparatus that creates a short sentence related to the post sentence group based on the ranking data of the words,
A data processing system having:
前記文章解析装置は、前記投稿文分類装置で作成された投稿文章群を解析する請求項10に記載のデータ処理システム。 A posted sentence classification device that classifies a plurality of posted sentences collected via the Internet and creates a posted sentence group for each event,
The data processing system according to claim 10, wherein the sentence analysis device analyzes a posted sentence group created by the posted sentence classification device.
インターネットを介して収集した任意の事象に関する投稿文章群の各投稿文を解析し、前記投稿文に含まれる単語を出現頻度で順位付けする文章解析機能と、
前記単語の順位データに基づいて前記投稿文章群に関する短文を自動作成する短文作成機能と
を実行させるプログラム。
を実行させるプログラム。 On the computer,
A sentence analysis function that analyzes each posted sentence of a posted sentence group related to any event collected via the Internet, and ranks words included in the posted sentence by frequency of appearance,
A program for executing a short sentence creation function for automatically creating a short sentence related to the post sentence group based on the ranking data of the words.
A program that executes
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017044433A JP7078244B2 (en) | 2017-03-08 | 2017-03-08 | Data processing equipment, data processing methods, data processing systems and programs |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017044433A JP7078244B2 (en) | 2017-03-08 | 2017-03-08 | Data processing equipment, data processing methods, data processing systems and programs |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2018147411A JP2018147411A (en) | 2018-09-20 |
JP2018147411A5 true JP2018147411A5 (en) | 2020-02-06 |
JP7078244B2 JP7078244B2 (en) | 2022-05-31 |
Family
ID=63592194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2017044433A Active JP7078244B2 (en) | 2017-03-08 | 2017-03-08 | Data processing equipment, data processing methods, data processing systems and programs |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP7078244B2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7333931B2 (en) * | 2019-02-08 | 2023-08-28 | 憲一 坂 | Post analysis system, post analysis device and post analysis method |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10134066A (en) * | 1996-10-29 | 1998-05-22 | Matsushita Electric Ind Co Ltd | Sentence summarizing up device |
JP2002163277A (en) * | 2000-11-28 | 2002-06-07 | Auto Network Gijutsu Kenkyusho:Kk | Document information supply system, information terminal unit and document information supply method |
JP2005250648A (en) * | 2004-03-02 | 2005-09-15 | Fuji Xerox Co Ltd | Article summarizing device and news distributing device |
KR100810999B1 (en) * | 2006-06-30 | 2008-03-11 | 엔에이치엔(주) | On-line e mail service system, and service method thereof |
CN101296128A (en) * | 2007-04-24 | 2008-10-29 | 北京大学 | Method for monitoring abnormal state of internet information |
KR101196935B1 (en) * | 2010-07-05 | 2012-11-05 | 엔에이치엔(주) | Method and system for providing reprsentation words of real-time popular keyword |
JP6225012B2 (en) * | 2013-07-31 | 2017-11-01 | 日本電信電話株式会社 | Utterance sentence generation apparatus, method and program thereof |
JP2015103101A (en) * | 2013-11-26 | 2015-06-04 | 日本電信電話株式会社 | Text summarization device, method, and program |
WO2016027364A1 (en) * | 2014-08-22 | 2016-02-25 | 株式会社日立製作所 | Topic cluster selection device, and search method |
JP2016110213A (en) * | 2014-12-02 | 2016-06-20 | シャープ株式会社 | Information processing device, information processing system, terminal device, information processing method, and information processing program |
-
2017
- 2017-03-08 JP JP2017044433A patent/JP7078244B2/en active Active
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9697477B2 (en) | Non-factoid question-answering system and computer program | |
EP3016002A1 (en) | Non-factoid question-and-answer system and method | |
US10366154B2 (en) | Information processing device, information processing method, and computer program product | |
JP2006344211A5 (en) | ||
JP6529761B2 (en) | Topic providing system and conversation control terminal device | |
CN110321553A (en) | Short text subject identifying method, device and computer readable storage medium | |
JP2006293767A (en) | Sentence categorizing device, sentence categorizing method, and categorization dictionary creating device | |
Atagün et al. | Topic modeling using LDA and BERT techniques: Teknofest example | |
CN109657181A (en) | Internet information chain type storage method, device, computer equipment and storage medium | |
WO2022134779A1 (en) | Method, apparatus and device for extracting character action related data, and storage medium | |
CN105956181A (en) | Searching method and apparatus | |
JP2018147411A5 (en) | ||
JP2006134183A (en) | Information classification method, system and program, and storage medium with program stored | |
CN113919305A (en) | Document generation method and device and computer readable storage medium | |
CN108415959B (en) | Text classification method and device | |
JP2008021139A (en) | Model construction apparatus for semantic tagging, semantic tagging apparatus, and computer program | |
JP2008065468A (en) | Device for multiple-classifying text, method for multiple-classifying text, program and storage medium | |
JP7078244B2 (en) | Data processing equipment, data processing methods, data processing systems and programs | |
JP5491446B2 (en) | Topic word acquisition apparatus, method, and program | |
CN104978375B (en) | A kind of language material filter method and device | |
KR102372629B1 (en) | Triple Extraction method using Pointer Network and the extraction apparatus | |
JP7171352B2 (en) | Workshop support system and workshop support method | |
KR20130113000A (en) | Apparatus for language processing and method thereof | |
CN109284364B (en) | Interactive vocabulary updating method and device for voice microphone-connecting interaction | |
CN111291186A (en) | Context mining method and device based on clustering algorithm and electronic equipment |