JP2002183175A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2002183175A5 JP2002183175A5 JP2000379770A JP2000379770A JP2002183175A5 JP 2002183175 A5 JP2002183175 A5 JP 2002183175A5 JP 2000379770 A JP2000379770 A JP 2000379770A JP 2000379770 A JP2000379770 A JP 2000379770A JP 2002183175 A5 JP2002183175 A5 JP 2002183175A5
- Authority
- JP
- Japan
- Prior art keywords
- words
- word
- storage medium
- program
- program storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Claims (6)
ファイル記憶装置に記憶される2個以上の文書集合から同時に出現する語の組を抽出するステップと、
前記部分文書集合毎に前記抽出された語の組の中から特徴的な語の組を抽出するステップとを実行させることを特徴とするプログラム記憶媒体。 A program storage medium for storing a program for text mining, wherein the program is stored in a CPU for text mining to extract characteristic information from at least two document sets .
Extracting a set of words appearing simultaneously from two or more document sets stored in a file storage device ;
Program storage medium characterized by and a step of extracting a set of characteristic words from the set of partial document set the extracted word for each.
前記属性に基づいて、ファイル記憶装置に記憶される文書集合を少なくとも2個の部分文書集合に分割するステップと、
前記2個以上の部分文書集合から同時に出現する語の組を抽出するステップと、
前記部分文書集合毎に前記抽出された語の組の中から特徴的な語の組を抽出するするステップとを実行させることを特徴とするプログラム記憶媒体。 A program storage medium for storing a program for text mining, wherein the program is for text mining to extract characteristic information from a set of documents to which at least one attribute is assigned by paying attention to the attribute. To the CPU,
Dividing the document set stored in the file storage device into at least two partial document sets based on the attributes ;
Extracting a set of words that simultaneously appear from the two or more partial document sets ;
Program storage medium characterized by and a step of extracting a set of characteristic words from the set of partial document set the extracted word for each.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000379770A JP2002183175A (en) | 2000-12-08 | 2000-12-08 | Text mining method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000379770A JP2002183175A (en) | 2000-12-08 | 2000-12-08 | Text mining method |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2002183175A JP2002183175A (en) | 2002-06-28 |
JP2002183175A5 true JP2002183175A5 (en) | 2005-07-21 |
Family
ID=18848074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2000379770A Pending JP2002183175A (en) | 2000-12-08 | 2000-12-08 | Text mining method |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP2002183175A (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3831319B2 (en) * | 2002-08-23 | 2006-10-11 | 株式会社東芝 | Text information analysis system and analysis result presentation method |
JP2004178123A (en) * | 2002-11-26 | 2004-06-24 | Hitachi Ltd | Information processor and program for executing information processor |
JP3600611B2 (en) | 2002-12-12 | 2004-12-15 | 本田技研工業株式会社 | Information processing apparatus, information processing method, and information processing program |
US8611676B2 (en) | 2005-07-26 | 2013-12-17 | Sony Corporation | Information processing apparatus, feature extraction method, recording media, and program |
CN101889281B (en) | 2008-03-10 | 2012-10-17 | 松下电器产业株式会社 | Content search device and content search method |
JP5964149B2 (en) * | 2012-06-20 | 2016-08-03 | 株式会社Nttドコモ | Apparatus and program for identifying co-occurrence words |
JP6764973B1 (en) * | 2019-04-25 | 2020-10-07 | みずほ情報総研株式会社 | Related word dictionary creation system, related word dictionary creation method and related word dictionary creation program |
-
2000
- 2000-12-08 JP JP2000379770A patent/JP2002183175A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4189416B2 (en) | Structured document management system and program | |
JP2006172450A5 (en) | ||
CN110083805A (en) | A kind of method and system that Word file is converted to EPUB file | |
CA2430802A1 (en) | Method and system for displaying and linking ink objects with recognized text and objects | |
US20130174024A1 (en) | Method and device for converting document format | |
US20120137207A1 (en) | Systems and methods for converting a pdf file | |
CN104063365B (en) | The method that object is inserted into PDF document | |
CN109344298A (en) | A kind of method and device converting unstructured data to structural data | |
CN105185377A (en) | Voice-based file generation method and device | |
TW201617940A (en) | Compression of cascading style sheet files | |
JP2002183175A5 (en) | ||
JP5950700B2 (en) | Image processing apparatus, image processing method, and program | |
CN107203509A (en) | Title generation method and device | |
US20160180849A1 (en) | Method for producing and recognizing barcode information based on voice, and recording medium | |
JP5618968B2 (en) | Similar page detection device, similar page detection method, and similar page detection program | |
JP2009140411A (en) | Text summarization device and text summarization method | |
CN104866607A (en) | Dongba character interpretation database building method | |
CN109857989A (en) | The font data compression method, apparatus and electronic equipment of pdf document | |
CN107610006A (en) | A kind of intellectual property service management system | |
JPS6154569A (en) | Document poicture processing system | |
TWI645304B (en) | Data extracting method for portable document format file corresponding to credit record of user and personal credit analysis system | |
CN108304401A (en) | E-book searching method and system | |
JP5366729B2 (en) | Semantic relationship information generating apparatus and program | |
JP2000285116A5 (en) | ||
JP2000311170A5 (en) |