TW200630825A - Method to automatically summarize chinese digital documents - Google Patents
Method to automatically summarize chinese digital documentsInfo
- Publication number
- TW200630825A TW200630825A TW094105191A TW94105191A TW200630825A TW 200630825 A TW200630825 A TW 200630825A TW 094105191 A TW094105191 A TW 094105191A TW 94105191 A TW94105191 A TW 94105191A TW 200630825 A TW200630825 A TW 200630825A
- Authority
- TW
- Taiwan
- Prior art keywords
- document
- digital documents
- similarity
- importance
- words
- Prior art date
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Document Processing Apparatus (AREA)
Abstract
A method is developed to automatically summarize Chinese digital documents. It evaluates the importance of each sentence in the digital document, finds out the similarity between the document title and each of its sentences, and combines the title with the reduced sentence to become a summary candidate with its number of characters below an assigned limit. Finally, according to the ratio of the number of characters and the similarity, it sorts those summary candidates and provides the result for the user to select. This invention simultaneously considers factors including the number of words after summarization, the importance covered by the content, the readability, and the coherence of the meaning. It effectively represents the key points of original document within limited number of words and greatly reduces the human labor burden.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW94105191A TWI288335B (en) | 2005-02-22 | 2005-02-22 | Method to automatically summarize Chinese digital documents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW94105191A TWI288335B (en) | 2005-02-22 | 2005-02-22 | Method to automatically summarize Chinese digital documents |
Publications (2)
Publication Number | Publication Date |
---|---|
TW200630825A true TW200630825A (en) | 2006-09-01 |
TWI288335B TWI288335B (en) | 2007-10-11 |
Family
ID=39202987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW94105191A TWI288335B (en) | 2005-02-22 | 2005-02-22 | Method to automatically summarize Chinese digital documents |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI288335B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI493364B (en) * | 2013-05-23 | 2015-07-21 | Loremaster Tech Inc | Management methods for subjective comments of articles, and related devices and computer program products |
CN110837556A (en) * | 2019-10-30 | 2020-02-25 | 深圳价值在线信息科技股份有限公司 | Abstract generation method and device, terminal equipment and storage medium |
-
2005
- 2005-02-22 TW TW94105191A patent/TWI288335B/en active
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI493364B (en) * | 2013-05-23 | 2015-07-21 | Loremaster Tech Inc | Management methods for subjective comments of articles, and related devices and computer program products |
CN110837556A (en) * | 2019-10-30 | 2020-02-25 | 深圳价值在线信息科技股份有限公司 | Abstract generation method and device, terminal equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
TWI288335B (en) | 2007-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Piotrowski | Natural language processing for historical texts | |
CN106407236B (en) | A kind of emotion tendency detection method towards comment data | |
Topkara et al. | Information hiding through errors: a confusing approach | |
CN106407235B (en) | A kind of semantic dictionary construction method based on comment data | |
ATE381053T1 (en) | DATA ENTRY METHOD AND SYSTEM FOR PERSONAL COMPUTER, AND CORRESPONDING COMPUTER-READABLE MEDIUM | |
EP1887451A3 (en) | Data entry method and system for personal computer, and corresponding computer readable medium | |
Diemer | Corpus linguistics with Google | |
CN101038508A (en) | GB phoneticize input method | |
Winters | F. Scott Fitzgerald's Die Schönen und Verdammten: A corpus-based study of loan words and code switches as features of translators' style | |
TW200630825A (en) | Method to automatically summarize chinese digital documents | |
Coleman et al. | Historical dictionaries and historical dictionary research: papers from the International Conference on Historical Lexicography and Lexicology, at the University of Leicester, 2002 | |
CN103020046A (en) | Name transliteration method on the basis of classification of name origin | |
Yadav et al. | Indus script: A study of its sign design | |
Downey et al. | Lexomic analysis of medieval Latin texts | |
JP2009230561A (en) | Example-set-based translation device, method and program, and phrase translation device including the translation device | |
Cook | Lexical coinages in Mandarin Chinese and the problem of classification. | |
Bollmann | Spelling normalization of historical German with sparse training data | |
Church | Dedication to William A. Gale | |
US20040021641A1 (en) | Method for inputting a chinese character with phonetic symbols | |
Fragoulaki | History-(A.) Rengakos and (A.) Tsakmakis Eds. Brill's Companion to Thucydides.(Brill's Companions in Classical Studies). Leiden: Brill, 2006. Pp. xix+ 947.€ 249. 9789004136830. | |
Molina et al. | Evaluation of unambiguous virtual keyboards with character prediction | |
Stewart | A HISTORY OF THE HELLENISTIC PELOPONNESE-(I.) Kralli The Hellenistic Peloponnese: Interstate Relations. A Narrative and Analytic History, from the Fourth Century to 146 bc. Pp. xxxiv+ 556, maps. Swansea: The Classical Press of Wales, 2017. Cased,£ 75. ISBN: 978-1-910589-60-1. | |
Ngonyani | Sentential negation and verb movement in Bantu languages | |
McNamara | Multilingualism in Medieval Britain (c. 1066-1520): Sources and Analysis ed. by Judith A. Jefferson and Ad Putter | |
Everson | Proposal to encode the Wancho script in the UCS |