AU1907300A - Term-length term-frequency method for measuring document similarity and classifying text - Google Patents
Term-length term-frequency method for measuring document similarity and classifying textInfo
- Publication number
- AU1907300A AU1907300A AU19073/00A AU1907300A AU1907300A AU 1907300 A AU1907300 A AU 1907300A AU 19073/00 A AU19073/00 A AU 19073/00A AU 1907300 A AU1907300 A AU 1907300A AU 1907300 A AU1907300 A AU 1907300A
- Authority
- AU
- Australia
- Prior art keywords
- term
- frequency method
- length
- document similarity
- classifying text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US20156998A | 1998-11-30 | 1998-11-30 | |
US09201569 | 1998-11-30 | ||
PCT/US1999/025686 WO2000033215A1 (en) | 1998-11-30 | 1999-11-01 | Term-length term-frequency method for measuring document similarity and classifying text |
Publications (1)
Publication Number | Publication Date |
---|---|
AU1907300A true AU1907300A (en) | 2000-06-19 |
Family
ID=22746357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU19073/00A Abandoned AU1907300A (en) | 1998-11-30 | 1999-11-01 | Term-length term-frequency method for measuring document similarity and classifying text |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU1907300A (en) |
WO (1) | WO2000033215A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105956010A (en) * | 2016-04-20 | 2016-09-21 | 浙江大学 | Distributed information retrieval set selection method based on distributed representation and local ordering |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3573688B2 (en) * | 2000-06-28 | 2004-10-06 | 松下電器産業株式会社 | Similar document search device and related keyword extraction device |
AUPR208000A0 (en) * | 2000-12-15 | 2001-01-11 | 80-20 Software Pty Limited | Method of document searching |
US7412453B2 (en) | 2002-12-30 | 2008-08-12 | International Business Machines Corporation | Document analysis and retrieval |
DE60315947T2 (en) * | 2003-03-27 | 2008-05-21 | Sony Deutschland Gmbh | Method for language modeling |
US7321880B2 (en) | 2003-07-02 | 2008-01-22 | International Business Machines Corporation | Web services access to classification engines |
JP2009516252A (en) * | 2005-11-15 | 2009-04-16 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | How to get a representation of text |
JP5027483B2 (en) * | 2006-11-10 | 2012-09-19 | 富士通株式会社 | Information search apparatus and information search method |
US8244767B2 (en) | 2009-10-09 | 2012-08-14 | Stratify, Inc. | Composite locality sensitive hash based processing of documents |
US9355171B2 (en) | 2009-10-09 | 2016-05-31 | Hewlett Packard Enterprise Development Lp | Clustering of near-duplicate documents |
CN103218435B (en) * | 2013-04-15 | 2017-01-25 | 上海嘉之道企业管理咨询有限公司 | Method and system for clustering Chinese text data |
US8837835B1 (en) * | 2014-01-20 | 2014-09-16 | Array Technology, LLC | Document grouping system |
CN114492446B (en) * | 2022-02-16 | 2023-06-16 | 平安科技(深圳)有限公司 | Legal document processing method and device, electronic equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5748953A (en) * | 1989-06-14 | 1998-05-05 | Hitachi, Ltd. | Document search method wherein stored documents and search queries comprise segmented text data of spaced, nonconsecutive text elements and words segmented by predetermined symbols |
JP3270783B2 (en) * | 1992-09-29 | 2002-04-02 | ゼロックス・コーポレーション | Multiple document search methods |
US5642502A (en) * | 1994-12-06 | 1997-06-24 | University Of Central Florida | Method and system for searching for relevant documents from a text database collection, using statistical ranking, relevancy feedback and small pieces of text |
-
1999
- 1999-11-01 WO PCT/US1999/025686 patent/WO2000033215A1/en active Application Filing
- 1999-11-01 AU AU19073/00A patent/AU1907300A/en not_active Abandoned
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105956010A (en) * | 2016-04-20 | 2016-09-21 | 浙江大学 | Distributed information retrieval set selection method based on distributed representation and local ordering |
CN105956010B (en) * | 2016-04-20 | 2019-03-26 | 浙江大学 | Distributed information retrieval set option method based on distributed characterization and partial ordering |
Also Published As
Publication number | Publication date |
---|---|
WO2000033215A1 (en) | 2000-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2001264928A1 (en) | System and method for automatically classifying text | |
AU4905997A (en) | Management and analysis of document information text | |
EP0996927A4 (en) | Text classification system and method | |
AU1357099A (en) | Method and device for classifying overhead objects | |
AU4698899A (en) | Computer audio reading device providing highlighting of either character or bitmapped based text images | |
AU8887298A (en) | Information processing device and information processing method | |
AU5251196A (en) | Method and system for two-dimensional visualization of an information taxonomy and of text documents based on topical content of the documents | |
AU1133200A (en) | Prescription-controlled data collection system and method | |
GB2345771B (en) | Apparatus for classifying or disambiguating data | |
AUPP764398A0 (en) | Method and apparatus for computing the similarity between images | |
AU4320299A (en) | Methods and apparatuses for processing security documents | |
GB2318439B (en) | Device and method for representing handwriting, and an alphabet therefor | |
AU6420699A (en) | Document facing method and apparatus | |
AU2001275422A1 (en) | Method and system for text analysis | |
AUPP603798A0 (en) | Automated image interpretation and retrieval system | |
AU1095200A (en) | Data exploration system and method | |
AU4043797A (en) | Method and apparatus for processing and determining the orientation of documents | |
AU6401599A (en) | Environmental material ticket reader (emtr) and environmental material ticket (emt) system | |
AU6265999A (en) | Computer curve construction system and method | |
AU4620899A (en) | Electronic file retrieval method and system | |
HK1038087A1 (en) | System and method for searching electronic documents created with optical character recognition. | |
AU2198300A (en) | Improved techniques for spatial representation of data and browsing based on similarity | |
AU2277900A (en) | Method and device for object recognition | |
AU7540996A (en) | Fingerprint characteristic extraction apparatus as well as fingerprint classification apparatus and fingerprint verification apparatus for use with fingerprint characteristic extraction apparatus | |
AU1907300A (en) | Term-length term-frequency method for measuring document similarity and classifying text |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MK6 | Application lapsed section 142(2)(f)/reg. 8.3(3) - pct applic. not entering national phase |