CN103582880B - 压缩匹配枚举 - Google Patents
压缩匹配枚举 Download PDFInfo
- Publication number
- CN103582880B CN103582880B CN201180071391.0A CN201180071391A CN103582880B CN 103582880 B CN103582880 B CN 103582880B CN 201180071391 A CN201180071391 A CN 201180071391A CN 103582880 B CN103582880 B CN 103582880B
- Authority
- CN
- China
- Prior art keywords
- node
- trie
- suffix
- data
- suffix array
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3086—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6011—Encoder aspects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US13/152,733 | 2011-06-03 | ||
| US13/152,733 US8493249B2 (en) | 2011-06-03 | 2011-06-03 | Compression match enumeration |
| US13/152733 | 2011-06-03 | ||
| PCT/US2011/055532 WO2012166190A1 (en) | 2011-06-03 | 2011-10-09 | Compression match enumeration |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN103582880A CN103582880A (zh) | 2014-02-12 |
| CN103582880B true CN103582880B (zh) | 2017-05-03 |
Family
ID=47259701
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201180071391.0A Expired - Fee Related CN103582880B (zh) | 2011-06-03 | 2011-10-09 | 压缩匹配枚举 |
Country Status (6)
| Country | Link |
|---|---|
| US (2) | US8493249B2 (enExample) |
| EP (1) | EP2715568A4 (enExample) |
| JP (1) | JP5873925B2 (enExample) |
| KR (2) | KR101865264B1 (enExample) |
| CN (1) | CN103582880B (enExample) |
| WO (1) | WO2012166190A1 (enExample) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8493249B2 (en) | 2011-06-03 | 2013-07-23 | Microsoft Corporation | Compression match enumeration |
| JP5766588B2 (ja) * | 2011-11-16 | 2015-08-19 | クラリオン株式会社 | 検索端末装置、検索サーバ装置、及びセンタ連携型検索システム |
| US9231615B2 (en) | 2012-10-24 | 2016-01-05 | Seagate Technology Llc | Method to shorten hash chains in Lempel-Ziv compression of data with repetitive symbols |
| WO2014117353A1 (en) * | 2013-01-31 | 2014-08-07 | Hewlett-Packard Development Company, L.P. | Incremental update of a shape graph |
| US9760546B2 (en) * | 2013-05-24 | 2017-09-12 | Xerox Corporation | Identifying repeat subsequences by left and right contexts |
| US10565182B2 (en) * | 2015-11-23 | 2020-02-18 | Microsoft Technology Licensing, Llc | Hardware LZMA compressor |
| CN108664459B (zh) * | 2018-03-22 | 2021-09-17 | 佛山市顺德区中山大学研究院 | 一种后缀数组自适应的合并方法及其装置 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4906991A (en) * | 1988-04-29 | 1990-03-06 | Xerox Corporation | Textual substitution data compression with finite length search windows |
| US5406279A (en) * | 1992-09-02 | 1995-04-11 | Cirrus Logic, Inc. | General purpose, hash-based technique for single-pass lossless data compression |
| US7124034B2 (en) * | 1999-12-24 | 2006-10-17 | International Business Machines Corporation | Method for changing a target array, a method for analyzing a structure, and an apparatus, a storage medium and a transmission medium therefor |
| CN1894696A (zh) * | 2003-12-23 | 2007-01-10 | 英特尔公司 | 检测数据流中的模式的方法和装置 |
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5442350A (en) | 1992-10-29 | 1995-08-15 | International Business Machines Corporation | Method and means providing static dictionary structures for compressing character data and expanding compressed data |
| US5978795A (en) | 1997-01-14 | 1999-11-02 | Microsoft Corporation | Temporally ordered binary search method and system |
| KR19990015114A (ko) * | 1997-08-01 | 1999-03-05 | 구자홍 | 문자연결 정보를 이용한 문자인식기 |
| US6047283A (en) | 1998-02-26 | 2000-04-04 | Sap Aktiengesellschaft | Fast string searching and indexing using a search tree having a plurality of linked nodes |
| US6751624B2 (en) * | 2000-04-04 | 2004-06-15 | Globalscape, Inc. | Method and system for conducting a full text search on a client system by a server system |
| JP2002269096A (ja) | 2001-03-08 | 2002-09-20 | Ricoh Co Ltd | 文字列復元方法及びその装置並びに記録媒体 |
| KR100793505B1 (ko) * | 2006-05-30 | 2008-01-14 | 울산대학교 산학협력단 | 복수의 표적 mRNA에 적용 가능한 siRNA염기서열을 추출하는 방법 |
| US7453377B2 (en) | 2006-08-09 | 2008-11-18 | Reti Corporation | Apparatus and methods for searching a pattern in a compressed data |
| US8099415B2 (en) * | 2006-09-08 | 2012-01-17 | Simply Hired, Inc. | Method and apparatus for assessing similarity between online job listings |
| JP4714127B2 (ja) * | 2006-11-27 | 2011-06-29 | 株式会社日立製作所 | 記号列検索方法、プログラムおよび装置ならびにそのトライの生成方法、プログラムおよび装置 |
| JP4439013B2 (ja) | 2007-04-25 | 2010-03-24 | 株式会社エスグランツ | ビット列検索方法及び検索プログラム |
| US8812508B2 (en) | 2007-12-14 | 2014-08-19 | Hewlett-Packard Development Company, L.P. | Systems and methods for extracting phases from text |
| US8676815B2 (en) * | 2008-05-07 | 2014-03-18 | City University Of Hong Kong | Suffix tree similarity measure for document clustering |
| US8108353B2 (en) * | 2008-06-11 | 2012-01-31 | International Business Machines Corporation | Method and apparatus for block size optimization in de-duplication |
| US8515961B2 (en) * | 2010-01-19 | 2013-08-20 | Electronics And Telecommunications Research Institute | Method and apparatus for indexing suffix tree in social network |
| US8493249B2 (en) | 2011-06-03 | 2013-07-23 | Microsoft Corporation | Compression match enumeration |
| TWI443539B (zh) * | 2012-01-06 | 2014-07-01 | Univ Nat Central | 藉由權重字尾樹進行資料分析之方法 |
-
2011
- 2011-06-03 US US13/152,733 patent/US8493249B2/en not_active Expired - Fee Related
- 2011-10-09 CN CN201180071391.0A patent/CN103582880B/zh not_active Expired - Fee Related
- 2011-10-09 WO PCT/US2011/055532 patent/WO2012166190A1/en not_active Ceased
- 2011-10-09 JP JP2014513494A patent/JP5873925B2/ja not_active Expired - Fee Related
- 2011-10-09 KR KR1020137032073A patent/KR101865264B1/ko not_active Expired - Fee Related
- 2011-10-09 EP EP11867008.2A patent/EP2715568A4/en not_active Ceased
- 2011-10-09 KR KR1020187015423A patent/KR101926324B1/ko not_active Expired - Fee Related
-
2013
- 2013-07-19 US US13/946,168 patent/US9065469B2/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4906991A (en) * | 1988-04-29 | 1990-03-06 | Xerox Corporation | Textual substitution data compression with finite length search windows |
| US5406279A (en) * | 1992-09-02 | 1995-04-11 | Cirrus Logic, Inc. | General purpose, hash-based technique for single-pass lossless data compression |
| US7124034B2 (en) * | 1999-12-24 | 2006-10-17 | International Business Machines Corporation | Method for changing a target array, a method for analyzing a structure, and an apparatus, a storage medium and a transmission medium therefor |
| CN1894696A (zh) * | 2003-12-23 | 2007-01-10 | 英特尔公司 | 检测数据流中的模式的方法和装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| US20120306670A1 (en) | 2012-12-06 |
| US9065469B2 (en) | 2015-06-23 |
| KR101865264B1 (ko) | 2018-06-07 |
| KR20180066254A (ko) | 2018-06-18 |
| JP5873925B2 (ja) | 2016-03-01 |
| EP2715568A4 (en) | 2016-01-06 |
| US20130307710A1 (en) | 2013-11-21 |
| CN103582880A (zh) | 2014-02-12 |
| WO2012166190A1 (en) | 2012-12-06 |
| KR101926324B1 (ko) | 2019-02-26 |
| KR20140038441A (ko) | 2014-03-28 |
| JP2014520318A (ja) | 2014-08-21 |
| EP2715568A1 (en) | 2014-04-09 |
| US8493249B2 (en) | 2013-07-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN103582880B (zh) | 压缩匹配枚举 | |
| Gueniche et al. | Compact prediction tree: A lossless model for accurate sequence prediction | |
| US8838551B2 (en) | Multi-level database compression | |
| CN111460311A (zh) | 基于字典树的搜索处理方法、装置、设备和存储介质 | |
| US8659451B2 (en) | Indexing compressed data | |
| US8698657B2 (en) | Methods and systems for compressing and decompressing data | |
| CN107291785A (zh) | 一种数据查找方法及装置 | |
| US20090063465A1 (en) | System and method for string processing and searching using a compressed permuterm index | |
| US20100278446A1 (en) | Structure of hierarchical compressed data structure for tabular data | |
| Ferragina et al. | On the bit-complexity of Lempel--Ziv compression | |
| US9720927B2 (en) | Method and system for database storage management | |
| Yamamoto et al. | Faster compact on-line Lempel-Ziv factorization | |
| Belazzougui et al. | Bidirectional variable-order de Bruijn graphs | |
| JPS6356726B2 (enExample) | ||
| US8996531B1 (en) | Inverted index and inverted list process for storing and retrieving information | |
| US20120110025A1 (en) | Coding order-independent collections of words | |
| JP5544998B2 (ja) | テキスト処理装置、テキスト処理方法、およびテキスト処理プログラム | |
| Vey | Differential direct coding: a compression algorithm for nucleotide sequence data | |
| CN115617818B (zh) | 区块链中的mpt树批量更新方法、电子设备及存储介质 | |
| JP5939259B2 (ja) | 照合制御プログラム、照合制御装置および照合制御方法 | |
| CN114443866B (zh) | 数据处理方法、装置、计算设备及介质 | |
| Hoang et al. | Dictionary selection using partial matching | |
| CN110807092A (zh) | 数据处理方法及装置 | |
| CN110545108B (zh) | 数据处理方法、装置、电子设备及计算机可读存储介质 | |
| JP5521064B1 (ja) | Id付与装置、方法、及びプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| ASS | Succession or assignment of patent right |
Owner name: MICROSOFT TECHNOLOGY LICENSING LLC Free format text: FORMER OWNER: MICROSOFT CORP. Effective date: 20150623 |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20150623 Address after: Washington State Applicant after: MICROSOFT TECHNOLOGY LICENSING, LLC Address before: Washington State Applicant before: Microsoft Corp. |
|
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170503 |