KR101564385B1 - 근사 문자열 매칭을 위한 아카이브를 관리하는 방법 및 시스템 - Google Patents
근사 문자열 매칭을 위한 아카이브를 관리하는 방법 및 시스템 Download PDFInfo
- Publication number
- KR101564385B1 KR101564385B1 KR1020107017207A KR20107017207A KR101564385B1 KR 101564385 B1 KR101564385 B1 KR 101564385B1 KR 1020107017207 A KR1020107017207 A KR 1020107017207A KR 20107017207 A KR20107017207 A KR 20107017207A KR 101564385 B1 KR101564385 B1 KR 101564385B1
- Authority
- KR
- South Korea
- Prior art keywords
- word
- words
- series
- record
- archive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
- G06F16/90344—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/113—Details of archiving
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3338—Query expansion
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/015,085 US8775441B2 (en) | 2008-01-16 | 2008-01-16 | Managing an archive for approximate string matching |
| US12/015,085 | 2008-01-16 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| KR20100116595A KR20100116595A (ko) | 2010-11-01 |
| KR101564385B1 true KR101564385B1 (ko) | 2015-10-29 |
Family
ID=40851547
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020107017207A Active KR101564385B1 (ko) | 2008-01-16 | 2008-12-30 | 근사 문자열 매칭을 위한 아카이브를 관리하는 방법 및 시스템 |
Country Status (8)
| Country | Link |
|---|---|
| US (2) | US8775441B2 (enExample) |
| EP (1) | EP2235621A4 (enExample) |
| JP (1) | JP5603250B2 (enExample) |
| KR (1) | KR101564385B1 (enExample) |
| CN (2) | CN105373365B (enExample) |
| AU (1) | AU2008348066B2 (enExample) |
| CA (1) | CA2710882C (enExample) |
| WO (1) | WO2009091494A1 (enExample) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20240025272A (ko) | 2022-08-18 | 2024-02-27 | 한국전력공사 | 자연어 처리를 위한 비정형 데이터 기반 근사 질의응답 시스템 및 방법 |
Families Citing this family (74)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7877350B2 (en) * | 2005-06-27 | 2011-01-25 | Ab Initio Technology Llc | Managing metadata for graph-based computations |
| US8572236B2 (en) * | 2006-08-10 | 2013-10-29 | Ab Initio Technology Llc | Distributing services in graph-based computations |
| CN101821721B (zh) | 2007-07-26 | 2017-04-12 | 起元技术有限责任公司 | 具有误差处理的事务型基于图的计算 |
| US8775441B2 (en) | 2008-01-16 | 2014-07-08 | Ab Initio Technology Llc | Managing an archive for approximate string matching |
| WO2009094649A1 (en) * | 2008-01-24 | 2009-07-30 | Sra International, Inc. | System and method for variant string matching |
| US8095773B2 (en) | 2008-02-26 | 2012-01-10 | International Business Machines Corporation | Dynamic address translation with translation exception qualifier |
| KR101491581B1 (ko) * | 2008-04-07 | 2015-02-24 | 삼성전자주식회사 | 철자 오류 보정 시스템 및 방법 |
| US8484215B2 (en) | 2008-10-23 | 2013-07-09 | Ab Initio Technology Llc | Fuzzy data operations |
| US9135396B1 (en) | 2008-12-22 | 2015-09-15 | Amazon Technologies, Inc. | Method and system for determining sets of variant items |
| EP2396724A4 (en) | 2009-02-13 | 2012-12-12 | Ab Initio Technology Llc | TASK EXECUTION MANAGEMENT |
| US9124431B2 (en) * | 2009-05-14 | 2015-09-01 | Microsoft Technology Licensing, Llc | Evidence-based dynamic scoring to limit guesses in knowledge-based authentication |
| US8856879B2 (en) | 2009-05-14 | 2014-10-07 | Microsoft Corporation | Social authentication for account recovery |
| US8667329B2 (en) * | 2009-09-25 | 2014-03-04 | Ab Initio Technology Llc | Processing transactions in graph-based applications |
| CN102792298B (zh) | 2010-01-13 | 2017-03-29 | 起元技术有限责任公司 | 使用特征化匹配的规则来匹配元数据源 |
| CN107066241B (zh) | 2010-06-15 | 2021-03-09 | 起元技术有限责任公司 | 用于动态加载基于图的计算的系统和方法 |
| US9069767B1 (en) | 2010-12-28 | 2015-06-30 | Amazon Technologies, Inc. | Aligning content items to identify differences |
| US8798366B1 (en) * | 2010-12-28 | 2014-08-05 | Amazon Technologies, Inc. | Electronic book pagination |
| US9846688B1 (en) | 2010-12-28 | 2017-12-19 | Amazon Technologies, Inc. | Book version mapping |
| CA2823658C (en) | 2011-01-28 | 2018-03-13 | Ab Initio Technology Llc | Generating data pattern information |
| US9881009B1 (en) | 2011-03-15 | 2018-01-30 | Amazon Technologies, Inc. | Identifying book title sets |
| US9317544B2 (en) | 2011-10-05 | 2016-04-19 | Microsoft Corporation | Integrated fuzzy joins in database management systems |
| CN108388632B (zh) | 2011-11-15 | 2021-11-19 | 起元科技有限公司 | 数据分群、分段、以及并行化 |
| US8788471B2 (en) * | 2012-05-30 | 2014-07-22 | International Business Machines Corporation | Matching transactions in multi-level records |
| US10108521B2 (en) | 2012-11-16 | 2018-10-23 | Ab Initio Technology Llc | Dynamic component performance monitoring |
| US9507682B2 (en) | 2012-11-16 | 2016-11-29 | Ab Initio Technology Llc | Dynamic graph performance monitoring |
| GB2508223A (en) | 2012-11-26 | 2014-05-28 | Ibm | Estimating the size of a joined table in a database |
| GB2508603A (en) | 2012-12-04 | 2014-06-11 | Ibm | Optimizing the order of execution of multiple join operations |
| US9274926B2 (en) | 2013-01-03 | 2016-03-01 | Ab Initio Technology Llc | Configurable testing of computer programs |
| US9063944B2 (en) | 2013-02-21 | 2015-06-23 | International Business Machines Corporation | Match window size for matching multi-level transactions between log files |
| US9317499B2 (en) * | 2013-04-11 | 2016-04-19 | International Business Machines Corporation | Optimizing generation of a regular expression |
| US9146946B2 (en) * | 2013-05-09 | 2015-09-29 | International Business Machines Corporation | Comparing database performance without benchmark workloads |
| US20140350919A1 (en) * | 2013-05-27 | 2014-11-27 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for word counting |
| CN104182383B (zh) * | 2013-05-27 | 2019-01-01 | 腾讯科技(深圳)有限公司 | 一种文字统计方法及设备 |
| US20150046152A1 (en) * | 2013-08-08 | 2015-02-12 | Quryon, Inc. | Determining concept blocks based on context |
| US10043182B1 (en) * | 2013-10-22 | 2018-08-07 | Ondot System, Inc. | System and method for using cardholder context and preferences in transaction authorization |
| JP6626823B2 (ja) | 2013-12-05 | 2019-12-25 | アビニシオ テクノロジー エルエルシー | サブグラフから構成されるデータフローグラフ用のインターフェースの管理 |
| US10521441B2 (en) * | 2014-01-02 | 2019-12-31 | The George Washington University | System and method for approximate searching very large data |
| MY173084A (en) * | 2014-05-23 | 2019-12-25 | Mimos Berhad | Adaptive-window edit distance algorithm computation |
| US9589074B2 (en) | 2014-08-20 | 2017-03-07 | Oracle International Corporation | Multidimensional spatial searching for identifying duplicate crash dumps |
| US10764265B2 (en) * | 2014-09-24 | 2020-09-01 | Ent. Services Development Corporation Lp | Assigning a document to partial membership in communities |
| US9805099B2 (en) * | 2014-10-30 | 2017-10-31 | The Johns Hopkins University | Apparatus and method for efficient identification of code similarity |
| US9679024B2 (en) * | 2014-12-01 | 2017-06-13 | Facebook, Inc. | Social-based spelling correction for online social networks |
| JP2015062146A (ja) * | 2015-01-05 | 2015-04-02 | 富士通株式会社 | 情報生成プログラム、情報生成装置、および情報生成方法 |
| US9646061B2 (en) | 2015-01-22 | 2017-05-09 | International Business Machines Corporation | Distributed fuzzy search and join with edit distance guarantees |
| US20170004120A1 (en) * | 2015-06-30 | 2017-01-05 | Facebook, Inc. | Corrections for natural language processing |
| US9904672B2 (en) | 2015-06-30 | 2018-02-27 | Facebook, Inc. | Machine-translation based corrections |
| US10657134B2 (en) | 2015-08-05 | 2020-05-19 | Ab Initio Technology Llc | Selecting queries for execution on a stream of real-time data |
| US10140200B2 (en) * | 2015-10-15 | 2018-11-27 | King.Dom Ltd. | Data analysis |
| IL242218B (en) * | 2015-10-22 | 2020-11-30 | Verint Systems Ltd | A system and method for maintaining a dynamic dictionary |
| CN105446957B (zh) | 2015-12-03 | 2018-07-20 | 小米科技有限责任公司 | 相似性确定方法、装置及终端 |
| CN108475189B (zh) | 2015-12-21 | 2021-07-09 | 起元技术有限责任公司 | 子图接口生成的方法、系统及计算机可读介质 |
| US9881053B2 (en) * | 2016-05-13 | 2018-01-30 | Maana, Inc. | Machine-assisted object matching |
| US11176180B1 (en) * | 2016-08-09 | 2021-11-16 | American Express Travel Related Services Company, Inc. | Systems and methods for address matching |
| US10228955B2 (en) * | 2016-09-29 | 2019-03-12 | International Business Machines Corporation | Running an application within an application execution environment and preparation of an application for the same |
| US11222253B2 (en) * | 2016-11-03 | 2022-01-11 | Salesforce.Com, Inc. | Deep neural network model for processing data through multiple linguistic task hierarchies |
| US10394960B2 (en) | 2016-12-21 | 2019-08-27 | Facebook, Inc. | Transliteration decoding using a tree structure |
| US10402489B2 (en) | 2016-12-21 | 2019-09-03 | Facebook, Inc. | Transliteration of text entry across scripts |
| US10810380B2 (en) | 2016-12-21 | 2020-10-20 | Facebook, Inc. | Transliteration using machine translation pipeline |
| US11087210B2 (en) * | 2017-08-18 | 2021-08-10 | MyFitnessPal, Inc. | Context and domain sensitive spelling correction in a database |
| US10546062B2 (en) * | 2017-11-15 | 2020-01-28 | International Business Machines Corporation | Phonetic patterns for fuzzy matching in natural language processing |
| US11294943B2 (en) | 2017-12-08 | 2022-04-05 | International Business Machines Corporation | Distributed match and association of entity key-value attribute pairs |
| US11163952B2 (en) * | 2018-07-11 | 2021-11-02 | International Business Machines Corporation | Linked data seeded multi-lingual lexicon extraction |
| WO2020159772A1 (en) * | 2019-01-31 | 2020-08-06 | Optumsoft, Inc. | Approximate matching |
| US11269905B2 (en) * | 2019-06-20 | 2022-03-08 | International Business Machines Corporation | Interaction between visualizations and other data controls in an information system by matching attributes in different datasets |
| US12008141B2 (en) * | 2020-03-31 | 2024-06-11 | Intuit Inc. | Privacy preserving synthetic string generation using recurrent neural networks |
| CN112084771B (zh) * | 2020-07-22 | 2024-06-18 | 浙江工业大学 | 一种基于地址的单字权重统计方法 |
| US11886794B2 (en) * | 2020-10-23 | 2024-01-30 | Saudi Arabian Oil Company | Text scrambling/descrambling |
| US11556593B1 (en) | 2021-07-14 | 2023-01-17 | International Business Machines Corporation | String similarity determination |
| US12019701B2 (en) | 2021-07-27 | 2024-06-25 | International Business Machines Corporation | Computer architecture for string searching |
| US11615243B1 (en) * | 2022-05-27 | 2023-03-28 | Intuit Inc. | Fuzzy string alignment |
| US12406014B2 (en) | 2022-06-13 | 2025-09-02 | Rogda L.L.C. | Method and system for generating location information for an area |
| US12293153B2 (en) * | 2022-08-22 | 2025-05-06 | International Business Machines Corporation | Fuzzy matching of obscure texts with meaningful terms included in a glossary |
| US20240135392A1 (en) * | 2022-10-18 | 2024-04-25 | Pelatro Pte.Ltd. | System and Method for Real Time Scoring, Classification, Assortment, and Contextual Nurturing of Digital Engagements using Numerical, Statistical, and Heuristics-based Techniques |
| WO2025005812A1 (en) * | 2023-06-28 | 2025-01-02 | Xero Limited | Methods and systems for determining sets of second attributes associated with respective first attributes |
Family Cites Families (90)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH02129756A (ja) | 1988-11-10 | 1990-05-17 | Nippon Telegr & Teleph Corp <Ntt> | 単語照合装置 |
| US5179643A (en) * | 1988-12-23 | 1993-01-12 | Hitachi, Ltd. | Method of multi-dimensional analysis and display for a large volume of record information items and a system therefor |
| US5388259A (en) * | 1992-05-15 | 1995-02-07 | Bell Communications Research, Inc. | System for accessing a database with an iterated fuzzy query notified by retrieval response |
| JPH0644309A (ja) | 1992-07-01 | 1994-02-18 | Nec Corp | データベース管理方式 |
| JPH0944518A (ja) | 1995-08-02 | 1997-02-14 | Adoin Kenkyusho:Kk | 画像データベースの構築方法と、画像データベースの検索方法及び検索装置 |
| US5832182A (en) * | 1996-04-24 | 1998-11-03 | Wisconsin Alumni Research Foundation | Method and system for data clustering for very large databases |
| JPH10275159A (ja) | 1997-03-31 | 1998-10-13 | Nippon Telegr & Teleph Corp <Ntt> | 情報検索方法及び装置 |
| JP3466054B2 (ja) | 1997-04-18 | 2003-11-10 | 富士通株式会社 | グループ化と集計演算処理方式 |
| US6026398A (en) * | 1997-10-16 | 2000-02-15 | Imarket, Incorporated | System and methods for searching and matching databases |
| JPH11184884A (ja) | 1997-12-24 | 1999-07-09 | Ntt Data Corp | 同一人判定システムおよび方法 |
| US6581058B1 (en) * | 1998-05-22 | 2003-06-17 | Microsoft Corporation | Scalable system for clustering of large databases having mixed data attributes |
| US6285995B1 (en) | 1998-06-22 | 2001-09-04 | U.S. Philips Corporation | Image retrieval system using a query image |
| US6742003B2 (en) * | 2001-04-30 | 2004-05-25 | Microsoft Corporation | Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications |
| JP2000029899A (ja) | 1998-07-14 | 2000-01-28 | Hitachi Software Eng Co Ltd | 建物と地図とのマッチング方法および記録媒体 |
| US6658626B1 (en) * | 1998-07-31 | 2003-12-02 | The Regents Of The University Of California | User interface for displaying document comparison information |
| US6493709B1 (en) * | 1998-07-31 | 2002-12-10 | The Regents Of The University Of California | Method and apparatus for digitally shredding similar documents within large document sets in a data processing environment |
| US6317707B1 (en) * | 1998-12-07 | 2001-11-13 | At&T Corp. | Automatic clustering of tokens from a corpus for grammar acquisition |
| US7356462B2 (en) * | 2001-07-26 | 2008-04-08 | At&T Corp. | Automatic clustering of tokens from a corpus for grammar acquisition |
| US6456995B1 (en) * | 1998-12-31 | 2002-09-24 | International Business Machines Corporation | System, method and computer program products for ordering objects corresponding to database operations that are performed on a relational database upon completion of a transaction by an object-oriented transaction system |
| WO2001009765A1 (en) * | 1999-08-03 | 2001-02-08 | Compudigm International Limited | Method and system for matching data sets |
| AU1051101A (en) | 1999-10-27 | 2001-05-08 | Zapper Technologies Inc. | Context-driven information retrieval |
| JP2001147930A (ja) | 1999-11-19 | 2001-05-29 | Mitsubishi Electric Corp | 文字列比較方法および文字列比較を用いた情報検索装置 |
| US7328211B2 (en) * | 2000-09-21 | 2008-02-05 | Jpmorgan Chase Bank, N.A. | System and methods for improved linguistic pattern matching |
| DE10048478C2 (de) * | 2000-09-29 | 2003-05-28 | Siemens Ag | Verfahren zum Zugriff auf eine Speichereinheit bei der Suche nach Teilzeichenfolgen |
| US6931390B1 (en) * | 2001-02-27 | 2005-08-16 | Oracle International Corporation | Method and mechanism for database partitioning |
| JP3605052B2 (ja) | 2001-06-20 | 2004-12-22 | 本田技研工業株式会社 | あいまい検索機能を備える図面管理システム |
| US20030033138A1 (en) * | 2001-07-26 | 2003-02-13 | Srinivas Bangalore | Method for partitioning a data set into frequency vectors for clustering |
| US7043647B2 (en) | 2001-09-28 | 2006-05-09 | Hewlett-Packard Development Company, L.P. | Intelligent power management for a rack of servers |
| US6570511B1 (en) * | 2001-10-15 | 2003-05-27 | Unisys Corporation | Data compression method and apparatus implemented with limited length character tables and compact string code utilization |
| US7213025B2 (en) | 2001-10-16 | 2007-05-01 | Ncr Corporation | Partitioned database system |
| US20030120630A1 (en) * | 2001-12-20 | 2003-06-26 | Daniel Tunkelang | Method and system for similarity search and clustering |
| US20040024720A1 (en) * | 2002-02-01 | 2004-02-05 | John Fairweather | System and method for managing knowledge |
| CA2475319A1 (en) * | 2002-02-04 | 2003-08-14 | Cataphora, Inc. | A method and apparatus to visually present discussions for data mining purposes |
| AU2003243533A1 (en) * | 2002-06-12 | 2003-12-31 | Jena Jordahl | Data storage, retrieval, manipulation and display tools enabling multiple hierarchical points of view |
| US6961721B2 (en) * | 2002-06-28 | 2005-11-01 | Microsoft Corporation | Detecting duplicate records in database |
| US20050226511A1 (en) | 2002-08-26 | 2005-10-13 | Short Gordon K | Apparatus and method for organizing and presenting content |
| US7043476B2 (en) * | 2002-10-11 | 2006-05-09 | International Business Machines Corporation | Method and apparatus for data mining to discover associations and covariances associated with data |
| US7392247B2 (en) | 2002-12-06 | 2008-06-24 | International Business Machines Corporation | Method and apparatus for fusing context data |
| US20040139072A1 (en) * | 2003-01-13 | 2004-07-15 | Broder Andrei Z. | System and method for locating similar records in a database |
| US7912842B1 (en) | 2003-02-04 | 2011-03-22 | Lexisnexis Risk Data Management Inc. | Method and system for processing and linking data records |
| US7287019B2 (en) * | 2003-06-04 | 2007-10-23 | Microsoft Corporation | Duplicate data elimination system |
| US20050120011A1 (en) * | 2003-11-26 | 2005-06-02 | Word Data Corp. | Code, method, and system for manipulating texts |
| US7526464B2 (en) * | 2003-11-28 | 2009-04-28 | Manyworlds, Inc. | Adaptive fuzzy network system and method |
| US7283999B1 (en) * | 2003-12-19 | 2007-10-16 | Ncr Corp. | Similarity string filtering |
| US7472113B1 (en) * | 2004-01-26 | 2008-12-30 | Microsoft Corporation | Query preprocessing and pipelining |
| GB0413743D0 (en) * | 2004-06-19 | 2004-07-21 | Ibm | Method and system for approximate string matching |
| US8407239B2 (en) * | 2004-08-13 | 2013-03-26 | Google Inc. | Multi-stage query processing system and method for use with tokenspace repository |
| US7917480B2 (en) * | 2004-08-13 | 2011-03-29 | Google Inc. | Document compression system and method for use with tokenspace repository |
| US20080040342A1 (en) * | 2004-09-07 | 2008-02-14 | Hust Robert M | Data processing apparatus and methods |
| US8725705B2 (en) * | 2004-09-15 | 2014-05-13 | International Business Machines Corporation | Systems and methods for searching of storage data with reduced bandwidth requirements |
| US7523098B2 (en) * | 2004-09-15 | 2009-04-21 | International Business Machines Corporation | Systems and methods for efficient data searching, storage and reduction |
| US7290084B2 (en) * | 2004-11-02 | 2007-10-30 | Integrated Device Technology, Inc. | Fast collision detection for a hashed content addressable memory (CAM) using a random access memory |
| EP1866808A2 (en) * | 2005-03-19 | 2007-12-19 | ActivePrime, Inc. | Systems and methods for manipulation of inexact semi-structured data |
| US9110985B2 (en) * | 2005-05-10 | 2015-08-18 | Neetseer, Inc. | Generating a conceptual association graph from large-scale loosely-grouped content |
| US7584205B2 (en) | 2005-06-27 | 2009-09-01 | Ab Initio Technology Llc | Aggregating data with complex operations |
| US7658880B2 (en) * | 2005-07-29 | 2010-02-09 | Advanced Cardiovascular Systems, Inc. | Polymeric stent polishing method and apparatus |
| US7672833B2 (en) * | 2005-09-22 | 2010-03-02 | Fair Isaac Corporation | Method and apparatus for automatic entity disambiguation |
| US7890533B2 (en) * | 2006-05-17 | 2011-02-15 | Noblis, Inc. | Method and system for information extraction and modeling |
| US8175875B1 (en) * | 2006-05-19 | 2012-05-08 | Google Inc. | Efficient indexing of documents with similar content |
| US7634464B2 (en) | 2006-06-14 | 2009-12-15 | Microsoft Corporation | Designing record matching queries utilizing examples |
| US20080140653A1 (en) * | 2006-12-08 | 2008-06-12 | Matzke Douglas J | Identifying Relationships Among Database Records |
| US7739247B2 (en) * | 2006-12-28 | 2010-06-15 | Ebay Inc. | Multi-pass data organization and automatic naming |
| CA2675216A1 (en) * | 2007-01-10 | 2008-07-17 | Nick Koudas | Method and system for information discovery and text analysis |
| US8694472B2 (en) | 2007-03-14 | 2014-04-08 | Ca, Inc. | System and method for rebuilding indices for partitioned databases |
| US7711747B2 (en) * | 2007-04-06 | 2010-05-04 | Xerox Corporation | Interactive cleaning for automatic document clustering and categorization |
| WO2008146456A1 (ja) * | 2007-05-28 | 2008-12-04 | Panasonic Corporation | 情報探索支援方法および情報探索支援装置 |
| US7769778B2 (en) * | 2007-06-29 | 2010-08-03 | United States Postal Service | Systems and methods for validating an address |
| US7788276B2 (en) * | 2007-08-22 | 2010-08-31 | Yahoo! Inc. | Predictive stemming for web search with statistical machine translation models |
| US7925652B2 (en) * | 2007-12-31 | 2011-04-12 | Mastercard International Incorporated | Methods and systems for implementing approximate string matching within a database |
| US8775441B2 (en) | 2008-01-16 | 2014-07-08 | Ab Initio Technology Llc | Managing an archive for approximate string matching |
| US8032546B2 (en) * | 2008-02-15 | 2011-10-04 | Microsoft Corp. | Transformation-based framework for record matching |
| US8266168B2 (en) * | 2008-04-24 | 2012-09-11 | Lexisnexis Risk & Information Analytics Group Inc. | Database systems and methods for linking records and entity representations with sufficiently high confidence |
| US7958125B2 (en) * | 2008-06-26 | 2011-06-07 | Microsoft Corporation | Clustering aggregator for RSS feeds |
| US20120191973A1 (en) * | 2008-09-10 | 2012-07-26 | National Ict Australia Limited | Online presence of users |
| US8150169B2 (en) * | 2008-09-16 | 2012-04-03 | Viewdle Inc. | System and method for object clustering and identification in video |
| US8484215B2 (en) | 2008-10-23 | 2013-07-09 | Ab Initio Technology Llc | Fuzzy data operations |
| US20100169311A1 (en) * | 2008-12-30 | 2010-07-01 | Ashwin Tengli | Approaches for the unsupervised creation of structural templates for electronic documents |
| JP5173898B2 (ja) | 2009-03-11 | 2013-04-03 | キヤノン株式会社 | 画像処理方法、画像処理装置、及びプログラム |
| US8161048B2 (en) * | 2009-04-24 | 2012-04-17 | At&T Intellectual Property I, L.P. | Database analysis using clusters |
| US20100274770A1 (en) * | 2009-04-24 | 2010-10-28 | Yahoo! Inc. | Transductive approach to category-specific record attribute extraction |
| US8195626B1 (en) | 2009-06-18 | 2012-06-05 | Amazon Technologies, Inc. | Compressing token-based files for transfer and reconstruction |
| US8285681B2 (en) * | 2009-06-30 | 2012-10-09 | Commvault Systems, Inc. | Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites |
| US8635223B2 (en) * | 2009-07-28 | 2014-01-21 | Fti Consulting, Inc. | System and method for providing a classification suggestion for electronically stored information |
| US9542647B1 (en) * | 2009-12-16 | 2017-01-10 | Board Of Regents, The University Of Texas System | Method and system for an ontology, including a representation of unified medical language system (UMLS) using simple knowledge organization system (SKOS) |
| US8375061B2 (en) * | 2010-06-08 | 2013-02-12 | International Business Machines Corporation | Graphical models for representing text documents for computer analysis |
| US8346772B2 (en) * | 2010-09-16 | 2013-01-01 | International Business Machines Corporation | Systems and methods for interactive clustering |
| US8463742B1 (en) | 2010-09-17 | 2013-06-11 | Permabit Technology Corp. | Managing deduplication of stored data |
| US8606771B2 (en) * | 2010-12-21 | 2013-12-10 | Microsoft Corporation | Efficient indexing of error tolerant set containment |
| US8612386B2 (en) * | 2011-02-11 | 2013-12-17 | Alcatel Lucent | Method and apparatus for peer-to-peer database synchronization in dynamic networks |
| CN108388632B (zh) | 2011-11-15 | 2021-11-19 | 起元科技有限公司 | 数据分群、分段、以及并行化 |
-
2008
- 2008-01-16 US US12/015,085 patent/US8775441B2/en active Active
- 2008-12-30 CA CA2710882A patent/CA2710882C/en active Active
- 2008-12-30 CN CN201510647048.6A patent/CN105373365B/zh active Active
- 2008-12-30 CN CN200880128089.2A patent/CN101978348B/zh active Active
- 2008-12-30 KR KR1020107017207A patent/KR101564385B1/ko active Active
- 2008-12-30 AU AU2008348066A patent/AU2008348066B2/en active Active
- 2008-12-30 EP EP08870601A patent/EP2235621A4/en not_active Ceased
- 2008-12-30 WO PCT/US2008/088530 patent/WO2009091494A1/en not_active Ceased
- 2008-12-30 JP JP2010543117A patent/JP5603250B2/ja active Active
-
2014
- 2014-07-07 US US14/325,007 patent/US9563721B2/en active Active
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20240025272A (ko) | 2022-08-18 | 2024-02-27 | 한국전력공사 | 자연어 처리를 위한 비정형 데이터 기반 근사 질의응답 시스템 및 방법 |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2009091494A1 (en) | 2009-07-23 |
| CN101978348B (zh) | 2015-11-25 |
| KR20100116595A (ko) | 2010-11-01 |
| US9563721B2 (en) | 2017-02-07 |
| CN105373365B (zh) | 2019-02-05 |
| US8775441B2 (en) | 2014-07-08 |
| CA2710882C (en) | 2017-01-17 |
| JP2011511341A (ja) | 2011-04-07 |
| CN105373365A (zh) | 2016-03-02 |
| CN101978348A (zh) | 2011-02-16 |
| JP5603250B2 (ja) | 2014-10-08 |
| EP2235621A4 (en) | 2012-08-29 |
| AU2008348066A1 (en) | 2009-07-23 |
| US20150066862A1 (en) | 2015-03-05 |
| AU2008348066B2 (en) | 2015-03-26 |
| CA2710882A1 (en) | 2009-07-23 |
| EP2235621A1 (en) | 2010-10-06 |
| US20090182728A1 (en) | 2009-07-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR101564385B1 (ko) | 근사 문자열 매칭을 위한 아카이브를 관리하는 방법 및 시스템 | |
| Boytsov | Indexing methods for approximate dictionary searching: Comparative analysis | |
| US6470347B1 (en) | Method, system, program, and data structure for a dense array storing character strings | |
| US20040107205A1 (en) | Boolean rule-based system for clustering similar records | |
| Branting | A comparative evaluation of name-matching algorithms | |
| CN114580557A (zh) | 基于语义分析的文献相似度确定方法及装置 | |
| CN113032371A (zh) | 数据库语法分析方法、装置和计算机设备 | |
| Talburt et al. | A practical guide to entity resolution with OYSTER | |
| Berghel | A logical framework for the correction of spelling errors in electronic documents | |
| CN120723894B (zh) | 一种基于父子分段与多源召回的增强检索生成方法 | |
| CN120316124B (zh) | 一种用于企业数字化转型平台的数据处理方法及系统 | |
| CN114490599B (zh) | 一种证件号处理和检索的方法 | |
| Martin et al. | Incremental evolution of fuzzy grammar fragments to enhance instance matching and text mining | |
| Wen et al. | : An efficient entity extraction algorithm using two-level edit-distance | |
| Luján-Mora et al. | Reducing inconsistency in integrating data from different sources | |
| JPH06282587A (ja) | 文書の自動分類方法及び装置並びに分類用の辞書作成方法及び装置 | |
| AU2015202043B2 (en) | Managing an archive for approximate string matching | |
| Zhou et al. | Implementing Boolean matching rules in an entity resolution system using XML scripts | |
| Porwal et al. | A comparative analysis of data cleaning approaches to dirty data | |
| Kumar et al. | Probabilistic management of OCR data using an RDBMS | |
| Luján-Mora et al. | Comparing string similarity measures for reducing inconsistency in integrating data from different sources | |
| Wen et al. | A technical report: entity extraction using both character-based and token-based similarity | |
| Luján-Mora et al. | Reducing inconsistency in data warehouses | |
| Lemström et al. | Approximate pattern matching and transitive closure logics | |
| CN118550932A (zh) | 多词库共用数据结构查询关键词的方法、系统和电子设备 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
Patent event date: 20100730 Patent event code: PA01051R01D Comment text: International Patent Application |
|
| PG1501 | Laying open of application | ||
| A201 | Request for examination | ||
| PA0201 | Request for examination |
Patent event code: PA02012R01D Patent event date: 20131224 Comment text: Request for Examination of Application |
|
| E902 | Notification of reason for refusal | ||
| PE0902 | Notice of grounds for rejection |
Comment text: Notification of reason for refusal Patent event date: 20141030 Patent event code: PE09021S01D |
|
| E701 | Decision to grant or registration of patent right | ||
| PE0701 | Decision of registration |
Patent event code: PE07011S01D Comment text: Decision to Grant Registration Patent event date: 20150730 |
|
| GRNT | Written decision to grant | ||
| PR0701 | Registration of establishment |
Comment text: Registration of Establishment Patent event date: 20151023 Patent event code: PR07011E01D |
|
| PR1002 | Payment of registration fee |
Payment date: 20151023 End annual number: 3 Start annual number: 1 |
|
| PG1601 | Publication of registration | ||
| FPAY | Annual fee payment |
Payment date: 20181012 Year of fee payment: 4 |
|
| PR1001 | Payment of annual fee |
Payment date: 20181012 Start annual number: 4 End annual number: 4 |
|
| FPAY | Annual fee payment |
Payment date: 20191010 Year of fee payment: 5 |
|
| PR1001 | Payment of annual fee |
Payment date: 20191010 Start annual number: 5 End annual number: 5 |
|
| PR1001 | Payment of annual fee |
Payment date: 20201015 Start annual number: 6 End annual number: 6 |
|
| PR1001 | Payment of annual fee |
Payment date: 20211014 Start annual number: 7 End annual number: 7 |
|
| PR1001 | Payment of annual fee |
Payment date: 20221012 Start annual number: 8 End annual number: 8 |
|
| PR1001 | Payment of annual fee |
Payment date: 20231012 Start annual number: 9 End annual number: 9 |
|
| PR1001 | Payment of annual fee |
Payment date: 20241016 Start annual number: 10 End annual number: 10 |