JP2006072985A5 - - Google Patents

Download PDF

Info

Publication number
JP2006072985A5
JP2006072985A5 JP2005221802A JP2005221802A JP2006072985A5 JP 2006072985 A5 JP2006072985 A5 JP 2006072985A5 JP 2005221802 A JP2005221802 A JP 2005221802A JP 2005221802 A JP2005221802 A JP 2005221802A JP 2006072985 A5 JP2006072985 A5 JP 2006072985A5
Authority
JP
Japan
Prior art keywords
tuple
tuples
neighborhood
data set
partition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2005221802A
Other languages
English (en)
Japanese (ja)
Other versions
JP4814570B2 (ja
JP2006072985A (ja
Filing date
Publication date
Priority claimed from US10/929,514 external-priority patent/US7516149B2/en
Application filed filed Critical
Publication of JP2006072985A publication Critical patent/JP2006072985A/ja
Publication of JP2006072985A5 publication Critical patent/JP2006072985A5/ja
Application granted granted Critical
Publication of JP4814570B2 publication Critical patent/JP4814570B2/ja
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

JP2005221802A 2004-08-30 2005-07-29 あいまいな重複に強い検出器 Expired - Fee Related JP4814570B2 (ja)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/929,514 US7516149B2 (en) 2004-08-30 2004-08-30 Robust detector of fuzzy duplicates
US10/929,514 2004-08-30

Publications (3)

Publication Number Publication Date
JP2006072985A JP2006072985A (ja) 2006-03-16
JP2006072985A5 true JP2006072985A5 (https=) 2008-09-11
JP4814570B2 JP4814570B2 (ja) 2011-11-16

Family

ID=35219700

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2005221802A Expired - Fee Related JP4814570B2 (ja) 2004-08-30 2005-07-29 あいまいな重複に強い検出器

Country Status (7)

Country Link
US (1) US7516149B2 (https=)
EP (1) EP1630698B1 (https=)
JP (1) JP4814570B2 (https=)
KR (1) KR101153113B1 (https=)
CN (1) CN100520776C (https=)
AT (1) ATE420406T1 (https=)
DE (1) DE602005012192D1 (https=)

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8732004B1 (en) 2004-09-22 2014-05-20 Experian Information Solutions, Inc. Automated analysis of data to generate prospect notifications based on trigger events
US20070016501A1 (en) 2004-10-29 2007-01-18 American Express Travel Related Services Co., Inc., A New York Corporation Using commercial share of wallet to rate business prospects
US8326672B2 (en) 2004-10-29 2012-12-04 American Express Travel Related Services Company, Inc. Using commercial share of wallet in financial databases
US7822665B2 (en) 2004-10-29 2010-10-26 American Express Travel Related Services Company, Inc. Using commercial share of wallet in private equity investments
US8630929B2 (en) 2004-10-29 2014-01-14 American Express Travel Related Services Company, Inc. Using commercial share of wallet to make lending decisions
US7792732B2 (en) 2004-10-29 2010-09-07 American Express Travel Related Services Company, Inc. Using commercial share of wallet to rate investments
US8204774B2 (en) * 2004-10-29 2012-06-19 American Express Travel Related Services Company, Inc. Estimating the spend capacity of consumer households
US8086509B2 (en) 2004-10-29 2011-12-27 American Express Travel Related Services Company, Inc. Determining commercial share of wallet
US7912770B2 (en) * 2004-10-29 2011-03-22 American Express Travel Related Services Company, Inc. Method and apparatus for consumer interaction based on spend capacity
US7788147B2 (en) 2004-10-29 2010-08-31 American Express Travel Related Services Company, Inc. Method and apparatus for estimating the spend capacity of consumers
US8543499B2 (en) 2004-10-29 2013-09-24 American Express Travel Related Services Company, Inc. Reducing risks related to check verification
US8131614B2 (en) 2004-10-29 2012-03-06 American Express Travel Related Services Company, Inc. Using commercial share of wallet to compile marketing company lists
US20070244732A1 (en) 2004-10-29 2007-10-18 American Express Travel Related Services Co., Inc., A New York Corporation Using commercial share of wallet to manage vendors
US7840484B2 (en) 2004-10-29 2010-11-23 American Express Travel Related Services Company, Inc. Credit score and scorecard development
US8326671B2 (en) 2004-10-29 2012-12-04 American Express Travel Related Services Company, Inc. Using commercial share of wallet to analyze vendors in online marketplaces
US20080243680A1 (en) * 2005-10-24 2008-10-02 Megdal Myles G Method and apparatus for rating asset-backed securities
US20080033852A1 (en) * 2005-10-24 2008-02-07 Megdal Myles G Computer-based modeling of spending behaviors of entities
US8036979B1 (en) 2006-10-05 2011-10-11 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
US8239250B2 (en) 2006-12-01 2012-08-07 American Express Travel Related Services Company, Inc. Industry size of wallet
US8606666B1 (en) 2007-01-31 2013-12-10 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US8606626B1 (en) 2007-01-31 2013-12-10 Experian Information Solutions, Inc. Systems and methods for providing a direct marketing campaign planning environment
US7827153B2 (en) * 2007-12-19 2010-11-02 Sap Ag System and method to perform bulk operation database cleanup
US20110004578A1 (en) * 2008-02-22 2011-01-06 Michinari Momma Active metric learning device, active metric learning method, and program
US20100161542A1 (en) * 2008-12-22 2010-06-24 International Business Machines Corporation Detecting entity relevance due to a multiplicity of distinct values for an attribute type
US9910875B2 (en) 2008-12-22 2018-03-06 International Business Machines Corporation Best-value determination rules for an entity resolution system
US8200640B2 (en) 2009-06-15 2012-06-12 Microsoft Corporation Declarative framework for deduplication
US8176407B2 (en) * 2010-03-02 2012-05-08 Microsoft Corporation Comparing values of a bounded domain
US9652802B1 (en) 2010-03-24 2017-05-16 Consumerinfo.Com, Inc. Indirect monitoring and reporting of a user's credit data
US9361008B2 (en) * 2010-05-12 2016-06-07 Moog Inc. Result-oriented configuration of performance parameters
US8473410B1 (en) 2012-02-23 2013-06-25 American Express Travel Related Services Company, Inc. Systems and methods for identifying financial relationships
US8781954B2 (en) 2012-02-23 2014-07-15 American Express Travel Related Services Company, Inc. Systems and methods for identifying financial relationships
US9477988B2 (en) 2012-02-23 2016-10-25 American Express Travel Related Services Company, Inc. Systems and methods for identifying financial relationships
US8538869B1 (en) 2012-02-23 2013-09-17 American Express Travel Related Services Company, Inc. Systems and methods for identifying financial relationships
US9336302B1 (en) 2012-07-20 2016-05-10 Zuci Realty Llc Insight and algorithmic clustering for automated synthesis
CN104516900A (zh) * 2013-09-29 2015-04-15 国际商业机器公司 用于多个序列数据的聚类方法及其装置
US9892158B2 (en) * 2014-01-31 2018-02-13 International Business Machines Corporation Dynamically adjust duplicate skipping method for increased performance
US10262362B1 (en) 2014-02-14 2019-04-16 Experian Information Solutions, Inc. Automatic generation of code for attributes
US10387389B2 (en) * 2014-09-30 2019-08-20 International Business Machines Corporation Data de-duplication
US10445152B1 (en) 2014-12-19 2019-10-15 Experian Information Solutions, Inc. Systems and methods for dynamic report generation based on automatic modeling of complex data structures
US11205103B2 (en) 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
US11055327B2 (en) 2018-07-01 2021-07-06 Quadient Technologies France Unstructured data parsing for structured information
US11301440B2 (en) 2020-06-18 2022-04-12 Lexisnexis Risk Solutions, Inc. Fuzzy search using field-level deletion neighborhoods

Family Cites Families (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5924090A (en) 1997-05-01 1999-07-13 Northern Light Technology Llc Method and apparatus for searching a database of records
US5940821A (en) * 1997-05-21 1999-08-17 Oracle Corporation Information presentation in a knowledge base search and retrieval system
US5950186A (en) 1997-08-15 1999-09-07 Microsoft Corporation Database system index selection using cost evaluation of a workload for multiple candidate index configurations
US5913207A (en) 1997-08-15 1999-06-15 Microsoft Corporation Database system index selection using index configuration enumeration for a workload
US5960423A (en) 1997-08-15 1999-09-28 Microsoft Corporation Database system index selection using candidate index selection for a workload
US5926813A (en) 1997-08-15 1999-07-20 Microsoft Corporation Database system index selection using cost evaluation of a workload for multiple candidate index configurations
US5913206A (en) 1997-08-15 1999-06-15 Microsoft Corporation Database system multi-column index selection for a workload
US5966702A (en) 1997-10-31 1999-10-12 Sun Microsystems, Inc. Method and apparatus for pre-processing and packaging class files
US6182066B1 (en) * 1997-11-26 2001-01-30 International Business Machines Corp. Category processing of query topics and electronic document content topics
US6169983B1 (en) 1998-05-30 2001-01-02 Microsoft Corporation Index merging for database systems
US6223171B1 (en) 1998-08-25 2001-04-24 Microsoft Corporation What-if index analysis utility for database systems
US6460045B1 (en) 1999-03-15 2002-10-01 Microsoft Corporation Self-tuning histogram and database modeling
US6374241B1 (en) 1999-03-31 2002-04-16 Verizon Laboratories Inc. Data merging techniques
US6363371B1 (en) 1999-06-29 2002-03-26 Microsoft Corporation Identifying essential statistics for query optimization for databases
US6529901B1 (en) 1999-06-29 2003-03-04 Microsoft Corporation Automating statistics management for query optimizers
US6691108B2 (en) 1999-12-14 2004-02-10 Nec Corporation Focused search engine and method
US6266658B1 (en) 2000-04-20 2001-07-24 Microsoft Corporation Index tuner for given workload
US6356890B1 (en) 2000-04-20 2002-03-12 Microsoft Corporation Merging materialized view pairs for database workload materialized view selection
US6356891B1 (en) 2000-04-20 2002-03-12 Microsoft Corporation Identifying indexes on materialized views for database workload
US6513029B1 (en) 2000-04-20 2003-01-28 Microsoft Corporation Interesting table-subset selection for database workload materialized view selection
US6366903B1 (en) 2000-04-20 2002-04-02 Microsoft Corporation Index and materialized view selection for a given workload
US7007008B2 (en) 2000-08-08 2006-02-28 America Online, Inc. Category searching
GB0029159D0 (en) * 2000-11-29 2001-01-17 Calaba Ltd Data storage and retrieval system
US20020124214A1 (en) 2001-03-01 2002-09-05 International Business Machines Corporation Method and system for eliminating duplicate reported errors in a logically partitioned multiprocessing system
US20040128282A1 (en) 2001-03-07 2004-07-01 Paul Kleinberger System and method for computer searching
AU2002309152A1 (en) 2001-03-25 2002-10-08 Exiqon A/S Systems for analysis of biological materials
US6912549B2 (en) * 2001-09-05 2005-06-28 Siemens Medical Solutions Health Services Corporation System for processing and consolidating records
JP3812818B2 (ja) * 2001-12-05 2006-08-23 日本電信電話株式会社 データベース生成装置、データベース生成方法及びデータベース生成処理プログラム
JP3803961B2 (ja) * 2001-12-05 2006-08-02 日本電信電話株式会社 データベース生成装置、データベース生成処理方法及びデータベース生成プログラム
US7523127B2 (en) * 2002-01-14 2009-04-21 Testout Corporation System and method for a hierarchical database management system for educational training and competency testing simulations
US7139749B2 (en) 2002-03-19 2006-11-21 International Business Machines Corporation Method, system, and program for performance tuning a database query
US7152060B2 (en) * 2002-04-11 2006-12-19 Choicemaker Technologies, Inc. Automated database blocking and record matching
US6961721B2 (en) * 2002-06-28 2005-11-01 Microsoft Corporation Detecting duplicate records in database
US7953694B2 (en) * 2003-01-13 2011-05-31 International Business Machines Corporation Method, system, and program for specifying multidimensional calculations for a relational OLAP engine
US20050027717A1 (en) * 2003-04-21 2005-02-03 Nikolaos Koudas Text joins for data cleansing and integration in a relational database management system
US7774312B2 (en) 2003-09-04 2010-08-10 Oracle International Corporation Self-managing performance statistics repository for databases
US20050125401A1 (en) * 2003-12-05 2005-06-09 Hewlett-Packard Development Company, L. P. Wizard for usage in real-time aggregation and scoring in an information handling system
WO2005057364A2 (en) 2003-12-08 2005-06-23 Ebay Inc. Custom caching
US7281004B2 (en) 2004-02-27 2007-10-09 International Business Machines Corporation Method, system and program for optimizing compression of a workload processed by a database management system

Similar Documents

Publication Publication Date Title
JP2006072985A5 (https=)
US10055458B2 (en) Data placement control for distributed computing environment
Hagedorn et al. The STARK framework for spatio-temporal data analytics on spark
Das et al. Shared-memory parallel maximal clique enumeration
CN102541992B (zh) 用于高效地查询数据库的同态定理
JP2019530068A5 (https=)
Yu et al. A demonstration of GeoSpark: A cluster computing framework for processing big spatial data
US20130117227A1 (en) Cache based key-value store mapping and replication
Lan et al. High performance implementation of 3D convolutional neural networks on a GPU
Liagouris et al. An effective encoding scheme for spatial RDF data
WO2019019574A1 (zh) 一种新型的olap预计算模型及构建方法
CN110059264A (zh) 基于知识图谱的地点检索方法、设备及计算机存储介质
CN110321446A (zh) 相关数据推荐方法、装置、计算机设备及存储介质
CN104346444B (zh) 一种基于路网反空间关键字查询的最佳选址方法
Alam et al. A performance study of big spatial data systems
CN108710640A (zh) 一种提高Spark SQL的查询效率的方法
CN102637227B (zh) 基于最短路径的土地资源评价因子作用域划分方法
CN103645948A (zh) 一种面向数据密集型及依赖关系的并行计算方法
Balaji et al. Distributed graph path queries using spark
Graux et al. SPARQLGX in action: Efficient distributed evaluation of SPARQL with Apache Spark
Abdullahi et al. Big data: performance profiling of meteorological and oceanographic data on hive
CN104679889A (zh) 一种面向大数据处理的数据存储方法和装置
CN117725811A (zh) 一种基于强化学习的涡轮叶片冷效设计调优方法及装置
CN116663244A (zh) 船舶仿真试验后处理数据存储方法及其查询方法
CN103995869A (zh) 一种基于Apriori算法的数据缓存方法