CA2353095A1 - Systeme et procede pour trouver des quasi-correspondances parmi des articles contenus dans des bases de donnees - Google Patents

Systeme et procede pour trouver des quasi-correspondances parmi des articles contenus dans des bases de donnees Download PDF

Info

Publication number
CA2353095A1
CA2353095A1 CA002353095A CA2353095A CA2353095A1 CA 2353095 A1 CA2353095 A1 CA 2353095A1 CA 002353095 A CA002353095 A CA 002353095A CA 2353095 A CA2353095 A CA 2353095A CA 2353095 A1 CA2353095 A1 CA 2353095A1
Authority
CA
Canada
Prior art keywords
record
records
data store
identifiers
creating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002353095A
Other languages
English (en)
Inventor
David Whipple
Joseph Carsanaro
Ken Young
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bloodhound Software Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2353095A1 publication Critical patent/CA2353095A1/fr
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24554Unary operations; Data partitioning operations
    • G06F16/24556Aggregation; Duplicate elimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un système et un procédé permettant de trouver des quasi-correspondances parmi plusieurs articles contenus dans des bases de données (104, 100, 116, 120, 112, 110) et des mémoires de données de systèmes informatiques. Le système de cette invention est notamment destiné à identifier des quasi-correspondances entre les articles d'une mémoire de données et un article comprenant un ensemble de coordonnées associé. Un processeur (96) crée des identificateurs associés à chaque article de ladite mémoire de données, puis applique chaque article situé dans un espace discriminant associé à chacun desdits identificateurs, avant d'extraire de l'ensemble de coordonnées associé à l'article choisi tous les articles de la mémoire de données comprenant l'ensemble de coordonnées associé à une distance prédéterminée, à l'intérieur dudit espace discriminant.
CA002353095A 1998-12-07 1999-12-06 Systeme et procede pour trouver des quasi-correspondances parmi des articles contenus dans des bases de donnees Abandoned CA2353095A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US11121298P 1998-12-07 1998-12-07
US60/111,212 1998-12-07
PCT/US1999/028870 WO2000034897A1 (fr) 1998-12-07 1999-12-06 Systeme et procede pour trouver des quasi-correspondances parmi des articles contenus dans des bases de donnees

Publications (1)

Publication Number Publication Date
CA2353095A1 true CA2353095A1 (fr) 2000-06-15

Family

ID=22337203

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002353095A Abandoned CA2353095A1 (fr) 1998-12-07 1999-12-06 Systeme et procede pour trouver des quasi-correspondances parmi des articles contenus dans des bases de donnees

Country Status (4)

Country Link
EP (1) EP1138007A1 (fr)
AU (2) AU2166700A (fr)
CA (1) CA2353095A1 (fr)
WO (1) WO2000034897A1 (fr)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8423374B2 (en) * 2002-06-27 2013-04-16 Siebel Systems, Inc. Method and system for processing intelligence information
GB0220576D0 (en) * 2002-09-04 2002-10-09 Neural Technologies Ltd Data proximity detector
US8126739B2 (en) 2006-04-28 2012-02-28 MDI Technologies, Inc Method and system for tracking treatment of patients in a health services environment
US8126738B2 (en) 2006-04-28 2012-02-28 Mdi Technologies, Inc. Method and system for scheduling tracking, adjudicating appointments and claims in a health services environment
US9262475B2 (en) 2012-06-12 2016-02-16 Melissa Data Corp. Systems and methods for matching records using geographic proximity
US9563677B2 (en) * 2012-12-11 2017-02-07 Melissa Data Corp. Systems and methods for clustered matching of records using geographic proximity
CN113595805B (zh) * 2021-08-23 2024-01-30 海南房小云科技有限公司 一种用于局域网内的个人计算机数据共享方法
WO2023063971A1 (fr) * 2021-10-13 2023-04-20 Equifax Inc. Détection d'enregistrement fragmenté basée sur des techniques d'appariement d'enregistrements

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5649183A (en) * 1992-12-08 1997-07-15 Microsoft Corporation Method for compressing full text indexes with document identifiers and location offsets
US5465353A (en) * 1994-04-01 1995-11-07 Ricoh Company, Ltd. Image matching and retrieval by multi-access redundant hashing
US6029167A (en) * 1997-07-25 2000-02-22 Claritech Corporation Method and apparatus for retrieving text using document signatures
US6026398A (en) * 1997-10-16 2000-02-15 Imarket, Incorporated System and methods for searching and matching databases

Also Published As

Publication number Publication date
WO2000034897A9 (fr) 2001-06-07
WO2000034897A1 (fr) 2000-06-15
AU6436599A (en) 2000-06-08
AU2166700A (en) 2000-06-26
EP1138007A1 (fr) 2001-10-04

Similar Documents

Publication Publication Date Title
EP3745276A1 (fr) Découverte d'une signification sémantique de champs de données à partir de données de profil de champs de données
US6820079B1 (en) Method and apparatus for retrieving text using document signatures
Burrows et al. Efficient plagiarism detection for large code repositories
Doermann et al. The detection of duplicates in document image databases
US7173632B2 (en) Information display
US6678681B1 (en) Information extraction from a database
US5659731A (en) Method for rating a match for a given entity found in a list of entities
US6934634B1 (en) Address geocoding
Borges et al. Discovering geographic locations in web pages using urban addresses
US20030154181A1 (en) Document clustering with cluster refinement and model selection capabilities
US7565348B1 (en) Determining a document similarity metric
US7711719B1 (en) Massive multi-pattern searching
US9129010B2 (en) System and method of partitioned lexicographic search
KR100627195B1 (ko) 광학문자인식으로 생성된 전자문서 검색방법 및 그 시스템
US20130031083A1 (en) Determining keyword for a form page
EP1934829A2 (fr) Recherche locale
US7240045B1 (en) Automatic system for configuring to dynamic database search forms
Keogh Efficiently finding arbitrarily scaled patterns in massive time series databases
US6691103B1 (en) Method for searching a database, search engine system for searching a database, and method of providing a key table for use by a search engine for a database
CA2353095A1 (fr) Systeme et procede pour trouver des quasi-correspondances parmi des articles contenus dans des bases de donnees
CN107291951B (zh) 数据处理方法、装置、存储介质和处理器
WO2024064705A1 (fr) Techniques pour découvrir et mettre à jour une signification sémantique de champs de données
US8515987B1 (en) Database information consolidation
CN111475464A (zh) 一种自动发现挖掘Web组件指纹的方法
KR100490442B1 (ko) 벡터문서모델을 이용한 동일/유사제품 클러스트링 장치 및그 방법

Legal Events

Date Code Title Description
EEER Examination request
FZDE Dead