CN112514349B - 利用对加密匹配索引进行的精确和模糊匹配来检测重复 - Google Patents

利用对加密匹配索引进行的精确和模糊匹配来检测重复 Download PDF

Info

Publication number
CN112514349B
CN112514349B CN201980051331.9A CN201980051331A CN112514349B CN 112514349 B CN112514349 B CN 112514349B CN 201980051331 A CN201980051331 A CN 201980051331A CN 112514349 B CN112514349 B CN 112514349B
Authority
CN
China
Prior art keywords
encrypted
matching
index
match
fields
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201980051331.9A
Other languages
English (en)
Chinese (zh)
Other versions
CN112514349A (zh
Inventor
A·赫尚斯
S·谢尔
C·克尔
P·V·瓦伊什纳芙
A·本-古尔
V·W·刘
D·麦加里
S·萨尼科穆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shuo Power Co
Original Assignee
Salesforce com Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Salesforce com Inc filed Critical Salesforce com Inc
Publication of CN112514349A publication Critical patent/CN112514349A/zh
Application granted granted Critical
Publication of CN112514349B publication Critical patent/CN112514349B/zh
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Automation & Control Theory (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Storage Device Security (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
CN201980051331.9A 2018-05-31 2019-05-30 利用对加密匹配索引进行的精确和模糊匹配来检测重复 Active CN112514349B (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201862678812P 2018-05-31 2018-05-31
US62/678,812 2018-05-31
US16/026,819 US10942906B2 (en) 2018-05-31 2018-07-03 Detect duplicates with exact and fuzzy matching on encrypted match indexes
US16/026,819 2018-07-03
PCT/US2019/034585 WO2019232167A1 (en) 2018-05-31 2019-05-30 Detect duplicates with exact and fuzzy matching on encrypted match indexes

Publications (2)

Publication Number Publication Date
CN112514349A CN112514349A (zh) 2021-03-16
CN112514349B true CN112514349B (zh) 2023-02-03

Family

ID=68694792

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980051331.9A Active CN112514349B (zh) 2018-05-31 2019-05-30 利用对加密匹配索引进行的精确和模糊匹配来检测重复

Country Status (5)

Country Link
US (2) US10942906B2 (https=)
EP (1) EP3804269B1 (https=)
JP (1) JP7399889B2 (https=)
CN (1) CN112514349B (https=)
WO (1) WO2019232167A1 (https=)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12518247B2 (en) * 2022-05-13 2026-01-06 Hubspot, Inc. System and method of translating a tracking module to a unique identifier
US10671604B2 (en) 2018-01-30 2020-06-02 Salesforce.Com, Inc. Using row value constructor (RVC) based queries to group records of a database for multi-thread execution
JP6841805B2 (ja) * 2018-10-03 2021-03-10 ファナック株式会社 ロボット教示装置、ロボット教示方法、及び動作命令を記憶する方法
US11068448B2 (en) 2019-01-07 2021-07-20 Salesforce.Com, Inc. Archiving objects in a database environment
US20220207007A1 (en) * 2020-12-30 2022-06-30 Vision Insight Ai Llp Artificially intelligent master data management
US11677810B2 (en) * 2021-07-23 2023-06-13 International Business Machines Corporation Configuration tool for deploying an application on a server
CN113609147B (zh) * 2021-08-11 2024-09-03 北京自如信息科技有限公司 数据共享方法、装置及电子设备
WO2023036399A1 (en) * 2021-09-07 2023-03-16 Mendix Technology B.V. Managing an app, especially developing an app comprising an event artifact, method and system
CN114693405B (zh) * 2022-04-12 2025-06-27 广州华多网络科技有限公司 电商信息安全检测方法及其装置、设备、介质、产品
US12260434B2 (en) 2022-09-20 2025-03-25 Salesforce, Inc. System and method for a scalable pricing engine
US12136114B2 (en) 2022-09-20 2024-11-05 Salesforce, Inc. System and method for asynchronous pricing to improve pricing service scalability
US12235849B2 (en) 2022-11-23 2025-02-25 Salesforce, Inc. Multi-context stateful rule execution
US12106131B2 (en) 2022-11-23 2024-10-01 Salesforce, Inc. Metadata driven guided rules editor
US12572826B2 (en) 2022-11-23 2026-03-10 Salesforce, Inc. Asynchronous rule compilation in a multi-tenant environment

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6052693A (en) * 1996-07-02 2000-04-18 Harlequin Group Plc System for assembling large databases through information extracted from text sources
US5819291A (en) * 1996-08-23 1998-10-06 General Electric Company Matching new customer records to existing customer records in a large business database using hash key
US7657569B1 (en) * 2006-11-28 2010-02-02 Lower My Bills, Inc. System and method of removing duplicate leads
US8078651B2 (en) * 2008-01-24 2011-12-13 Oracle International Corporation Match rules to identify duplicate records in inbound data
CN103975300A (zh) 2011-12-08 2014-08-06 英派尔科技开发有限公司 用于允许跨用户的重复数据删除的存储折扣
US8943059B2 (en) * 2011-12-21 2015-01-27 Sap Se Systems and methods for merging source records in accordance with survivorship rules
WO2014028524A1 (en) * 2012-08-15 2014-02-20 Visa International Service Association Searchable encrypted data
US9495552B2 (en) 2012-12-31 2016-11-15 Microsoft Technology Licensing, Llc Integrated data deduplication and encryption
JP6171649B2 (ja) 2013-07-16 2017-08-02 日本電気株式会社 暗号化装置、復号装置、暗号化方法および暗号化プログラム
EP3238368B1 (en) 2014-12-23 2020-09-02 Nokia Technologies Oy Method and apparatus for duplicated data management in cloud computing
WO2016130807A1 (en) 2015-02-11 2016-08-18 Visa International Service Association Increasing search ability of private, encrypted data
JP2016202341A (ja) 2015-04-16 2016-12-08 キヤノンマーケティングジャパン株式会社 医用画像処理装置、医用画像処理装置の制御方法、およびプログラム
US9894042B2 (en) * 2015-07-24 2018-02-13 Skyhigh Networks, Inc. Searchable encryption enabling encrypted search based on document type
US10430423B1 (en) * 2015-07-31 2019-10-01 Priceline.Com Llc Attribute translation and matching from a plurality of data records to create consolidated data records
US20170177899A1 (en) 2015-12-17 2017-06-22 Agency For Science, Technology And Research Encrypted data deduplication in cloud storage
US10222987B2 (en) * 2016-02-11 2019-03-05 Dell Products L.P. Data deduplication with augmented cuckoo filters
US10558669B2 (en) 2016-07-22 2020-02-11 National Student Clearinghouse Record matching system
US10223433B2 (en) * 2017-01-25 2019-03-05 International Business Machines Corporation Data mapper

Also Published As

Publication number Publication date
EP3804269B1 (en) 2024-01-24
US11748320B2 (en) 2023-09-05
CN112514349A (zh) 2021-03-16
WO2019232167A1 (en) 2019-12-05
EP3804269A1 (en) 2021-04-14
US20210182255A1 (en) 2021-06-17
US20190370363A1 (en) 2019-12-05
JP7399889B2 (ja) 2023-12-18
US10942906B2 (en) 2021-03-09
JP2021525915A (ja) 2021-09-27

Similar Documents

Publication Publication Date Title
CN112514349B (zh) 利用对加密匹配索引进行的精确和模糊匹配来检测重复
CN109670049B (zh) 图谱路径查询方法、装置、计算机设备和存储介质
US10114955B2 (en) Increasing search ability of private, encrypted data
US11354285B2 (en) Bulk duplication detection supporting data encryption
US10002171B1 (en) Flexible database schema
US12277105B2 (en) Methods and systems for improved search for data loss prevention
US9720946B2 (en) Efficient storage of related sparse data in a search index
CN118673522A (zh) 用于数据匿名化的方法、电子设备和计算机程序产品
US9734229B1 (en) Systems and methods for mining data in a data warehouse
US20230100289A1 (en) Searchable data processing operation documentation associated with data processing of raw data
CN111221690B (zh) 针对集成电路设计的模型确定方法、装置及终端
CN110020040A (zh) 查询数据的方法、装置和系统
CN113934729A (zh) 一种基于知识图谱的数据管理方法、相关设备及介质
US9286349B2 (en) Dynamic search system
CN116186337B (zh) 一种业务场景数据处理方法、系统及电子设备
CN116827630A (zh) 卡片业务信息的可搜索加密方法、装置、设备和存储介质
CN118277444A (zh) 异常行为发现方法、装置、存储介质及电子设备
CN101447886B (zh) 一种比较海量数据的方法及装置
US20250329074A1 (en) Uncovering patterns in text through clustering
CN117910055A (zh) 芯片数据的加密传输方法、装置、芯片和存储介质
CN117786731A (zh) 一种数据搜索方法、装置及存储介质
Basheer Adaptive Tag Based Document Explorer and Duplicate Detection Engine for Cloud
Mahajan et al. Optimization of Association Rule in Horizontally Distributed Database using Unique Key Value

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: California, USA

Patentee after: Shuo Power Co.

Address before: California, USA

Patentee before: SALESFORCE.COM, Inc.

CP01 Change in the name or title of a patent holder