TWI835300B - A data matching method, device, equipment and medium - Google Patents

A data matching method, device, equipment and medium Download PDF

Info

Publication number
TWI835300B
TWI835300B TW111135467A TW111135467A TWI835300B TW I835300 B TWI835300 B TW I835300B TW 111135467 A TW111135467 A TW 111135467A TW 111135467 A TW111135467 A TW 111135467A TW I835300 B TWI835300 B TW I835300B
Authority
TW
Taiwan
Prior art keywords
vector
data
encrypted
target
distance
Prior art date
Application number
TW111135467A
Other languages
Chinese (zh)
Other versions
TW202336617A (en
Inventor
劉紅寶
高鵬飛
鄭建賓
余蕭寒
邱震堯
周雍愷
程棟
趙慶航
Original Assignee
大陸商中國銀聯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210191650.3A external-priority patent/CN114817943A/en
Application filed by 大陸商中國銀聯股份有限公司 filed Critical 大陸商中國銀聯股份有限公司
Publication of TW202336617A publication Critical patent/TW202336617A/en
Application granted granted Critical
Publication of TWI835300B publication Critical patent/TWI835300B/en

Links

Images

Abstract

本發明公開了一種資料匹配方法、裝置、設備及介質,該方法中分別將第一資料和第二資料登錄到預先訓練完成的向量轉化模型中,獲得對應的第一向量和第二向量,並獲取第一目標公開金鑰加密後的第一向量和第二向量的距離,基於加密後的距離以及第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於該目標距離確定兩個資料是否匹配,在兩個資料不完全相同時,也能實現資料的模糊匹配,拓寬了使用場景,且在進行模糊匹配過程中引入了第一目標公私密金鑰對分別進行同態加密和解密,實現了安全求交,保證了匹配過程的安全性,且整個匹配的過程中,資料均未以原始資料的形式離開過對應的設備,實現了原始資料不出庫也能實現模糊匹配,保證了匹配過程的安全性。The invention discloses a data matching method, device, equipment and medium. In the method, the first data and the second data are respectively logged into the pre-trained vector transformation model to obtain the corresponding first vector and the second vector, and Obtain the distance between the first vector and the second vector encrypted by the first target public key, determine the target distance between the first vector and the second vector based on the encrypted distance and the first target private key, and determine based on the target distance Whether the two data match or not, when the two data are not exactly the same, fuzzy matching of the data can also be achieved, broadening the usage scenarios, and in the fuzzy matching process, the first target public and private key pairs are introduced for homomorphic encryption respectively. and decryption, realizing safe intersection and ensuring the security of the matching process. During the entire matching process, the data has not left the corresponding device in the form of original data, realizing fuzzy matching without leaving the original data. The security of the matching process is guaranteed.

Description

一種資料匹配方法、裝置、設備及介質A data matching method, device, equipment and medium

本發明屬於資料處理技術領域,尤其是關於一種資料匹配方法、裝置、設備及介質。The present invention belongs to the field of data processing technology, and in particular relates to a data matching method, device, equipment and medium.

當前隱私計算技術主要應用於安全求交和聯邦學習。所謂安全求交是指識別雙方資料的交集,比如,識別機構A和機構B的共有用戶,其中安全求交也為縱向聯邦學習的第一個步驟,也就是說,先對手機號、身份證號、營業執照號等關鍵資訊進行安全求交,然後再進行下一步聯合建模等步驟。Current privacy computing technology is mainly used in secure intersection and federated learning. The so-called secure friendship refers to identifying the intersection of data from both parties, for example, identifying the shared users of organization A and organization B. Secure friendship is also the first step in vertical federated learning. That is to say, first identify the mobile phone number and ID card. Key information such as ID number, business license number, etc. are securely exchanged before proceeding to the next step of joint modeling and other steps.

相關技術中,為了識別雙方資料的交集或者實現雙方資料的匹配,常見的安全求交演算法包括基於RSA加密演算法的安全求交演算法,以及基於不經意傳輸(Oblivious Transfer,OT)協定的安全求交演算法等等。但是目前的安全求交演算法只有當雙方資料完全相同時,也就是說,當雙方資料的資料類型以及資料包含的字元的數量完全相同時,才能匹配成功。但是實際業務中,往往存在很多資料不完全相同時待匹配的使用場景,因此現有技術中的安全求交演算法極大限制了其使用場景,影響匹配的業務範圍。In related technologies, in order to identify the intersection of the two parties' data or achieve the matching of the two parties' data, common secure intersection algorithms include secure intersection algorithms based on the RSA encryption algorithm, and secure intersection algorithms based on the Oblivious Transfer (OT) protocol. Intersection algorithm and so on. However, the current safe intersection algorithm can only match successfully when the data of both parties are exactly the same, that is to say, when the data type of the two parties' data and the number of characters contained in the data are exactly the same. However, in actual business, there are often many usage scenarios that need to be matched when the data are not exactly the same. Therefore, the secure intersection algorithm in the existing technology greatly limits its usage scenarios and affects the business scope of matching.

本發明提供了一種資料匹配方法、裝置、設備及介質,用以解決現有技術中的安全求交演算法只能在雙方資料完全相同時進行安全求交,使用場景有限,影響資料匹配的業務範圍的問題。The present invention provides a data matching method, device, equipment and medium to solve the problem that the secure intersection algorithm in the prior art can only perform secure intersection when the data of both parties are exactly the same, and the usage scenarios are limited, which affects the business scope of data matching. problem.

本發明提供了一種資料匹配方法,應用於第一設備,該方法包括: 將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量; 採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備; 獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的; 基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The present invention provides a data matching method, which is applied to the first device. The method includes: Log the first data to be matched into the pre-trained vector transformation model to obtain the first vector corresponding to the first data; Use the self-generated first target public key to perform homomorphic encryption on the first vector to generate a first encryption vector, and send the first target public key to the second device; Obtain the distance between the encrypted first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the second encrypted vector is obtained by synchronizing the second vector with the first target public key. obtained after state encryption; the second vector is obtained by logging the second data into the pre-trained vector transformation model in the second device; Based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key, determine the target distance between the first vector and the second vector, based on the target distance and a preset first distance threshold to determine whether the first data and the second data match.

本發明提供了一種資料匹配方法,應用於第二設備,該方法包括: 將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量; 接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量; 獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的; 基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The present invention provides a data matching method, which is applied to the second device. The method includes: Log the second data to be matched into the pre-trained vector transformation model to obtain the second vector corresponding to the second data; Receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; Obtain the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the first encrypted vector is obtained by encrypting the first vector using the first target public key. , the first vector is obtained by logging the first data into the pre-trained vector transformation model in the first device; Based on the target distance and the preset first distance threshold, it is determined whether the first data and the second data match.

本發明還提供了一種資料匹配裝置,該裝置包括: 第一獲取模組,用於將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量; 第一處理模組,用於採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備; 該第一獲取模組,還用於獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的; 第一確定模組,用於基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The invention also provides a data matching device, which includes: The first acquisition module is used to log the first data to be matched into the pre-trained vector transformation model and obtain the first vector corresponding to the first data; The first processing module is configured to perform homomorphic encryption on the first vector using the first target public key generated by itself to generate a first encryption vector, and send the first target public key to the second device; The first acquisition module is also used to acquire the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the second encrypted vector is obtained using the first target The second vector is obtained by homomorphically encrypting the second vector with the public key; the second vector is obtained by logging the second data into the pre-trained vector conversion model in the second device; A first determination module configured to determine the first vector and the second vector based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key. The target distance, based on the target distance and the preset first distance threshold, determine whether the first data and the second data match.

本發明還提供了一種資料匹配裝置,該裝置包括: 第二獲取模組,用於將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量; 第二處理模組,用於接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量; 該第二獲取模組,還用於獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的; 第二確定模組,用於基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The invention also provides a data matching device, which includes: The second acquisition module is used to log the second data to be matched into the pre-trained vector transformation model and obtain the second vector corresponding to the second data; The second processing module is configured to receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; The second acquisition module is also used to acquire the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the first encrypted vector is disclosed using the first target The first vector is obtained by encrypting the first vector with the key, and the first vector is obtained by logging the first data into the pre-trained vector conversion model in the first device; The second determination module is used to determine whether the first data and the second data match based on the target distance and the preset first distance threshold.

本發明還提供了一種電子設備,該電子設備包括處理器,該處理器用於執行記憶體中存儲的電腦程式時實現如上述任一所述資料匹配方法的步驟。The present invention also provides an electronic device. The electronic device includes a processor. The processor is configured to implement the steps of any of the above data matching methods when executing a computer program stored in the memory.

本發明還提供了一種電腦可讀存儲介質,其存儲有可由終端執行的電腦程式,當該程式在該終端上運行時,使得該終端執行上述任一項該資料匹配方法的步驟。The present invention also provides a computer-readable storage medium that stores a computer program that can be executed by a terminal. When the program is run on the terminal, it causes the terminal to execute any one of the above steps of the data matching method.

本發明還提供了一種電子設備,該電子設備包括處理器,該處理器用於執行記憶體中存儲的電腦程式時實現如上述任一所述資料匹配方法的步驟。The present invention also provides an electronic device. The electronic device includes a processor. The processor is configured to implement the steps of any of the above data matching methods when executing a computer program stored in the memory.

本發明還提供了一種電腦可讀存儲介質,其存儲有可由終端執行的電腦程式,當該程式在該終端上運行時,使得該終端執行上述任一項該資料匹配方法的步驟。The present invention also provides a computer-readable storage medium that stores a computer program that can be executed by a terminal. When the program is run on the terminal, it causes the terminal to execute any one of the above steps of the data matching method.

本發明中將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得第一資料對應的第一向量,採用自身生成的第一目標公開金鑰對第一向量進行同態加密生成第一加密向量,並將第一目標公開金鑰發送給第二設備,獲取基於第一加密向量和第二加密向量確定的加密後的第一向量和第二向量的距離,其中第二加密向量為採用第一目標公開金鑰對第二向量進行同態加密後得到的,第二向量為將第二資料登錄到第二設備中的預先訓練完成的向量轉化模型中獲得的,基於加密後的第一向量和第二向量的距離及第一目標公開金鑰對應的第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於目標距離以及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。由於在本發明實施例中,分別將待匹配的第一資料和第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量以及第二資料對應的第二向量,並獲取基於該第一向量加密後的第一加密向量,以及該第二向量加密後的第二加密向量,確定的加密後的第一向量和第二向量的距離,並基於加密後的第一向量和第二向量的距離以及自身生成的第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於該目標距離以及預設的第一距離閾值確定第一資料和第二資料是否匹配,即在第一資料和第二資料不完全相同時,也能實現第一資料和第二資料的模糊匹配,拓寬了使用場景,且在進行模糊匹配過程中引入了第一目標公開金鑰和第一目標私密金鑰分別進行同態加密和解密,實現了安全求交,保證了匹配過程的安全性,且整個匹配的過程中,第一資料以及第二資料均未以原始資料的形式離開過對應的第一設備以及第二設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。In the present invention, the first data to be matched is logged into the pre-trained vector conversion model, the first vector corresponding to the first data is obtained, and the first target public key generated by itself is used to homomorphically encrypt and generate the first vector. the first encrypted vector, and sends the first target public key to the second device to obtain the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, where the second encrypted vector It is obtained by homomorphically encrypting the second vector using the first target public key. The second vector is obtained by logging the second data into the pre-trained vector conversion model of the second device. Based on the encrypted The distance between the first vector and the second vector and the first target private key corresponding to the first target public key determine the target distance between the first vector and the second vector. Based on the target distance and the preset first distance threshold, determine Whether the first data and the second data match. In this embodiment of the present invention, the first data and the second data to be matched are respectively logged into the vector transformation model that has been trained in advance, and the first vector corresponding to the first data and the second vector corresponding to the second data are obtained. , and obtain the first encrypted vector encrypted based on the first vector, and the second encrypted vector encrypted based on the second vector, determine the distance between the encrypted first vector and the second vector, and obtain the encrypted first vector based on the encrypted second vector. The distance between the first vector and the second vector and the first target private key generated by itself determine the target distance between the first vector and the second vector, and determine the first data and the second data based on the target distance and the preset first distance threshold. Whether the data matches, that is, when the first data and the second data are not exactly the same, fuzzy matching between the first data and the second data can be achieved, broadening the usage scenarios, and introducing the first target disclosure during the fuzzy matching process The golden key and the first target private key are homomorphically encrypted and decrypted respectively, achieving safe intersection and ensuring the security of the matching process. During the entire matching process, neither the first data nor the second data are used as original data. The form has left the corresponding first device and second device, realizing fuzzy matching without leaving the original data out of the database, further ensuring the security of the matching process.

為了使本發明的目的、技術方案和優點更加清楚,下面將結合附圖對本發明作進一步地詳細描述,顯然,所描述的實施例僅僅是本發明一部分實施例,而不是全部的實施例。基於本發明中的實施例,本領域普通技術人員在沒有做出進步性勞動前提下所獲得的所有其它實施例,都屬於本發明保護的範圍。In order to make the purpose, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without making any progressive efforts fall within the scope of protection of the present invention.

為了保證雙方資料不完全相同時也可以進行匹配,拓寬資料匹配的業務範圍,本發明實施例提供了一種資料匹配方法、裝置、設備及介質。In order to ensure that matching can be performed even when the data of both parties are not exactly the same, and to broaden the business scope of data matching, embodiments of the present invention provide a data matching method, device, equipment and medium.

本發明中,將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得第一資料對應的第一向量,採用自身生成的第一目標公開金鑰對第一向量進行同態加密生成第一加密向量,並將第一目標公開金鑰發送給第二設備,獲取基於第一加密向量和第二加密向量確定的加密後的第一向量和第二向量的距離,其中第二加密向量為採用第一目標公開金鑰對第二向量進行同態加密後得到的,第二向量為將第二資料登錄到第二設備中的預先訓練完成的向量轉化模型中獲得的,基於加密後的第一向量和第二向量的距離及第一目標公開金鑰對應的第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於目標距離以及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。In the present invention, the first data to be matched is logged into the pre-trained vector conversion model, the first vector corresponding to the first data is obtained, and the first target public key generated by itself is used to homomorphically encrypt the first vector. Generate a first encrypted vector, send the first target public key to the second device, and obtain the distance between the encrypted first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, where the second encrypted The vector is obtained by homomorphically encrypting the second vector using the first target public key. The second vector is obtained by logging the second data into the pre-trained vector conversion model of the second device. Based on the encrypted The distance between the first vector and the second vector and the first target private key corresponding to the first target public key, determine the target distance between the first vector and the second vector, based on the target distance and the preset first distance threshold, Determine whether the first information and the second information match.

實施例1: 圖1為本發明實施例提供的一種資料匹配過程示意圖,該過程包括以下步驟: S101:將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量。 Example 1: Figure 1 is a schematic diagram of a data matching process provided by an embodiment of the present invention. The process includes the following steps: S101: Log the first data to be matched into the pre-trained vector transformation model, and obtain the first vector corresponding to the first data.

本發明實施例提供的資料匹配方法應用於第一設備,該第一設備可以是智慧終端機、PC或者伺服器等設備。The data matching method provided by the embodiment of the present invention is applied to a first device, which may be a smart terminal, a PC, a server, or other devices.

為了保證雙方資料不完全相同時也能實現模糊匹配,在本發明實施例中,第一設備中部署了預先訓練完成的向量轉化模型,該預先訓練完成的向量轉化模型用於獲得待匹配的資料對應的向量,且針對不同的資料,該預先訓練完成的向量轉化模型輸出的向量的維度相同。In order to ensure that fuzzy matching can be achieved even when the data of both parties are not exactly the same, in the embodiment of the present invention, a pre-trained vector transformation model is deployed in the first device. The pre-trained vector transformation model is used to obtain the data to be matched. Corresponding vectors, and for different data, the dimensions of the vectors output by the pre-trained vector transformation model are the same.

為了獲得待匹配的第一資料對應的第一向量,將該第一資料登錄到預先訓練完成的向量轉化模型中,該預先訓練完成的向量轉化模型輸出該第一資料對應的第一向量,該第一向量中的每個分量為數字,即通過該預先訓練完成的向量轉化模型將第一資料進行了量化。In order to obtain the first vector corresponding to the first data to be matched, the first data is logged into the pre-trained vector transformation model, and the pre-trained vector transformation model outputs the first vector corresponding to the first data. Each component in the first vector is a number, that is, the first data is quantified through the pre-trained vector transformation model.

S102:採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備。S102: Use the self-generated first target public key to perform homomorphic encryption on the first vector to generate a first encryption vector, and send the first target public key to the second device.

在本發明實施例中,為了提高安全性,第一設備生成第一目標公私密金鑰對,其中,該第一目標公私密金鑰對中包含第一目標公開金鑰和第一目標私密金鑰,並根據該自身生成的第一目標公開金鑰對第一向量加密生成第一加密向量。其中,該第一目標公私密金鑰對可以為對稱公私密金鑰對,也可以為非對稱公私密金鑰對,具體的,可以根據需求設置目標公私密金鑰對。In this embodiment of the present invention, in order to improve security, the first device generates a first target public and private key pair, where the first target public and private key pair includes the first target public key and the first target private key. key, and encrypts the first vector according to the self-generated first target public key to generate a first encryption vector. The first target public and private key pair may be a symmetric public and private key pair or an asymmetric public and private key pair. Specifically, the target public and private key pair may be set according to requirements.

其中,生成第一目標公私密金鑰對的過程為現有技術,在此不做贅述。The process of generating the first target public-private key pair is an existing technology and will not be described in detail here.

由於待與第一資料進行匹配的第二資料為第二設備獲取的,因此為了便於後續能夠確定第一資料對應的第一向量和第二資料對應的第二向量之間的目標距離,在本發明實施例中,第一設備還將第一目標公開金鑰發送給第二設備,以使第二設備可以根據該第一目標公開金鑰對第二資料對應的第二向量進行同態加密生成第二加密向量。具體的,在本發明實施例中,對該第一向量和第二向量進行同態加密時,基於該第一目標公開金鑰分別對該第一向量中的每個分量以及該第二向量中的每個分量進行同態加密,進而獲得第一加密向量和第二加密向量。Since the second data to be matched with the first data is obtained by the second device, in order to facilitate subsequent determination of the target distance between the first vector corresponding to the first data and the second vector corresponding to the second data, in this article In the embodiment of the invention, the first device also sends the first target public key to the second device, so that the second device can homomorphically encrypt and generate the second vector corresponding to the second material based on the first target public key. Second encryption vector. Specifically, in the embodiment of the present invention, when performing homomorphic encryption on the first vector and the second vector, each component in the first vector and each component in the second vector are encrypted based on the first target public key. Perform homomorphic encryption on each component of , and then obtain the first encryption vector and the second encryption vector.

S103:獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的。S103: Obtain the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, where the second encrypted vector is the second vector using the first target public key. Obtained after homomorphic encryption; the second vector is obtained by logging the second data into the pre-trained vector transformation model in the second device.

為了實現第一資料和第二資料的模糊匹配,第二設備中也部署了預先訓練完成的向量轉化模型,用於獲得待匹配的第二資料對應的第二向量,也就是說,將該第二資料登錄到預先訓練完成的向量轉化模型中,該預先訓練完成的向量轉化模型輸出該第二資料對應的第二向量,第二設備基於接收到的第一設備發送的第一目標公開金鑰對該第二向量進行同態加密,獲得第二加密向量。In order to achieve fuzzy matching between the first data and the second data, a pre-trained vector conversion model is also deployed in the second device to obtain the second vector corresponding to the second data to be matched. That is to say, the second data is The second data is logged into the pre-trained vector conversion model. The pre-trained vector conversion model outputs the second vector corresponding to the second data. The second device is based on the received first target public key sent by the first device. Perform homomorphic encryption on the second vector to obtain a second encrypted vector.

由於採用了第一設備生成的第一目標公開金鑰對第一向量和第二向量分別進行了加密,因此,為了確定第一向量和第二向量的目標距離,在本發明實施例中,第一設備可以先接收第二設備發送的第二加密向量,並基於該第一設備自身生成的第一目標公私密金鑰對中的第一目標私密金鑰對第二加密向量進行解密,獲得第二向量,並基於該第一向量和第二向量,確定該第一向量和第二向量的目標距離。Since the first vector and the second vector are respectively encrypted using the first target public key generated by the first device, in order to determine the target distance of the first vector and the second vector, in the embodiment of the present invention, A device may first receive the second encryption vector sent by the second device, and decrypt the second encryption vector based on the first target private key in the first target public-private key pair generated by the first device itself to obtain the second encryption vector. two vectors, and based on the first vector and the second vector, determine the target distance of the first vector and the second vector.

為了提高安全性,在本發明實施例中,為了確定第一向量和第二向量的目標距離,可以先獲得基於第一加密向量和第二加密向量確定的加密後的第一向量和第二向量的距離。其中,該加密後的第一向量和第二向量的距離不是一個確定的數值,而是一個需要進行解密後才能獲得第一向量和第二向量的目標距離的確定運算式,其中,該加密後的第一向量和第二向量的距離可以為第一設備確定的,也可以為第二設備確定後發送給第一設備的。In order to improve security, in the embodiment of the present invention, in order to determine the target distance of the first vector and the second vector, the encrypted first vector and the second vector determined based on the first encrypted vector and the second encrypted vector may first be obtained. distance. Among them, the encrypted distance between the first vector and the second vector is not a certain numerical value, but a certain calculation formula that needs to be decrypted to obtain the target distance between the first vector and the second vector, wherein the encrypted distance The distance between the first vector and the second vector may be determined by the first device, or may be determined by the second device and then sent to the first device.

S104:基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。S104: Based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key, determine the target distance between the first vector and the second vector. Based on the The target distance and the preset first distance threshold determine whether the first data and the second data match.

在本發明實施例中,為了確定第一資料以及第二資料是否匹配,在確定加密後的第一向量和第二向量的距離之後,先對該加密後的第一向量和第二向量的距離進行解密,確定該第一向量和第二向量的目標距離。由於第一加密向量和第二加密向量均是由第一設備生成的第一目標公開金鑰加密生成的,因此,為了確定第一向量和第二向量的目標距離,在本發明實施例中,可以根據第一設備生成的第一目標私密金鑰對中的第一目標私密金鑰,對該加密後的第一向量和第二向量的距離進行解密,確定該第一向量和第二向量的目標距離。In the embodiment of the present invention, in order to determine whether the first data and the second data match, after determining the distance between the encrypted first vector and the second vector, the distance between the encrypted first vector and the second vector is first determined. Decrypt and determine the target distance between the first vector and the second vector. Since both the first encrypted vector and the second encrypted vector are encrypted and generated by the first target public key generated by the first device, in order to determine the target distance between the first vector and the second vector, in the embodiment of the present invention, The distance between the encrypted first vector and the second vector can be decrypted according to the first target private key in the first target private key pair generated by the first device, and the distance between the first vector and the second vector can be determined. target distance.

為了確定該第一資料以及該第二資料是否匹配,在本發明實施例中,將該第一向量和第二向量的目標距離以及預設的第一距離閾值進行比較,根據比較結果,確定該第一資料以及該第二資料是否匹配。其中,目標距離越小,第一資料與第二資料越匹配。In order to determine whether the first data and the second data match, in the embodiment of the present invention, the target distances of the first vector and the second vector are compared with a preset first distance threshold, and based on the comparison result, the target distance is determined. Whether the first information and the second information match. Among them, the smaller the target distance is, the better the match between the first data and the second data is.

由於在本發明實施例中,分別將待匹配的第一資料和第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量以及第二資料對應的第二向量,並獲取基於該第一向量加密後的第一加密向量,以及該第二向量加密後的第二加密向量,確定的加密後的第一向量和第二向量的距離,並基於加密後的第一向量和第二向量的距離以及自身生成的第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於該目標距離以及預設的第一距離閾值確定第一資料和第二資料是否匹配,即在第一資料和第二資料不完全相同時,也能實現第一資料和第二資料的模糊匹配,拓寬了使用場景,且在進行模糊匹配過程中引入了第一目標公開金鑰和第一目標私密金鑰分別進行同態加密和解密,實現了安全求交,保證了匹配過程的安全性,且整個匹配的過程中,第一資料以及第二資料均未以原始資料的形式離開過對應的第一設備以及第二設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。In this embodiment of the present invention, the first data and the second data to be matched are respectively logged into the vector transformation model that has been trained in advance, and the first vector corresponding to the first data and the second vector corresponding to the second data are obtained. , and obtain the first encrypted vector encrypted based on the first vector, and the second encrypted vector encrypted based on the second vector, determine the distance between the encrypted first vector and the second vector, and obtain the encrypted first vector based on the encrypted second vector. The distance between the first vector and the second vector and the first target private key generated by itself determine the target distance between the first vector and the second vector, and determine the first data and the second data based on the target distance and the preset first distance threshold. Whether the data matches, that is, when the first data and the second data are not exactly the same, fuzzy matching between the first data and the second data can be achieved, broadening the usage scenarios, and introducing the first target disclosure during the fuzzy matching process The golden key and the first target private key are homomorphically encrypted and decrypted respectively, achieving safe intersection and ensuring the security of the matching process. During the entire matching process, neither the first data nor the second data are used as original data. The form has left the corresponding first device and second device, realizing fuzzy matching without leaving the original data out of the database, further ensuring the security of the matching process.

實施例2: 為了確定第一資料對應的第一向量,在上述實施例的基礎上,在本發明實施例中,該將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量包括: 確定待匹配的第一資料對應的第一目標資料類型; 根據該第一目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一資料對應的預先完成的第一目標向量轉化模型; 將該第一資料登錄到該預先訓練完成的第一目標向量轉化模型中,獲得該第一資料對應的第一向量。 Example 2: In order to determine the first vector corresponding to the first data, based on the above embodiment, in the embodiment of the present invention, the first data to be matched is logged into the pre-trained vector transformation model to obtain the first data The corresponding first vector includes: Determine the first target data type corresponding to the first data to be matched; Determine the pre-completed first target vector transformation model corresponding to the first data according to the corresponding relationship between the first target data type and the pre-saved data type and the pre-trained vector transformation model; Log the first data into the pre-trained first target vector transformation model to obtain the first vector corresponding to the first data.

在本發明實施例中,由於待匹配的第一資料可以為文字資料,比如,該第一資料可以為姓名、性別、位址等等,也可以為數字資料,比如,該第一資料可以為身份證號、銀行卡號以及准考證號等等,因此,針對不同資料類型的第一資料,獲得對應的第一向量的預先訓練完成的向量轉化模型也不相同。In this embodiment of the present invention, the first data to be matched may be text data, for example, the first data may be name, gender, address, etc., or it may be digital data, for example, the first data may be ID number, bank card number, admission ticket number, etc. Therefore, for different data types of first data, the pre-trained vector conversion models that obtain the corresponding first vectors are also different.

具體的,可以在第一設備中保存有資料類型和預先訓練完成的向量轉化模型的對應關係,根據獲取到的待匹配的第一資料對應的第一目標資料類型,採用對應的預先訓練完成的向量轉化模型,獲取該第一資料對應的第一向量;其中,該對應的預先訓練完成的向量轉化模型也就是預先訓練完成的第一目標向量轉化模型。Specifically, the first device may store the corresponding relationship between the data type and the pre-trained vector conversion model, and use the corresponding pre-trained data type according to the first target data type corresponding to the obtained first data to be matched. The vector transformation model obtains the first vector corresponding to the first data; wherein the corresponding pre-trained vector transformation model is also the pre-trained first target vector transformation model.

為了準確的確定將第一資料轉換為第一向量的向量轉化模型,在上述各實施例的基礎上,在本發明實施例中,若該第一目標資料類型為文字類型,對應的預先訓練完成的第一目標向量轉化模型為詞向量模型或句向量模型;若該第一目標資料類型為數字類型,對應的預先訓練完成的第一目標向量轉化模型為獨熱(One-Hot)編碼模型。In order to accurately determine the vector conversion model for converting the first data into the first vector, based on the above embodiments, in the embodiment of the present invention, if the first target data type is a text type, the corresponding pre-training is completed The first target vector conversion model is a word vector model or a sentence vector model; if the first target data type is a numeric type, the corresponding pre-trained first target vector conversion model is a One-Hot encoding model.

具體的,若該第一資料為文字資料,即第一資料的第一目標資料類型為文字類型,則根據預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一目標資料類型對應的預先訓練完成的第一目標向量轉化模型,該第一目標向量轉化模型為詞向量模型或者句向量模型,並基於該預先訓練完成的詞向量模型或者句向量模型,獲取該第一資料對應的第一向量;若該第一資料為數字資料,即第一資料的第一目標資料類型為數字類型,則根據預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一目標資料類型對應的預先訓練完成的第一目標向量轉化模型,該第一目標向量轉化模型為預先訓練完成的One-Hot編碼模型,並基於該預先訓練完成的One-Hot編碼模型,獲取該第一資料對應的第一向量。Specifically, if the first data is text data, that is, the first target data type of the first data is text type, then the first target is determined based on the correspondence between the pre-saved data type and the pre-trained vector transformation model. A pre-trained first target vector conversion model corresponding to the data type. The first target vector conversion model is a word vector model or a sentence vector model. Based on the pre-trained word vector model or sentence vector model, the first target vector conversion model is obtained. The first vector corresponding to the data; if the first data is digital data, that is, the first target data type of the first data is a digital type, then based on the correspondence between the pre-saved data type and the pre-trained vector conversion model, determine The pre-trained first target vector conversion model corresponding to the first target data type, the first target vector conversion model is a pre-trained One-Hot encoding model, and based on the pre-trained One-Hot encoding model, Obtain the first vector corresponding to the first data.

以預先訓練完成的向量轉化模型輸出的向量的維度為5,預先訓練完成的向量轉化模型為詞向量模型為例進行說明,若第一資料為文字資料,且該第一資料為「上海市浦東新區晴天小賣部」,則將「上海市浦東新區晴天小賣部」輸入到預先訓練完成的詞向量模型中,輸出的該「上海市浦東新區晴天小賣部」對應的第一向量為(1.0,2.0,1.5,2.0,3.5)。Taking the dimension of the vector output by the pre-trained vector conversion model as 5 and the pre-trained vector conversion model as a word vector model as an example, if the first data is text data, and the first data is "Shanghai Pudong New District Sunny Shop", then input "Shanghai Pudong New District Sunny Shop" into the pre-trained word vector model, and the output first vector corresponding to "Shanghai Pudong New District Sunny Shop" is (1.0, 2.0, 1.5, 2.0, 3.5).

若該預先訓練完成的向量轉化模型為One-Hot編碼模型,可以預先針對每個數字設置該對應的獨熱編碼,比如,數字包含0-9,則該0-9各個數字中,0對應的獨熱編碼為0000000001、1對應的獨熱編碼為0000000010、2對應的獨熱編碼為0000000100、3對應的獨熱編碼為0000001000、4對應的獨熱編碼為0000010000、5對應的獨熱編碼為0000100000、6對應的獨熱編碼為0001000000、7對應的獨熱編碼為0010000000、8對應的獨熱編碼為0100000000、9對應的獨熱編碼為1000000000。將數字資料登錄到One-Hot編碼模型中,該One-Hot編碼模型輸出的第一向量中每個第一分量為該第一資料中對應的每個數字的獨熱編碼。If the pre-trained vector conversion model is a One-Hot encoding model, the corresponding one-hot encoding can be set in advance for each number. For example, if the number contains 0-9, then among the numbers 0-9, 0 corresponds to The one-hot encoding is 0000000001, the one-hot encoding corresponding to 1 is 0000000010, the one-hot encoding corresponding to 2 is 0000000100, the one-hot encoding corresponding to 3 is 0000001000, the one-hot encoding corresponding to 4 is 0000010000, and the one-hot encoding corresponding to 5 is 0000100000 , The one-hot encoding corresponding to 6 is 0001000000, the one-hot encoding corresponding to 7 is 0010000000, the one-hot encoding corresponding to 8 is 0100000000, and the one-hot encoding corresponding to 9 is 1000000000. Log the digital data into the One-Hot encoding model, and each first component in the first vector output by the One-Hot encoding model is the one-hot encoding of each corresponding number in the first data.

若該第一資料為數字資料,該數字資料為「12345」,則將「12345」輸入到預先訓練完成的One-Hot編碼模型中,輸出的該「12345」對應的第一向量為(0000000010,0000000100,0000001000,0000010000,0000100000)。If the first data is digital data, and the digital data is "12345", then "12345" is input into the pre-trained One-Hot encoding model, and the first vector corresponding to the output "12345" is (0000000010, 0000000100, 0000001000, 0000010000, 0000100000).

其中,在對向量轉化模型進行訓練時,可以預先標注每個資料和資料對應的標注向量,將每個資料和對應的標注向量輸入到原始向量轉化模型中,根據原始向量轉化模型輸出的預測向量及對應的標注向量,對該原始向量轉化模型的參數進行調整,當滿足收斂條件時,確定向量轉化模型訓練完成。Among them, when training the vector conversion model, each data and the corresponding label vector can be pre-labeled, each data and the corresponding label vector are input into the original vector conversion model, and the prediction vector output by the original vector conversion model is and the corresponding annotation vector, adjust the parameters of the original vector transformation model, and when the convergence conditions are met, it is determined that the training of the vector transformation model is completed.

在本發明實施例中,不管第一資料和第二資料是數字資料還是文字資料,都能實現模糊匹配,進一步的擴寬了場景應用。In the embodiment of the present invention, regardless of whether the first data and the second data are digital data or text data, fuzzy matching can be achieved, further broadening the scenario application.

實施例3: 為了確定加密後的第一向量和第二向量的距離,在上述各實施例的基礎上,在本發明實施例中,該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 接收該第二設備發送的該第二加密向量,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的; 基於該第一加密向量以及該第二加密向量,確定加密後的該第一向量和該第二向量的距離。 Example 3: In order to determine the distance between the encrypted first vector and the second vector, based on the above embodiments, in the embodiment of the present invention, the encrypted vector determined based on the first encrypted vector and the second encrypted vector is obtained. The distance between the first vector and the second vector includes: Receive the second encryption vector sent by the second device, wherein the second encryption vector is obtained by the second device after homomorphically encrypting the second vector based on the first target public key; Based on the first encrypted vector and the second encrypted vector, the distance between the encrypted first vector and the second vector is determined.

在本發明實施例中,第一設備獲得的基於第一加密向量和第二加密向量確定加密後的第一向量和第二向量的距離可以為第一設備確定的,還可以為第二設備確定併發送給第一設備的。In the embodiment of the present invention, the distance between the encrypted first vector and the second vector obtained by the first device and determined based on the first encrypted vector and the second encrypted vector may be determined by the first device, or may also be determined by the second device. and sent to the first device.

若該加密後的第一向量和第二向量的距離為第一設備確定的,為了確定加密後的第一向量和第二向量的距離,第一設備需要獲得第二設備發送的第二加密向量,具體的為了獲取到第二加密向量,第一設備在將第一目標公開金鑰發送給第二設備後,第二設備接收第一設備發送的第一目標公開金鑰,基於該第一目標公開金鑰對第二向量加密後獲得第二加密向量併發送給第一設備,其中,該第二向量為待匹配的第二資料登錄到第二設備中預先訓練完成的向量轉化模型中獲得的向量。If the distance between the encrypted first vector and the second vector is determined by the first device, in order to determine the distance between the encrypted first vector and the second vector, the first device needs to obtain the second encrypted vector sent by the second device. , specifically in order to obtain the second encryption vector, after the first device sends the first target public key to the second device, the second device receives the first target public key sent by the first device, based on the first target After encrypting the second vector with the public key, the second encrypted vector is obtained and sent to the first device. The second vector is obtained by logging the second data to be matched into the pre-trained vector transformation model in the second device. vector.

第一設備接收第二設備發送的第二加密向量,並基於接收到的第二加密向量以及第一設備確定的第一加密向量,在第一設備本地確定加密後的第一向量和第二向量的距離。The first device receives the second encryption vector sent by the second device, and determines the encrypted first vector and the second vector locally on the first device based on the received second encryption vector and the first encryption vector determined by the first device. distance.

若基於該第一目標公開金鑰分別對該第一向量中的每個分量以及該第二向量中的每個分量進行同態加密,獲得第一加密向量和第二加密向量,則為了確定加密後的第一向量和第二向量的距離,在一種可能的實施方式中,第一設備基於該第一加密向量以及第二加密向量以及歐式距離公式,確定該加密後的第一向量和第二向量的距離。具體的,根據 ,確定加密後的第一向量和第二向量的距離,其中,該 為第一加密向量中的第i個分量, 為第二加密向量中的第i個分量,該 為加密後的第一向量和第二向量的距離,N為第一加密向量或第二加密向量中包含的分量的數量,且該第一加密向量中包含的分量的數量與該第二加密分量中包含的分量的數量相同,即第一加密向量的長度與第二加密向量的長度相等。 If each component in the first vector and each component in the second vector are homomorphically encrypted based on the first target public key to obtain the first encrypted vector and the second encrypted vector, then in order to determine the encryption The distance between the encrypted first vector and the second vector. In a possible implementation, the first device determines the encrypted first vector and the second encrypted vector based on the first encrypted vector and the second encrypted vector and the Euclidean distance formula. vector distance. specific, based on , determine the distance between the encrypted first vector and the second vector, where, the is the i-th component in the first encrypted vector, is the i-th component in the second encrypted vector, the is the distance between the encrypted first vector and the second vector, N is the number of components included in the first encrypted vector or the second encrypted vector, and the number of components included in the first encrypted vector is the same as the second encrypted component. The number of components contained in is the same, that is, the length of the first encryption vector is equal to the length of the second encryption vector.

在另外一種可能的實施方式中,第一設備還可以根據該第一加密向量和第二加密向量以及余弦距離公式或者漢明距離公式確定該加密後的第一向量和第二向量的距離。In another possible implementation, the first device may also determine the distance between the encrypted first vector and the second vector based on the first encrypted vector and the second encrypted vector and the cosine distance formula or the Hamming distance formula.

需要說明的是,由於該第一加密向量和該第二加密向量中每個分量被第一目標公開金鑰進行同態加密了,因此,該確定的加密後的第一向量和第二向量的距離並不是一個實際的數值,而是一種確定運算式。It should be noted that since each component of the first encrypted vector and the second encrypted vector is homomorphically encrypted by the first target public key, the determined encrypted first vector and the second vector are Distance is not an actual numerical value, but a deterministic operation.

由於在本發明實施例中,第二資料均未以原始資料的形式離開過第二設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。Since in the embodiment of the present invention, the second data has not left the second device in the form of original data, fuzzy matching can be achieved without the original data leaving the database, further ensuring the security of the matching process.

實施例4: 為了確定加密後的第一向量和第二向量的距離,在上述各實施例的基礎上,在本發明實施例中,該將該第一目標公開金鑰發送給第二設備包括: 將該第一加密向量以及該第一目標公開金鑰發送給該第二設備; 該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 接收該第二設備發送的基於該第一加密向量和該第二加密向量確定的加密後的該第一向量和該第二向量的距離,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的。 Example 4: In order to determine the distance between the encrypted first vector and the second vector, based on the above embodiments, in this embodiment of the present invention, sending the first target public key to the second device includes: Send the first encryption vector and the first target public key to the second device; The obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Receive the encrypted distance between the first vector and the second vector sent by the second device and determined based on the first encrypted vector and the second encrypted vector, where the second encrypted vector is determined by the second device based on the The first target public key is obtained by homomorphically encrypting the second vector.

為了確定加密後的第一向量和第二向量的距離,在本發明實施例中,第一設備獲得的該加密後該第一向量和第二向量的距離還可以為第二設備確定併發送給第一設備的。In order to determine the distance between the encrypted first vector and the second vector, in the embodiment of the present invention, the distance between the encrypted first vector and the second vector obtained by the first device can also be determined for the second device and sent to The first device.

具體的,為了保證第二設備可以獲得加密後的第一向量和第二向量的距離,在本發明實施例中,第一設備在將生成的第一目標公開金鑰發送給第二設備時,還可以將第一加密向量一起發送給第二設備,第二設備在接收到第一設備發送的第一目標公開金鑰以及第一加密向量後,基於該第一目標公開金鑰對第二向量進行同態加密生成第二加密向量,第二設備基於該第二加密向量以及接收到的第一設備發送的第一加密向量確定加密後的第一向量和第二向量的距離,並將該加密後的第一向量和第二向量的距離發送給第一設備,第一設備獲取第二設備發送的加密後的第一向量和第二向量的距離。Specifically, in order to ensure that the second device can obtain the encrypted distance between the first vector and the second vector, in this embodiment of the present invention, when the first device sends the generated first target public key to the second device, The first encryption vector can also be sent to the second device together. After receiving the first target public key and the first encryption vector sent by the first device, the second device pairs the second vector with the first target public key based on the first target public key. Perform homomorphic encryption to generate a second encrypted vector, and the second device determines the distance between the encrypted first vector and the second vector based on the second encrypted vector and the received first encrypted vector sent by the first device, and adds the encrypted vector to the second encrypted vector. The encrypted distance between the first vector and the second vector is sent to the first device, and the first device obtains the encrypted distance between the first vector and the second vector sent by the second device.

若基於該第一目標公開金鑰分別對該第一向量中的每個分量以及該第二向量中的每個分量進行同態加密,獲得第一加密向量和第二加密向量,則為了確定加密後的第一向量和第二向量的距離,在一種可能的實施方式中,第二設備基於該第一加密向量以及第二加密向量以及歐式距離公式,確定加密後的第一向量和第二向量的距離。具體的,根據 ,確定加密後的第一向量和第二向量的距離,其中,該 為第一加密向量中的第i個分量, 為第二加密向量中的第i個分量,該 為加密後的第一向量和第二向量的距離,N為第一加密向量或第二加密向量中包含的分量的數量,且該第一加密向量中包含的分量的數量與該第二加密分量中包含的分量的數量相同,即第一加密向量的長度與第二加密向量的長度相等。 If each component in the first vector and each component in the second vector are homomorphically encrypted based on the first target public key to obtain the first encrypted vector and the second encrypted vector, then in order to determine the encryption The distance between the encrypted first vector and the second vector. In a possible implementation, the second device determines the encrypted first vector and the second vector based on the first encrypted vector and the second encrypted vector and the Euclidean distance formula. distance. specific, based on , determine the distance between the encrypted first vector and the second vector, where, the is the i-th component in the first encrypted vector, is the i-th component in the second encrypted vector, the is the distance between the encrypted first vector and the second vector, N is the number of components included in the first encrypted vector or the second encrypted vector, and the number of components included in the first encrypted vector is the same as the second encrypted component. The number of components contained in is the same, that is, the length of the first encryption vector is equal to the length of the second encryption vector.

在另外一種可能的實施方式中,第二設備還可以根據該第一加密向量和第二加密向量以及余弦距離公式或者漢明距離公式確定該加密後的第一向量和第二向量的距離。In another possible implementation, the second device may also determine the distance between the encrypted first vector and the second vector based on the first encrypted vector and the second encrypted vector and the cosine distance formula or the Hamming distance formula.

需要說明的是,由於該第一加密向量和該第二加密向量中每個分量被第一目標公開金鑰進行同態加密了,因此,該確定的加密後的第一向量和第二向量的距離並不是一個實際的數值,而是一種確定運算式。It should be noted that since each component of the first encrypted vector and the second encrypted vector is homomorphically encrypted by the first target public key, the determined encrypted first vector and the second vector are Distance is not an actual numerical value, but a deterministic operation.

由於在本發明實施例中,第一資料均未以原始資料的形式離開過第一設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。Since in the embodiment of the present invention, none of the first data has left the first device in the form of original data, fuzzy matching can be achieved without the original data leaving the database, further ensuring the security of the matching process.

實施例5: 為了使得第二設備也能確定第一資料以及第二資料是否匹配,在上述各實施例的基礎上,該方法還包括: 接收該第二設備發送的第三加密向量以及該第二設備生成的第二目標公開金鑰;其中,該第三加密向量為該第二設備採用該第二目標公開金鑰對該第二向量進行同態加密後得到的; 基於該第二目標公開金鑰對該第一向量進行同態加密生成第四加密向量; 基於該第三加密向量以及該第四加密向量,確定加密後的該第二向量和該第一向量的距離,並將該加密後的該第二向量和該第一向量的距離發送給該第二設備,以使該第二設備根據加密後的該第二向量和該第一向量的距離以及該第二目標公開金鑰對應的第二目標私密金鑰,對該加密後的該第二向量和該第一向量的距離進行解密,確定該第二向量與該第一向量的目標距離,並根據該第二向量與該第一向量的目標距離及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 5: In order to enable the second device to also determine whether the first data and the second data match, based on the above embodiments, the method further includes: Receive the third encryption vector sent by the second device and the second target public key generated by the second device; wherein the third encryption vector is the second device using the second target public key to encrypt the second vector. Obtained after homomorphic encryption; Perform homomorphic encryption on the first vector based on the second target public key to generate a fourth encrypted vector; Based on the third encrypted vector and the fourth encrypted vector, determine the distance between the encrypted second vector and the first vector, and send the encrypted distance between the second vector and the first vector to the third encrypted vector. Two devices, so that the second device calculates the encrypted second vector based on the distance between the encrypted second vector and the first vector and the second target private key corresponding to the second target public key. Decrypt the distance from the first vector to determine the target distance between the second vector and the first vector, and determine the third vector based on the target distance between the second vector and the first vector and the preset first distance threshold. Whether the first data and the second data match.

為了使得第一設備以及第二設備都能確定第一資料以及第二資料是否匹配,在本發明實施例中,第二設備也需要獲得第一向量和第二向量的目標距離,也就是說,第一設備需要與第二設備實現資料同步。其中,該第一向量和第二向量的目標距離可以是第二設備自身確定的,也可以是第一設備確定了目標距離後發送給第二設備的。In order for both the first device and the second device to determine whether the first data and the second data match, in the embodiment of the present invention, the second device also needs to obtain the target distance between the first vector and the second vector, that is, The first device needs to synchronize data with the second device. The target distance between the first vector and the second vector may be determined by the second device itself, or may be sent to the second device after the first device determines the target distance.

若該第一向量和第二向量的目標距離是第二設備自身確定的,具體的,第一設備可以接收第二設備發送的第二目標公開金鑰以及該第二設備發送的第三加密向量,其中,該第二目標公開金鑰為第二設備自身生成的,且該第三加密向量為第二設備採用該第二目標公開金鑰對第二向量進行同態加密後得到的。第一設備接收到第二設備發送的第二目標公開金鑰後,基於該第二目標公開金鑰對第一向量進行同態加密生成第四加密向量。為了確定加密後的第一向量和第二向量的距離,在本發明實施例中,第一設備基於接收到第二設備發送的第三加密向量以及自身生成的第四加密向量,確定加密後的第一向量和第二向量的距離,並將確定的該加密後的第一向量和第二向量的距離發送給第二設備,使第二設備根據接收到的該加密後的第一向量和第二向量的距離以及該第二設備自身生成的第二目標公開金鑰對應的第二目標私密金鑰,對該加密後的第一向量和第二向量的距離進行解密,確定該第二向量與第一向量的目標距離,並根據該第二向量與第一向量的目標距離及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。If the target distance between the first vector and the second vector is determined by the second device itself, specifically, the first device can receive the second target public key sent by the second device and the third encryption vector sent by the second device. , wherein the second target public key is generated by the second device itself, and the third encryption vector is obtained by the second device using the second target public key to homomorphically encrypt the second vector. After receiving the second target public key sent by the second device, the first device performs homomorphic encryption on the first vector based on the second target public key to generate a fourth encryption vector. In order to determine the distance between the encrypted first vector and the second vector, in the embodiment of the present invention, the first device determines the encrypted vector based on the third encrypted vector sent by the second device and the fourth encrypted vector generated by itself. The distance between the first vector and the second vector is determined, and the determined distance between the encrypted first vector and the second vector is sent to the second device, so that the second device determines the distance between the encrypted first vector and the second vector based on the received encrypted first vector and the second vector. The distance between the two vectors and the second target private key corresponding to the second target public key generated by the second device itself, decrypt the distance between the encrypted first vector and the second vector, and determine the distance between the second vector and the second target public key. The target distance of the first vector, and based on the target distance of the second vector and the first vector and the preset first distance threshold, determine whether the first data and the second data match.

下面結合一個具體的例子進行說明: 第一設備將待匹配的第一資料登錄到部署在第一設備中的預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量,第二設備將待匹配的第二資料登錄到部署在第二設備中的預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量。若第一資料U1對應的第一向量為(x1,x2,x3……,xm),第二資料U2對應的第二向量為(y1,y2,y3……,ym)。 The following is explained with a specific example: The first device logs the first data to be matched into the pre-trained vector transformation model deployed in the first device, and obtains the first vector corresponding to the first data. The second device logs the second data to be matched. Go to the pre-trained vector conversion model deployed in the second device to obtain the second vector corresponding to the second data. If the first vector corresponding to the first data U1 is (x1, x2, x3..., xm), the second vector corresponding to the second data U2 is (y1, y2, y3..., ym).

第一設備生成第一目標公私密金鑰對A(pka1,ska1),其中,該pka1為第一目標公開金鑰,ska1為第一目標私密金鑰,基於該第一目標公開金鑰對第一向量進行同態加密生成第一加密向量,第一向量(x1,x2,x3……,xm)對應的第一加密向量為( ),並將該第一加密向量以及該第一目標公開金鑰發送給第二設備。 The first device generates a first target public-private key pair A(pka1, ska1), where pka1 is the first target public key and ska1 is the first target private key. Based on the first target public key pair A vector is homomorphically encrypted to generate the first encrypted vector. The first encrypted vector corresponding to the first vector (x1, x2, x3..., xm) is ( , , ), and send the first encryption vector and the first target public key to the second device.

第二設備接收到第一設備發送的第一目標公開金鑰以及第一加密向量後,基於該第一目標公開金鑰對第二向量進行同態加密,獲得第二加密向量。具體的,該第二向量(y1,y2,y3……,ym)對應的第二加密向量為( ),第二設備根據該第二加密向量以及接收到的第一加密向量,確定加密後的第一向量和第二向量的距離併發送給第一設備。 After receiving the first target public key and the first encryption vector sent by the first device, the second device performs homomorphic encryption on the second vector based on the first target public key to obtain the second encryption vector. Specifically, the second encrypted vector corresponding to the second vector (y1, y2, y3..., ym) is ( , , ), the second device determines the distance between the encrypted first vector and the second vector based on the second encrypted vector and the received first encrypted vector and sends it to the first device.

第二設備生成第二目標公私密金鑰對B(pka2,ska2),其中,該pka2為第二目標公開金鑰,ska2為第二目標私密金鑰,基於該第二目標公開金鑰對第二向量進行同態加密生成第三加密向量,該第二向量(y1,y2,y3……,ym)對應的第三加密向量為( ),第二設備將該第二目標公開金鑰以及該第三加密向量發送給第一設備。 The second device generates a second target public-private key pair B(pka2, ska2), where pka2 is the second target public key and ska2 is the second target private key. Based on the second target public key pair Two vectors are homomorphically encrypted to generate a third encrypted vector. The third encrypted vector corresponding to the second vector (y1, y2, y3..., ym) is ( , , ), the second device sends the second target public key and the third encryption vector to the first device.

第一設備接收到該第二設備發送的第二目標公開金鑰以及該第三加密向量之後,基於該第二目標公開金鑰對該第一向量進行同態加密,獲得第四加密向量,具體的,第一向量(x1,x2,x3……,xm)對應的第四加密向量為( )。第一設備基於該第四加密向量以及第三加密向量,確定加密後的第一向量和第二向量的距離,並將該加密後的第一向量和第二向量的距離發送給第二設備。 After receiving the second target public key and the third encryption vector sent by the second device, the first device performs homomorphic encryption on the first vector based on the second target public key to obtain a fourth encryption vector. Specifically, , the fourth encrypted vector corresponding to the first vector (x1, x2, x3..., xm) is ( , , ). The first device determines the encrypted distance between the first vector and the second vector based on the fourth encrypted vector and the third encrypted vector, and sends the encrypted distance between the first vector and the second vector to the second device.

第一設備接收到該加密後的第一向量和第二向量的距離後,根據自身生成的第一目標公開金鑰對應的第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,確定第一向量和第二向量的目標距離,第一設備再根據第二向量與第一向量的目標距離及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。After receiving the distance between the encrypted first vector and the second vector, the first device generates the encrypted first vector and the second vector according to the first target private key corresponding to the first target public key generated by itself. Decrypt the distance to determine the target distance between the first vector and the second vector. The first device determines whether the first data and the second data are based on the target distance between the second vector and the first vector and the preset first distance threshold. match.

第二設備接收到該加密後的第一向量和第二向量的距離後,根據自身生成的第二目標公開金鑰對應的第二目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,確定第一向量和第二向量的目標距離,第二設備再根據第二向量與第一向量的目標距離及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。After receiving the distance between the encrypted first vector and the second vector, the second device generates the encrypted first vector and the second vector according to the second target private key corresponding to the second target public key generated by itself. Decrypt the distance to determine the target distance between the first vector and the second vector, and the second device determines whether the first data and the second data are based on the target distance between the second vector and the first vector and the preset first distance threshold. match.

實施例6: 為了使得第二設備也能確定第一資料以及第二資料是否匹配,在上述各實施例的基礎上,在本發明實施例中,該確定該第一向量和該第二向量的目標距離之後,該方法還包括: 將該第一向量和該第二向量的目標距離發送給該第二設備,使該第二設備基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 6: In order to enable the second device to also determine whether the first data and the second data match, based on the above embodiments, in this embodiment of the present invention, after determining the target distance between the first vector and the second vector, The method also includes: Send the target distance of the first vector and the second vector to the second device, so that the second device determines whether the first data and the second data match based on the target distance and the preset first distance threshold. .

為了使得第二設備也能確定第一資料以及第二資料是否匹配,在本發明實施例中,第二設備也需要獲得第一向量和第二向量的目標距離,具體的,該第二設備獲得的第一向量和第二向量的目標距離可以是第一設備獲得後發送給第二設備的。第二設備在接收到第一設備發送的第一向量和第二向量的目標距離後,基於該目標距離以及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。In order to enable the second device to also determine whether the first data and the second data match, in the embodiment of the present invention, the second device also needs to obtain the target distance between the first vector and the second vector. Specifically, the second device obtains The target distance between the first vector and the second vector may be obtained by the first device and sent to the second device. After receiving the target distance of the first vector and the second vector sent by the first device, the second device determines whether the first data and the second data match based on the target distance and the preset first distance threshold.

下面以一個具體的例子進行說明: 第一設備將待匹配的第一資料登錄到部署在第一設備中的預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量,第二設備將待匹配的第二資料登錄到部署在第二設備中的預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量。若第一資料U1對應的第一向量為(x1,x2,x3……,xm),第二資料U2對應的第二向量為(y1,y2,y3……,ym)。 The following is a specific example to illustrate: The first device logs the first data to be matched into the pre-trained vector transformation model deployed in the first device, and obtains the first vector corresponding to the first data. The second device logs the second data to be matched. Go to the pre-trained vector conversion model deployed in the second device to obtain the second vector corresponding to the second data. If the first vector corresponding to the first data U1 is (x1, x2, x3..., xm), the second vector corresponding to the second data U2 is (y1, y2, y3..., ym).

第一設備生成第一目標公私密金鑰對A(pka1,ska1),其中,該pka1為第一目標公開金鑰,ska1為第一目標私密金鑰,基於該第一目標公開金鑰對第一向量進行同態加密生成第一加密向量,第一向量(x1,x2,x3……,xm)對應的第一加密向量為( ),並將該第一加密向量以及該第一目標公開金鑰發送給第二設備。 The first device generates a first target public-private key pair A(pka1, ska1), where pka1 is the first target public key and ska1 is the first target private key. Based on the first target public key pair A vector is homomorphically encrypted to generate the first encrypted vector. The first encrypted vector corresponding to the first vector (x1, x2, x3..., xm) is ( , , ), and send the first encryption vector and the first target public key to the second device.

第二設備接收到第一設備發送的第一目標公開金鑰以及第一加密向量後,基於該第一目標公開金鑰對第二向量進行同態加密,獲得第二加密向量。具體的,該第二向量(y1,y2,y3……,ym)對應的第二加密向量為( ),第二設備根據該第二加密向量以及接收到的第一加密向量,確定加密後的第一向量和第二向量的距離併發送給第一設備。 After receiving the first target public key and the first encryption vector sent by the first device, the second device performs homomorphic encryption on the second vector based on the first target public key to obtain the second encryption vector. Specifically, the second encrypted vector corresponding to the second vector (y1, y2, y3..., ym) is ( , , ), the second device determines the distance between the encrypted first vector and the second vector based on the second encrypted vector and the received first encrypted vector and sends it to the first device.

第一設備接收到該加密後的第一向量和第二向量的距離後,根據自身生成的第一目標公開金鑰對應的第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,確定第一向量和第二向量的目標距離,第一設備可以基於該目標距離及預設的第一距離閾值,確定第一資料以及第二資料是否匹配,並且第一設備將該第一向量和第二向量的目標距離發送給第二設備,第二設備根據第二向量與第一向量的目標距離及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。After receiving the distance between the encrypted first vector and the second vector, the first device generates the encrypted first vector and the second vector according to the first target private key corresponding to the first target public key generated by itself. Decrypt the distance to determine the target distance of the first vector and the second vector. The first device can determine whether the first data and the second data match based on the target distance and the preset first distance threshold, and the first device will The target distance between the first vector and the second vector is sent to the second device. The second device determines whether the first data and the second data match based on the target distance between the second vector and the first vector and the preset first distance threshold. .

實施例7: 為了確定第一向量和第二向量的目標距離,在上述各實施例的基礎上,在本發明實施例中,該基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和第二向量的目標距離包括: 採用第一設備自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密,確定該第一向量和該第二向量的目標距離。 Example 7: In order to determine the target distance between the first vector and the second vector, on the basis of the above embodiments, in the embodiment of the present invention, the distance based on the encrypted first vector and the second vector and the first target The first target private key corresponding to the public key, and determining the target distance between the first vector and the second vector includes: Use the first target private key corresponding to the first target public key generated by the first device itself to decrypt the distance between the encrypted first vector and the second vector to determine the first vector and the second vector. target distance.

在本發明實施例中,由於第一設備獲得的加密後的第一向量和第二向量的距離是基於第一加密向量和第二加密向量確定的,且該第一加密向量和該第二加密向量為根據該第一設備生成的第一目標公開金鑰確定的,因此,為了確定第一向量和第二向量的目標距離,第一設備在獲得加密後的第一向量和第二向量的距離後,採用第一設備生成的第一目標公開金鑰對應的第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,確定第一向量和第二向量的目標距離。In the embodiment of the present invention, since the distance between the encrypted first vector and the second vector obtained by the first device is determined based on the first encrypted vector and the second encrypted vector, and the first encrypted vector and the second encrypted vector The vector is determined based on the first target public key generated by the first device. Therefore, in order to determine the target distance between the first vector and the second vector, the first device obtains the encrypted distance between the first vector and the second vector. Then, use the first target private key corresponding to the first target public key generated by the first device to decrypt the encrypted distance between the first vector and the second vector, and determine the target distance between the first vector and the second vector. .

實施例8: 為了確定第一加密向量,在上述各實施例的基礎上,在本發明實施例中,該將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量包括: 針對該第一資料中的每個第一子資料,將該第一子資料登錄到預先訓練完成的向量轉化模型中,獲得該第一子資料對應的第一子向量;其中,每個第一子資料對應的第一子向量的長度均為第一預設長度; 將該每個第一子資料對應的第一子向量進行拼接,得到該第一資料對應的該第一向量。 Example 8: In order to determine the first encryption vector, based on the above embodiments, in the embodiment of the present invention, the first data to be matched is logged into the pre-trained vector transformation model, and the third data corresponding to the first data is obtained. A vector consists of: For each first sub-data in the first data, log the first sub-data into the pre-trained vector transformation model to obtain the first sub-vector corresponding to the first sub-data; wherein, each first sub-data is The length of the first sub-vector corresponding to the sub-data is the first preset length; The first sub-vectors corresponding to each first sub-data are spliced to obtain the first vector corresponding to the first data.

在本發明實施例中,一個第一資料中可能包含一個第一子資料,也可能包含多個第一子資料,比如,該第一資料中包含「上海市浦東新區晴天小賣部」一個第一子資料,該第一資料中還可以包含三個第一子資料,比如,該三個第一子資料分別為:「上海市浦東新區晴天小賣部」、「上海市天天餐飲店」、以及「高科路楊國福麻辣燙」。In the embodiment of the present invention, a first data may include one first sub-data, or may include multiple first sub-data. For example, the first data includes a first sub-data of "Shanghai Pudong New District Sunny Shop" Data, the first data can also include three first sub-data. For example, the three first sub-data are: "Shanghai Pudong New Area Qingtian Shop", "Shanghai Tiantian Restaurant", and "Gaoke Road" Yang Guofu Malatang".

為了確定第一資料對應的第一向量,可以針對第一資料中的每個第一子資料,將該第一子資料登錄到預先訓練完成的向量轉化模型中,獲得該第一子資料對應的第一子向量,其中,每個第一子資料包含的文字或者數字或者字元的長度可以不相同,但是每個第一子資料對應的第一子向量的長度均為第一預設長度,其中,該第一預設長度可以為3,也可以為4或者6等等,具體的,該第一預設長度可以根據需求進行設置。In order to determine the first vector corresponding to the first data, for each first sub-data in the first data, the first sub-data can be logged into the pre-trained vector transformation model, and the corresponding first sub-data can be obtained. The first sub-vector, in which the length of the characters, numbers or characters contained in each first sub-data may be different, but the length of the first sub-vector corresponding to each first sub-data is the first preset length, The first preset length may be 3, 4, 6, etc. Specifically, the first preset length may be set according to requirements.

以預先訓練完成的向量轉化模型輸出的向量的維度為5,預先訓練完成的向量轉化模型為詞向量模型為例進行說明,若第一資料中包含三個第一子資料,每個第一子資料都為文字資料,該三個第一子資料分別為「上海市浦東新區晴天小賣部」,「上海市天天餐飲店」以及「高科路楊國福麻辣燙」,則將「上海市浦東新區晴天小賣部」輸入到預先訓練完成的詞向量模型中,輸出的該「上海市浦東新區晴天小賣部」對應的第一子向量為(1.0,2.0,1.5,2.0,3.5),將「上海市天天餐飲店」輸入到預先訓練完成的詞向量模型中,輸出的該「上海市天天餐飲店」對應的第一子向量為(3.0,4.0,2.5,2.5,1.5),將「高科路楊國福麻辣燙」輸入到預先訓練完成的詞向量模型中,輸出的該「高科路楊國福麻辣燙」對應的第一子向量為(4.5,5.5,7.5,1.5,0.5)。Taking the dimension of the vector output by the pre-trained vector conversion model as 5 and the pre-trained vector conversion model as a word vector model as an example to illustrate, if the first data contains three first sub-data, each first sub-data The data are all text data. The three first sub-data are "Shanghai Pudong New Area Qingtian Canteen", "Shanghai Tiantian Restaurant" and "Gaoke Road Yang Guofu Malatang", then "Shanghai Pudong New District Qingtian Canteen" Input into the pre-trained word vector model, and the first subvector corresponding to the output "Shanghai Pudong New District Sunny Shop" is (1.0, 2.0, 1.5, 2.0, 3.5), input "Shanghai Tiantian Restaurant" In the word vector model that has been trained in advance, the first subvector corresponding to the output "Shanghai Tiantian Restaurant" is (3.0, 4.0, 2.5, 2.5, 1.5), and "Gaoke Road Yang Guofu Malatang" is input into the pre-trained word vector model. In the word vector model that has been trained, the first subvector corresponding to "Gaoke Road Yang Guofu Malatang" output is (4.5, 5.5, 7.5, 1.5, 0.5).

若第一資料中包含三個第一子資料,每個第一子資料都為數字資料,該三個數字資料分別為「12345」,「11111」,以及「22233」,則將「12345」輸入到預先訓練完成的詞向量模型中,輸出的該「12345」對應的第一子向量為(0000000010,0000000100,0000001000,0000010000,0000100000),將「11111」輸入到預先訓練完成的詞向量模型中,輸出的該「11111」對應的第一子向量為(0000000010,0000000010,0000000010,0000000010,0000000010),將「22233」輸入到預先訓練完成的詞向量模型中,輸出的該「22233」對應的第一子向量為(0000000100,0000000100,0000000100,0000001000,0000001000)。If the first data contains three first sub-data, each first sub-data is digital data, and the three digital data are "12345", "11111", and "22233" respectively, then enter "12345" Into the pre-trained word vector model, the first subvector corresponding to the output "12345" is (0000000010, 0000000100, 0000001000, 0000010000, 0000100000). Input "11111" into the pre-trained word vector model. The first subvector corresponding to the output "11111" is (0000000010, 0000000010, 0000000010, 0000000010, 0000000010). Input "22233" into the pre-trained word vector model, and the first subvector corresponding to the output "22233" The subvectors are (0000000100, 0000000100, 0000000100, 0000001000, 0000001000).

為了確定該第一資料對應的第一向量,在本發明實施例中,在獲得第一資料中每個第一子資料對應的第一子向量後,將每個第一子資料對應的第一子向量進行拼接,將拼接的結果確定為該第一資料對應的第一向量,具體的,可以先隨機對該第一資料中包含的每個第一子資料進行排序,根據第一子資料的排序結果,對第一子資料進對應的第一子向量進行排序並拼接,得到第一向量。In order to determine the first vector corresponding to the first data, in the embodiment of the present invention, after obtaining the first sub-vector corresponding to each first sub-data in the first data, the first vector corresponding to each first sub-data is obtained. The sub-vectors are spliced, and the splicing result is determined to be the first vector corresponding to the first data. Specifically, each first sub-data contained in the first data can be randomly sorted, and according to the first sub-data As a sorting result, the first sub-data corresponding to the first sub-vector is sorted and spliced to obtain the first vector.

比如,第一資料中包含「上海市浦東新區晴天小賣部」、「上海市天天餐飲店」、以及「高科路楊國福麻辣燙」三個第一子資料時,「上海市浦東新區晴天小賣部」對應的第一子向量為(1.0,2.0,1.5),「上海市天天餐飲店」對應的第一子向量為(3.0,4.0,2.5),「高科路楊國福麻辣燙」對應的第一子向量為(4.5,5.5,7.5),則可以隨機對該第一資料中包含的每個第一子資料進行排序後,獲得的排序結果為「上海市浦東新區晴天小賣部」、「高科路楊國福麻辣燙」、「上海市天天餐飲店」,則按照排序結果,對三個第一子資料對應的第一子向量進行拼接後獲得的第一資料對應的第一向量為(1.0,2.0,1.5,4.5,5.5,7.5,3.0,4.0,2.5)。For example, when the first data contains three first sub-data: "Shanghai Pudong New Area Qingtian Canteen", "Shanghai Tiantian Restaurant", and "Gaoke Road Yang Guofu Malatang", the corresponding "Shanghai Pudong New Area Qingtian Canteen" The first subvector is (1.0, 2.0, 1.5), the first subvector corresponding to "Shanghai Tiantian Restaurant" is (3.0, 4.0, 2.5), and the first subvector corresponding to "Gaoke Road Yang Guofu Malatang" is ( 4.5, 5.5, 7.5), then each first sub-data contained in the first data can be randomly sorted, and the sorting results obtained are "Shanghai Pudong New District Sunny Shop", "Gaoke Road Yang Guofu Malatang", "Shanghai Tiantian Restaurant", according to the sorting results, the first vector corresponding to the first data obtained after splicing the first sub-vectors corresponding to the three first sub-data is (1.0, 2.0, 1.5, 4.5, 5.5 ,7.5,3.0,4.0,2.5).

為了實現第一資料與第二資料的模糊匹配,在上述各實施例的基礎上,在本發明實施例中,該第一向量和該第二向量的長度均為第二預設長度。In order to achieve fuzzy matching between the first data and the second data, based on the above embodiments, in the embodiment of the present invention, the lengths of the first vector and the second vector are both second preset lengths.

在本發明實施例中,為了實現第一資料與第二資料的模糊匹配,獲得的第一資料對應的第一向量的長度以及第二資料對應的第二向量的長度必須是相同的,且均為第二預設長度,其中,該第二預設長度不小於該第一預設長度,第二預設長度為第一預設長度的整數倍,且若該第一資料中只包含一個第一子資料,則該第一預設長度等於第二預設長度。In the embodiment of the present invention, in order to achieve fuzzy matching between the first data and the second data, the length of the first vector corresponding to the obtained first data and the length of the second vector corresponding to the second data must be the same, and both is a second preset length, wherein the second preset length is not less than the first preset length, the second preset length is an integer multiple of the first preset length, and if the first data only contains one A sub-data, then the first preset length is equal to the second preset length.

由於針對第一資料對應的第一向量的長度和第二資料對應的第二向量的長度均為第二預設長度,因此,即使第一資料與第二資料不相同,也能實現模糊匹配,拓寬了使用場景。Since the length of the first vector corresponding to the first data and the length of the second vector corresponding to the second data are both the second preset length, even if the first data and the second data are different, fuzzy matching can be achieved. Broadened usage scenarios.

實施例9: 為了確定第一加密向量,在上述各實施例的基礎上,在本發明實施例中,該採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量包括: 針對該第一向量中的每個第一分量,確定該第一分量對應的第一平方分量; 將每個第一分量對應的第一平方分量按照預設的插入規則插入到該第一向量中,並將插入第一平方分量後獲得的向量更新為該第一向量; 基於該第一目標公開金鑰對該第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成該第一加密向量。 Example 9: In order to determine the first encryption vector, on the basis of the above embodiments, in the embodiment of the present invention, using the self-generated first target public key to homomorphically encrypt the first vector to generate the first encryption vector includes: For each first component in the first vector, determine the first square component corresponding to the first component; Insert the first square component corresponding to each first component into the first vector according to the preset insertion rules, and update the vector obtained after inserting the first square component to the first vector; Homomorphic encryption is performed on each first component and each first square component in the first vector based on the first target public key to generate the first encrypted vector.

為了生成第一加密向量,在本發明實施例中,可以直接基於第一目標公開金鑰對該第一向量進行同態加密,獲得加密後的第一加密向量。為了保證可以在不對第一加密向量和第二加密向量進行解密的前提下,還可以基於第一加密向量和第二加密向量確定加密後的第一向量和第二向量的距離,在本發明實施例中,還可以先針對第一向量中的每個第一分量,確定該第一分量對應的第一平方分量。In order to generate the first encrypted vector, in the embodiment of the present invention, the first vector can be directly homomorphically encrypted based on the first target public key to obtain the encrypted first encrypted vector. In order to ensure that the distance between the encrypted first vector and the second vector can be determined based on the first encrypted vector and the second encrypted vector without decrypting the first encrypted vector and the second encrypted vector, in the implementation of the present invention In this example, for each first component in the first vector, the first square component corresponding to the first component may also be determined first.

比如,若該第一向量為(1,2,4,5,3),則該第一向量中為1的第一分量對應的第一平方分量為1,該第一向量中為2的第一分量對應的第一平方分量為4,該第一向量中為4的第一分量對應的第一平方分量為16,該第一向量中為5的第一分量對應的第一平方分量為25,該第一向量中為3的第一分量對應的第一平方分量為9。For example, if the first vector is (1, 2, 4, 5, 3), then the first square component corresponding to the first component of 1 in the first vector is 1, and the first square component of the first vector is 2. The first square component corresponding to one component is 4, the first square component corresponding to the first component 4 in the first vector is 16, and the first square component corresponding to the first component 5 in the first vector is 25 , the first square component corresponding to the first component of 3 in the first vector is 9.

在本發明實施例中,可以針對第一向量中的每個第一分量後,在確定該第一分量對應的第一平方分量後,將每個第一分量對應的第一平方分量按照預設的規則插入到第一向量中,並將插入第一平方分量後獲得的向量更新為第一向量,具體的,針對每個第一分量對應的第一平方分量,可以將該第一分量對應的第一平方分量插入到第一向量中的任意位置,比如,將該第一分量對應的第一平方分量插入到第一向量中該第一分量的前面,或者將該第一分量對應的第一平方分量插入到該第一向量中該第一分量的後面,或者將第一平方分量按照順序插入到第一分量的後面,只要保證第一設備和第二設備能夠識別每個向量中的第一分量和第一平方分量即可。In the embodiment of the present invention, for each first component in the first vector, after determining the first square component corresponding to the first component, the first square component corresponding to each first component can be calculated according to a preset The rule of is inserted into the first vector, and the vector obtained after inserting the first square component is updated as the first vector. Specifically, for the first square component corresponding to each first component, the first square component corresponding to the first component can be The first square component is inserted into any position in the first vector. For example, the first square component corresponding to the first component is inserted in front of the first component in the first vector, or the first square component corresponding to the first component is inserted into the first vector. The square component is inserted into the first vector after the first component, or the first square component is inserted into the back of the first component in sequence, as long as it is ensured that the first device and the second device can identify the first component in each vector. component and the first square component.

比如,若該第一向量為(1,2,4,5,3),在確定第一向量中各個第一分量的第一平方分量後,將第一平方分量插入到第一向量中,獲得的更新後的第一向量為(1,9,2,16,4,4,5,25,1,3)。For example, if the first vector is (1, 2, 4, 5, 3), after determining the first square component of each first component in the first vector, insert the first square component into the first vector to obtain The updated first vector is (1, 9, 2, 16, 4, 4, 5, 25, 1, 3).

為了便於後續基於插入第一平方分量的第一向量,確定加密後的第一向量和第二向量的距離,在本發明實施例中,針對第一向量中的每個第一分量,在確定該第一分量對應的第一平方分量後,還可以將每個第一分量對應的第一平方分量按照預設的插入規則插入到第一向量中,並將插入第一平方分量後獲得的向量更新為第一向量,具體的,可以將該第一分量對應第一平方分量插入到該第一向量中該第一分量後面且與該第一分量相鄰的位置。In order to facilitate the subsequent determination of the distance between the encrypted first vector and the second vector based on the first vector inserted into the first square component, in the embodiment of the present invention, for each first component in the first vector, after determining the After the first square component corresponding to the first component, the first square component corresponding to each first component can also be inserted into the first vector according to the preset insertion rule, and the vector obtained after inserting the first square component is updated. is the first vector. Specifically, the first square component corresponding to the first component can be inserted into the first vector at a position behind the first component and adjacent to the first component.

比如,若該第一向量為(1,2,4,5,3),在確定第一向量中各個第一分量的第一平方分量後,將第一平方分量插入到第一向量中,獲得的更新後的第一向量為(1,1,2,4,4,16,5,25,3,9)。For example, if the first vector is (1, 2, 4, 5, 3), after determining the first square component of each first component in the first vector, insert the first square component into the first vector to obtain The updated first vector is (1, 1, 2, 4, 4, 16, 5, 25, 3, 9).

在確定更新後的第一向量後,為了確定第一加密向量,在本發明實施例中,可以基於該第一目標公開金鑰對第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成第一加密向量。After determining the updated first vector, in order to determine the first encrypted vector, in the embodiment of the present invention, each first component and each first square in the first vector can be determined based on the first target public key. The components are homomorphically encrypted respectively to generate a first encryption vector.

比如,若更新後的第一向量為(2,4,3,9),則第一加密向量為( ),其中,該 表徵基於目標公開金鑰對第一向量中為2的第一分量進行同態加密後的結果,該 表徵基於目標公開金鑰對第一向量中為4的第一分量進行同態加密後的結果,該 表徵基於目標公開金鑰對第一向量中為3的第一分量進行同態加密後的結果,該 表徵基於目標公開金鑰對第一向量中為9的第一分量進行同態加密後的結果。 For example, if the updated first vector is (2, 4, 3, 9), then the first encrypted vector is ( , , , ), among which, the Characterizes the result of homomorphic encryption of the first component of 2 in the first vector based on the target public key, which Characterizes the result of homomorphic encryption of the first component of 4 in the first vector based on the target public key, which Characterizes the result of homomorphic encryption of the first component of 3 in the first vector based on the target public key, which Characterizes the result of homomorphically encrypting the first component of 9 in the first vector based on the target public key.

為了確定加密後的第一向量和第二向量的距離,在上述各實施例的基礎上,在本發明實施例中,該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;並根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量; 根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量; 根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離; 根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。 In order to determine the distance between the encrypted first vector and the second vector, based on the above embodiments, in the embodiment of the present invention, the encrypted vector determined based on the first encrypted vector and the second encrypted vector is obtained. The distance between the first vector and the second vector includes: According to the preset insertion rule and each first component in the first encryption vector, obtain each set of first encryption components and first encryption square components; and according to the preset insertion rule and the second encryption vector For each second component of , obtain each set of second encrypted components and second encrypted square components; According to the preset insertion rule, determine each corresponding group of first encrypted component, first encrypted square component, second encrypted component and second encrypted square component; Determine each encrypted sub-distance according to each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; The distance between the encrypted first vector and the second vector is determined based on the sum of each sub-distance.

在本發明實施例中,為了確定加密後的第一向量和第二向量的距離,第一設備可以針對預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量,其中一組第一加密分量和第一加密平方分量中是對同一個分量本身和該分量的平方分量進行同態加密得到的。即每組第一加密分量和第一加密平方分量中,該第一加密平方分量為對該第一加密分量在加密前對應的第一分量對應的第一平方分量進行同態加密後獲得的。In the embodiment of the present invention, in order to determine the distance between the encrypted first vector and the second vector, the first device can obtain each group of the first component according to the preset insertion rule and the first encrypted vector. An encrypted component and a first encrypted square component, wherein a set of the first encrypted component and the first encrypted square component is obtained by performing homomorphic encryption on the same component itself and the square component of the component. That is, in each group of the first encrypted component and the first encrypted square component, the first encrypted square component is obtained by performing homomorphic encryption on the first square component corresponding to the first component of the first encrypted component before encryption.

具體的,若該預設的插入規則為將該第一分量對應的第一平方分量插入到該第一向量中該第一分量後面且與該第一分量相鄰的位置,則在確定每組第一加密分量和第一加密分量時,可以直接從該第一向量的第一個分量開始,將待確定分組的分量與該第一向量中該待確定分組的分量的下一個與該待確定分組的分量確定為一組,依次進行劃分,直至確定所有分組對應的第一加密分量和第一加密平方分量。Specifically, if the preset insertion rule is to insert the first square component corresponding to the first component into the first vector at a position behind the first component and adjacent to the first component, then after determining each group When the first encryption component and the first encryption component are used, you can directly start from the first component of the first vector, and combine the component of the group to be determined with the next component of the group to be determined in the first vector and the component of the group to be determined. The components of the group are determined as one group, and are divided sequentially until the first encrypted component and the first encrypted square component corresponding to all the groups are determined.

比如,該第一加密向量為( ),則可以獲得兩組第一加密分量和第一加密平方分量,其中第一組為 ,第二組為 。其中,第一組中的 為該第一組中的第一加密分量, 為該第一組中的第一加密平方分量,第二組中的 為該第二組中的第一加密分量, 為該第二組中的第一加密平方分量。 For example, the first encryption vector is ( , , , ), then two sets of first encrypted components and first encrypted square components can be obtained, where the first group is , the second group is . Among them, the first group is the first encrypted component in the first group, is the first encrypted square component in the first group, and the is the first encrypted component in the second group, is the first encrypted square component in this second group.

在本發明實施例中,第一設備在獲得第二設備發送的第二加密向量後,也可以針對預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量,其中,該獲得第一向量時對應的預設的插入規則與獲得第二向量時對應的預設的插入規則相同,且獲得每組第二加密分量和第二加密平方分量的過程與獲得每組第一加密分量和第一加密平方分量的過程相同,在此不做贅述。In the embodiment of the present invention, after obtaining the second encryption vector sent by the second device, the first device can also obtain each group of second components based on the preset insertion rules and each second component in the second encryption vector. Encrypted components and second encrypted square components, where the corresponding preset insertion rules when obtaining the first vector are the same as the preset insertion rules when obtaining the second vector, and each group of second encrypted components and second The process of encrypting square components is the same as the process of obtaining each group of first encrypted components and first encrypted square components, and will not be described again here.

為了確定該加密後的第一向量與第二向量之間的距離,在本發明實施例中,可以根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離,具體的,針對每組,可以先確定該組中第一加密平方分量與第二加密平方分量的目標和值,然後確定該組中第一加密分量、第二加密分量以及預設的數值的目標乘積,其中,在本發明實施例中,該預設的數值為2,最後確定該目標和值與該目標乘積的目標差值,將該目標差值確定為該組加密後的子距離。In order to determine the distance between the encrypted first vector and the second vector, in the embodiment of the present invention, each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component can be determined. , determine each sub-distance after encryption. Specifically, for each group, you can first determine the target sum value of the first encrypted square component and the second encrypted square component in the group, and then determine the first encrypted component and the second encrypted square component in the group. The target product of the encryption component and the preset value, where, in the embodiment of the present invention, the preset value is 2, finally determine the target difference between the target sum value and the target product, and determine the target difference as The encrypted subdistance of this group.

在確定每組加密後的子距離後,可以根據每個子距離的和值,確定加密後的第一向量和第二向量的距離。After each group of encrypted sub-distances is determined, the distance between the encrypted first vector and the second vector can be determined based on the sum of each sub-distance.

比如,若確定第一資料對應的第一向量為(1,5),第二資料對應的第二向量為(2,3),在確定第一向量中各個第一分量的第一平方分量後,將各個第一平方分量插入到第一向量中,獲得的更新後的第一向量為(1,1,5,25),確定第一加密向量為( ),在確定第二向量中各個第二分量的第二平方分量後,將各個第二平方分量插入到第二向量中,獲得的更新後的第二向量為(2,4,3,9),確定第二加密向量為( ),則確定加密的該第一向量和第二向量的距離為[ ]+[ ]。 For example, if it is determined that the first vector corresponding to the first data is (1, 5) and the second vector corresponding to the second data is (2, 3), after determining the first square component of each first component in the first vector , insert each first square component into the first vector, the updated first vector obtained is (1, 1, 5, 25), and the first encrypted vector is determined to be ( , , , ), after determining the second square component of each second component in the second vector, insert each second square component into the second vector, and the updated second vector obtained is (2, 4, 3, 9) , determine the second encryption vector as ( , , , ), then determine the distance between the encrypted first vector and the second vector as [ ]+[ ].

由於待匹配的第一資料可能包含多個第一子資料,待匹配的第二資料中也可能包含多個第二子資料,因此,針對每個第一子資料以及每個第二子資料,還能確定該加密後的第一子資料的第一子向量和該第二子資料的第二子向量的距離,進而確定每個第一子資料以及每個第二子資料是否匹配,因此,為了方便,可以根據各個加密後的第一子向量和第二子向量的距離,確定加密後的距離矩陣,其中,該加密後的距離矩陣中每個元素為對應的加密後的第一子向量和第二子向量的距離。在基於目標私密金鑰對加密後的距離矩陣進行解密後,確定該目標子距離矩陣,其中,該目標子距離矩陣中每個元素為對應的第一子向量和第二子向量的目標子距離。Since the first data to be matched may include multiple first sub-data, and the second data to be matched may also include multiple second sub-data, therefore, for each first sub-data and each second sub-data, It can also determine the distance between the first sub-vector of the encrypted first sub-data and the second sub-vector of the second sub-data, and then determine whether each first sub-data and each second sub-data match, therefore, For convenience, the encrypted distance matrix can be determined based on the distance between each encrypted first sub-vector and the second sub-vector, where each element in the encrypted distance matrix is the corresponding encrypted first sub-vector. distance from the second subvector. After decrypting the encrypted distance matrix based on the target private key, the target sub-distance matrix is determined, where each element in the target sub-distance matrix is the target sub-distance of the corresponding first sub-vector and second sub-vector. .

在確定每個第一子向量和每個第二子向量的目標子距離後,還可以根據每個目標子距離的和值,確定該第一向量和第二向量的目標距離,且還可以確定每個第一子資料與每個第二子資料是否匹配,具體的,可以預先設置了子距離閾值,針對每個第一子向量以及每個第二子向量,根據該第一子向量和該第二子向量的目標子距離,確定該目標子距離是否小於預先設置的子距離閾值,若是,則確定該第一子向量對應的第一子資料與該第二子向量對應的第二子資料匹配。After determining the target sub-distance of each first sub-vector and each second sub-vector, the target distance of the first vector and the second vector may also be determined based on the sum of each target sub-distance, and may also be determined Whether each first sub-data matches each second sub-data. Specifically, a sub-distance threshold can be set in advance. For each first sub-vector and each second sub-vector, according to the first sub-vector and the The target sub-distance of the second sub-vector determines whether the target sub-distance is less than the preset sub-distance threshold. If so, determines the first sub-data corresponding to the first sub-vector and the second sub-data corresponding to the second sub-vector. match.

圖2a為本發明一些實施例提供的一種目標子距離的顯示示意圖,圖2b為本發明一些實施例提供的一種目標子距離矩陣的顯示示意圖,圖3a為本發明一些實施例提供的另外一種目標子距離的顯示示意圖,圖3b為本發明一些實施例提供的另外一種目標距離子矩陣的顯示示意圖,現針對圖2a、圖2b、圖3a及圖3b進行說明。Figure 2a is a schematic display diagram of a target sub-distance provided by some embodiments of the present invention. Figure 2b is a schematic display diagram of a target sub-distance matrix provided by some embodiments of the present invention. Figure 3a is another target provided by some embodiments of the present invention. A schematic diagram of the display of sub-distances. Figure 3b is a schematic diagram of the display of another target distance sub-matrix provided by some embodiments of the present invention. Figures 2a, 2b, 3a and 3b will now be described.

D(x,y)表徵的為目標子距離,x表徵的第一向量,y表徵的第二向量,若第一資料中存在三個第一子資料,第二資料中也存在三個第二子資料,且該三個第一資料子對應的第一子向量分別表示為A1、A2以及A3,三個第二子資料對應的第二子向量分別表示為B1、B2以及B3,A1與B1對應的目標子距離為1,A1與B2對應的目標子距離為3.64,A1與B3對應的目標子距離為7.66,A2與B1對應的目標子距離為3.9,A2與B2對應的目標子距離為0,A2與B3對應的目標子距離為5.7,A3與B1對應的目標子距離為8.35,A3與B2對應的目標子距離為5.16,A3與B3對應的目標子距離為8.18,如圖2a所示,對應的目標子距離矩陣如圖2b所示。D(x,y) represents the target sub-distance, the first vector represented by x, and the second vector represented by y. If there are three first sub-data in the first data, there are also three second sub-data in the second data. sub-data, and the first sub-vectors corresponding to the three first sub-data are represented as A1, A2 and A3 respectively, and the second sub-vectors corresponding to the three second sub-data are represented as B1, B2 and B3 respectively, A1 and B1 The corresponding target sub-distance is 1, the target sub-distance corresponding to A1 and B2 is 3.64, the target sub-distance corresponding to A1 and B3 is 7.66, the target sub-distance corresponding to A2 and B1 is 3.9, and the target sub-distance corresponding to A2 and B2 is 0, the target sub-distance corresponding to A2 and B3 is 5.7, the target sub-distance corresponding to A3 and B1 is 8.35, the target sub-distance corresponding to A3 and B2 is 5.16, and the target sub-distance corresponding to A3 and B3 is 8.18, as shown in Figure 2a is shown, and the corresponding target sub-distance matrix is shown in Figure 2b.

若第一資料中存在三個第一子資料,第二資料中也存在三個第二子資料,且該三個第一子資料對應的第一子向量分別表示為A1、A2以及A3,三個第二子資料對應的第二子向量分別表示為B1、B2以及B3,A1與B1對應的目標子距離為2.82,A1與B2對應的目標子距離為1,A1與B3對應的目標子距離為3.16,A2與B1對應的目標子距離為3,A2與B2對應的目標子距離為1.73,A2與B3對應的目標子距離為3,A3與B1對應的目標子距離為0,A3與B2對應的目標子距離為2.83,A3與B3對應的目標子距離為3,如圖3a所示,對應的目標子距離矩陣如圖3b所示。If there are three first sub-data in the first data and three second sub-data in the second data, and the first sub-vectors corresponding to the three first sub-data are respectively represented as A1, A2 and A3, the three The second sub-vectors corresponding to the second sub-data are represented as B1, B2 and B3 respectively. The target sub-distance corresponding to A1 and B1 is 2.82, the target sub-distance corresponding to A1 and B2 is 1, and the target sub-distance corresponding to A1 and B3 is 3.16, the target sub-distance corresponding to A2 and B1 is 3, the target sub-distance corresponding to A2 and B2 is 1.73, the target sub-distance corresponding to A2 and B3 is 3, the target sub-distance corresponding to A3 and B1 is 0, and the target sub-distance corresponding to A3 and B2 The corresponding target sub-distance is 2.83, the corresponding target sub-distance between A3 and B3 is 3, as shown in Figure 3a, and the corresponding target sub-distance matrix is shown in Figure 3b.

實施例10: 為了確定第一資料以及第二資料是否匹配,在各上述實施例的基礎上,在本發明實施例中,該基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配包括: 確定該目標距離是否小於預設的第一距離閾值; 若是,則確定該第一資料與該第二資料匹配; 否則,確定該第一資料與該第二資料不匹配。 Example 10: In order to determine whether the first data and the second data match, based on the above embodiments, in the embodiment of the present invention, it is determined based on the target distance and the preset first distance threshold that the first data and the third data are matched. Whether the two data match includes: Determine whether the target distance is less than a preset first distance threshold; If so, it is determined that the first data matches the second data; Otherwise, it is determined that the first data does not match the second data.

為了確定該第一資料以及該第二資料是否匹配,在本發明實施例中,將該目標距離以及預設的距離閾值進行比較,若該目標距離小於預設的第一距離閾值,則確定該第一資料以及該第二資料匹配,若該目標距離不小於預設的第一距離閾值,則確定該第一資料以及該第二資料不匹配。其中,該預設的第一距離閾值可以為1,可以為1.5等等,具體的,該預設的第一距離閾值可以根據需求進行設置。其中,該目標距離越小,該第一向量和第二向量越匹配。In order to determine whether the first data and the second data match, in the embodiment of the present invention, the target distance is compared with a preset distance threshold. If the target distance is less than the preset first distance threshold, it is determined that the target distance is smaller than the preset first distance threshold. The first data and the second data match. If the target distance is not less than the preset first distance threshold, it is determined that the first data and the second data do not match. The preset first distance threshold may be 1, 1.5, etc. Specifically, the preset first distance threshold may be set according to requirements. Wherein, the smaller the target distance is, the closer the first vector and the second vector match.

為了確定第二資料與第一資料是否完全匹配,在上述各實施例的基礎上,在本發明實施例中,該確定該第一資料與該第二資料匹配之後,該方法還包括: 確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。 In order to determine whether the second data completely matches the first data, based on the above embodiments, in the embodiment of the present invention, after determining that the first data matches the second data, the method further includes: It is determined whether the target distance is equal to a preset second distance threshold, and if so, it is determined that the first data and the second data are the same.

在本發明實施例中,若該目標距離等於預設的第二距離閾值,則說明該第一資料與該第二資料相同,也就是說,該第一資料與該第二資料完全匹配,其中,該預設的第二距離閾值小於該預設的第一距離閾值,且該預設的第二距離閾值等於0。In the embodiment of the present invention, if the target distance is equal to the preset second distance threshold, it means that the first data and the second data are the same, that is to say, the first data and the second data completely match, where , the preset second distance threshold is less than the preset first distance threshold, and the preset second distance threshold is equal to 0.

實施例11: 為了保證雙方資料不完全相同時也可以進行匹配,拓寬資料匹配的業務範圍,本發明實施例提供了一種資料匹配方法、裝置、設備及介質。 Example 11: In order to ensure that matching can be performed even when the data of both parties are not exactly the same, and to broaden the business scope of data matching, embodiments of the present invention provide a data matching method, device, equipment and medium.

圖4為本發明實施例提供的一種資料匹配方法過程示意圖,該過程包括以下步驟: S401:將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量。 Figure 4 is a schematic process diagram of a data matching method provided by an embodiment of the present invention. The process includes the following steps: S401: Log the second data to be matched into the pre-trained vector transformation model to obtain the second vector corresponding to the second data.

本發明實施例提供的資料匹配方法應用於第二設備,該第二設備可以是智慧終端機、PC或者伺服器等設備,且該第二設備與本發明中的第一設備為不同的設備。The data matching method provided by the embodiment of the present invention is applied to a second device. The second device may be a smart terminal, a PC, a server, etc., and the second device is a different device from the first device in the present invention.

在本發明實施例中,為了保證雙方資料不完全相同時,也能實現模糊匹配,在本發明實施例中,第二設備中部署了預先訓練完成的向量轉化模型,用於獲得待匹配的資料對應的向量,且針對不同的資料,該預先訓練完成的向量轉化模型輸出的向量的維度相同。In the embodiment of the present invention, in order to ensure that fuzzy matching can be achieved even when the data of both parties are not exactly the same, in the embodiment of the present invention, a pre-trained vector transformation model is deployed in the second device to obtain the data to be matched. Corresponding vectors, and for different data, the dimensions of the vectors output by the pre-trained vector transformation model are the same.

為了獲得待匹配的第二資料對應的第二向量,將該第二資料登錄到預先訓練完成的向量轉化模型中,該預先訓練完成的向量轉化模型輸出該第二資料對應的第二向量。In order to obtain the second vector corresponding to the second data to be matched, the second data is logged into the pre-trained vector transformation model, and the pre-trained vector transformation model outputs the second vector corresponding to the second data.

在本發明實施例中,第一資料與第二資料一般為同一種類型的資料,比如,都為文字資料或者都為數字資料。In the embodiment of the present invention, the first data and the second data are generally the same type of data, for example, both are text data or both are digital data.

S402:接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量。S402: Receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector.

在本發明實施例中,為了確定第二加密向量,第二設備接收第一設備發送的第一目標公開金鑰後,採用該第一目標公開金鑰對第二向量進行同態加密生成第二加密向量。In this embodiment of the present invention, in order to determine the second encryption vector, after receiving the first target public key sent by the first device, the second device uses the first target public key to homomorphically encrypt the second vector to generate the second Encrypted vector.

S403:獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的。S403: Obtain the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, where the first encrypted vector is encrypted using the first target public key. Obtained later, the first vector is obtained by logging the first data into a pre-trained vector transformation model in the first device.

在本發明實施例中,為了實現第二設備也能確定第一資料與第二資料是否匹配,因此該第二設備也獲得了基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,其中,該目標距離可以為第一設備確定後發送給第二設備的,還可以為第二設備確定的。In this embodiment of the present invention, in order to enable the second device to also determine whether the first data matches the second data, the second device also obtains the first vector and the second vector determined based on the first encryption vector and the second encryption vector. The target distance of the two vectors, where the target distance can be determined by the first device and then sent to the second device, or can also be determined by the second device.

其中,該第一加密向量為第一設備採用第一設備生成的第一目標公開金鑰對第一向量加密後得到的,第一向量為第一資料登錄到第一設備中的預先訓練完成的向量轉化模型中獲得的。Wherein, the first encrypted vector is obtained by encrypting the first vector using the first target public key generated by the first device, and the first vector is pre-trained by logging the first data into the first device. Obtained from vector transformation model.

S404:基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。S404: Based on the target distance and the preset first distance threshold, determine whether the first data and the second data match.

為了確定該第一資料以及該第二資料是否匹配,在本發明實施例中,將該目標距離以及預設的第一距離閾值進行比較,根據比較結果,確定該第一資料以及該第二資料是否匹配。In order to determine whether the first data and the second data match, in the embodiment of the present invention, the target distance is compared with a preset first distance threshold, and based on the comparison result, the first data and the second data are determined. Whether it matches.

由於在本發明實施例中,分別將待匹配的第一資料和第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量以及第二資料對應的第二向量,並獲取基於該第一向量加密後的第一加密向量,以及該第二向量加密後的第二加密向量,確定的加密後的第一向量和第二向量的距離,並基於加密後的第一向量和第二向量的距離以及自身生成的第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於該目標距離以及預設的第一距離閾值確定第一資料和第二資料是否匹配,即在第一資料和第二資料不完全相同時,也能實現第一資料和第二資料的模糊匹配,拓寬了使用場景,且在進行模糊匹配過程中引入了第一目標公開金鑰和第一目標私密金鑰分別進行同態加密和解密,實現了安全求交,保證了匹配過程的安全性,且整個匹配的過程中,第一資料以及第二資料均未以原始資料的形式離開過對應的第一設備以及第二設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。In this embodiment of the present invention, the first data and the second data to be matched are respectively logged into the vector transformation model that has been trained in advance, and the first vector corresponding to the first data and the second vector corresponding to the second data are obtained. , and obtain the first encrypted vector encrypted based on the first vector, and the second encrypted vector encrypted based on the second vector, determine the distance between the encrypted first vector and the second vector, and obtain the encrypted first vector based on the encrypted second vector. The distance between the first vector and the second vector and the first target private key generated by itself determine the target distance between the first vector and the second vector, and determine the first data and the second data based on the target distance and the preset first distance threshold. Whether the data matches, that is, when the first data and the second data are not exactly the same, fuzzy matching between the first data and the second data can be achieved, broadening the usage scenarios, and introducing the first target disclosure during the fuzzy matching process The golden key and the first target private key are homomorphically encrypted and decrypted respectively, achieving safe intersection and ensuring the security of the matching process. During the entire matching process, neither the first data nor the second data are used as original data. The form has left the corresponding first device and second device, realizing fuzzy matching without leaving the original data out of the database, further ensuring the security of the matching process.

實施例12: 為了確定第二資料對應的第二向量,在各上述實施例的基礎上,在本發明實施例中,該將獲取到待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量包括: 確定待匹配的第二資料對應的第二目標資料類型; 根據該第二目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先完成的第二目標向量轉化模型; 將該第二資料登錄到該預先訓練完成的第二目標向量轉化模型中,獲得該第二資料對應的第二向量。 Example 12: In order to determine the second vector corresponding to the second data, on the basis of the above embodiments, in the embodiment of the present invention, the second data to be matched is logged into the pre-trained vector transformation model to obtain the The second vector corresponding to the second data includes: Determine the second target data type corresponding to the second data to be matched; Determine the pre-completed second target vector transformation model corresponding to the second data according to the corresponding relationship between the second target data type and the pre-saved data type and the pre-trained vector transformation model; Log the second data into the pre-trained second target vector transformation model to obtain the second vector corresponding to the second data.

在本發明實施例中,由於待匹配的第二資料可以為文字資料,比如,該第一資料可以為姓名、性別、位址等等,也可以為數字資料,比如,該第二資料可以為身份證號、銀行卡號以及准考證號等等,因此,針對不同資料類型的第二資料,獲得對應的第二向量的預先訓練完成的向量轉化模型也不相同。In this embodiment of the present invention, the second data to be matched can be text data. For example, the first data can be name, gender, address, etc., or it can also be digital data. For example, the second data can be ID number, bank card number, admission ticket number, etc. Therefore, for different data types of second data, the pre-trained vector conversion models to obtain the corresponding second vectors are also different.

具體的,可以在第二設備中保存有資料類型和預先完成的向量轉化模型的對應關係,根據獲取到的待匹配的第二資料對應的第二目標資料類型,採用對應的預先訓練完成的向量轉化模型,獲取該第二資料對應的第二向量;其中,該對應的預先訓練完成的向量轉化模型也就是預先訓練完成的第二目標向量轉化模型。Specifically, the corresponding relationship between the data type and the pre-completed vector conversion model can be saved in the second device, and the corresponding pre-trained vector can be used according to the second target data type corresponding to the obtained second data to be matched. Transform the model to obtain the second vector corresponding to the second data; wherein the corresponding pre-trained vector transformation model is also the pre-trained second target vector transformation model.

圖5a為本發明一些實施例提供的一種獲取文字類型資料對應的向量的過程示意圖,圖5b為本發明一些實施例提供的一種獲取數字類型資料對應的向量的過程示意圖,現針對圖5a和圖5b進行說明。Figure 5a is a schematic diagram of a process for obtaining vectors corresponding to text type data provided by some embodiments of the present invention. Figure 5b is a schematic diagram of a process for obtaining vectors corresponding to numeric type data provided by some embodiments of the present invention. Now with reference to Figure 5a and Figure 5b for explanation.

若待匹配的資料為文字資料,該待匹配的資料可以為第一資料或者第二資料,則該先訓練完成的向量轉化模型為詞向量模型,將該文字資料登錄到的預先訓練完成的詞向量模型中,該詞向量模型輸出該文字資料對應的向量。If the data to be matched is text data, and the data to be matched can be the first data or the second data, then the previously trained vector conversion model is a word vector model, and the text data is logged into the pre-trained word In the vector model, the word vector model outputs the vector corresponding to the text data.

若待匹配的資料為數字資料,該待匹配的資料可以為第一資料或者第二資料,則該預先訓練完成的向量轉化模型為One-Hot編碼模型,將該數字資料登錄到預先訓練完成的One-Hot編碼模型中,該One-Hot編碼模型輸出該數字資料對應的向量。If the data to be matched is digital data, the data to be matched can be the first data or the second data, then the pre-trained vector conversion model is a One-Hot encoding model, and the digital data is logged into the pre-trained In the One-Hot encoding model, the One-Hot encoding model outputs a vector corresponding to the digital data.

為了準確的確定將第二資料轉換為第二向量的模型,在上述各實施例的基礎上,若該第二目標資料類型為文字類型,對應的預先訓練完成的第二目標向量轉化模型為詞向量模型或句向量模型;若該第二目標資料類型為數字類型,對應的預先訓練完成的第二目標向量轉化模型為獨熱編碼模型。In order to accurately determine the model for converting the second data into the second vector, based on the above embodiments, if the second target data type is a text type, the corresponding pre-trained second target vector is converted into a word model Vector model or sentence vector model; if the second target data type is a numeric type, the corresponding pre-trained second target vector conversion model is a one-hot encoding model.

在本發明實施例中,若該第二資料為文字資料,則為了獲得該文字資料對應的第二向量,第二設備中部署的預先訓練完成的向量轉化模型可以為詞向量模型或者句向量模型,若該第二資料為數字資料,則為了確定該數字資料對應的第二向量,第二設備中部署的預先訓練完成的向量轉化模型可以為One-Hot編碼模型。In this embodiment of the present invention, if the second data is text data, in order to obtain the second vector corresponding to the text data, the pre-trained vector conversion model deployed in the second device can be a word vector model or a sentence vector model. , if the second data is digital data, in order to determine the second vector corresponding to the digital data, the pre-trained vector conversion model deployed in the second device can be a One-Hot encoding model.

具體的,若該第二資料為文字資料,則根據預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先完成的第二目標向量轉化模型為詞向量模型,並基於該預先訓練完成的詞向量模型,獲取該第二資料對應的第二向量;若該第二資料為數字資料,則根據預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先完成的第二目標向量轉化模型為One-Hot編碼模型,並基於該預先訓練完成的One-Hot編碼模型,獲取該第二資料對應的第二向量。Specifically, if the second data is text data, the pre-completed second target vector conversion model corresponding to the second data is determined to be a word vector according to the corresponding relationship between the pre-saved data type and the pre-trained vector conversion model. model, and based on the pre-trained word vector model, obtain the second vector corresponding to the second data; if the second data is digital data, convert the model's correspondence according to the pre-saved data type and the pre-trained vector relationship, determine that the pre-completed second target vector conversion model corresponding to the second data is a One-Hot encoding model, and based on the pre-trained One-Hot encoding model, obtain the second vector corresponding to the second data.

其中,在對向量轉化模型進行訓練時,可以預先標注每個資料和資料對應的標注向量,將每個資料和對應的標注向量輸入到原始向量轉化模型中,根據原始向量轉化模型輸出的預測向量及對應的標注向量,對該原始向量轉化模型的參數進行調整,當滿足收斂條件時,確定向量轉化模型訓練完成。Among them, when training the vector conversion model, each data and the corresponding label vector can be pre-labeled, each data and the corresponding label vector can be input into the original vector conversion model, and the prediction vector output by the original vector conversion model can be and the corresponding annotation vector, adjust the parameters of the original vector transformation model, and when the convergence conditions are met, it is determined that the training of the vector transformation model is completed.

實施例13: 為了獲得基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,在上述各實施例的基礎上,該接收第一設備發送的該目標公開金鑰包括: 接收該第一設備發送的該第一目標公開金鑰以及該第一加密向量,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量進行同態加密後得到的; 該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括: 根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離; 將該加密後的該第一向量和第二向量的距離發送給該第一設備; 接收該第一設備發送的該目標距離,其中,該目標距離為該第一設備採用自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密後得到的。 Example 13: In order to obtain the target distance of the first vector and the second vector determined based on the first encryption vector and the second encryption vector, based on the above embodiments, receiving the target public key sent by the first device includes: Receive the first target public key and the first encryption vector sent by the first device, wherein the first encryption vector is obtained by homomorphically encrypting the first vector using the first target public key; The obtaining the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Determine the distance between the encrypted first vector and the second vector according to the first encrypted vector and the second encrypted vector; Send the encrypted distance between the first vector and the second vector to the first device; Receive the target distance sent by the first device, where the target distance is the encrypted sum of the first vector using the first target private key corresponding to the first target public key generated by the first device itself. The distance of the second vector is obtained after decryption.

在本發明實施例中,第二設備獲得的基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離可以為第一設備確定併發送給第二設備的,還可以為第二設備確定的。In this embodiment of the present invention, the target distance of the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector obtained by the second device may be determined by the first device and sent to the second device, or may be determined for the second device.

由於第一加密向量和第二加密向量為基於第一設備生成的第一目標公開金鑰對第一向量和第二向量進行同態加密生成的,因此,在確定加密後的第一向量和第二向量的距離後,也需要基於該第一設備產生的第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,若該第一向量和第二向量的目標距離為第二設備確定的,則第二設備需要接收第一設備發送的該第一目標私密金鑰,若在發送過程中存在漏洞或者攻擊,則影響資訊的安全,安全性不高。Since the first encrypted vector and the second encrypted vector are generated by homomorphically encrypting the first vector and the second vector based on the first target public key generated by the first device, therefore, after determining the encrypted first vector and the second vector, After the distance between the two vectors, the encrypted distance between the first vector and the second vector also needs to be decrypted based on the first target private key generated by the first device. If the target distance between the first vector and the second vector If determined by the second device, the second device needs to receive the first target private key sent by the first device. If there is a vulnerability or attack during the sending process, the security of the information will be affected and the security will not be high.

為了提高安全性,在本發明實施例中,該目標距離可以為第一設備確定的,併發送給第二設備的,具體的,第二設備在接收第一設備發送的第一目標公開金鑰的同時,還可以接收第一設備確定的第一加密向量,其中,該第一加密向量為第一設備採用該第一目標公開金鑰對第一向量進行同態加密後得到的。In order to improve security, in this embodiment of the present invention, the target distance can be determined by the first device and sent to the second device. Specifically, the second device receives the first target public key sent by the first device. At the same time, the first encryption vector determined by the first device may also be received, wherein the first encryption vector is obtained by the first device using the first target public key to homomorphically encrypt the first vector.

第二設備在接收第一設備發送的第一加密向量以及第一目標公開金鑰後,可以先基於該第一目標公開金鑰對第二向量進行同態加密,確定第二加密向量,然後基於該第一加密向量和第二加密向量,確定加密後的第一向量和第二向量的距離,由於第一加密向量和第二加密向量都是基於第一設備生成的第一目標公私密金鑰對中的第一目標公開金鑰加密生成的,因此,在本發明實施例中,為了確定第一向量和第二向量的目標距離,第二設備可以將確定的加密後的第一向量和第二向量的距離發送給第一設備,以使第一設備根據生成的第一目標公私密金鑰對中的第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密生成目標距離,並將該目標距離發送給第二設備,第二設備接收該第一設備發送的目標距離。After receiving the first encryption vector and the first target public key sent by the first device, the second device may first perform homomorphic encryption on the second vector based on the first target public key, determine the second encryption vector, and then perform homomorphic encryption on the second vector based on the first target public key. The first encryption vector and the second encryption vector determine the distance between the encrypted first vector and the second vector, because the first encryption vector and the second encryption vector are both based on the first target public and private key generated by the first device. The first target public key in the pair is encrypted and generated. Therefore, in the embodiment of the present invention, in order to determine the target distance between the first vector and the second vector, the second device may combine the determined encrypted first vector and the second vector. The distance between the two vectors is sent to the first device, so that the first device decrypts the encrypted distance between the first vector and the second vector according to the first target private key in the generated first target public private key pair. Generate a target distance and send the target distance to the second device, and the second device receives the target distance sent by the first device.

由於在本發明實施例中,第一資料均未以原始資料的形式離開過第一設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。Since in the embodiment of the present invention, none of the first data has left the first device in the form of original data, fuzzy matching can be achieved without the original data leaving the database, further ensuring the security of the matching process.

實施例14: 為了獲得基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,在上述各實施例的基礎上,該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括: 向該第一設備發送該第二加密向量,以使該第一設備基於該第二加密向量以及採用自身生成的該第一目標公開金鑰對該第一向量進行同態加密後得到的該第一加密向量,確定加密後的該第一加密向量和該第二加密向量的距離; 接收該第一設備發送的該第一向量和第二向量的目標距離,其中,該目標距離為該第一設備基於自身生成的該第一目標公開金鑰對應的目標私密金鑰對該加密後的該第一向量和第二向量的距離解密得到的。 Example 14: In order to obtain the target distance of the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, based on the above embodiments, the target distance determined based on the first encrypted vector and the second encrypted vector is obtained. The target distance between a vector and the second vector includes: Send the second encryption vector to the first device, so that the first device homomorphically encrypts the first vector based on the second encryption vector and the first target public key generated by itself. An encryption vector, determining the distance between the encrypted first encryption vector and the second encryption vector; Receive the target distance of the first vector and the second vector sent by the first device, where the target distance is the encrypted target private key corresponding to the first target public key generated by the first device. The distance between the first vector and the second vector is decrypted.

在本發明實施例中,為了獲取基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,第一設備除了可以向第二設備發送第一目標公開金鑰以及第一加密向量,以使第二設備基於第一加密向量和基於第一目標公開金鑰對第二向量加密生成的第二加密向量,確定加密後的第一向量和第二向量的距離以外,第一設備還可以只將第一目標公開金鑰發送給第二設備。In the embodiment of the present invention, in order to obtain the target distance of the first vector and the second vector determined based on the first encryption vector and the second encryption vector, the first device may send the first target public key and the first target distance to the second device. An encryption vector, so that the second device determines the distance between the encrypted first vector and the second vector based on the first encryption vector and the second encryption vector generated by encrypting the second vector based on the first target public key, and A device can also only send the first target public key to the second device.

第二設備在接收到第一設備發送的第一目標公開金鑰後,基於該第一目標公開金鑰對第二向量進行同態加密,生成第二加密向量後,將該第二加密向量發送給第一設備,以使第一設備基於該第二加密向量以及第一加密向量,確定該加密後的第一加密向量和第二加密向量的距離,並基於自身生成的第一目標公私密金鑰對中的第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,確定該第一向量和第二向量的目標距離併發送給第二設備,第二設備接收第一設備發送的第一向量和第二向量的目標距離,並可以基於該目標距離,確定第一資料和第二資料是否匹配。After receiving the first target public key sent by the first device, the second device performs homomorphic encryption on the second vector based on the first target public key, generates the second encrypted vector, and sends the second encrypted vector. To the first device, so that the first device determines the distance between the encrypted first encryption vector and the second encryption vector based on the second encryption vector and the first encryption vector, and based on the first target public secret generated by itself The first target private key in the key pair decrypts the encrypted distance between the first vector and the second vector, determines the target distance between the first vector and the second vector, and sends it to the second device, and the second device receives it The first device sends a target distance between the first vector and the second vector, and can determine whether the first data and the second data match based on the target distance.

圖6為本發明一些實施例提供的一種雙方資料進行模糊匹配的整體過程示意圖,現針對圖6進行說明。Figure 6 is a schematic diagram of the overall process of fuzzy matching of data from both parties provided by some embodiments of the present invention. Figure 6 will now be described.

將待匹配的第一資料和第二資料登錄到預先訓練完成的向量轉化模型中,輸出待匹配的第一資料對應的第一向量,以及待匹配的第二資料對應的第二向量,基於第一目標公開金鑰對該第一向量和第二向量分別進行同態加密,獲得第一加密向量和第二加密向量,根據該第一加密向量和第二加密向量,確定該加密後的第一向量和第二向量的距離,基於第一目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,確定該第一向量和第二向量的目標距離,根據目標距離以及預設的第一距離閾值進行比較,確定匹配結果。Log the first data and second data to be matched into the pre-trained vector transformation model, output the first vector corresponding to the first data to be matched, and the second vector corresponding to the second data to be matched, based on the A target public key performs homomorphic encryption on the first vector and the second vector respectively to obtain the first encrypted vector and the second encrypted vector. Based on the first encrypted vector and the second encrypted vector, the encrypted first vector is determined. The distance between the vector and the second vector is decrypted based on the first target private key, and the target distance between the first vector and the second vector is determined. Compare with the set first distance threshold to determine the matching result.

圖7為本發明一些實施例提供的一種雙方資料進行模糊匹配的具體過程示意圖,現針對圖7進行說明。Figure 7 is a schematic diagram of a specific process for fuzzy matching of data from both parties provided by some embodiments of the present invention. Figure 7 will now be described.

第一設備將待匹配的第一資料登錄到部署在第一設備中的預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量,第二設備將待匹配的第二資料登錄到部署在第二設備中的預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量。如圖7所述,第一資料包含四個第一子資料,分別為U1、U2、U3、U4,且U1對應的第一子向量為(x11,x12,x13……,x1m),U2對應的第一子向量為(x21,x22,x23……,x2m),U3對應的第一子向量為(x31,x32,x33……,x3m),U4對應的第一子向量為(x41,x42,x43……,x4m)。第一資料對應的第一向量為(x11,x12,x13……,x1m,x21,x22,x23……,x2m,x31,x32,x33……,x3m,x41,x42,x43……,x4m)。第二資料包含四個第二子資料,分別為U5、U6、U7、U8,且U5對應的第二子向量為(y11,y12,y13……,y1m),U6對應的第二子向量為(y21,y22,y23……,y2m),U7對應的第二子向量為(y31,y32,y33……,y3m),U8對應的第二子向量為(y41,y42,y43……,y4m),第二資料對應的第二向量為(y11,y12,y13……,y1m,y21,y22,y23……,y2m,y31,y32,y33……,y3m,y41,y42,y43……,y4m)。The first device logs the first data to be matched into the pre-trained vector transformation model deployed in the first device, and obtains the first vector corresponding to the first data. The second device logs the second data to be matched. Go to the pre-trained vector conversion model deployed in the second device to obtain the second vector corresponding to the second data. As shown in Figure 7, the first data includes four first sub-data, namely U1, U2, U3, and U4, and the first sub-vector corresponding to U1 is (x11, x12, x13..., x1m), and the corresponding first sub-vector of U2 is The first subvector of U3 is (x21, x22, x23..., x2m), the first subvector corresponding to U3 is (x31, x32, x33..., x3m), and the first subvector corresponding to U4 is (x41, x42 ,x43...,x4m). The first vector corresponding to the first data is (x11, x12, x13..., x1m, x21, x22, x23..., x2m, x31, x32, x33..., x3m, x41, x42, x43..., x4m) . The second data includes four second sub-data, namely U5, U6, U7, and U8. The second sub-vector corresponding to U5 is (y11, y12, y13..., y1m), and the second sub-vector corresponding to U6 is (y21, y22, y23..., y2m), the second subvector corresponding to U7 is (y31, y32, y33..., y3m), and the second subvector corresponding to U8 is (y41, y42, y43..., y4m ), the second vector corresponding to the second data is (y11, y12, y13..., y1m, y21, y22, y23..., y2m, y31, y32, y33..., y3m, y41, y42, y43..., y4m).

第一設備生成第一目標公私密金鑰對A(pka,ska),其中,該pka為第一目標公開金鑰,ska為第一目標私密金鑰,該第一目標公私密金鑰對為同態加密目標公私密金鑰對。基於該第一目標公開金鑰對第一向量進行同態加密生成第一加密向量,(x11,x12,x13……,x1m,x21,x22,x23……,x2m,x31,x32,x33……,x3m,x41,x42,x43……,x4m)對應的第一加密向量為( )。其中該第一加密向量中包含m/2組加密後的第一加密分量和第一加密平方分量。 The first device generates a first target public-private key pair A(pka, ska), where pka is the first target public key, ska is the first target private key, and the first target public-private key pair is Homomorphic encryption target public and private key pairs. The first vector is homomorphically encrypted based on the first target public key to generate a first encrypted vector, (x11, x12, x13..., x1m, x21, x22, x23..., x2m, x31, x32, x33... ,x3m,x41,x42,x43...,x4m) The corresponding first encryption vector is ( , , , , , , , , , , ). The first encrypted vector includes m/2 sets of encrypted first encrypted components and first encrypted square components.

第一設備將該第一目標公開金鑰以及第一加密向量發送給第二設備,第二設備在接收到該第一目標公開金鑰以及第一加密向量後,基於該第一目標公開金鑰對第二向量進行同態加密生成第二加密向量,(y11,y12,y13……,y1m,y21,y22,y23……,y2m,y31,y32,y33……,y3m,y41,y42,y43……,y4m)對應的第二加密向量為( )。其中該第二加密向量中包含m/2組加密後的第二加密分量和第二加密平方分量。 The first device sends the first target public key and the first encryption vector to the second device. After receiving the first target public key and the first encryption vector, the second device generates data based on the first target public key. Perform homomorphic encryption on the second vector to generate the second encrypted vector, (y11, y12, y13..., y1m, y21, y22, y23..., y2m, y31, y32, y33..., y3m, y41, y42, y43 ..., y4m) corresponding to the second encryption vector is ( , , , , , , , , , ). The second encrypted vector includes m/2 sets of encrypted second encrypted components and second encrypted square components.

針對每個加密後第一子向量以及對應的每個加密後的第二子向量,第二設備確定加密後的第一子向量和該第二子向量之間的距離,基於每個加密後的第一子向量和該第二子向量之間的距離,確定加密後的距離矩陣,並將該加密後的距離矩陣發送給第一設備。For each encrypted first sub-vector and each corresponding encrypted second sub-vector, the second device determines the distance between the encrypted first sub-vector and the second sub-vector, based on each encrypted The distance between the first subvector and the second subvector determines an encrypted distance matrix, and sends the encrypted distance matrix to the first device.

第一設備在接收到第二設備發送的加密的距離矩陣後,基於該第一設備生成的第一目標私密金鑰對該加密的距離矩陣進行解密,也就是說,對每個加密後的第一子向量和該第二子向量之間的距離進行解密,獲得目標子距離矩陣,第一設備基於該目標子距離矩陣,確定各個第一子資料與各個第二子資料是否匹配,其中,該目標子距離矩陣中每個元素為對應的目標子距離,將該目標子距離矩陣發送給第二設備,第二設備也獲得該目標子距離矩陣,第二設備基於該目標子距離矩陣,確定各個第一子資料與各個第二子資料是否匹配。After receiving the encrypted distance matrix sent by the second device, the first device decrypts the encrypted distance matrix based on the first target private key generated by the first device, that is, for each encrypted third The distance between a sub-vector and the second sub-vector is decrypted to obtain a target sub-distance matrix. Based on the target sub-distance matrix, the first device determines whether each first sub-data matches each second sub-data, where, the Each element in the target sub-distance matrix is a corresponding target sub-distance. The target sub-distance matrix is sent to the second device. The second device also obtains the target sub-distance matrix. The second device determines each target sub-distance matrix based on the target sub-distance matrix. Whether the first sub-data matches each second sub-data.

實施例15: 為了獲得基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,在上述各實施例的基礎上,該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括: 採用自身生成的第二目標公開金鑰對該第二向量進行同態加密,獲得第三加密向量,並將該第二目標公開金鑰以及該第三加密向量發送給該第一設備; 接收該第一設備發送的基於該第三加密向量以及第四加密向量,確定加密後的該第二向量和該第一向量的距離,其中,該第四加密向量為採用該第二目標公開金鑰對該第一向量進行同態加密生成的; 基於該加密後的該第一向量和第二向量的距離及該第二目標公開金鑰對應的第二目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 15: In order to obtain the target distance of the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, based on the above embodiments, the target distance determined based on the first encrypted vector and the second encrypted vector is obtained. The target distance between a vector and the second vector includes: Perform homomorphic encryption on the second vector using the self-generated second target public key to obtain a third encryption vector, and send the second target public key and the third encryption vector to the first device; Receive the information sent by the first device based on the third encrypted vector and the fourth encrypted vector, and determine the distance between the encrypted second vector and the first vector, wherein the fourth encrypted vector is obtained using the second target public gold. The key is generated by homomorphic encryption of the first vector; Based on the encrypted distance between the first vector and the second vector and the second target private key corresponding to the second target public key, determine the target distance between the first vector and the second vector, based on the target distance and a preset first distance threshold to determine whether the first data and the second data match.

為了使得第一設備以及第二設備都能確定第一資料以及第二資料是否匹配,在本發明實施例中,第二設備也可以獲得第一向量和第二向量的目標距離,其中,該第一向量和第二向量的目標距離可以是第二設備對接收到第一設備發送的加密後的第一向量和第二向量的距離進行解密後獲得的。In order to enable both the first device and the second device to determine whether the first data and the second data match, in the embodiment of the present invention, the second device can also obtain the target distance of the first vector and the second vector, where the third The target distance between a vector and a second vector may be obtained by the second device decrypting the encrypted distance between the first vector and the second vector sent by the first device.

具體的,第二設備可以採用自身生成的第二目標公開金鑰對第二向量進行同態加密,獲得第三加密向量,並將該第三加密向量以及第二目標公開金鑰發送給該第一設備,第一設備在接收到該第三加密向量以及第二目標公開金鑰後,基於該第二目標公開金鑰對第一向量進行同態加密,獲得第四加密向量。第一設備可以根據該第三加密向量以及第四加密向量確定加密後的第一向量和第二向量的距離,並將該加密後的第一向量和第二向量的距離發送給第二設備,第二設備接收到該加密後的第一向量和第二向量的距離後,採用自身生成的第二目標公開金鑰對應的第二目標私密金鑰對該加密後的第一向量和第二向量的距離進行解密,獲得第一向量和第二向量的目標距離,並根據該第一向量和第二向量的目標距離以及預設的第一距離閾值,確定第一資料以及第二資料是否匹配。Specifically, the second device can homomorphically encrypt the second vector using the second target public key generated by itself, obtain the third encryption vector, and send the third encryption vector and the second target public key to the third device. A device. After receiving the third encryption vector and the second target public key, the first device performs homomorphic encryption on the first vector based on the second target public key to obtain a fourth encryption vector. The first device may determine the encrypted distance between the first vector and the second vector based on the third encrypted vector and the fourth encrypted vector, and send the encrypted distance between the first vector and the second vector to the second device, After receiving the distance between the encrypted first vector and the second vector, the second device uses the second target private key corresponding to the second target public key generated by itself to encrypt the first vector and the second vector. Decrypt the distance to obtain the target distance between the first vector and the second vector, and determine whether the first data and the second data match based on the target distance between the first vector and the second vector and the preset first distance threshold.

其中,第二設備根據第一向量和第二向量的目標距離以及預設的第一距離閾值,確定第一資料以及第二資料是否匹配的過程,與第一設備根據第一向量和第二向量的目標距離以及預設的第一距離閾值,確定第一資料以及第二資料是否匹配的過程相同,在此不做贅述。Among them, the second device determines whether the first data and the second data match based on the target distance of the first vector and the second vector and the preset first distance threshold, which is the same as the first device based on the first vector and the second vector. The target distance and the preset first distance threshold are the same. The process of determining whether the first data and the second data match is the same and will not be described again here.

由於在本發明實施例中,第二資料均未以原始資料的形式離開過第二設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。Since in the embodiment of the present invention, the second data has not left the second device in the form of original data, fuzzy matching can be achieved without the original data leaving the database, further ensuring the security of the matching process.

實施例16: 為了獲得基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,在上述各實施例的基礎上,該將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量包括: 針對該第二資料中的每個第二子資料,將該第二子資料登錄到預先訓練完成的向量轉化模型中,獲得該第二子資料對應的第二子向量;其中,每個第二子資料對應的第二子向量的長度均為第一預設長度; 將該每個第二子資料對應的第二子向量進行拼接,得到該第二資料對應的該第二向量。 Example 16: In order to obtain the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, based on the above embodiments, the second data to be matched is logged into the pre-trained vector transformation In the model, obtaining the second vector corresponding to the second data includes: For each second sub-data in the second data, log the second sub-data into the pre-trained vector transformation model to obtain the second sub-vector corresponding to the second sub-data; wherein, each second sub-data is The lengths of the second sub-vectors corresponding to the sub-data are all the first preset length; The second sub-vectors corresponding to each second sub-data are spliced to obtain the second vector corresponding to the second data.

在本發明實施例中,一個第二資料中可能包含一個第二子資料,也可能包含多個第二子資料,比如,該第二資料中包含「上海市浦東新區晴天小賣部」一個第二子資料,該第二資料中還可以包含三個第二子資料,比如,該三個第二子資料分別為:「上海市浦東新區晴天小賣部」、「上海市天天餐飲店」、以及「高科路楊國福麻辣燙」。In this embodiment of the present invention, a second data may include one second sub-data, or may include multiple second sub-data. For example, the second data may include a second sub-data "Shanghai Pudong New District Sunny Shop" Data, the second data can also include three second sub-data. For example, the three second sub-data are: "Shanghai Pudong New District Qingtian Shop", "Shanghai Tiantian Restaurant", and "Gaoke Road" Yang Guofu Malatang".

為了確定第二資料對應的第二向量,可以針對第二資料中的每個第二子資料,將該第二子資料登錄到預先訓練完成的向量轉化模型中,獲得該第二子資料對應的第二子向量,其中,每個第二子資料包含的文字或者數字或者字元的長度可以不相同,但是每個第二子資料對應的第二子向量的長度均為第一預設長度,其中,該第一預設長度可以為3,也可以為4或者6等等,具體的,該第一預設長度可以根據需求進行設置。In order to determine the second vector corresponding to the second data, for each second sub-data in the second data, the second sub-data can be logged into the pre-trained vector transformation model to obtain the second vector corresponding to the second sub-data. The second sub-vector, in which the length of the characters, numbers or characters contained in each second sub-data may be different, but the length of the second sub-vector corresponding to each second sub-data is the first preset length, The first preset length may be 3, 4, 6, etc. Specifically, the first preset length may be set according to requirements.

比如,第二資料中包含「上海市浦東新區晴天小賣部」、「上海市天天餐飲店」、以及「高科路楊國福麻辣燙」三個第二子數據時,將「上海市浦東新區晴天小賣部」輸入到預先訓練完成的詞向量模型中,輸出的該「上海市浦東新區晴天小賣部」對應的第二子向量為(2.0,3.0,2.5,1.0,1.5),將「上海市天天餐飲店」輸入到預先訓練完成的詞向量模型中,輸出的該「上海市天天餐飲店」對應的第二子向量為(2.3,4.4,3.5,4.5,2.5),將「高科路楊國福麻辣燙」輸入到預先訓練完成的詞向量模型中,輸出的該「高科路楊國福麻辣燙」對應的第二子向量為(2.5,2.7,8.3,4.5,1.5)。For example, when the second data contains three second sub-data: "Shanghai Pudong New Area Qingtian Canteen", "Shanghai Tiantian Restaurant", and "Gaoke Road Yang Guofu Malatang", enter "Shanghai Pudong New Area Qingtian Canteen" In the pre-trained word vector model, the second subvector corresponding to the output "Shanghai Pudong New District Qingtian Canteen" is (2.0, 3.0, 2.5, 1.0, 1.5), and "Shanghai Tiantian Restaurant" is input into In the pre-trained word vector model, the output second sub-vector corresponding to "Shanghai Tiantian Restaurant" is (2.3, 4.4, 3.5, 4.5, 2.5), and "Gaoke Road Yang Guofu Malatang" is input into the pre-training In the completed word vector model, the second subvector corresponding to the output "Gaoke Road Yang Guofu Malatang" is (2.5, 2.7, 8.3, 4.5, 1.5).

若第二資料中包含三個第二子資料,每個第二子資料都為數字資料,該三個數字資料分別為「54321」,「00001」,以及「33322」,則將「54321」輸入到預先訓練完成的詞向量模型中,輸出的該「54321」對應的第二子向量為(0000100000,0000010000,0000001000,0000000100,0000000010),將「00001」輸入到預先訓練完成的詞向量模型中,輸出的該「000011」對應的第二子向量為(0000000000,0000000000,0000000000,0000000010,0000000010),將「33322」輸入到預先訓練完成的詞向量模型中,輸出的該「22233」對應的第二子向量為(0000000010,0000000010,0000000010,0000000100,0000000100)。If the second data contains three second sub-data, each second sub-data is digital data, and the three digital data are "54321", "00001", and "33322" respectively, then enter "54321" Into the pre-trained word vector model, the second subvector corresponding to the output "54321" is (0000100000, 0000010000, 0000001000, 0000000100, 0000000010). Input "00001" into the pre-trained word vector model. The second subvector corresponding to the output "000011" is (0000000000, 0000000000, 0000000000, 0000000010, 0000000010). Input "33322" into the pre-trained word vector model, and the second subvector corresponding to the output "22233" The subvectors are (0000000010, 0000000010, 0000000010, 0000000100, 0000000100).

為了確定該第二資料對應的第二向量,在本發明實施例中,在獲得第二資料中每個第二子資料對應的第二子向量後,將每個第二子資料對應的第二子向量進行拼接,將拼接的結果確定為該第二資料對應的第二向量,具體的,可以先隨機對該第二資料中包含的每個第二子資料進行排序,根據第二子資料的排序結果,對第二子資料進對應的第二子向量進行排序並拼接,得到第二向量。In order to determine the second vector corresponding to the second data, in the embodiment of the present invention, after obtaining the second sub-vector corresponding to each second sub-data in the second data, the second vector corresponding to each second sub-data is obtained. The sub-vectors are spliced, and the splicing result is determined as the second vector corresponding to the second data. Specifically, each second sub-data contained in the second data can be randomly sorted first, and according to the second sub-data As a sorting result, the second sub-data is sorted and spliced into the corresponding second sub-vectors to obtain the second vector.

比如,第二資料中包含「上海市浦東新區晴天小賣部」、「上海市天天餐飲店」、以及「高科路楊國福麻辣燙」三個第二子資料時,「上海市浦東新區晴天小賣部」對應的第二子向量為(1.0,2.0,1.5),「上海市天天餐飲店」對應的第二子向量為(3.0,4.0,2.5),「高科路楊國福麻辣燙」對應的第二子向量為(4.5,5.5,7.5),則可以隨機對該第二資料中包含的每個第二子資料進行排序後,獲得的排序結果為「上海市浦東新區晴天小賣部」、「高科路楊國福麻辣燙」、「上海市天天餐飲店」,則按照排序結果,對三個第二子資料對應的第二子向量進行拼接後獲得的第二資料對應的第二向量為(1.0,2.0,1.5,4.5,5.5,7.5,3.0,4.0,2.5)。For example, when the second data contains three second sub-data: "Shanghai Pudong New Area Qingtian Canteen", "Shanghai Tiantian Restaurant", and "Gaoke Road Yang Guofu Malatang", the corresponding "Shanghai Pudong New Area Qingtian Canteen" The second subvector is (1.0, 2.0, 1.5), the second subvector corresponding to "Shanghai Tiantian Restaurant" is (3.0, 4.0, 2.5), and the second subvector corresponding to "Gaoke Road Yang Guofu Malatang" is ( 4.5, 5.5, 7.5), then each second sub-data contained in the second data can be randomly sorted, and the sorting results obtained are "Shanghai Pudong New District Sunny Shop", "Gaoke Road Yang Guofu Malatang", "Shanghai Tiantian Restaurant", according to the sorting results, the second vector corresponding to the second data obtained after splicing the second sub-vectors corresponding to the three second sub-data is (1.0, 2.0, 1.5, 4.5, 5.5 ,7.5,3.0,4.0,2.5).

為了實現第一資料與第二資料的模糊匹配,在上述各實施例的基礎上,該第一向量和該第二向量的長度均為第二預設長度。In order to achieve fuzzy matching between the first data and the second data, based on the above embodiments, the lengths of the first vector and the second vector are both second preset lengths.

在本發明實施例中,為了實現第一資料與第二資料的模糊匹配,獲得的第一資料對應的第一向量的長度以及第二資料對應的第二向量的長度必須是相同的,且均為第二預設長度,其中,該第二預設長度不小於該第一預設長度,第二預設長度為第一預設長度的整數倍,且若該第二資料中若只包含一個第二子資料,則該第一預設長度等於第二預設長度。In the embodiment of the present invention, in order to achieve fuzzy matching between the first data and the second data, the length of the first vector corresponding to the obtained first data and the length of the second vector corresponding to the second data must be the same, and both is a second preset length, wherein the second preset length is not less than the first preset length, the second preset length is an integer multiple of the first preset length, and if the second data contains only one For the second sub-data, the first preset length is equal to the second preset length.

由於針對第一資料對應的第一向量的長度和第二資料對應的第二向量的長度均為第二預設長度,因此,即使第一資料與第二資料不相同,也能實現模糊匹配,拓寬了使用場景。Since the length of the first vector corresponding to the first data and the length of the second vector corresponding to the second data are both the second preset length, even if the first data and the second data are different, fuzzy matching can be achieved. Broadened usage scenarios.

實施例17: 為了確定第二加密向量,在上述各實施例的基礎上,該採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量包括: 針對該第二向量中的每個第二分量,確定該第二分量對應的第二平方分量; 將每個第二分量對應的第二平方分量按照預設的插入規則插入到該第二向量中,並將插入第二平方分量後獲得的向量更新為該第二向量; 基於該第一目標公開金鑰對該第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成該第二加密向量。 Example 17: In order to determine the second encryption vector, based on the above embodiments, using the first target public key to homomorphically encrypt the second vector to generate the second encryption vector includes: For each second component in the second vector, determine the second square component corresponding to the second component; Insert the second square component corresponding to each second component into the second vector according to the preset insertion rules, and update the vector obtained after inserting the second square component to the second vector; Homomorphic encryption is performed on each second component and each second square component in the second vector based on the first target public key to generate the second encrypted vector.

為了生成第二加密向量,在本發明實施例中,可以直接基於第一目標公開金鑰對該第二向量進行同態加密,獲得加密後的第二加密向量。為了保證可以在不對第二加密向量和第二加密向量進行解密的前提下,還可以基於第一加密向量和第二加密向量確定加密後的第一向量和第二向量的距離,在本發明實施例中,還可以先針對第二向量中的每個第二分量,確定該第二分量對應的第二平方分量。In order to generate the second encrypted vector, in the embodiment of the present invention, the second vector can be directly homomorphically encrypted based on the first target public key to obtain the encrypted second encrypted vector. In order to ensure that the distance between the encrypted first vector and the second vector can be determined based on the first encrypted vector and the second encrypted vector without decrypting the second encrypted vector and the second encrypted vector, in the implementation of the present invention In this example, for each second component in the second vector, the second square component corresponding to the second component may also be determined first.

比如,若該第二向量為(1,2,4,5,3),則該第二向量中為1的第二分量對應的第二平方分量為1,該第二向量中為2的第二分量對應的第二平方分量為4,該第二向量中為4的第二分量對應的第二平方分量為16,該第二向量中為5的第二分量對應的第二平方分量為25,該第二向量中為3的第二分量對應的第二平方分量為9。For example, if the second vector is (1, 2, 4, 5, 3), then the second square component corresponding to the second component of 1 in the second vector is 1, and the second square component of the second vector is 2. The second square component corresponding to the two components is 4, the second square component corresponding to the second component 4 in the second vector is 16, and the second square component corresponding to the second component 5 in the second vector is 25 , the second square component corresponding to the second component of 3 in the second vector is 9.

在本發明實施例中,可以針對第二向量中的每個第二分量後,在確定該第二分量對應的第二平方分量後,將每個第二分量對應的第二平方分量按照預設的規則插入到第二向量中,並將插入第二平方分量後獲得的向量更新為第二向量,具體的,針對每個第二分量對應的第二平方分量,可以將該第二分量對應的第二平方分量插入到第二向量中的任意位置,比如,將該第二分量對應的第二平方分量插入到第二向量中該第二分量的前面,或者將該第二分量對應的第二平方分量插入到該第二向量中該第二分量的後面,或者將第二平方分量按照順序插入到第二分量的後面,只要保證第一設備和第二設備能夠識別每個向量中的第二分量和第二平方分量即可。In the embodiment of the present invention, for each second component in the second vector, after determining the second square component corresponding to the second component, the second square component corresponding to each second component can be calculated according to a preset The rule of is inserted into the second vector, and the vector obtained after inserting the second square component is updated to the second vector. Specifically, for the second square component corresponding to each second component, the second square component corresponding to the second component can be The second square component is inserted into any position in the second vector. For example, the second square component corresponding to the second component is inserted in front of the second component in the second vector, or the second square component corresponding to the second component is inserted into the second vector. The square component is inserted into the second vector after the second component, or the second square component is inserted into the back of the second component in sequence, as long as it is ensured that the first device and the second device can identify the second component in each vector. component and the second square component.

比如,若該第二向量為(1,2,4,5,3),在確定第二向量中各個第二分量的第二平方分量後,將第二平方分量插入到第二向量中,獲得的更新後的第二向量為(1,9,2,16,4,4,5,25,1,3)。For example, if the second vector is (1, 2, 4, 5, 3), after determining the second square component of each second component in the second vector, insert the second square component into the second vector to obtain The updated second vector is (1, 9, 2, 16, 4, 4, 5, 25, 1, 3).

為了便於後續基於插入第二平方分量的第二向量確定加密後的第一向量和第二向量的距離,在本發明實施例中,針對第二向量中的每個第二分量後,在確定該第二分量對應的第二平方分量後,還可以將每個第二分量對應的第二平方分量按照預設的插入規則插入到第二向量中,並將插入第二平方分量後獲得的向量更新為第二向量,具體的,可以將該第二分量對應第二平方分量插入到該第二向量中該第二分量後面且與該第二分量相鄰的位置。In order to facilitate the subsequent determination of the distance between the encrypted first vector and the second vector based on the second vector inserted into the second square component, in the embodiment of the present invention, for each second component in the second vector, after determining the After the second square component corresponding to the second component, the second square component corresponding to each second component can also be inserted into the second vector according to the preset insertion rules, and the vector obtained after inserting the second square component is updated. is the second vector. Specifically, the second square component corresponding to the second component can be inserted into the second vector at a position behind the second component and adjacent to the second component.

比如,若該第二向量為(1,2,4,5,3),在確定第二向量中各個第二分量的第二平方分量後,將第二平方分量插入到第二向量中,獲得的更新後的第二向量為(1,1,2,4,4,16,5,25,3,9)。For example, if the second vector is (1, 2, 4, 5, 3), after determining the second square component of each second component in the second vector, insert the second square component into the second vector to obtain The updated second vector is (1, 1, 2, 4, 4, 16, 5, 25, 3, 9).

在確定更新後的第二向量後,為了確定第二加密向量,在本發明實施例中,可以基於該第一目標公開金鑰對第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成第二加密向量。After determining the updated second vector, in order to determine the second encrypted vector, in the embodiment of the present invention, each second component and each second square in the second vector can be matched based on the first target public key. The components are homomorphically encrypted respectively to generate a second encryption vector.

比如,若更新後的第二向量為(2,4,3,9),則第二加密向量為( ),其中,該 表徵基於目標公開金鑰對第二向量中為2的第二分量進行同態加密後的結果,該 表徵基於目標公開金鑰對第二向量中為4的第二分量進行同態加密後的結果,該 表徵基於目標公開金鑰對第二向量中為3的第二分量進行同態加密後的結果,該 表徵基於目標公開金鑰對第二向量中為9的第二分量進行同態加密後的結果。 For example, if the updated second vector is (2, 4, 3, 9), then the second encrypted vector is ( , , , ), among which, the Characterizes the result of homomorphic encryption of the second component of 2 in the second vector based on the target public key, which Characterizes the result of homomorphic encryption of the second component of 4 in the second vector based on the target public key, which Characterizes the result of homomorphic encryption of the second component of 3 in the second vector based on the target public key, which Characterizes the result of homomorphically encrypting the second component of 9 in the second vector based on the target public key.

實施例18: 為了獲得基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,在上述各實施例的基礎上,該根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離包括: 根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;並根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量; 根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量; 根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離; 根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。 Example 18: In order to obtain the target distance of the first vector and the second vector determined based on the first encryption vector and the second encryption vector, based on the above embodiments, the encryption vector is determined based on the first encryption vector and the second encryption vector. The distance between the first vector and the second vector includes: According to the preset insertion rule and each second component in the second encryption vector, obtain each set of second encryption components and second encryption square components; and according to the preset insertion rule and the first encryption vector For each first component of , obtain each group of first encrypted component and first encrypted square component; According to the preset insertion rule, determine each corresponding group of first encrypted component, first encrypted square component, second encrypted component and second encrypted square component; Determine each encrypted sub-distance according to each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; The distance between the encrypted first vector and the second vector is determined based on the sum of each sub-distance.

在本發明實施例中,為了確定加密後的第一向量和第二向量的距離,第二設備可以針對預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量,其中一組第二加密分量和第二加密平方分量中是對同一個分量本身和該分量的平方分量進行同態加密得到的。即每組第二加密分量和第二加密平方分量中,該第二加密平方分量為對該第二加密分量在加密前對應的第二分量對應的第二平方分量進行同態加密後獲得的。In this embodiment of the present invention, in order to determine the distance between the encrypted first vector and the second vector, the second device can obtain each group of the second component for the preset insertion rule and the second encrypted vector. The second encrypted component and the second encrypted square component are obtained by homomorphically encrypting the same component itself and the square component of the component. That is, in each group of the second encrypted component and the second encrypted square component, the second encrypted square component is obtained by performing homomorphic encryption on the second square component corresponding to the second component of the second encrypted component before encryption.

具體的,若該預設的插入規則為將該第二分量對應的第二平方分量插入到該第二向量中該第二分量後面且與該第二分量相鄰的位置。則在確定每組第二加密分量和第二加密分量時,可以直接從該第二向量的第一個分量開始,將待確定分組的分量與該第二向量中該待確定分組的分量的下一個與該待確定分組的分量確定為一組,依次進行劃分,直至確定所有分組對應的第二加密分量和第二加密平方分量。Specifically, if the preset insertion rule is to insert the second square component corresponding to the second component into the second vector at a position behind the second component and adjacent to the second component. Then when determining each group of second encryption components and second encryption components, you can directly start from the first component of the second vector, and combine the component of the group to be determined with the lower component of the group to be determined in the second vector. One component of the group to be determined is determined as a group, and is divided sequentially until the second encrypted component and the second encrypted square component corresponding to all the groups are determined.

比如,該第二加密向量為( ),則可以獲得兩組第二加密分量和第二加密平方分量,其中第二組為 ,第二組為 。其中,第二組中的 為該第二組中的第二加密分量, 為該第二組中的第二加密平方分量,第二組中的 為該第二組中的第二加密分量, 為該第二組中的第二加密平方分量。 For example, the second encryption vector is ( , , , ), then two sets of second encrypted components and second encrypted square components can be obtained, where the second group is , the second group is . Among them, in the second group is the second encrypted component in the second group, is the second encrypted square component in the second group, and in the second group is the second encrypted component in the second group, is the second encrypted square component in the second group.

在本發明實施例中,第二設備在獲得第一設備發送的第一加密向量後,也可以針對預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量,其中,該獲得第二向量時對應的預設的插入規則與獲得第一向量時對應的預設的插入規則相同,且獲得每組第二加密分量和第二加密平方分量的過程與獲得每組第一加密分量和第一加密平方分量的過程相同,在此不做贅述。In the embodiment of the present invention, after obtaining the first encryption vector sent by the first device, the second device can also obtain each group of first components based on the preset insertion rules and each first component in the first encryption vector. Encrypted components and first encrypted square components, where the corresponding preset insertion rules when obtaining the second vector are the same as the preset insertion rules when obtaining the first vector, and each group of second encrypted components and second The process of encrypting square components is the same as the process of obtaining each group of first encrypted components and first encrypted square components, and will not be described again here.

為了確定該加密後的第一向量與第二向量之間的距離,在本發明實施例中,可以根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離,具體的,針對每組,可以先確定該組中第一加密平方向量,與第二加密平方分量的目標和值,然後確定該組中第一加密分量、第二加密分量以及預設的數值的目標乘積,其中,在本發明實施例中,該預設的數值為2,最後確定該目標和值與該目標乘積的目標差值,將該目標差值確定為該組確定的加密後的子距離。In order to determine the distance between the encrypted first vector and the second vector, in the embodiment of the present invention, each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component can be determined. , determine each encrypted sub-distance. Specifically, for each group, you can first determine the target sum of the first encrypted square vector and the second encrypted square component in the group, and then determine the first encrypted component and the second encrypted square component in the group. The target product of the two encryption components and a preset value, where, in the embodiment of the present invention, the preset value is 2, finally determine the target difference between the target sum value and the target product, and determine the target difference The encrypted subdistance determined for the group.

在確定每組的加密後的子距離後,可以根據每個子距離的和值,確定加密後的第一向量和第二向量的距離。After the encrypted sub-distances of each group are determined, the distance between the encrypted first vector and the second vector can be determined based on the sum of each sub-distance.

實施例19: 為了確定第一資料以及第二資料是否匹配,在上述各實施例的基礎上,該基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配包括: 確定該目標距離是否小於預設的第一距離閾值; 若是,則確定該第一資料與該第二資料匹配; 否則,確定該第一資料與該第二資料不匹配。 Example 19: In order to determine whether the first data and the second data match, based on the above embodiments, determining whether the first data and the second data match based on the target distance and the preset first distance threshold includes: Determine whether the target distance is less than a preset first distance threshold; If so, it is determined that the first data matches the second data; Otherwise, it is determined that the first data does not match the second data.

為了確定該第一資料以及該第二資料是否匹配,在本發明實施例中,將該目標距離以及預設的距離閾值進行比較,若該目標距離小於預設的第一距離閾值,則確定該第一資料以及該第二資料匹配,若該目標距離不小於預設的第一距離閾值,則確定該第一資料以及該第二資料不匹配。其中,該預設的第一距離閾值可以為1,可以為1.5等等,具體的,該預設的第一距離閾值可以根據需求進行設置。其中,該目標距離越小,該第一向量和第二向量越匹配。In order to determine whether the first data and the second data match, in the embodiment of the present invention, the target distance is compared with a preset distance threshold. If the target distance is less than the preset first distance threshold, it is determined that the target distance is smaller than the preset first distance threshold. The first data and the second data match. If the target distance is not less than the preset first distance threshold, it is determined that the first data and the second data do not match. The preset first distance threshold may be 1, 1.5, etc. Specifically, the preset first distance threshold may be set according to requirements. Wherein, the smaller the target distance is, the closer the first vector and the second vector match.

為了獲得基於第一加密向量和第二加密向量確定的第一向量和第二向量的目標距離,在上述各實施例的基礎上,該確定該第一資料與該第二資料匹配之後,該方法還包括: 確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。 In order to obtain the target distance of the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, based on the above embodiments, after it is determined that the first data matches the second data, the method Also includes: It is determined whether the target distance is equal to a preset second distance threshold, and if so, it is determined that the first data and the second data are the same.

在本發明實施例中,若該目標距離等於預設的第二距離閾值,則說明該第一資料與該第二資料相同,也就是說,該第一資料與該第二資料完全匹配,其中了,該預設的第二距離閾值小於該預設的第一距離閾值,且該預設的第二距離閾值等於0。In the embodiment of the present invention, if the target distance is equal to the preset second distance threshold, it means that the first data and the second data are the same, that is to say, the first data and the second data completely match, where , the preset second distance threshold is smaller than the preset first distance threshold, and the preset second distance threshold is equal to 0.

實施例20: 圖8為本發明一些實施例提供的一種資料匹配裝置結構示意圖,該裝置包括: 第一獲取模組801,用於將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量; 第一處理模組802,用於採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備; 該第一獲取模組801,還用於獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的; 第一確定模組803,用於基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 20: Figure 8 is a schematic structural diagram of a data matching device provided by some embodiments of the present invention. The device includes: The first acquisition module 801 is used to log the first data to be matched into the pre-trained vector transformation model and obtain the first vector corresponding to the first data; The first processing module 802 is configured to use the first target public key generated by itself to perform homomorphic encryption on the first vector to generate a first encryption vector, and send the first target public key to the second device; The first obtaining module 801 is also used to obtain the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the second encrypted vector is obtained by using the first encrypted vector. The second vector is obtained by homomorphically encrypting the second vector with the target public key; the second vector is obtained by logging the second data into the pre-trained vector conversion model in the second device; The first determination module 803 is configured to determine the first vector and the second vector based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key. The target distance of the vector determines whether the first data and the second data match based on the target distance and the preset first distance threshold.

在一種可能的實施方式中,該第一獲取模組801,具體用於確定該第一資料對應的第一目標資料類型;根據該第一目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一資料對應的預先訓練完成的第一目標向量轉化模型;將該第一資料登錄到該預先訓練完成的第一目標向量轉化模型中,獲得該第一資料對應的該第一向量。In a possible implementation, the first acquisition module 801 is specifically used to determine the first target data type corresponding to the first data; according to the first target data type, the pre-saved data type and the pre-trained The corresponding relationship of the vector transformation model is determined to determine the pre-trained first target vector transformation model corresponding to the first data; log the first data into the pre-trained first target vector transformation model to obtain the first data corresponding to the first vector.

在一種可能的實施方式中,該第一處理模組802,具體用於將該第一加密向量以及該第一目標公開金鑰發送給該第二設備; 該第一獲取模組801,具體用於接收該第二設備發送的基於該第一加密向量和該第二加密向量確定的加密後的該第一向量和該第二向量的距離,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的。 In a possible implementation, the first processing module 802 is specifically configured to send the first encryption vector and the first target public key to the second device; The first acquisition module 801 is specifically configured to receive the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector sent by the second device, wherein, the The second encryption vector is obtained by the second device homomorphically encrypting the second vector based on the first target public key.

在一種可能的實施方式中,該第一處理模組802,還用於接收該第二設備發送的第三加密向量以及該第二設備生成的第二目標公開金鑰;其中,該第三加密向量為該第二設備採用該第二目標公開金鑰對該第二向量進行同態加密後得到的;基於該第二目標公開金鑰對該第一向量進行同態加密生成第四加密向量;基於該第三加密向量以及該第四加密向量,確定加密後的該第二向量和該第一向量的距離,並將該加密後的該第二向量和該第一向量的距離發送給該第二設備,以使該第二設備根據加密後的該第二向量和該第一向量的距離以及該第二目標公開金鑰對應的第二目標私密金鑰,對該加密後的該第二向量和該第一向量的距離進行解密,確定該第二向量與該第一向量的目標距離,並根據該第二向量與該第一向量的目標距離及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。In a possible implementation, the first processing module 802 is also configured to receive the third encryption vector sent by the second device and the second target public key generated by the second device; wherein, the third encryption vector The vector is obtained by the second device using the second target public key to homomorphically encrypt the second vector; based on the second target public key, the first vector is homomorphically encrypted to generate a fourth encrypted vector; Based on the third encrypted vector and the fourth encrypted vector, determine the distance between the encrypted second vector and the first vector, and send the encrypted distance between the second vector and the first vector to the third encrypted vector. Two devices, so that the second device calculates the encrypted second vector based on the distance between the encrypted second vector and the first vector and the second target private key corresponding to the second target public key. Decrypt the distance from the first vector to determine the target distance between the second vector and the first vector, and determine the third vector based on the target distance between the second vector and the first vector and the preset first distance threshold. Whether the first data and the second data match.

在一種可能的實施方式中,該第一處理模組801,還用於將該第一向量和該第二向量的目標距離發送給該第二設備,使該第二設備基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。In a possible implementation, the first processing module 801 is also used to send the target distance of the first vector and the second vector to the second device, so that the second device can base the target distance and the predetermined distance on the second device. A first distance threshold is set to determine whether the first data and the second data match.

在一種可能的實施方式中,該第一獲取模組801,具體用於採用第一設備自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密,確定該第一向量和該第二向量的目標距離。In a possible implementation, the first acquisition module 801 is specifically configured to use the first target private key corresponding to the first target public key generated by the first device itself to encrypt the first vector. Decrypt the distance between the first vector and the second vector to determine the target distance between the first vector and the second vector.

在一種可能的實施方式中,該第一獲取模組801,具體用於針對該第一資料中的每個第一子資料,將該第一子資料登錄到預先訓練完成的向量轉化模型中,獲得該第一子資料對應的第一子向量;其中,每個第一子資料對應的第一子向量的長度均為第一預設長度;將該每個第一子資料對應的第一子向量進行拼接,得到該第一資料對應的該第一向量。In a possible implementation, the first acquisition module 801 is specifically configured to log the first sub-data into the pre-trained vector transformation model for each first sub-data in the first data, Obtain the first sub-vector corresponding to the first sub-data; wherein, the length of the first sub-vector corresponding to each first sub-data is a first preset length; convert the first sub-vector corresponding to each first sub-data into The vectors are spliced together to obtain the first vector corresponding to the first data.

在一種可能的實施方式中,該第一處理模組802,具體用於針對該第一向量中的每個第一分量,確定該第一分量對應的第一平方分量;將每個第一分量對應的第一平方分量按照預設的插入規則插入到該第一向量中,並將插入第一平方分量後獲得的向量更新為該第一向量;基於該第一目標公開金鑰對該第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成該第一加密向量。In a possible implementation, the first processing module 802 is specifically configured to determine, for each first component in the first vector, the first square component corresponding to the first component; The corresponding first square component is inserted into the first vector according to the preset insertion rules, and the vector obtained after inserting the first square component is updated to the first vector; the first vector is updated based on the first target public key. Each first component and each first square component in the vector are respectively homomorphically encrypted to generate the first encrypted vector.

在一種可能的實施方式中,該第一獲取模組801,具體用於根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;並根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量;根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離;根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。In a possible implementation, the first acquisition module 801 is specifically configured to acquire each group of first encrypted components and first components according to the preset insertion rule and each first component in the first encrypted vector. Encrypt square components; and obtain each set of second encrypted components and second encrypted square components according to the preset insertion rule and each second component in the second encrypted vector; determine the corresponding According to each group of the first encrypted component, the first encrypted square component, the second encrypted square component and the second encrypted square component, Determine each encrypted sub-distance; determine the distance between the encrypted first vector and the second vector based on the sum of each sub-distance.

在一種可能的實施方式中,該第一確定模組803,具體用於確定該目標距離是否小於預設的第一距離閾值;若是,則確定該第一資料與該第二資料匹配;否則,確定該第一資料與該第二資料不匹配。In a possible implementation, the first determination module 803 is specifically used to determine whether the target distance is less than a preset first distance threshold; if so, determine that the first data matches the second data; otherwise, It is determined that the first information does not match the second information.

在一種可能的實施方式中,該第一確定模組803,還用於確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。In a possible implementation, the first determination module 803 is also used to determine whether the target distance is equal to a preset second distance threshold, and if so, determine that the first data and the second data are the same.

實施例21: 圖9為本發明一些實施例提供的一種資料匹配裝置結構示意圖,該裝置包括: 第二獲取模組901,用於將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量; 第二處理模組902,用於接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量; 該第二獲取模組901,還用於獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的; 第二確定模組903,用於基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 21: Figure 9 is a schematic structural diagram of a data matching device provided by some embodiments of the present invention. The device includes: The second acquisition module 901 is used to log the second data to be matched into the pre-trained vector transformation model and obtain the second vector corresponding to the second data; The second processing module 902 is configured to receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; The second acquisition module 901 is also used to acquire the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the first encrypted vector is calculated using the first target. The first vector is obtained by encrypting the first vector with the public key, and the first vector is obtained by logging the first data into the pre-trained vector conversion model in the first device; The second determination module 903 is used to determine whether the first data and the second data match based on the target distance and the preset first distance threshold.

在一種可能的實施方式中,該第二獲取模組901,具體用於確定該第二資料對應的第二目標資料類型;根據該第二目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先訓練完成的第二目標向量轉化模型;將該第二資料登錄到該預先訓練完成的第二目標向量轉化模型中,獲得該第二資料對應的該第二向量。In a possible implementation, the second acquisition module 901 is specifically used to determine the second target data type corresponding to the second data; according to the second target data type, the pre-saved data type and the pre-trained The corresponding relationship of the vector transformation model is to determine the pre-trained second target vector transformation model corresponding to the second data; log the second data into the pre-trained second target vector transformation model to obtain the second data The corresponding second vector.

在一種可能的實施方式中,該第二處理模組902,具體用於接收該第一設備發送的該第一目標公開金鑰以及該第一加密向量,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量進行同態加密後得到的; 該第二獲取模組901,具體用於根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離;將該加密後的該第一向量和第二向量的距離發送給該第一設備;接收該第一設備發送的該目標距離,其中,該目標距離為該第一設備採用自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密後得到的。 In a possible implementation, the second processing module 902 is specifically configured to receive the first target public key and the first encryption vector sent by the first device, wherein the first encryption vector is generated using the The first target public key is obtained by homomorphically encrypting the first vector; The second acquisition module 901 is specifically configured to determine the distance between the encrypted first vector and the second vector according to the first encrypted vector and the second encrypted vector; and combine the encrypted first vector and the second encrypted vector. Send the distance between the two vectors to the first device; receive the target distance sent by the first device, where the target distance is the first target private key corresponding to the first target public key generated by the first device itself. The key is obtained by decrypting the distance between the encrypted first vector and the second vector.

在一種可能的實施方式中,該第二獲取模組901,具體用於向該第一設備發送該第二加密向量,以使該第一設備基於該第二加密向量以及該第一加密向量,確定加密後的該第一加密向量和該第二加密向量的距離;接收該第一設備發送的該第一向量和第二向量的目標距離,其中,該目標距離為該第一設備基於自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一加密向量和該第二加密向量的距離解密得到的。In a possible implementation, the second acquisition module 901 is specifically configured to send the second encryption vector to the first device, so that the first device can, based on the second encryption vector and the first encryption vector, Determine the distance between the encrypted first encrypted vector and the second encrypted vector; receive the target distance of the first vector and the second vector sent by the first device, wherein the target distance is generated by the first device based on itself The first target private key corresponding to the first target public key is obtained by decrypting the distance between the encrypted first encryption vector and the second encryption vector.

在一種可能的實施方式中,該第二獲取模組901,具體用於採用自身生成的第二目標公開金鑰對該第二向量進行同態加密,獲得第三加密向量,並將該第二目標公開金鑰以及該第三加密向量發送給該第一設備;接收該第一設備發送的基於該第三加密向量以及第四加密向量,確定加密後的該第二向量和該第一向量的距離,其中,該第四加密向量為採用該第二目標公開金鑰對該第一向量進行同態加密生成的;基於該加密後的該第一向量和第二向量的距離及該第二目標公開金鑰對應的第二目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。In a possible implementation, the second acquisition module 901 is specifically configured to homomorphically encrypt the second vector using the second target public key generated by itself, obtain the third encrypted vector, and convert the second Send the target public key and the third encryption vector to the first device; receive the information sent by the first device based on the third encryption vector and the fourth encryption vector, and determine the encrypted second vector and the first vector. distance, wherein the fourth encrypted vector is generated by homomorphically encrypting the first vector using the second target public key; based on the distance between the encrypted first vector and the second vector and the second target The second target private key corresponding to the public key determines the target distance between the first vector and the second vector, and determines whether the first data and the second data are based on the target distance and the preset first distance threshold. match.

在一種可能的實施方式中,該第二獲取模組901,具體用於針對該第二資料中的每個第二子資料,將該第二子資料登錄到預先訓練完成的向量轉化模型中,獲得該第二子資料對應的第二子向量;其中,每個第二子資料對應的第二子向量的長度均為第一預設長度;將該每個第二子資料對應的第二子向量進行拼接,得到該第二資料對應的該第二向量。In a possible implementation, the second acquisition module 901 is specifically configured to log the second sub-data into the pre-trained vector transformation model for each second sub-data in the second data, Obtain the second sub-vector corresponding to the second sub-data; wherein, the length of the second sub-vector corresponding to each second sub-data is the first preset length; convert the second sub-vector corresponding to each second sub-data into The vectors are spliced together to obtain the second vector corresponding to the second data.

在一種可能的實施方式中,該第二處理模組902,具體用於針對該第二向量中的每個第二分量,確定該第二分量對應的第二平方分量;將每個第二分量對應的第二平方分量按照預設的插入規則插入到該第二向量中,並將插入第二平方分量後獲得的向量更新為該第二向量;基於該第一目標公開金鑰對該第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成該第二加密向量。In a possible implementation, the second processing module 902 is specifically configured to determine, for each second component in the second vector, the second square component corresponding to the second component; The corresponding second square component is inserted into the second vector according to the preset insertion rules, and the vector obtained after inserting the second square component is updated to the second vector; the second vector is updated based on the first target public key. Each second component and each second square component in the vector are respectively homomorphically encrypted to generate the second encrypted vector.

在一種可能的實施方式中,該第二獲取模組901,具體用於根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;並根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量;根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離;根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。In a possible implementation, the second acquisition module 901 is specifically configured to acquire each group of second encrypted components and second components according to the preset insertion rule and each second component in the second encrypted vector. Encrypt square components; and obtain each group of first encrypted components and first encrypted square components according to the preset insertion rule and each first component in the first encrypted vector; determine the corresponding According to each group of the first encrypted component, the first encrypted square component, the second encrypted square component and the second encrypted square component, Determine each encrypted sub-distance; determine the distance between the encrypted first vector and the second vector based on the sum of each sub-distance.

在一種可能的實施方式中,該第二確定模組903,具體用於確定該目標距離是否小於預設的第一距離閾值;若是,則確定該第一資料與該第二資料匹配;否則,確定該第一資料與該第二資料不匹配。In a possible implementation, the second determination module 903 is specifically used to determine whether the target distance is less than a preset first distance threshold; if so, determine that the first data matches the second data; otherwise, It is determined that the first information does not match the second information.

在一種可能的實施方式中,該第二確定模組903,還用於確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。In a possible implementation, the second determination module 903 is also used to determine whether the target distance is equal to a preset second distance threshold, and if so, determine that the first data and the second data are the same.

實施例22: 在上述各實施例的基礎上,本發明一些實施例還提供了一種電子設備,如圖10所示,包括:處理器1001、通信介面1002、記憶體1003和通信匯流排1004,其中,處理器1001,通信介面1002,記憶體1003通過通信匯流排1004完成相互間的通信。 Example 22: Based on the above embodiments, some embodiments of the present invention also provide an electronic device, as shown in Figure 10, including: a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, wherein the processor 1001. The communication interface 1002 and the memory 1003 complete communication with each other through the communication bus 1004.

該記憶體1003中存儲有電腦程式,當該程式被該處理器1001執行時,使得該處理器1001執行如下步驟: 將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量; 採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備; 獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的; 基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The memory 1003 stores a computer program. When the program is executed by the processor 1001, the processor 1001 performs the following steps: Log the first data to be matched into the pre-trained vector transformation model to obtain the first vector corresponding to the first data; Use the self-generated first target public key to perform homomorphic encryption on the first vector to generate a first encryption vector, and send the first target public key to the second device; Obtain the distance between the encrypted first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the second encrypted vector is obtained by synchronizing the second vector with the first target public key. obtained after state encryption; the second vector is obtained by logging the second data into the pre-trained vector transformation model in the second device; Based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key, determine the target distance between the first vector and the second vector, based on the target distance and a preset first distance threshold to determine whether the first data and the second data match.

進一步地,該處理器1001,還用於確定該第一資料對應的第一目標資料類型;根據該第一目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一資料對應的預先訓練完成的第一目標向量轉化模型;將該第一資料登錄到該預先訓練完成的第一目標向量轉化模型中,獲得該第一資料對應的該第一向量。Further, the processor 1001 is also used to determine the first target data type corresponding to the first data; according to the first target data type and the corresponding relationship between the pre-saved data type and the pre-trained vector transformation model, determine The first pre-trained first target vector transformation model corresponding to the first data; log the first data into the pre-trained first target vector transformation model to obtain the first vector corresponding to the first data.

進一步地,該處理器1001,還用於接收該第二設備發送的該第二加密向量,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的;基於該第一加密向量以及該第二加密向量,確定加密後的該第一向量和該第二向量的距離。Further, the processor 1001 is also configured to receive the second encryption vector sent by the second device, wherein the second encryption vector is performed by the second device based on the first target public key. Obtained after homomorphic encryption; based on the first encrypted vector and the second encrypted vector, determine the distance between the encrypted first vector and the second vector.

進一步地,該處理器1001,還用於將該第一加密向量以及該第一目標公開金鑰發送給該第二設備;接收該第二設備發送的基於該第一加密向量和該第二加密向量確定的加密後的該第一向量和該第二向量的距離,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的。Further, the processor 1001 is also configured to send the first encryption vector and the first target public key to the second device; receive the data sent by the second device based on the first encryption vector and the second encryption key. The distance between the encrypted first vector and the second vector determined by the vector, wherein the second encrypted vector is obtained by the second device after homomorphically encrypting the second vector based on the first target public key. .

進一步地,該處理器1001,還用於接收該第二設備發送的第三加密向量以及該第二設備生成的第二目標公開金鑰;其中,該第三加密向量為該第二設備採用該第二目標公開金鑰對該第二向量進行同態加密後得到的;基於該第二目標公開金鑰對該第一向量進行同態加密生成第四加密向量;基於該第三加密向量以及該第四加密向量,確定加密後的該第二向量和該第一向量的距離,並將該加密後的該第二向量和該第一向量的距離發送給該第二設備,以使該第二設備根據加密後的該第二向量和該第一向量的距離以及該第二目標公開金鑰對應的第二目標私密金鑰,對該加密後的該第二向量和該第一向量的距離進行解密,確定該第二向量與該第一向量的目標距離,並根據該第二向量與該第一向量的目標距離及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。Further, the processor 1001 is also configured to receive a third encryption vector sent by the second device and a second target public key generated by the second device; wherein the third encryption vector is generated by the second device using the The second target public key is obtained by homomorphically encrypting the second vector; the first vector is homomorphically encrypted based on the second target public key to generate a fourth encrypted vector; based on the third encrypted vector and the The fourth encrypted vector determines the distance between the encrypted second vector and the first vector, and sends the encrypted distance between the second vector and the first vector to the second device, so that the second The device calculates the distance between the encrypted second vector and the first vector based on the encrypted distance between the second vector and the first vector and the second target private key corresponding to the second target public key. Decrypt, determine the target distance between the second vector and the first vector, and determine whether the first data and the second data are based on the target distance between the second vector and the first vector and the preset first distance threshold. match.

進一步地,該處理器1001,還用於將該第一向量和該第二向量的目標距離發送給該第二設備,使該第二設備基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。Further, the processor 1001 is also configured to send the target distance of the first vector and the second vector to the second device, so that the second device determines based on the target distance and the preset first distance threshold. Whether the first data and the second data match.

進一步地,該處理器1001,還用於採用第一設備自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密,確定該第一向量和該第二向量的目標距離。Further, the processor 1001 is also configured to use the first target private key corresponding to the first target public key generated by the first device itself to decrypt the encrypted distance between the first vector and the second vector. , determine the target distance between the first vector and the second vector.

進一步地,該處理器1001,還用於針對該第一資料中的每個第一子資料,將該第一子資料登錄到預先訓練完成的向量轉化模型中,獲得該第一子資料對應的第一子向量;其中,每個第一子資料對應的第一子向量的長度均為第一預設長度;將該每個第一子資料對應的第一子向量進行拼接,得到該第一資料對應的該第一向量。Further, the processor 1001 is also configured to log the first sub-data into the pre-trained vector transformation model for each first sub-data in the first data, and obtain the corresponding data of the first sub-data. The first sub-vector; wherein the length of the first sub-vector corresponding to each first sub-data is the first preset length; the first sub-vectors corresponding to each first sub-data are spliced to obtain the first The first vector corresponding to the data.

進一步地,該處理器1001,還用於針對該第一向量中的每個第一分量,確定該第一分量對應的第一平方分量;將每個第一分量對應的第一平方分量按照預設的插入規則插入到該第一向量中,並將插入第一平方分量後獲得的向量更新為該第一向量;基於該第一目標公開金鑰對該第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成該第一加密向量。Further, the processor 1001 is further configured to determine, for each first component in the first vector, a first square component corresponding to the first component; and convert the first square component corresponding to each first component according to a predetermined Assume that the insertion rule is inserted into the first vector, and the vector obtained after inserting the first square component is updated to the first vector; based on the first target public key, each first component in the first vector is And each first square component is homomorphically encrypted separately to generate the first encrypted vector.

進一步地,該處理器1001,還用於根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;並根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量;根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離;根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。Further, the processor 1001 is also configured to obtain each group of first encrypted components and first encrypted square components according to the preset insertion rule and each first component in the first encrypted vector; and according to the preset Assuming the insertion rules and each second component in the second encryption vector, obtain each group of second encryption components and second encryption square components; according to the preset insertion rules, determine the corresponding group of first encryption components, The first encrypted square component, the second encrypted square component and the second encrypted square component; determine each encrypted sub-distance based on each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; The distance between the encrypted first vector and the second vector is determined based on the sum of each sub-distance.

進一步地,該處理器1001,還用於確定該目標距離是否小於預設的第一距離閾值;若是,則確定該第一資料與該第二資料匹配;否則,確定該第一資料與該第二資料不匹配。Further, the processor 1001 is also used to determine whether the target distance is less than a preset first distance threshold; if so, determine that the first data matches the second data; otherwise, determine that the first data matches the third data. The two data do not match.

進一步地,該處理器1001,還用於確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。Further, the processor 1001 is also used to determine whether the target distance is equal to a preset second distance threshold, and if so, determine that the first data and the second data are the same.

上述伺服器提到的通信匯流排可以是外設部件互連標準(Peripheral Component Interconnect,PCI)匯流排或延伸工業標準架構(Extended Industry Standard Architecture,EISA)匯流排等。該通信匯流排可以分為位址匯流排、資料匯流排、控制匯流排等。為便於表示,圖中僅用一條粗線表示,但並不表示僅有一根匯流排或一種類型的匯流排。The communication bus mentioned in the above-mentioned server can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

通信介面1002用於上述電子設備與其他設備之間的通信。The communication interface 1002 is used for communication between the above-mentioned electronic device and other devices.

記憶體可以包括隨機存取記憶體(Random Access Memory,RAM),也可以包括非易失性記憶體(Non-Volatile Memory,NVM),例如至少一個磁碟記憶體。可選地,記憶體還可以是至少一個位於遠離前述處理器的存儲裝置。The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory. Optionally, the memory may also be at least one storage device located far away from the aforementioned processor.

上述處理器可以是通用處理器,包括中央處理器、網路處理器(Network Processor,NP)等;還可以是數字指令處理器(Digital Signal Processing,DSP)、專用積體電路、現場可程式設計門陳列或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。The above-mentioned processor can be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processing, DSP), a special integrated circuit, or field programmable Gate array or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

實施例23: 在上述各實施例的基礎上,本發明一些實施例還提供了一種電子設備,如圖11所示,包括:處理器1101、通信介面1102、記憶體1103和通信匯流排1104,其中,處理器1101,通信介面1102,記憶體1103通過通信匯流排1104完成相互間的通信。 Example 23: Based on the above embodiments, some embodiments of the present invention also provide an electronic device, as shown in Figure 11, including: a processor 1101, a communication interface 1102, a memory 1103 and a communication bus 1104, where the processor 1101. Communication interface 1102 and memory 1103 complete communication with each other through communication bus 1104.

該記憶體1103中存儲有電腦程式,當該程式被該處理器1101執行時,使得該處理器1101執行如下步驟: 將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量; 接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量; 獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的; 基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The memory 1103 stores a computer program. When the program is executed by the processor 1101, the processor 1101 performs the following steps: Log the second data to be matched into the pre-trained vector transformation model to obtain the second vector corresponding to the second data; Receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; Obtain the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the first encrypted vector is obtained by encrypting the first vector using the first target public key. , the first vector is obtained by logging the first data into the pre-trained vector transformation model in the first device; Based on the target distance and the preset first distance threshold, it is determined whether the first data and the second data match.

進一步地,該處理器1101,還用於確定該第二資料對應的第二目標資料類型;根據該第二目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先訓練完成的第二目標向量轉化模型;將該第二資料登錄到該預先訓練完成的第二目標向量轉化模型中,獲得該第二資料對應的該第二向量。Further, the processor 1101 is also used to determine the second target data type corresponding to the second data; determine the second target data type according to the corresponding relationship between the second target data type and the pre-saved data type and the pre-trained vector transformation model. The pre-trained second target vector transformation model corresponding to the second data; log the second data into the pre-trained second target vector transformation model to obtain the second vector corresponding to the second data.

進一步地,該處理器1101,還用於接收該第一設備發送的該第一目標公開金鑰以及該第一加密向量,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量進行同態加密後得到的;根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離;將該加密後的該第一向量和第二向量的距離發送給該第一設備;接收該第一設備發送的該目標距離,其中,該目標距離為該第一設備採用自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密後得到的。Further, the processor 1101 is also configured to receive the first target public key and the first encryption vector sent by the first device, wherein the first encryption vector is the first target public key using the first target public key. The first vector is obtained by performing homomorphic encryption; according to the first encrypted vector and the second encrypted vector, determine the distance between the encrypted first vector and the second vector; and combine the encrypted first vector and the second encrypted vector. Send the distance between the two vectors to the first device; receive the target distance sent by the first device, where the target distance is the first target private key corresponding to the first target public key generated by the first device itself. The key is obtained by decrypting the distance between the encrypted first vector and the second vector.

進一步地,該處理器1101,還用於向該第一設備發送該第二加密向量,以使該第一設備基於該第二加密向量以及該第一加密向量,確定加密後的該第一加密向量和該第二加密向量的距離;接收該第一設備發送的該第一向量和第二向量的目標距離,其中,該目標距離為該第一設備基於自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一加密向量和該第二加密向量的距離解密得到的。Further, the processor 1101 is also configured to send the second encryption vector to the first device, so that the first device determines the encrypted first encryption vector based on the second encryption vector and the first encryption vector. The distance between the vector and the second encrypted vector; receiving the target distance between the first vector and the second vector sent by the first device, where the target distance is the first target public key generated by the first device based on itself The corresponding first target private key is obtained by decrypting the distance between the encrypted first encryption vector and the second encryption vector.

進一步地,該處理器1101,還用於採用自身生成的第二目標公開金鑰對該第二向量進行同態加密,獲得第三加密向量,並將該第二目標公開金鑰以及該第三加密向量發送給該第一設備;接收該第一設備發送的基於該第三加密向量以及第四加密向量,確定加密後的該第二向量和該第一向量的距離,其中,該第四加密向量為採用該第二目標公開金鑰對該第一向量進行同態加密生成的;基於該加密後的該第一向量和第二向量的距離及該第二目標公開金鑰對應的第二目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。Further, the processor 1101 is also configured to perform homomorphic encryption on the second vector using the second target public key generated by itself, obtain a third encryption vector, and combine the second target public key and the third Send the encryption vector to the first device; receive the information sent by the first device based on the third encryption vector and the fourth encryption vector, and determine the distance between the encrypted second vector and the first vector, wherein the fourth encryption The vector is generated by homomorphically encrypting the first vector using the second target public key; based on the distance between the encrypted first vector and the second vector and the second target corresponding to the second target public key The private key determines the target distance between the first vector and the second vector, and determines whether the first data and the second data match based on the target distance and the preset first distance threshold.

進一步地,該處理器1101,還用於針對該第二資料中的每個第二子資料,將該第二子資料登錄到預先訓練完成的向量轉化模型中,獲得該第二子資料對應的第二子向量;其中,每個第二子資料對應的第二子向量的長度均為第一預設長度;將該每個第二子資料對應的第二子向量進行拼接,得到該第二資料對應的該第二向量。Further, the processor 1101 is also configured to log the second sub-data into the pre-trained vector transformation model for each second sub-data in the second data, and obtain the second sub-data corresponding to the second sub-data. The second sub-vector; wherein, the length of the second sub-vector corresponding to each second sub-data is the first preset length; the second sub-vector corresponding to each second sub-data is spliced to obtain the second sub-vector. The second vector corresponding to the data.

進一步地,該處理器1101,還用於針對該第二向量中的每個第二分量,確定該第二分量對應的第二平方分量;將每個第二分量對應的第二平方分量按照預設的插入規則插入到該第二向量中,並將插入第二平方分量後獲得的向量更新為該第二向量;基於該第一目標公開金鑰對該第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成該第二加密向量。Further, the processor 1101 is also configured to determine, for each second component in the second vector, a second square component corresponding to the second component; and convert the second square component corresponding to each second component according to a predetermined Assume that the insertion rule is inserted into the second vector, and the vector obtained after inserting the second square component is updated to the second vector; based on the first target public key, each second component in the second vector is And each second square component is homomorphically encrypted separately to generate the second encrypted vector.

進一步地,該處理器1101,還用於根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;並根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量;根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離;根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。Further, the processor 1101 is also configured to obtain each group of second encrypted components and second encrypted square components according to the preset insertion rule and each second component in the second encrypted vector; and according to the preset Assuming the insertion rules and each first component in the first encryption vector, obtain each group of first encryption components and first encryption square components; according to the preset insertion rules, determine the corresponding group of first encryption components, The first encrypted square component, the second encrypted square component and the second encrypted square component; determine each encrypted sub-distance based on each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; The distance between the encrypted first vector and the second vector is determined based on the sum of each sub-distance.

進一步地,該處理器1101,還用於確定該目標距離是否小於預設的第一距離閾值;若是,則確定該第一資料與該第二資料匹配;否則,確定該第一資料與該第二資料不匹配。Further, the processor 1101 is also used to determine whether the target distance is less than a preset first distance threshold; if so, determine that the first data matches the second data; otherwise, determine that the first data matches the third data. The two data do not match.

進一步地,該處理器1101,還用於確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。Further, the processor 1101 is also used to determine whether the target distance is equal to a preset second distance threshold, and if so, determine that the first data and the second data are the same.

上述伺服器提到的通信匯流排可以是外設部件互連標準(Peripheral Component Interconnect,PCI)匯流排或延伸工業標準架構(Extended Industry Standard Architecture,EISA)匯流排等。該通信匯流排可以分為位址匯流排、資料匯流排、控制匯流排等。為便於表示,圖中僅用一條粗線表示,但並不表示僅有一根匯流排或一種類型的匯流排。The communication bus mentioned in the above-mentioned server can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

通信介面1102用於上述電子設備與其他設備之間的通信。The communication interface 1102 is used for communication between the above-mentioned electronic device and other devices.

記憶體可以包括隨機存取記憶體(Random Access Memory,RAM),也可以包括非易失性記憶體(Non-Volatile Memory,NVM),例如至少一個磁碟記憶體。可選地,記憶體還可以是至少一個位於遠離前述處理器的存儲裝置。The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory. Optionally, the memory may also be at least one storage device located far away from the aforementioned processor.

上述處理器可以是通用處理器,包括中央處理器、網路處理器(Network Processor,NP)等;還可以是數字指令處理器(Digital Signal Processing,DSP)、專用積體電路、現場可程式設計門陳列或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。The above-mentioned processor can be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processing, DSP), a special integrated circuit, or field programmable Gate array or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.

實施例24: 在上述各實施例的基礎上,本發明實施例還提供了一種電腦可讀存儲介質,該電腦可讀存儲介質內存儲有可由電子設備執行的電腦程式,當該程式在該電子設備上運行時,使得該電子設備執行時實現如下步驟: 將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量; 採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備; 獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的; 基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 24: Based on the above embodiments, embodiments of the present invention also provide a computer-readable storage medium. The computer-readable storage medium stores a computer program that can be executed by an electronic device. When the program is run on the electronic device, , so that the electronic device can implement the following steps when executing: Log the first data to be matched into the pre-trained vector transformation model to obtain the first vector corresponding to the first data; Use the self-generated first target public key to perform homomorphic encryption on the first vector to generate a first encryption vector, and send the first target public key to the second device; Obtain the distance between the encrypted first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the second encrypted vector is obtained by synchronizing the second vector with the first target public key. obtained after state encryption; the second vector is obtained by logging the second data into the pre-trained vector transformation model in the second device; Based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key, determine the target distance between the first vector and the second vector, based on the target distance and a preset first distance threshold to determine whether the first data and the second data match.

進一步地,該將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量包括: 確定該第一資料對應的第一目標資料類型; 根據該第一目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一資料對應的預先訓練完成的第一目標向量轉化模型; 將該第一資料登錄到該預先訓練完成的第一目標向量轉化模型中,獲得該第一資料對應的該第一向量。 Further, logging the first data to be matched into the pre-trained vector transformation model and obtaining the first vector corresponding to the first data includes: Determine the first target data type corresponding to the first data; Determine the pre-trained first target vector transformation model corresponding to the first data according to the corresponding relationship between the first target data type and the pre-saved data type and the pre-trained vector transformation model; Log the first data into the pre-trained first target vector transformation model to obtain the first vector corresponding to the first data.

進一步地,該第一目標資料類型為文字類型或數字類型。Further, the first target data type is a text type or a numeric type.

進一步地,若該第一目標資料類型為文字類型,對應的預先訓練完成的第一目標向量轉化模型為詞向量模型或句向量模型;若該第一目標資料類型為數字類型,對應的預先訓練完成的第一目標向量轉化模型為獨熱編碼模型。Further, if the first target data type is a text type, the corresponding pre-trained first target vector conversion model is a word vector model or a sentence vector model; if the first target data type is a numeric type, the corresponding pre-trained The completed first target vector conversion model is a one-hot encoding model.

進一步地,該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 接收該第二設備發送的該第二加密向量,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的; 基於該第一加密向量以及該第二加密向量,確定加密後的該第一向量和該第二向量的距離。 Further, obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Receive the second encryption vector sent by the second device, wherein the second encryption vector is obtained by the second device after homomorphically encrypting the second vector based on the first target public key; Based on the first encrypted vector and the second encrypted vector, the distance between the encrypted first vector and the second vector is determined.

進一步地,該將該第一目標公開金鑰發送給第二設備包括: 將該第一加密向量以及該第一目標公開金鑰發送給該第二設備; 該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 接收該第二設備發送的基於該第一加密向量和該第二加密向量確定的加密後的該第一向量和該第二向量的距離,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的。 Further, sending the first target public key to the second device includes: Send the first encryption vector and the first target public key to the second device; The obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Receive the encrypted distance between the first vector and the second vector sent by the second device and determined based on the first encrypted vector and the second encrypted vector, where the second encrypted vector is determined by the second device based on the The first target public key is obtained by homomorphically encrypting the second vector.

進一步地,該方法還包括: 接收該第二設備發送的第三加密向量以及該第二設備生成的第二目標公開金鑰;其中,該第三加密向量為該第二設備採用該第二目標公開金鑰對該第二向量進行同態加密後得到的; 基於該第二目標公開金鑰對該第一向量進行同態加密生成第四加密向量; 基於該第三加密向量以及該第四加密向量,確定加密後的該第二向量和該第一向量的距離,並將該加密後的該第二向量和該第一向量的距離發送給該第二設備,以使該第二設備根據加密後的該第二向量和該第一向量的距離以及該第二目標公開金鑰對應的第二目標私密金鑰,對該加密後的該第二向量和該第一向量的距離進行解密,確定該第二向量與該第一向量的目標距離,並根據該第二向量與該第一向量的目標距離及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Further, the method also includes: Receive the third encryption vector sent by the second device and the second target public key generated by the second device; wherein the third encryption vector is the second device using the second target public key to encrypt the second vector. Obtained after homomorphic encryption; Perform homomorphic encryption on the first vector based on the second target public key to generate a fourth encrypted vector; Based on the third encrypted vector and the fourth encrypted vector, determine the distance between the encrypted second vector and the first vector, and send the encrypted distance between the second vector and the first vector to the third encrypted vector. Two devices, so that the second device calculates the encrypted second vector based on the distance between the encrypted second vector and the first vector and the second target private key corresponding to the second target public key. Decrypt the distance from the first vector to determine the target distance between the second vector and the first vector, and determine the third vector based on the target distance between the second vector and the first vector and the preset first distance threshold. Whether the first data and the second data match.

進一步地,該確定該第一向量和該第二向量的目標距離之後,該方法還包括: 將該第一向量和該第二向量的目標距離發送給該第二設備,使該第二設備基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Further, after determining the target distance between the first vector and the second vector, the method further includes: The target distance of the first vector and the second vector is sent to the second device, so that the second device determines whether the first data and the second data match based on the target distance and the preset first distance threshold. .

進一步地,該基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和第二向量的目標距離包括: 採用第一設備自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密,確定該第一向量和該第二向量的目標距離。 Further, determining the target distance between the first vector and the second vector based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key includes: Use the first target private key corresponding to the first target public key generated by the first device itself to decrypt the distance between the encrypted first vector and the second vector to determine the first vector and the second vector. target distance.

進一步地,該將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量包括: 針對該第一資料中的每個第一子資料,將該第一子資料登錄到預先訓練完成的向量轉化模型中,獲得該第一子資料對應的第一子向量;其中,每個第一子資料對應的第一子向量的長度均為第一預設長度; 將該每個第一子資料對應的第一子向量進行拼接,得到該第一資料對應的該第一向量。 Further, logging the first data to be matched into the pre-trained vector transformation model and obtaining the first vector corresponding to the first data includes: For each first sub-data in the first data, log the first sub-data into the pre-trained vector transformation model to obtain the first sub-vector corresponding to the first sub-data; wherein, each first sub-data is The length of the first sub-vector corresponding to the sub-data is the first preset length; The first sub-vectors corresponding to each first sub-data are spliced to obtain the first vector corresponding to the first data.

進一步地,該第一向量和該第二向量的長度均為第二預設長度。Further, the lengths of the first vector and the second vector are both second preset lengths.

進一步地,該採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量包括: 針對該第一向量中的每個第一分量,確定該第一分量對應的第一平方分量; 將每個第一分量對應的第一平方分量按照預設的插入規則插入到該第一向量中,並將插入第一平方分量後獲得的向量更新為該第一向量; 基於該第一目標公開金鑰對該第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成該第一加密向量。 Further, using the self-generated first target public key to homomorphically encrypt the first vector to generate the first encrypted vector includes: For each first component in the first vector, determine the first square component corresponding to the first component; Insert the first square component corresponding to each first component into the first vector according to the preset insertion rules, and update the vector obtained after inserting the first square component to the first vector; Homomorphic encryption is performed on each first component and each first square component in the first vector based on the first target public key to generate the first encrypted vector.

進一步地,該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;並根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量; 根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量; 根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離; 根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。 Further, obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: According to the preset insertion rule and each first component in the first encryption vector, obtain each set of first encryption components and first encryption square components; and according to the preset insertion rule and the second encryption vector For each second component of , obtain each set of second encrypted components and second encrypted square components; According to the preset insertion rule, determine each corresponding group of first encrypted component, first encrypted square component, second encrypted component and second encrypted square component; Determine each encrypted sub-distance according to each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; The distance between the encrypted first vector and the second vector is determined based on the sum of each sub-distance.

進一步地,該基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配包括: 確定該目標距離是否小於預設的第一距離閾值; 若是,則確定該第一資料與該第二資料匹配; 否則,確定該第一資料與該第二資料不匹配。 Further, determining whether the first data and the second data match based on the target distance and a preset first distance threshold includes: Determine whether the target distance is less than a preset first distance threshold; If so, it is determined that the first data matches the second data; Otherwise, it is determined that the first data does not match the second data.

進一步地,該確定該第一資料與該第二資料匹配之後,該方法還包括: 確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。 Further, after determining that the first data matches the second data, the method further includes: It is determined whether the target distance is equal to a preset second distance threshold, and if so, it is determined that the first data and the second data are the same.

實施例25: 在上述各實施例的基礎上,本發明實施例還提供了一種電腦可讀存儲介質,該電腦可讀存儲介質內存儲有可由電子設備執行的電腦程式,當該程式在該電子設備上運行時,使得該電子設備執行時實現如下步驟: 將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量; 接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量; 獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的; 基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Example 25: Based on the above embodiments, embodiments of the present invention also provide a computer-readable storage medium. The computer-readable storage medium stores a computer program that can be executed by an electronic device. When the program is run on the electronic device, , so that the electronic device can implement the following steps when executing: Log the second data to be matched into the pre-trained vector transformation model to obtain the second vector corresponding to the second data; Receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; Obtain the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the first encrypted vector is obtained by encrypting the first vector using the first target public key. , the first vector is obtained by logging the first data into the pre-trained vector transformation model in the first device; Based on the target distance and the preset first distance threshold, it is determined whether the first data and the second data match.

進一步地,該將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量包括: 確定該第二資料對應的第二目標資料類型; 根據該第二目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先訓練完成的第二目標向量轉化模型; 將該第二資料登錄到該預先訓練完成的第二目標向量轉化模型中,獲得該第二資料對應的該第二向量。 Further, logging the second data to be matched into the pre-trained vector transformation model, and obtaining the second vector corresponding to the second data includes: Determine the second target data type corresponding to the second data; According to the second target data type and the corresponding relationship between the pre-saved data type and the pre-trained vector transformation model, determine the pre-trained second target vector transformation model corresponding to the second data; Log the second data into the pre-trained second target vector transformation model to obtain the second vector corresponding to the second data.

進一步地,該第二目標資料類型為文字類型或數字類型。Further, the second target data type is a text type or a numeric type.

進一步地,若該第二目標資料類型為文字類型,對應的預先訓練完成的第二目標向量轉化模型為詞向量模型或句向量模型;若該第二目標資料類型為數字類型,對應的預先訓練完成的第二目標向量轉化模型為獨熱編碼模型。Further, if the second target data type is a text type, the corresponding pre-trained second target vector conversion model is a word vector model or a sentence vector model; if the second target data type is a numeric type, the corresponding pre-trained The completed second target vector transformation model is a one-hot encoding model.

進一步地,該接收第一設備發送的第一目標公開金鑰包括: 接收該第一設備發送的該第一目標公開金鑰以及該第一加密向量,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量進行同態加密後得到的; 該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括: 根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離; 將該加密後的該第一向量和第二向量的距離發送給該第一設備; 接收該第一設備發送的該目標距離,其中,該目標距離為該第一設備採用自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密後得到的。 Further, receiving the first target public key sent by the first device includes: Receive the first target public key and the first encryption vector sent by the first device, wherein the first encryption vector is obtained by homomorphically encrypting the first vector using the first target public key; The obtaining the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Determine the distance between the encrypted first vector and the second vector according to the first encrypted vector and the second encrypted vector; Send the encrypted distance between the first vector and the second vector to the first device; Receive the target distance sent by the first device, where the target distance is the encrypted sum of the first vector using the first target private key corresponding to the first target public key generated by the first device itself. The distance of the second vector is obtained after decryption.

進一步地,該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括: 向該第一設備發送該第二加密向量,以使該第一設備基於該第二加密向量以及該第一加密向量,確定加密後的該第一加密向量和該第二加密向量的距離; 接收該第一設備發送的該第一向量和第二向量的目標距離,其中,該目標距離為該第一設備基於自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一加密向量和該第二加密向量的距離解密得到的。 Further, obtaining the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Send the second encryption vector to the first device, so that the first device determines the distance between the encrypted first encryption vector and the second encryption vector based on the second encryption vector and the first encryption vector; Receive the target distance of the first vector and the second vector sent by the first device, wherein the target distance is the first target private key corresponding to the first target public key generated by the first device based on itself. The distance between the encrypted first encrypted vector and the second encrypted vector is obtained by decrypting.

進一步地,該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括: 採用自身生成的第二目標公開金鑰對該第二向量進行同態加密,獲得第三加密向量,並將該第二目標公開金鑰以及該第三加密向量發送給該第一設備; 接收該第一設備發送的基於該第三加密向量以及第四加密向量,確定加密後的該第二向量和該第一向量的距離,其中,該第四加密向量為採用該第二目標公開金鑰對該第一向量進行同態加密生成的; 基於該加密後的該第一向量和第二向量的距離及該第二目標公開金鑰對應的第二目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 Further, obtaining the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: Perform homomorphic encryption on the second vector using the self-generated second target public key to obtain a third encryption vector, and send the second target public key and the third encryption vector to the first device; Receive the information sent by the first device based on the third encrypted vector and the fourth encrypted vector, and determine the distance between the encrypted second vector and the first vector, wherein the fourth encrypted vector is obtained using the second target public gold. The key is generated by homomorphic encryption of the first vector; Based on the encrypted distance between the first vector and the second vector and the second target private key corresponding to the second target public key, determine the target distance between the first vector and the second vector, based on the target distance and a preset first distance threshold to determine whether the first data and the second data match.

進一步地,該將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量包括: 針對該第二資料中的每個第二子資料,將該第二子資料登錄到預先訓練完成的向量轉化模型中,獲得該第二子資料對應的第二子向量;其中,每個第二子資料對應的第二子向量的長度均為第一預設長度; 將該每個第二子資料對應的第二子向量進行拼接,得到該第二資料對應的該第二向量。 Further, logging the second data to be matched into the pre-trained vector transformation model, and obtaining the second vector corresponding to the second data includes: For each second sub-data in the second data, log the second sub-data into the pre-trained vector transformation model to obtain the second sub-vector corresponding to the second sub-data; wherein, each second sub-data is The lengths of the second sub-vectors corresponding to the sub-data are all the first preset length; The second sub-vectors corresponding to each second sub-data are spliced to obtain the second vector corresponding to the second data.

進一步地,該第一向量和該第二向量的長度均為第二預設長度。Further, the lengths of the first vector and the second vector are both second preset lengths.

進一步地,該採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量包括: 針對該第二向量中的每個第二分量,確定該第二分量對應的第二平方分量; 將每個第二分量對應的第二平方分量按照預設的插入規則插入到該第二向量中,並將插入第二平方分量後獲得的向量更新為該第二向量; 基於該第一目標公開金鑰對該第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成該第二加密向量。 Further, using the first target public key to perform homomorphic encryption on the second vector to generate a second encrypted vector includes: For each second component in the second vector, determine the second square component corresponding to the second component; Insert the second square component corresponding to each second component into the second vector according to the preset insertion rules, and update the vector obtained after inserting the second square component to the second vector; Homomorphic encryption is performed on each second component and each second square component in the second vector based on the first target public key to generate the second encrypted vector.

進一步地,該根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離包括:Further, determining the distance between the encrypted first vector and the second vector according to the first encrypted vector and the second encrypted vector includes:

根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;並根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量; 根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量; 根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離; 根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。 According to the preset insertion rule and each second component in the second encryption vector, obtain each set of second encryption components and second encryption square components; and according to the preset insertion rule and the first encryption vector For each first component of , obtain each group of first encrypted component and first encrypted square component; According to the preset insertion rule, determine each corresponding group of first encrypted component, first encrypted square component, second encrypted component and second encrypted square component; Determine each encrypted sub-distance according to each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; The distance between the encrypted first vector and the second vector is determined based on the sum of each sub-distance.

進一步地,該基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配包括: 確定該目標距離是否小於預設的第一距離閾值; 若是,則確定該第一資料與該第二資料匹配; 否則,確定該第一資料與該第二資料不匹配。 Further, determining whether the first data and the second data match based on the target distance and a preset first distance threshold includes: Determine whether the target distance is less than a preset first distance threshold; If so, it is determined that the first data matches the second data; Otherwise, it is determined that the first data does not match the second data.

進一步地,該確定該第一資料與該第二資料匹配之後,該方法還包括: 確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。 Further, after determining that the first data matches the second data, the method further includes: It is determined whether the target distance is equal to a preset second distance threshold, and if so, it is determined that the first data and the second data are the same.

由於在本發明實施例中,分別將待匹配的第一資料和第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量以及第二資料對應的第二向量,並基於第一目標公開金鑰獲取該第一向量加密後的第一加密向量,以及該第二向量加密後的第二加密向量,並確定加密後的第一向量和第二向量的距離,並基於加密後的第一向量和第二向量的距離以及自身生成的第一目標私密金鑰,確定第一向量和第二向量的目標距離,基於該目標距離以及預設的第一距離閾值確定第一資料和第二資料是否匹配,即實現了第一資料和第二資料不完全相同時,也能實現第一資料和第二資料的模糊匹配,拓寬了使用場景,且在進行模糊匹配過程中引入了第一目標公開金鑰和第一目標私密金鑰分別進行同態加密和解密,實現了安全求交,保證了匹配過程的安全性,且整個匹配的過程中,第一資料以及第二資料均未離開以原始資料的形式離開過對應的第一設備以及第二設備,實現了原始資料不出庫也能實現模糊匹配,進一步保證了匹配過程的安全性。In this embodiment of the present invention, the first data and the second data to be matched are respectively logged into the vector transformation model that has been trained in advance, and the first vector corresponding to the first data and the second vector corresponding to the second data are obtained. , and obtain the first encrypted vector after encrypting the first vector and the second encrypted vector after encrypting the second vector based on the first target public key, and determine the distance between the encrypted first vector and the second vector, And based on the encrypted distance between the first vector and the second vector and the first target private key generated by itself, the target distance between the first vector and the second vector is determined based on the target distance and the preset first distance threshold. Whether the first data and the second data match, that is, when the first data and the second data are not exactly the same, fuzzy matching between the first data and the second data can also be achieved, broadening the usage scenarios, and during the fuzzy matching process The first target public key and the first target private key are introduced to perform homomorphic encryption and decryption respectively, achieving safe intersection and ensuring the security of the matching process. During the entire matching process, the first data and the third Neither of the two data has left the corresponding first device and the second device in the form of original data, which enables fuzzy matching to be achieved without the original data leaving the database, further ensuring the security of the matching process.

本領域內的技術人員應明白,本發明的實施例可提供為方法、系統、或電腦程式產品。因此,本發明可採用完全硬體實施例、完全軟體實施例、或結合軟體和硬體方面的實施例的形式。而且,本發明可採用在一個或多個其中包含有電腦可用程式碼的電腦可用存儲介質(包括但不限於磁碟記憶體、CD-ROM、光學記憶體等)上實施的電腦程式產品的形式。Those skilled in the art will understand that embodiments of the present invention may be provided as methods, systems, or computer program products. Thus, the invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the invention may take the form of a computer program product implemented on one or more computer-usable storage media (including, but not limited to, disk memory, CD-ROM, optical memory, etc.) embodying computer-usable program code therein. .

本發明是參照根據本發明的方法、設備(系統)、和電腦程式產品的流程圖和/或方框圖來描述的。應理解可由電腦程式指令實現流程圖和/或方框圖中的每一流程和/或方框、以及流程圖和/或方框圖中的流程和/或方框的結合。可提供這些電腦程式指令到通用電腦、專用電腦、嵌入式處理機或其他可程式設計資料處理設備的處理器以產生一個機器,使得通過電腦或其他可程式設計資料處理設備的處理器執行的指令產生用於實現在流程圖一個流程或多個流程和/或方框圖一個方框或多個方框中指定的功能的裝置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the invention. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine that causes instructions to be executed by the processor of the computer or other programmable data processing device. Means are generated for implementing the functions specified in the process or processes of the flowchart diagram and/or the block or blocks of the block diagram.

這些電腦程式指令也可存儲在能引導電腦或其他可程式設計資料處理設備以特定方式工作的電腦可讀記憶體中,使得存儲在該電腦可讀記憶體中的指令產生包括指令裝置的製造品,該指令裝置實現在流程圖一個流程或多個流程和/或方框圖一個方框或多個方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. , the instruction device implements the functions specified in one process or multiple processes in the flow chart and/or one block or multiple blocks in the block diagram.

這些電腦程式指令也可裝載到電腦或其他可程式設計資料處理設備上,使得在電腦或其他可程式設計設備上執行一系列操作步驟以產生電腦實現的處理,從而在電腦或其他可程式設計設備上執行的指令提供用於實現在流程圖一個流程或多個流程和/或方框圖一個方框或多個方框中指定的功能的步驟。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby causing the computer or other programmable device to perform a computer-implemented process. The instructions executed on provide steps for implementing the functions specified in a process or processes of the flow diagrams and/or a block or blocks of the block diagrams.

顯然,本領域的技術人員可以對本發明進行各種改動和變型而不脫離本發明的精神和範圍。這樣,倘若本發明的這些修改和變型屬於本發明申請專利範圍及其等同技術的範圍之內,則本發明也意圖包含這些改動和變型在內。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the invention. In this way, if these modifications and variations of the present invention fall within the patent scope of the present invention and the scope of equivalent technologies, the present invention is also intended to include these modifications and variations.

S101~S104、S401~S404:步驟 801:第一獲取模組 802:第一處理模組 803:第一確定模組 901:第二獲取模組 902:第二處理模組 903:第二確定模組 1001:處理器 1002:通信介面 1003:記憶體 1004:通信匯流排 1101:處理器 1102:通信介面 1103:記憶體 1104:通信匯流排 S101~S104, S401~S404: steps 801:First acquisition module 802: First processing module 803: First confirmed module 901: Second acquisition module 902: Second processing module 903: Second determined module 1001: Processor 1002: Communication interface 1003:Memory 1004: Communication bus 1101: Processor 1102: Communication interface 1103:Memory 1104: Communication bus

為了更清楚地說明本發明實施例中的技術方案,下面將對實施例描述中所需要使用的附圖作簡要介紹,顯而易見地,下面描述中的附圖僅僅是本發明的一些實施例,對於本領域的普通技術人員來講,在不付出進步性勞動性的前提下,還可以根據這些附圖獲得其他的附圖: 圖1為本發明實施例提供的一種資料匹配過程示意圖; 圖2a為本發明一些實施例提供的一種目標子距離的顯示示意圖; 圖2b為本發明一些實施例提供的一種目標子距離矩陣的顯示示意圖; 圖3a為本發明一些實施例提供的另外一種目標子距離的顯示示意圖; 圖3b為本發明一些實施例提供的另外一種目標子距離矩陣的顯示示意圖; 圖4為本發明實施例提供的一種資料匹配方法過程示意圖; 圖5a為本發明一些實施例提供的一種獲取文字類型資料對應的向量的過程示意圖; 圖5b為本發明一些實施例提供的一種獲取數字類型資料對應的向量的過程示意圖; 圖6為本發明一些實施例提供的一種雙方資料進行模糊匹配的整體過程示意圖; 圖7為本發明一些實施例提供的一種雙方資料進行模糊匹配的具體過程示意圖; 圖8為本發明一些實施例提供的一種資料匹配裝置結構示意圖; 圖9為本發明一些實施例提供的一種資料匹配裝置結構示意圖; 圖10為本發明一些實施例提供的一種電子設備的結構示意圖; 圖11為本發明一些實施例提供的一種電子設備的結構示意圖。 In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings needed to describe the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the field, other drawings can also be obtained based on these drawings without incurring any progressive labor: Figure 1 is a schematic diagram of a data matching process provided by an embodiment of the present invention; Figure 2a is a schematic diagram showing a target sub-distance display provided by some embodiments of the present invention; Figure 2b is a schematic display diagram of a target sub-distance matrix provided by some embodiments of the present invention; Figure 3a is a schematic diagram showing another target sub-distance display provided by some embodiments of the present invention; Figure 3b is a schematic display diagram of another target sub-distance matrix provided by some embodiments of the present invention; Figure 4 is a schematic process diagram of a data matching method provided by an embodiment of the present invention; Figure 5a is a schematic diagram of a process for obtaining vectors corresponding to text type data provided by some embodiments of the present invention; Figure 5b is a schematic diagram of a process for obtaining vectors corresponding to digital data provided by some embodiments of the present invention; Figure 6 is a schematic diagram of the overall process of fuzzy matching of data from both parties provided by some embodiments of the present invention; Figure 7 is a schematic diagram of a specific process for fuzzy matching of data from both parties provided by some embodiments of the present invention; Figure 8 is a schematic structural diagram of a data matching device provided by some embodiments of the present invention; Figure 9 is a schematic structural diagram of a data matching device provided by some embodiments of the present invention; Figure 10 is a schematic structural diagram of an electronic device provided by some embodiments of the present invention; Figure 11 is a schematic structural diagram of an electronic device provided by some embodiments of the present invention.

S101~S104:步驟 S101~S104: Steps

Claims (30)

一種資料匹配方法,其特徵在於,應用於第一設備,該方法包括:將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量;採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備;獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的;基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配;該採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量包括:針對該第一向量中的每個第一分量,確定該第一分量對應的第一平方分量;將每個第一分量對應的第一平方分量按照預設的插入規則插入到該第一向量中,並將插入第一平方分量後獲得的向量更新為該第一向量;基於該第一目標公開金鑰對該第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成該第一加密向量。 A data matching method, characterized in that, applied to a first device, the method includes: logging the first data to be matched into a vector transformation model that has been trained in advance, and obtaining a first vector corresponding to the first data; using its own The generated first target public key performs homomorphic encryption on the first vector to generate a first encryption vector, and sends the first target public key to the second device; obtaining the first encryption vector based on the first encryption vector and the second encryption vector. The determined distance between the encrypted first vector and the second vector, where the second encrypted vector is obtained by homomorphically encrypting the second vector using the first target public key; the second vector is the The second data is obtained by logging into the pre-trained vector conversion model in the second device; based on the encrypted distance between the first vector and the second vector and the first target corresponding to the first target public key The private key determines the target distance between the first vector and the second vector, and determines whether the first data and the second data match based on the target distance and the preset first distance threshold; the third data generated by itself is used. Homomorphically encrypting the first vector with a target public key to generate the first encrypted vector includes: for each first component in the first vector, determining the first square component corresponding to the first component; The first square component corresponding to a component is inserted into the first vector according to the preset insertion rules, and the vector obtained after inserting the first square component is updated to the first vector; based on the first target public key, the Each first component and each first square component in the first vector are respectively homomorphically encrypted to generate the first encrypted vector. 如請求項1所述的資料匹配方法,其中,該將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量包括:確定該第一資料對應的第一目標資料類型;根據該第一目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第一資料對應的預先訓練完成的第一目標向量轉化模型;將該第一資料登錄到該預先訓練完成的第一目標向量轉化模型中,獲得該第一資料對應的該第一向量。 The data matching method as described in claim 1, wherein logging the first data to be matched into the pre-trained vector transformation model and obtaining the first vector corresponding to the first data includes: determining the corresponding The first target data type; according to the first target data type and the corresponding relationship between the pre-saved data type and the pre-trained vector transformation model, determine the pre-trained first target vector transformation model corresponding to the first data; Log the first data into the pre-trained first target vector transformation model to obtain the first vector corresponding to the first data. 如請求項2所述的資料匹配方法,其中,該第一目標資料類型為文字類型或數字類型。 The data matching method as described in claim 2, wherein the first target data type is a text type or a numeric type. 如請求項3所述的資料匹配方法,其中,若該第一目標資料類型為文字類型,對應的預先訓練完成的第一目標向量轉化模型為詞向量模型或句向量模型;若該第一目標資料類型為數字類型,對應的預先訓練完成的第一目標向量轉化模型為獨熱編碼模型。 The data matching method as described in claim 3, wherein if the first target data type is a text type, the corresponding pre-trained first target vector conversion model is a word vector model or a sentence vector model; if the first target The data type is a numeric type, and the corresponding pre-trained first target vector conversion model is a one-hot encoding model. 如請求項1所述的資料匹配方法,其中,該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括:接收該第二設備發送的該第二加密向量,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的;基於該第一加密向量以及該第二加密向量,確定加密後的該第一向量和該第二向量的距離。 The data matching method as described in claim 1, wherein the obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: receiving the encrypted distance sent by the second device. The second encryption vector, wherein the second encryption vector is obtained by the second device homomorphically encrypting the second vector based on the first target public key; based on the first encryption vector and the second encryption vector, determine the distance between the encrypted first vector and the second vector. 如請求項1所述的資料匹配方法,其中,該將該第一目標公開金鑰發送給第二設備包括: 將該第一加密向量以及該第一目標公開金鑰發送給該第二設備;該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括:接收該第二設備發送的基於該第一加密向量和該第二加密向量確定的加密後的該第一向量和該第二向量的距離,其中,該第二加密向量為該第二設備基於該第一目標公開金鑰對該第二向量進行同態加密後得到的。 The data matching method as described in claim 1, wherein sending the first target public key to the second device includes: The first encrypted vector and the first target public key are sent to the second device; the obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes : Receive the distance between the encrypted first vector and the second vector determined based on the first encrypted vector and the second encrypted vector sent by the second device, where the second encrypted vector is based on The first target public key is obtained by homomorphically encrypting the second vector. 如請求項6所述的資料匹配方法,其中,該方法還包括:接收該第二設備發送的第三加密向量以及該第二設備生成的第二目標公開金鑰;其中,該第三加密向量為該第二設備採用該第二目標公開金鑰對該第二向量進行同態加密後得到的;基於該第二目標公開金鑰對該第一向量進行同態加密生成第四加密向量;基於該第三加密向量以及該第四加密向量,確定加密後的該第二向量和該第一向量的距離,並將該加密後的該第二向量和該第一向量的距離發送給該第二設備,以使該第二設備根據加密後的該第二向量和該第一向量的距離以及該第二目標公開金鑰對應的第二目標私密金鑰,對該加密後的該第二向量和該第一向量的距離進行解密,確定該第二向量與該第一向量的目標距離,並根據該第二向量與該第一向量的目標距離及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The data matching method as described in request item 6, wherein the method further includes: receiving a third encryption vector sent by the second device and a second target public key generated by the second device; wherein, the third encryption vector It is obtained by homomorphically encrypting the second vector using the second target public key for the second device; homomorphically encrypting the first vector based on the second target public key to generate a fourth encrypted vector; based on The third encrypted vector and the fourth encrypted vector determine the distance between the encrypted second vector and the first vector, and send the encrypted distance between the second vector and the first vector to the second device, so that the second device calculates the sum of the encrypted second vector and the first vector according to the distance between the encrypted second vector and the first vector and the second target private key corresponding to the second target public key. Decrypt the distance of the first vector to determine the target distance between the second vector and the first vector, and determine the first distance based on the target distance between the second vector and the first vector and the preset first distance threshold. information and the second information match. 如請求項1、5或6所述的資料匹配方法,其中,該確定該第一向量和該第二向量的目標距離之後,該方法還包括: 將該第一向量和該第二向量的目標距離發送給該第二設備,使該第二設備基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The data matching method as described in claim 1, 5 or 6, wherein after determining the target distance between the first vector and the second vector, the method further includes: The target distance of the first vector and the second vector is sent to the second device, so that the second device determines whether the first data and the second data match based on the target distance and the preset first distance threshold. . 如請求項1、5或6所述的資料匹配方法,其中,該基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和第二向量的目標距離包括:採用第一設備自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密,確定該第一向量和該第二向量的目標距離。 The data matching method as described in claim 1, 5 or 6, wherein the first target private key corresponding to the first target public key is based on the encrypted distance between the first vector and the second vector, Determining the target distance between the first vector and the second vector includes: using the first target private key corresponding to the first target public key generated by the first device itself to calculate the encrypted first vector and the second vector. The distance is decrypted to determine the target distance between the first vector and the second vector. 如請求項1所述的資料匹配方法,其中,該將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量包括:針對該第一資料中的每個第一子資料,將該第一子資料登錄到預先訓練完成的向量轉化模型中,獲得該第一子資料對應的第一子向量;其中,每個第一子資料對應的第一子向量的長度均為第一預設長度;將該每個第一子資料對應的第一子向量進行拼接,得到該第一資料對應的該第一向量。 The data matching method as described in claim 1, wherein logging the first data to be matched into a pre-trained vector transformation model and obtaining the first vector corresponding to the first data includes: targeting the first data For each first sub-data, log the first sub-data into the pre-trained vector transformation model to obtain the first sub-vector corresponding to the first sub-data; where, the first sub-data corresponding to each first The lengths of the sub-vectors are all the first preset length; the first sub-vectors corresponding to each first sub-data are spliced to obtain the first vector corresponding to the first data. 如請求項1或10所述的資料匹配方法,其中,該第一向量和該第二向量的長度均為第二預設長度。 The data matching method as described in claim 1 or 10, wherein the lengths of the first vector and the second vector are both second preset lengths. 如請求項1所述的資料匹配方法,其中,該獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離包括: 根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;並根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量;根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離;根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。 The data matching method as described in claim 1, wherein the obtaining the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: According to the preset insertion rule and each first component in the first encryption vector, obtain each set of first encryption components and first encryption square components; and according to the preset insertion rule and the second encryption vector For each second component of The second encrypted square component; according to each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component, determine each encrypted sub-distance; according to the sum of each sub-distance, determine the encrypted The distance between the first vector and the second vector. 如請求項1所述的資料匹配方法,其中,該基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配包括:確定該目標距離是否小於預設的第一距離閾值;若是,則確定該第一資料與該第二資料匹配;否則,確定該第一資料與該第二資料不匹配。 The data matching method of claim 1, wherein determining whether the first data and the second data match based on the target distance and a preset first distance threshold includes: determining whether the target distance is less than a preset distance A first distance threshold; if yes, it is determined that the first data and the second data match; otherwise, it is determined that the first data and the second data do not match. 如請求項13所述的資料匹配方法,其中,該確定該第一資料與該第二資料匹配之後,該方法還包括:確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。 The data matching method as described in claim 13, wherein after determining that the first data matches the second data, the method further includes: determining whether the target distance is equal to a preset second distance threshold, and if so, determining The first data is the same as the second data. 一種資料匹配方法,其特徵在於,應用於第二設備,該方法包括:將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量; 接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量;獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量進行同態加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的;基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配;該採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量包括:針對該第二向量中的每個第二分量,確定該第二分量對應的第二平方分量;將每個第二分量對應的第二平方分量按照預設的插入規則插入到該第二向量中,並將插入第二平方分量後獲得的向量更新為該第二向量;基於該第一目標公開金鑰對該第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成該第二加密向量。 A data matching method, characterized in that, applied to a second device, the method includes: logging the second data to be matched into a pre-trained vector transformation model, and obtaining a second vector corresponding to the second data; Receive the first target public key sent by the first device, use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; obtain the second encryption vector determined based on the first encryption vector and the second encryption vector. The target distance between the first vector and the second vector, wherein the first encrypted vector is obtained by homomorphically encrypting the first vector using the first target public key, and the first vector is the first data Obtained by logging into the pre-trained vector transformation model in the first device; based on the target distance and the preset first distance threshold, determine whether the first data and the second data match; use the first Homomorphically encrypting the second vector with the target public key to generate the second encrypted vector includes: for each second component in the second vector, determining the second square component corresponding to the second component; The second square component corresponding to the component is inserted into the second vector according to the preset insertion rules, and the vector obtained after inserting the second square component is updated to the second vector; based on the first target public key, the second square component is inserted into the second vector based on the first target public key. Each second component and each second square component in the two vectors are homomorphically encrypted respectively to generate the second encrypted vector. 如請求項15所述的資料匹配方法,其中,該將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量包括:確定該第二資料對應的第二目標資料類型;根據該第二目標資料類型以及預先保存的資料類型和預先訓練完成的向量轉化模型的對應關係,確定該第二資料對應的預先訓練完成的第二目標向量轉化模型; 將該第二資料登錄到該預先訓練完成的第二目標向量轉化模型中,獲得該第二資料對應的該第二向量。 The data matching method as described in claim 15, wherein logging the second data to be matched into the pre-trained vector transformation model and obtaining the second vector corresponding to the second data includes: determining the corresponding The second target data type; according to the second target data type and the corresponding relationship between the pre-saved data type and the pre-trained vector transformation model, determine the pre-trained second target vector transformation model corresponding to the second data; Log the second data into the pre-trained second target vector transformation model to obtain the second vector corresponding to the second data. 如請求項16所述的資料匹配方法,其中,該第二目標資料類型為文字類型或數字類型。 The data matching method as described in claim 16, wherein the second target data type is a text type or a numeric type. 如請求項17所述的資料匹配方法,其中,若該第二目標資料類型為文字類型,對應的預先訓練完成的第二目標向量轉化模型為詞向量模型或句向量模型;若該第二目標資料類型為數字類型,對應的預先訓練完成的第二目標向量轉化模型為獨熱編碼模型。 The data matching method as described in claim 17, wherein if the second target data type is a text type, the corresponding pre-trained second target vector conversion model is a word vector model or a sentence vector model; if the second target data type is The data type is a numeric type, and the corresponding pre-trained second target vector conversion model is a one-hot encoding model. 如請求項15所述的資料匹配方法,其中,該接收第一設備發送的第一目標公開金鑰包括:接收該第一設備發送的該第一目標公開金鑰以及該第一加密向量,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量進行同態加密後得到的;該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括:根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離;將該加密後的該第一向量和第二向量的距離發送給該第一設備;接收該第一設備發送的該目標距離,其中,該目標距離為該第一設備採用自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一向量和第二向量的距離進行解密後得到的。 The data matching method as described in claim 15, wherein receiving the first target public key sent by the first device includes: receiving the first target public key and the first encryption vector sent by the first device, wherein , the first encryption vector is obtained by homomorphically encrypting the first vector using the first target public key; the acquisition is based on the first vector and the second encryption vector determined based on the first encryption vector and the second encryption vector. The target distance of the vector includes: determining the encrypted distance between the first vector and the second vector based on the first encrypted vector and the second encrypted vector; and sending the encrypted distance between the first vector and the second vector. To the first device; receive the target distance sent by the first device, where the target distance is the encrypted value of the first target private key corresponding to the first target public key generated by the first device itself. It is obtained by decrypting the distance between the first vector and the second vector. 如請求項15所述的資料匹配方法,其中,該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括:向該第一設備發送該第二加密向量,以使該第一設備基於該第二加密向量以及該第一加密向量,確定加密後的該第一加密向量和該第二加密向量的距離;接收該第一設備發送的該第一向量和第二向量的目標距離,其中,該目標距離為該第一設備基於自身生成的該第一目標公開金鑰對應的第一目標私密金鑰對該加密後的該第一加密向量和該第二加密向量的距離解密得到的。 The data matching method as described in claim 15, wherein obtaining the target distance of the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: sending the second vector to the first device. Encrypt the vector so that the first device determines the distance between the encrypted first encryption vector and the second encryption vector based on the second encryption vector and the first encryption vector; receive the first encryption vector sent by the first device The target distance between the vector and the second vector, where the target distance is the encrypted first encryption vector and the first target private key corresponding to the first target public key generated by the first device. The second encrypted vector is obtained by distance decryption. 如請求項15所述的資料匹配方法,其中,該獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離包括:採用自身生成的第二目標公開金鑰對該第二向量進行同態加密,獲得第三加密向量,並將該第二目標公開金鑰以及該第三加密向量發送給該第一設備;接收該第一設備發送的基於該第三加密向量以及第四加密向量,確定加密後的該第二向量和該第一向量的距離,其中,該第四加密向量為採用該第二目標公開金鑰對該第一向量進行同態加密生成的;基於該加密後的該第一向量和第二向量的距離及該第二目標公開金鑰對應的第二目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配。 The data matching method as described in claim 15, wherein obtaining the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector includes: using a self-generated second target public fund. Perform homomorphic encryption on the second vector with the key to obtain a third encryption vector, and send the second target public key and the third encryption vector to the first device; receive the information sent by the first device based on the third encryption vector. The encrypted vector and the fourth encrypted vector determine the distance between the encrypted second vector and the first vector, wherein the fourth encrypted vector is generated by homomorphic encryption of the first vector using the second target public key. based on the encrypted distance between the first vector and the second vector and the second target private key corresponding to the second target public key, determine the target distance between the first vector and the second vector, based on the The target distance and the preset first distance threshold determine whether the first data and the second data match. 如請求項15所述的資料匹配方法,其中,該將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量包括: 針對該第二資料中的每個第二子資料,將該第二子資料登錄到預先訓練完成的向量轉化模型中,獲得該第二子資料對應的第二子向量;其中,每個第二子資料對應的第二子向量的長度均為第一預設長度;將該每個第二子資料對應的第二子向量進行拼接,得到該第二資料對應的該第二向量。 The data matching method as described in claim 15, wherein logging the second data to be matched into a pre-trained vector transformation model and obtaining the second vector corresponding to the second data includes: For each second sub-data in the second data, log the second sub-data into the pre-trained vector transformation model to obtain the second sub-vector corresponding to the second sub-data; wherein, each second sub-data is The lengths of the second sub-vectors corresponding to the sub-data are all the first preset length; the second sub-vectors corresponding to each of the second sub-data are spliced to obtain the second vector corresponding to the second data. 如請求項15或22所述的資料匹配方法,其中,該第一向量和該第二向量的長度均為第二預設長度。 The data matching method as described in claim 15 or 22, wherein the lengths of the first vector and the second vector are both second preset lengths. 如請求項19所述的資料匹配方法,其中,該根據該第一加密向量以及該第二加密向量,確定加密後的該第一向量和第二向量的距離包括:根據該預設的插入規則及該第二加密向量中的每個第二分量,獲取每組第二加密分量和第二加密平方分量;並根據該預設的插入規則及該第一加密向量中的每個第一分量,獲取每組第一加密分量和第一加密平方分量;根據該預設的插入規則,確定對應的每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量;根據每組第一加密分量、第一加密平方分量、第二加密分量和第二加密平方分量,確定加密後的每個子距離;根據每個子距離的和值,確定加密後的該第一向量和該第二向量的距離。 The data matching method of claim 19, wherein determining the distance between the encrypted first vector and the second vector based on the first encrypted vector and the second encrypted vector includes: based on the preset insertion rule and each second component in the second encryption vector, obtain each set of second encryption components and second encryption square components; and according to the preset insertion rule and each first component in the first encryption vector, Obtain each group of the first encrypted component and the first encrypted square component; according to the preset insertion rule, determine the corresponding group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component; according to For each group of the first encrypted component, the first encrypted square component, the second encrypted component and the second encrypted square component, each encrypted sub-distance is determined; based on the sum of each sub-distance, the encrypted first vector and the encrypted square component are determined. The distance of the second vector. 如請求項15所述的資料匹配方法,其中,該基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配包括:確定該目標距離是否小於預設的第一距離閾值;若是,則確定該第一資料與該第二資料匹配;否則,確定該第一資料與該第二資料不匹配。 The data matching method of claim 15, wherein determining whether the first data and the second data match based on the target distance and a preset first distance threshold includes: determining whether the target distance is less than a preset distance A first distance threshold; if yes, it is determined that the first data and the second data match; otherwise, it is determined that the first data and the second data do not match. 如請求項25所述的資料匹配方法,其中,該確定該第一資料與該第二資料匹配之後,該方法還包括:確定該目標距離是否等於預設的第二距離閾值,若是,則確定該第一資料與該第二資料相同。 The data matching method as described in claim 25, wherein after determining that the first data matches the second data, the method further includes: determining whether the target distance is equal to a preset second distance threshold, and if so, determining The first data is the same as the second data. 一種資料匹配裝置,其特徵在於,應用於第一設備,該裝置包括:第一獲取模組,用於將待匹配的第一資料登錄到預先訓練完成的向量轉化模型中,獲得該第一資料對應的第一向量;第一處理模組,用於採用自身生成的第一目標公開金鑰對該第一向量進行同態加密生成第一加密向量,並將該第一目標公開金鑰發送給第二設備;該第一獲取模組,還用於獲取基於該第一加密向量和第二加密向量確定的加密後的該第一向量和第二向量的距離,其中該第二加密向量為採用該第一目標公開金鑰對該第二向量進行同態加密後得到的;該第二向量為將第二資料登錄到該第二設備中的預先訓練完成的向量轉化模型中獲得的;第一確定模組,用於基於該加密後的該第一向量和第二向量的距離及該第一目標公開金鑰對應的第一目標私密金鑰,確定該第一向量和該第二向量的目標距離,基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配;該第一處理模組,具體用於針對該第一向量中的每個第一分量,確定該第一分量對應的第一平方分量;將每個第一分量對應的第一平方分量按照預設的插入規則插入到該第一向量中,並將插入第一平方分量後獲得的向量更新為該 第一向量;基於該第一目標公開金鑰對該第一向量中的每個第一分量及每個第一平方分量分別進行同態加密,生成該第一加密向量。 A data matching device, characterized in that it is applied to a first device. The device includes: a first acquisition module for logging the first data to be matched into a pre-trained vector transformation model to obtain the first data. The corresponding first vector; the first processing module is used to perform homomorphic encryption on the first vector using the first target public key generated by itself to generate the first encryption vector, and send the first target public key to The second device; the first acquisition module is also used to acquire the encrypted distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the second encrypted vector is The first target public key is obtained by homomorphically encrypting the second vector; the second vector is obtained by logging the second data into the pre-trained vector transformation model in the second device; the first Determining module, configured to determine the target of the first vector and the second vector based on the encrypted distance between the first vector and the second vector and the first target private key corresponding to the first target public key. distance, based on the target distance and the preset first distance threshold, determine whether the first data and the second data match; the first processing module is specifically used to target each first component in the first vector , determine the first square component corresponding to the first component; insert the first square component corresponding to each first component into the first vector according to the preset insertion rules, and insert the first square component into the vector obtained updated to this A first vector; performing homomorphic encryption on each first component and each first square component in the first vector based on the first target public key to generate the first encrypted vector. 一種資料匹配裝置,其特徵在於,應用於第二設備,該裝置包括:第二獲取模組,用於將待匹配的第二資料登錄到預先訓練完成的向量轉化模型中,獲得該第二資料對應的第二向量;第二處理模組,用於接收第一設備發送的第一目標公開金鑰,採用該第一目標公開金鑰對該第二向量進行同態加密生成第二加密向量;該第二獲取模組,還用於獲取基於第一加密向量和該第二加密向量確定的第一向量和該第二向量的目標距離,其中,該第一加密向量為採用該第一目標公開金鑰對該第一向量加密後得到的,該第一向量為將第一資料登錄到該第一設備中的預先訓練完成的向量轉化模型中獲得的;第二確定模組,用於基於該目標距離以及預設的第一距離閾值,確定該第一資料以及該第二資料是否匹配;該第一處理模組,具體用於針對該第二向量中的每個第二分量,確定該第二分量對應的第二平方分量;將每個第二分量對應的第二平方分量按照預設的插入規則插入到該第二向量中,並將插入第二平方分量後獲得的向量更新為該第二向量;基於該第一目標公開金鑰對該第二向量中的每個第二分量及每個第二平方分量分別進行同態加密,生成該第二加密向量。 A data matching device, characterized in that it is applied to a second device, and the device includes: a second acquisition module for logging the second data to be matched into a pre-trained vector transformation model to obtain the second data The corresponding second vector; a second processing module configured to receive the first target public key sent by the first device, and use the first target public key to perform homomorphic encryption on the second vector to generate a second encryption vector; The second acquisition module is also used to acquire the target distance between the first vector and the second vector determined based on the first encrypted vector and the second encrypted vector, wherein the first encrypted vector is disclosed using the first target The first vector is obtained by encrypting the first vector with the key, and the first vector is obtained by logging the first data into the pre-trained vector transformation model in the first device; the second determination module is used to base on the first vector The target distance and the preset first distance threshold are used to determine whether the first data and the second data match; the first processing module is specifically used to determine the third component for each second component in the second vector. The second square component corresponding to the two components; insert the second square component corresponding to each second component into the second vector according to the preset insertion rules, and update the vector obtained after inserting the second square component to the second square component. Two vectors; performing homomorphic encryption on each second component and each second square component in the second vector based on the first target public key to generate the second encrypted vector. 一種電子設備,其特徵在於,該電子設備包括處理器和記憶體,該記憶體用於存儲程式指令,該處理器用於執行記憶體中存儲的電腦程式 時實現如請求項1至14中任一項所述的資料匹配方法的步驟或如請求項15至26中任一項所述的資料匹配方法的步驟。 An electronic device, characterized in that the electronic device includes a processor and a memory, the memory is used to store program instructions, and the processor is used to execute the computer program stored in the memory When implementing the steps of the data matching method as described in any one of claims 1 to 14 or the steps of the data matching method as described in any one of claims 15 to 26. 一種電腦可讀存儲介質,其特徵在於,其存儲有電腦程式,該電腦程式被處理器執行時實現如請求項1至14中任一項所述的資料匹配方法的步驟或如請求項15至26中任一項所述的資料匹配方法的步驟。 A computer-readable storage medium, characterized in that it stores a computer program that, when executed by a processor, implements the steps of the data matching method as described in any one of claims 1 to 14 or as claimed in claims 15 to 14. The steps of the data matching method described in any one of 26.
TW111135467A 2022-02-28 2022-09-20 A data matching method, device, equipment and medium TWI835300B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210191650.3A CN114817943A (en) 2022-02-28 2022-02-28 Data matching method, device, equipment and medium
CN202210191650.3 2022-02-28

Publications (2)

Publication Number Publication Date
TW202336617A TW202336617A (en) 2023-09-16
TWI835300B true TWI835300B (en) 2024-03-11

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160119119A1 (en) 2014-05-15 2016-04-28 Xeror Corporation Compact fuzzy private matching using a fully-homomorphic encryption scheme

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160119119A1 (en) 2014-05-15 2016-04-28 Xeror Corporation Compact fuzzy private matching using a fully-homomorphic encryption scheme

Similar Documents

Publication Publication Date Title
CN111989893B (en) Method, system and computer readable device for generating and linking zero knowledge proofs
US10284372B2 (en) Method and system for secure management of computer applications
US20230087864A1 (en) Secure multi-party computation method and apparatus, device, and storage medium
WO2023159888A1 (en) Data matching method and apparatus, device, and medium
CN113159327A (en) Model training method and device based on federal learning system, and electronic equipment
CN114036565A (en) Private information retrieval system and private information retrieval method
CN115567188B (en) Multi-key value hiding intersection solving method and device and storage medium
CN111027981B (en) Method and device for multi-party joint training of risk assessment model for IoT (Internet of things) machine
CN109615376B (en) Transaction method and device based on zero-knowledge proof
CN115242553B (en) Data exchange method and system supporting safe multi-party calculation
WO2022186960A1 (en) Method and system for the atomic exchange of blockchain assets using transient key pairs
CN110704875B (en) Method, device, system, medium and electronic equipment for processing client sensitive information
CN114386058A (en) Model file encryption and decryption method and device
CN114491637A (en) Data query method and device, computer equipment and storage medium
TWI832640B (en) A data matching method, device, system, equipment and medium
EP2286610B1 (en) Techniques for peforming symmetric cryptography
TWI835300B (en) A data matching method, device, equipment and medium
CN115599959A (en) Data sharing method, device, equipment and storage medium
CN110995440B (en) Work history confirming method, device, equipment and storage medium
CN113505348A (en) Data watermark embedding method, data watermark verifying method and data watermark verifying device
US11095429B2 (en) Circuit concealing apparatus, calculation apparatus, and program
CN109981547B (en) Logistics transmission method and device based on block chain
CN115587897B (en) Police tax joint analysis method based on privacy calculation
CN114896313B (en) Data transmission method, device, equipment and medium
US20240129113A1 (en) Method for providing oracle service of blockchain network by using zero-knowledge proof and aggregator terminal using the same