TW202416270A

TW202416270A - System and method for performing encrypted mixing based on big number format and additive homomorphism

Info

Publication number: TW202416270A
Application number: TW111137250A
Authority: TW
Inventors: 廖勝廉; 林逸修; 謝佳育; 何佩玲; 張景雄
Original assignee: 中華電信股份有限公司
Filing date: 2022-09-30
Publication date: 2024-04-16

Abstract

A system and a method for performing encrypted mixing based on big number format and additive homomorphism are provided. The method includes: performing a preprocessing-operation by a second-voice-information-processing-unit to obtain a voice-information-preprocessing-result, and storing the voice-information-preprocessing-result in a big number format; performing an encryption-operation on the voice-information-preprocessing-result by an encryption-and-transmission-unit to obtain an voice-information-encrypted-result in the big number format; performing an additive-homomorphism-operation on the voice-information-encrypted-result by a mixing-unit to obtain a mixed-ciphertext in the big number format; performing a decryption-operation on the mixed-ciphertext by a decryption-unit to obtain mixed-voice-information in the big number format; and performing a post-processing-operation on the mixed-voice-information by a first-voice-information-processing-unit to obtain a mixed-voice-information-post-processing-result.

Description

System and method for executing encrypted mixing based on large number type and additive homomorphism

本發明是有關於一種基於大數型態及加法同態執行加密混音的系統及方法。The present invention relates to a system and method for performing encrypted mixing based on large number types and addition homomorphism.

目前，一般的加密混音方法是在用戶設備執行語音加密後，將加密語音傳送至伺服器，並且由伺服器執行解密及混音。接著，再由伺服器將混音結果加密後傳輸至用戶設備，以讓用戶設備執行解密及播放。換言之，在一般的加密混音方法中，伺服器將會執行解密而獲得明文語音資訊。若伺服器受到第三方監控，則此種作法將會造成資安上的風險。At present, the general method of encrypted mixing is that after the user device performs voice encryption, the encrypted voice is transmitted to the server, and the server performs decryption and mixing. Then, the server encrypts the mixing result and transmits it to the user device, so that the user device can perform decryption and playback. In other words, in the general method of encrypted mixing, the server will perform decryption and obtain the plaintext voice information. If the server is monitored by a third party, this practice will cause information security risks.

本發明的基於大數型態及加法同態執行加密混音的系統包括用戶設備以及伺服器。用戶設備包括解密單元、第一語音資訊處理單元、第二語音資訊處理單元以及加密與傳送單元。伺服器包括混音單元，其中伺服器通訊連接至用戶設備，其中第二語音資訊處理單元執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果；加密與傳送單元對語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果；混音單元對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音；解密單元對密文混音執行解密操作以獲得大數型態的混音語音資訊；第一語音資訊處理單元對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。The system for performing encryption mixing based on large number type and addition homomorphism includes a user device and a server. The user device includes a decryption unit, a first voice information processing unit, a second voice information processing unit and an encryption and transmission unit. The server includes a mixing unit, wherein the server is communicatively connected to the user equipment, wherein the second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in a large number form; the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain a voice information encryption result in a large number form; the mixing unit performs an addition homomorphic operation on the voice information encryption result to obtain a ciphertext mixture in a large number form; the decryption unit performs a decryption operation on the ciphertext mixture to obtain a mixed voice information in a large number form; and the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result.

本發明的基於大數型態及加法同態執行加密混音的方法包括：由第二語音資訊處理單元執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果；由加密與傳送單元對語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果；由混音單元對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音；由解密單元對密文混音執行解密操作以獲得大數型態的混音語音資訊；以及由第一語音資訊處理單元對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。The method of performing encrypted mixing based on large number form and additive homomorphism of the present invention includes: the second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in a large number form; the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain a voice information encryption result in a large number form; the mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain a ciphertext mixing in a large number form; the decryption unit performs a decryption operation on the ciphertext mixing to obtain a mixed voice information in a large number form; and the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result.

圖1是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的系統10的示意圖。系統10可包括多個用戶設備100以及伺服器200。需先說明的是，本發明不限制用戶設備100的數量。用戶設備100及伺服器200可支援基於IP的語音傳輸（VoIP，Voice over IP）。FIG1 is a schematic diagram of a system 10 for performing encrypted mixing based on large number type and additive homomorphic execution according to an embodiment of the present invention. The system 10 may include a plurality of user devices 100 and a server 200. It should be noted that the present invention does not limit the number of user devices 100. The user devices 100 and the server 200 may support voice transmission based on IP (VoIP, Voice over IP).

用戶設備100（或伺服器200）具有處理單元（如：處理器但不限於此）、通訊單元（如：各類通訊晶片、藍芽晶片、WiFi晶片等但不限於此）及儲存單元（如：可移動隨機存取記憶體、快閃記憶體、硬碟等但不限於此）等運行用戶設備100（或伺服器200）的必要構件。伺服器200可（分別）通訊連接至各用戶設備100。例如，伺服器200可經由網際網路300通訊連接至各用戶設備100。The user device 100 (or server 200) has necessary components for running the user device 100 (or server 200), such as a processing unit (such as a processor but not limited thereto), a communication unit (such as various communication chips, Bluetooth chips, WiFi chips, etc. but not limited thereto), and a storage unit (such as a removable random access memory, a flash memory, a hard disk, etc. but not limited thereto). The server 200 can be (respectively) communicatively connected to each user device 100. For example, the server 200 can be communicatively connected to each user device 100 via the Internet 300.

用戶設備100可包括受話單元110以及發話單元120。進一步而言，受話單元110可包括收音單元111、解密單元112以及第一語音資訊處理單元113。除此之外，發話單元120可包括錄音單元121、第二語音資訊處理單元122以及加密與傳送單元123。在本實施例中，用戶設備100可包括輸入輸出裝置（圖未繪示）。輸入輸出裝置例如是麥克風或喇叭等用於輸入語音或輸出語音的裝置。另一方面，伺服器200可包括收送單元210以及混音單元220。The user equipment 100 may include a receiving unit 110 and a speaking unit 120. In more detail, the receiving unit 110 may include a sound receiving unit 111, a decryption unit 112, and a first voice information processing unit 113. In addition, the speaking unit 120 may include a recording unit 121, a second voice information processing unit 122, and an encryption and transmission unit 123. In this embodiment, the user equipment 100 may include an input and output device (not shown). The input and output device is, for example, a microphone or a speaker, which is used for inputting or outputting voice. On the other hand, the server 200 may include a sending and receiving unit 210 and a mixing unit 220.

圖2是圖1所示的系統10的各單元的運作流程圖。請同時參照圖1及圖2。在步驟S201中，錄音單元121可通過輸入輸出裝置接收語音資訊。詳細而言，使用者可操作用戶設備100以輸入（類比訊號形式的）語音資訊至錄音單元121。接著，錄音單元121可將類比訊號形式的語音資訊轉換為數位訊號形式的語音資訊，並且將（數位訊號形式的）語音資訊送往第二語音資訊處理單元122。在此需先說明的是，所述語音資訊可包括（多個）語音樣本（Sample）。使用者可利用輸入輸出裝置來設定語音樣本的位元數量。位元數量例如是8 bits或16 bits，然而本發明不限於此。FIG. 2 is an operation flow chart of each unit of the system 10 shown in FIG. 1 . Please refer to FIG. 1 and FIG. 2 at the same time. In step S201, the recording unit 121 can receive voice information through the input-output device. In detail, the user can operate the user device 100 to input the voice information (in the form of an analog signal) to the recording unit 121. Then, the recording unit 121 can convert the voice information in the form of an analog signal into the voice information in the form of a digital signal, and send the voice information (in the form of a digital signal) to the second voice information processing unit 122. It should be noted here that the voice information may include (multiple) voice samples. The user can use the input-output device to set the number of bits of the voice sample. The number of bits is, for example, 8 bits or 16 bits, but the present invention is not limited to this.

在步驟S202中，第二語音資訊處理單元122可執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果。具體而言，第二語音資訊處理單元122可對語音資訊執行語音資訊分割操作以獲得語音資訊分割結果，並且對語音資訊分割結果執行語音資訊向上平移操作以獲得語音資訊向上平移結果。所述語音資訊向上平移操作可關聯於語音樣本的位元數量。詳細而言，第二語音資訊處理單元122可將語音資訊根據各語音樣本執行語音資訊分割操作。接著，若語音樣本的位元數量為 bits，第二語音資訊處理單元122可將語音資訊中的各語音樣本分別都加上，以執行所述語音資訊向上平移操作。換言之，由於前述步驟S201中第二語音資訊處理單元122從錄音單元121獲得的，（數位訊號形式的）語音資訊中的各語音樣本可能為負數，第二語音資訊處理單元122可將語音資訊中的各語音樣本分別都加上，以使各語音樣本必然為正數。更進一步而言，第二語音資訊處理單元122可添加溢位緩衝碼至語音資訊向上平移結果以獲得語音資訊前處理結果。溢位緩衝碼可關聯於（語音樣本的）位元數量。例如，第二語音資訊處理單元122可利用下述公式1獲得溢位緩衝碼的位元數量（長度）。最後，第二語音資訊處理單元122可將此些語音樣本合併，且以大數型態儲存語音資訊前處理結果，並且將語音資訊前處理結果送往加密與傳送單元123。 ……… (公式1) 其中k為混音前音軌數量。 In step S202, the second voice information processing unit 122 may perform a pre-processing operation to obtain a voice information pre-processing result, and store the voice information pre-processing result in a large number format. Specifically, the second voice information processing unit 122 may perform a voice information segmentation operation on the voice information to obtain a voice information segmentation result, and perform a voice information upward shift operation on the voice information segmentation result to obtain a voice information upward shift result. The voice information upward shift operation may be related to the number of bits of the voice sample. In detail, the second voice information processing unit 122 may perform a voice information segmentation operation on the voice information according to each voice sample. Then, if the number of bits of the voice sample is bits, the second voice information processing unit 122 can add each voice sample in the voice information to In other words, since each voice sample in the voice information (in the form of digital signal) obtained by the second voice information processing unit 122 from the recording unit 121 in the aforementioned step S201 may be a negative number, the second voice information processing unit 122 may add each voice sample in the voice information to the value of , so that each voice sample must be a positive number. Furthermore, the second voice information processing unit 122 can add an overflow buffer to the voice information upward shift result to obtain the voice information pre-processing result. The overflow buffer can be related to the number of bits (of the voice sample). For example, the second voice information processing unit 122 can use the following formula 1 to obtain the number of bits (length) of the overflow buffer. Finally, the second voice information processing unit 122 can merge these voice samples, store the voice information pre-processing result in a large number format, and send the voice information pre-processing result to the encryption and transmission unit 123. ……… (Formula 1) Where k is the number of tracks before mixing.

需說明的是，上述公式1中的（混音前音軌數量）的意義可以是，圖1所示的用戶設備100的數量。進一步而言，在本實施例中，伺服器200可將用戶設備100的數量預先告知各用戶設備100，以讓第二語音資訊處理單元122利用上述值執行步驟S202中添加溢位緩衝碼的操作。 It should be noted that in the above formula 1 (Number of tracks before mixing) may mean the number of user devices 100 shown in FIG. 1. Further, in this embodiment, the server 200 may inform each user device 100 of the number of user devices 100 in advance, so that the second voice information processing unit 122 may use the above The operation of adding the overflow buffer code in step S202 is performed.

在步驟S203中，加密與傳送單元123可對（以大數型態儲存的）語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果。具體而言，加密與傳送單元123可利用帕耶（Paillier）演算法對語音資訊前處理結果執行加密操作。接著，加密與傳送單元123可傳送語音資訊加密結果至收送單元210。詳細而言，加密與傳送單元123可根據UDP/RTP傳輸協定通過網際網路300傳送語音資訊加密結果至伺服器200的收送單元210。In step S203, the encryption and transmission unit 123 may perform an encryption operation on the voice information pre-processing result (stored in a large number format) to obtain a voice information encryption result in a large number format. Specifically, the encryption and transmission unit 123 may perform an encryption operation on the voice information pre-processing result using the Paillier algorithm. Then, the encryption and transmission unit 123 may transmit the voice information encryption result to the transceiver unit 210. Specifically, the encryption and transmission unit 123 may transmit the voice information encryption result to the transceiver unit 210 of the server 200 through the Internet 300 according to the UDP/RTP transmission protocol.

在步驟S204中，收送單元210可根據UDP/RTP傳輸協定拆解UDP/RTP傳輸協定封包以獲得前述語音資訊加密結果，並且將語音資訊加密結果送往混音單元220。In step S204, the transceiver unit 210 may disassemble the UDP/RTP transport protocol packet according to the UDP/RTP transport protocol to obtain the aforementioned voice information encryption result, and send the voice information encryption result to the mixing unit 220.

在步驟S205中，混音單元220可對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音。具體而言，混音單元220可利用帕耶演算法對語音資訊加密結果執行加法同態操作。換言之，本發明的伺服器200並不需執行解密，而僅需進行密文相加以獲得密文混音。接著，收送單元210可將密文混音打包成UDP/RTP傳輸協定的格式，並且通過網際網路300傳送密文混音至收音單元111。In step S205, the mixing unit 220 can perform an addition homomorphic operation on the voice information encryption result to obtain a ciphertext mixture in a large number format. Specifically, the mixing unit 220 can use the Payet algorithm to perform an addition homomorphic operation on the voice information encryption result. In other words, the server 200 of the present invention does not need to perform decryption, but only needs to add ciphertexts to obtain the ciphertext mixture. Then, the sending and receiving unit 210 can package the ciphertext mixture into the format of the UDP/RTP transmission protocol, and transmit the ciphertext mixture to the receiving unit 111 via the Internet 300.

在步驟S206中，在收音單元111接收封包之後，收音單元111可採用UDP/RTP傳輸協定進行拆包以獲得前述密文混音。接著，收音單元111可傳送密文混音至解密單元112。In step S206 , after the sound receiving unit 111 receives the packet, the sound receiving unit 111 may unpack the packet using the UDP/RTP transmission protocol to obtain the ciphertext audio mixture. Then, the sound receiving unit 111 may transmit the ciphertext audio mixture to the decryption unit 112 .

在步驟S207中，解密單元112可對密文混音執行解密操作以獲得大數型態的混音語音資訊。具體而言，解密單元112可利用帕耶演算法對密文混音執行解密操作。詳細而言，解密單元112可透過帕耶演算法的金鑰將密文混音執行解密操作以獲得混音語音資訊，並且將混音語音資訊送往第一語音資訊處理單元113。In step S207, the decryption unit 112 may perform a decryption operation on the ciphertext mixture to obtain the mixed voice information in a large number format. Specifically, the decryption unit 112 may perform a decryption operation on the ciphertext mixture using the Payet algorithm. In detail, the decryption unit 112 may perform a decryption operation on the ciphertext mixture using the Payet algorithm key to obtain the mixed voice information, and send the mixed voice information to the first voice information processing unit 113.

在步驟S208中，第一語音資訊處理單元113可對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。具體而言，第一語音資訊處理單元113可根據前述溢位緩衝碼對混音語音資訊執行混音語音資訊分割操作以獲得混音語音資訊分割結果，並且對混音語音資訊分割結果執行語音資訊向下平移操作以獲得混音語音資訊後處理結果。所述語音資訊向下平移操作可關聯於混音前音軌數量以及語音樣本的位元數量。詳細而言，若（混音語音資訊分割結果中的）語音樣本的位元數量為 bits，且為混音前音軌數量（如前述實施例所說明的，可為圖1所示的用戶設備100的數量）。第一語音資訊處理單元113可對（混音語音資訊分割結果中的）的每個語音樣本分別都減去，以執行所述語音資訊向下平移操作。最後，第一語音資訊處理單元113可通過輸入輸出裝置輸出混音語音資訊後處理結果。 In step S208, the first voice information processing unit 113 may perform a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result. Specifically, the first voice information processing unit 113 may perform a mixed voice information segmentation operation on the mixed voice information according to the aforementioned overflow buffer to obtain a mixed voice information segmentation result, and perform a voice information downward shift operation on the mixed voice information segmentation result to obtain a mixed voice information post-processing result. The voice information downward shift operation may be related to the number of audio tracks before mixing and the number of bits of the voice sample. In detail, if the number of bits of the voice sample (in the mixed voice information segmentation result) is bits, and is the number of tracks before mixing (as described in the above embodiment, The first voice information processing unit 113 may subtract the number of each voice sample (in the mixed voice information segmentation result) from the first voice information processing unit 113. Finally, the first voice information processing unit 113 can output the mixed voice information post-processing result through the input and output device.

圖3是根據本發明的一實施例繪示的群組加密混音通話的示意圖。請同時參照圖1、圖2及圖3。在圖3所示的實施例中，使用者「小花」、使用者「小明」以及使用者「阿中」3人分別操作用戶設備100以執行群組加密混音通話。換言之，在圖3的實施例中，上述公式1的值（混音前音軌數量，即用戶設備100的數量）為3。在此假設圖3所示的各用戶設備100以及伺服器200已預先交換了同態加密金鑰，且假設各用戶設備100所設定的語音樣本的位元數量均為8 bits。 FIG3 is a schematic diagram of a group encrypted mixed voice call according to an embodiment of the present invention. Please refer to FIG1, FIG2 and FIG3 at the same time. In the embodiment shown in FIG3, three users, namely, user "Xiaohua", user "Xiaoming" and user "Azhong", respectively operate user equipment 100 to perform a group encrypted mixed voice call. In other words, in the embodiment of FIG3, the above formula 1 is The value (the number of tracks before mixing, i.e., the number of user devices 100) is 3. It is assumed that each user device 100 and the server 200 shown in FIG. 3 have exchanged homomorphic encryption keys in advance, and that the number of bits of the voice sample set by each user device 100 is 8 bits.

相似於前述實施例所說明的，使用者「小花」、使用者「小明」以及使用者「阿中」可分別對著各自持有的用戶設備100開始對話。對話內容可先由發話單元120中的錄音單元121進行收錄，以獲得一連串以位元數量為8 bits為單位的語音樣本所組成的數位訊號。接著，此數位訊號可由第二語音資訊處理單元122進行前處理操作以獲得語音資訊前處理結果。接著，語音資訊前處理結果可由加密與傳送單元123執行加密操作以獲得語音資訊加密結果，並且由加密與傳送單元123將語音資訊加密結果傳送至伺服器200的收送單元210。在收送單元210分別接收到各用戶設備100的語音資訊加密結果後，可由伺服器200的混音單元220將各用戶設備100的語音資訊加密結果執行加法同態操作以獲得密文混音，再由收送單元210將密文混音傳送回使用者「小花」、使用者「小明」以及使用者「阿中」使用的各用戶設備100的收音單元111。接著，各用戶設備100可分別透過解密單元112將密文混音執行解密操作以獲得，尚未經過後處理操作的，大數型態的混音語音資訊。最後，各用戶設備100的第一語音資訊處理單元113可分別對混音語音資訊執行後處理操作以獲得（能夠輸出/播放的）混音語音資訊後處理結果，並且通過各用戶設備100的輸入輸出裝置分別輸出混音語音資訊後處理結果。Similar to the above-mentioned embodiment, user "Xiaohua", user "Xiaoming" and user "Azhong" can start a conversation respectively with the user equipment 100 they each own. The conversation content can be first recorded by the recording unit 121 in the speaking unit 120 to obtain a digital signal composed of a series of voice samples with a bit number of 8 bits. Then, this digital signal can be pre-processed by the second voice information processing unit 122 to obtain a voice information pre-processing result. Then, the voice information pre-processing result can be encrypted by the encryption and transmission unit 123 to obtain a voice information encryption result, and the encryption and transmission unit 123 transmits the voice information encryption result to the receiving and transmitting unit 210 of the server 200. After the transceiver unit 210 receives the voice information encryption results of each user device 100, the mixing unit 220 of the server 200 can perform an addition homomorphic operation on the voice information encryption results of each user device 100 to obtain a ciphertext mix, and then the transceiver unit 210 transmits the ciphertext mix back to the receiving unit 111 of each user device 100 used by the user "Xiaohua", the user "Xiaoming" and the user "Azhong". Then, each user device 100 can perform a decryption operation on the ciphertext mix through the decryption unit 112 to obtain a large number of mixed voice information that has not been post-processed. Finally, the first voice information processing unit 113 of each user device 100 can perform post-processing operations on the mixed voice information to obtain a mixed voice information post-processing result (capable of being output/played), and output the mixed voice information post-processing result through the input and output devices of each user device 100.

圖4A是圖1所示的系統10在群組加密混音通話時的第一階段運作的示意圖。圖4B是圖1所示的系統10在群組加密混音通話時的第二階段運作的示意圖。承前述圖3實施例所說明的，在圖4A以及圖4B的實施例中，上述公式1的值（混音前音軌數量，即用戶設備100的數量）為3，且各用戶設備100所設定的語音樣本的位元數量為8 bits（即值為8）。 FIG. 4A is a schematic diagram of the operation of the system 10 shown in FIG. 1 during the first stage of group encrypted mixed voice calls. FIG. 4B is a schematic diagram of the operation of the system 10 shown in FIG. 1 during the second stage of group encrypted mixed voice calls. As described in the embodiment of FIG. 3 above, in the embodiments of FIG. 4A and FIG. 4B, the above formula 1 is The value (the number of tracks before mixing, i.e., the number of user devices 100) is 3, and the number of bits of the speech sample set by each user device 100 is 8 bits (i.e. value is 8).

請先參照圖1、圖2、圖3及圖4A。以下以使用者「小花」為例繼續說明。如圖4A所示，假設使用者「小花」持有的用戶設備100的錄音單元121所接收的語音資訊中，包括了3個以位元數量為8 bits為單位的語音樣本[11110110,00000110,00011110]。使用者「小花」持有的用戶設備100的第二語音資訊處理單元122可對此3個語音樣本都加上，以執行語音資訊向上平移操作並且獲得語音資訊向上平移結果（即，3個語音樣本[118，134，158]）。藉此，此3個語音樣本必然為正數（0~255的正整數）。接著，為了避免後續（伺服器200的）混音過程產生溢位的情形，在第二語音資訊處理單元122串接此3個語音樣本成為大數型態時，第二語音資訊處理單元122可添加溢位緩衝碼至語音資訊向上平移結果以獲得語音資訊前處理結果。詳細而言，根據上述公式1，由於，第二語音資訊處理單元122可在此3個語音樣本的開頭都添加位元數量（長度）為1的溢位緩衝碼以獲得語音資訊前處理結果（即，1110110010000110010011110)，並且完成大數型態的轉換。然後，使用者「小花」持有的用戶設備100的加密與傳送單元123可對（大數型態的）語音資訊前處理結果執行加密操作（即同態加密）以獲得大數型態的語音資訊加密結果，即E(1110110010000110010011110)。相似地，使用者「小明」持有的用戶設備100以及使用者「阿中」持有的用戶設備100也可用與使用者「小花」相同的方式，來獲得各自的語音資訊加密結果。接著，各用戶設備100可分別傳送語音資訊加密結果至伺服器200的收送單元210。 Please refer to Figures 1, 2, 3 and 4A. The following description will be continued by taking the user "Xiaohua" as an example. As shown in Figure 4A, it is assumed that the voice information received by the recording unit 121 of the user device 100 held by the user "Xiaohua" includes 3 voice samples [11110110, 00000110, 00011110] with a bit number of 8 bits. The second voice information processing unit 122 of the user device 100 held by the user "Xiaohua" can add all the 3 voice samples. , to perform the voice information upward translation operation and obtain the voice information upward translation result (i.e., 3 voice samples [118, 134, 158]). Thus, these 3 voice samples must be positive numbers (positive integers from 0 to 255). Then, in order to avoid overflow in the subsequent (server 200) mixing process, when the second voice information processing unit 122 concatenates these 3 voice samples into a large number format, the second voice information processing unit 122 can add an overflow buffer to the voice information upward translation result to obtain the voice information pre-processing result. In detail, according to the above formula 1, since , the second voice information processing unit 122 can add an overflow buffer with a bit number (length) of 1 at the beginning of the three voice samples to obtain the voice information pre-processing result (i.e., 1110110010000110010011110), and complete the conversion of the large number format. Then, the encryption and transmission unit 123 of the user equipment 100 held by the user "Xiaohua" can perform an encryption operation (i.e., homomorphic encryption) on the voice information pre-processing result (in the large number format) to obtain the voice information encryption result in the large number format, i.e., E(1110110010000110010011110). Similarly, the user equipment 100 held by the user "Xiao Ming" and the user equipment 100 held by the user "A Zhong" can also obtain their respective voice information encryption results in the same manner as the user "Xiao Hua". Then, each user equipment 100 can transmit the voice information encryption results to the transceiver unit 210 of the server 200 respectively.

請繼續參照圖1、圖2、圖3、圖4A及圖4B。在伺服器200接收到來自各用戶設備100的語音資訊加密結果之後，混音單元220可對語音資訊加密結果執行加法同態操作以獲得（大數型態）的密文混音，即E(110110100110011100100111101)。接著，收送單元210可回傳密文混音至各用戶設備100的收音單元111。然後，解密單元112可對密文混音執行解密操作以獲得（大數型態的）混音語音資訊，即110110100110011100100111101。然後，第一語音資訊處理單元113可對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。詳細而言，第一語音資訊處理單元113可對此混音語音資訊依照每個含有溢位緩衝碼的語音樣本的位元執行混音語音資訊分割操作（即字串分割），以獲得混音語音資訊分割結果[110110100，110011100，100111101]。更進一步而言，第一語音資訊處理單元113可對混音語音資訊分割結果執行語音資訊向下平移操作以獲得混音語音資訊後處理結果。換言之，第一語音資訊處理單元113可將混音語音資訊分割結果中的此3個語音樣本分別都減去，以得到混音語音資訊後處理結果[52 , 28 , -67]。最後，第一語音資訊處理單元113可通過輸入輸出裝置輸出/播放混音語音資訊後處理結果。 Please continue to refer to Figures 1, 2, 3, 4A and 4B. After the server 200 receives the voice information encryption result from each user device 100, the mixing unit 220 can perform an addition homomorphic operation on the voice information encryption result to obtain a ciphertext mixture (in the form of a large number), that is, E(110110100110011100100111101). Then, the transceiver unit 210 can return the ciphertext mixture to the receiver unit 111 of each user device 100. Then, the decryption unit 112 can perform a decryption operation on the ciphertext mixture to obtain the mixed voice information (in the form of a large number), that is, 110110100110011100100111101. Then, the first voice information processing unit 113 may perform a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result. Specifically, the first voice information processing unit 113 may perform a mixed voice information segmentation operation (i.e., string segmentation) on the mixed voice information according to the bit of each voice sample containing the overflow buffer code to obtain a mixed voice information segmentation result [110110100, 110011100, 100111101]. Furthermore, the first voice information processing unit 113 may perform a voice information downward translation operation on the mixed voice information segmentation result to obtain a mixed voice information post-processing result. In other words, the first voice information processing unit 113 can remove the three voice samples from the mixed voice information segmentation result. , to obtain the mixed voice information post-processing result [52, 28, -67]. Finally, the first voice information processing unit 113 can output/play the mixed voice information post-processing result through the input and output device.

圖5是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的方法的流程圖，其中所述方法可由圖1所示的系統10實施。在步驟S501中，由第二語音資訊處理單元執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果。在步驟S502中，由加密與傳送單元對語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果。在步驟S503中，由混音單元對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音。在步驟S504中，由解密單元對密文混音執行解密操作以獲得大數型態的混音語音資訊。在步驟S505中，由第一語音資訊處理單元對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。所述方法已於前述實施例說明，於此不再贅述。FIG5 is a flow chart of a method for performing encrypted mixing based on large number form and additive homomorphism according to an embodiment of the present invention, wherein the method can be implemented by the system 10 shown in FIG1. In step S501, the second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in large number form. In step S502, the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain a voice information encryption result in large number form. In step S503, the mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain a ciphertext mixing in large number form. In step S504, the decryption unit performs a decryption operation on the ciphertext mixture to obtain the mixed voice information in a large number format. In step S505, the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain the mixed voice information post-processing result. The method has been described in the above-mentioned embodiment and will not be repeated here.

綜上所述，本發明的基於大數型態及加法同態執行加密混音的系統及方法可在用戶設備對語音資訊執行語音資訊向上平移操作及添加溢位緩衝碼等前處理操作之後，由用戶設備執行加密操作以獲得語音資訊加密結果。接著，伺服器並不需執行解密，而是對語音資訊加密結果執行加法同態操作以獲得密文混音。最後，用戶設備可對密文混音執行解密操作，並且在用戶設備執行後處理操作之後輸出/播放。換言之，由於伺服器不需執行解密，伺服器將不會獲得明文語音資訊。從而提高了用戶設備的語音資訊的安全性。In summary, the system and method for performing encrypted mixing based on large number types and additive homomorphism of the present invention can be performed by the user equipment to perform encryption operations to obtain the voice information encryption result after the user equipment performs pre-processing operations such as voice information upward translation operation and adding overflow buffer code on the voice information. Then, the server does not need to perform decryption, but performs additive homomorphic operations on the voice information encryption result to obtain the ciphertext mixing. Finally, the user equipment can perform a decryption operation on the ciphertext mixing, and output/play it after the user equipment performs post-processing operations. In other words, since the server does not need to perform decryption, the server will not obtain the plaintext voice information. Thereby improving the security of the voice information of the user equipment.

10:基於大數型態及加法同態執行加密混音的系統 100:用戶設備 110:受話單元 111:收音單元 112:解密單元 113:第一語音資訊處理單元 120:發話單元 121:錄音單元 122:第二語音資訊處理單元 123:加密與傳送單元 200:伺服器 210:收送單元 220:混音單元 300:網際網路 S201~S208、S501~S505:步驟 10: System for performing encrypted mixing based on large number type and addition homomorphism 100: User equipment 110: Receiver unit 111: Radio unit 112: Decryption unit 113: First voice information processing unit 120: Speaker unit 121: Recorder unit 122: Second voice information processing unit 123: Encryption and transmission unit 200: Server 210: Transmitter and receiver unit 220: Mixer unit 300: Internet S201~S208, S501~S505: Steps

圖1是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的系統的示意圖。圖2是圖1所示的系統的各單元的運作流程圖。圖3是根據本發明的一實施例繪示的群組加密混音通話的示意圖。圖4A是圖1所示的系統在群組加密混音通話時的第一階段運作的示意圖。圖4B是圖1所示的系統在群組加密混音通話時的第二階段運作的示意圖。圖5是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的方法的流程圖。 FIG. 1 is a schematic diagram of a system for executing encrypted mixing based on large number type and additive homomorphism according to an embodiment of the present invention. FIG. 2 is an operation flow chart of each unit of the system shown in FIG. 1. FIG. 3 is a schematic diagram of a group encrypted mixing call according to an embodiment of the present invention. FIG. 4A is a schematic diagram of the operation of the system shown in FIG. 1 in the first stage of a group encrypted mixing call. FIG. 4B is a schematic diagram of the operation of the system shown in FIG. 1 in the second stage of a group encrypted mixing call. FIG. 5 is a flow chart of a method for executing encrypted mixing based on large number type and additive homomorphism according to an embodiment of the present invention.

S501~S505:步驟 S501~S505: Steps

Claims

A system for performing encrypted mixing based on large number form and additive homomorphism, comprising: A user device, comprising a decryption unit, a first voice information processing unit, a second voice information processing unit, and an encryption and transmission unit; and A server, comprising a mixing unit, wherein the server is communicatively connected to the user device, wherein The second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in large number form; The encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain the voice information encryption result in the large number form; The mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain the ciphertext mixing in the large number form; The decryption unit performs a decryption operation on the ciphertext mixture to obtain the mixed voice information in the form of a large number; The first voice information processing unit performs a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result.

A system as described in claim 1, wherein the user equipment further includes an input-output device, wherein the user equipment further includes a recording unit, wherein the recording unit receives voice information through the input-output device.

A system as described in claim 2, wherein the voice information includes a voice sample, wherein the second voice information processing unit performs a voice information segmentation operation on the voice information to obtain a voice information segmentation result, and performs a voice information upward translation operation on the voice information segmentation result to obtain a voice information upward translation result, wherein the voice information upward translation operation is associated with the number of bits of the voice sample.

A system as described in claim 3, wherein the second voice information processing unit adds an overflow buffer to the voice information upward shift result to obtain the voice information pre-processing result, wherein the overflow buffer is associated with the bit number.

A system as described in claim 4, wherein the second voice information processing unit concatenates the voice samples to form the voice information pre-processing result in the large number format.

The system as described in claim 1, wherein the server further includes a transceiver unit, wherein the user equipment further includes a radio unit, wherein the encryption and transmission unit transmits the voice information encryption result to the transceiver unit; the transceiver unit transmits the ciphertext mix to the radio unit.

A system as described in claim 1, wherein the first voice information processing unit performs a mixed voice information segmentation operation on the mixed voice information according to the overflow buffer code to obtain a mixed voice information segmentation result, and performs a voice information downward shift operation on the mixed voice information segmentation result to obtain the mixed voice information post-processing result, wherein the voice information downward shift operation is related to the number of audio tracks before mixing and the number of bits of the voice sample.

A system as described in claim 1, wherein the user equipment further includes an input-output device, wherein the first voice information processing unit outputs the post-processing result of the mixed voice information through the input-output device.

The system as described in claim 1, wherein the encryption and transmission unit uses the Paillier algorithm to perform the encryption operation on the pre-processing result of the voice information; the mixing unit uses the Paillier algorithm to perform the addition homomorphic operation on the voice information encryption result; the decryption unit uses the Paillier algorithm to perform the decryption operation on the ciphertext mixing.

A method for performing encrypted mixing based on large number form and additive homomorphism is applicable to a system including a user device and a server, wherein the user device includes a decryption unit, a first voice information processing unit, a second voice information processing unit and an encryption and transmission unit, and the server includes a mixing unit, wherein the method includes: The second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in large number form; The encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain the voice information encryption result in the large number form; The mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain the ciphertext mixing in the large number form; The decryption unit performs a decryption operation on the ciphertext mixture to obtain the large-scale mixed voice information; and the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result.