TWI823592B

TWI823592B - System and method for performing encrypted mixing based on big number format and additive homomorphism

Info

Publication number: TWI823592B
Application number: TW111137250A
Authority: TW
Inventors: 廖勝廉; 林逸修; 謝佳育; 何佩玲; 張景雄
Original assignee: 中華電信股份有限公司
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2023-11-21

Abstract

A system and a method for performing encrypted mixing based on big number format and additive homomorphism are provided. The method includes: performing a preprocessing-operation by a second-voice-information-processing-unit to obtain a voice-information-preprocessing-result, and storing the voice-information-preprocessing-result in a big number format; performing an encryption-operation on the voice-information-preprocessing-result by an encryption-and-transmission-unit to obtain an voice-information-encrypted-result in the big number format; performing an additive-homomorphism-operation on the voice-information-encrypted-result by a mixing-unit to obtain a mixed-ciphertext in the big number format; performing a decryption-operation on the mixed-ciphertext by a decryption-unit to obtain mixed-voice-information in the big number format; and performing a post-processing-operation on the mixed-voice-information by a first-voice-information-processing-unit to obtain a mixed-voice-information-post-processing-result.

Description

System and method for encrypted mixing based on large number types and additive homomorphism

本發明是有關於一種基於大數型態及加法同態執行加密混音的系統及方法。The present invention relates to a system and method for performing encrypted mixing based on large number types and additive homomorphism.

目前，一般的加密混音方法是在用戶設備執行語音加密後，將加密語音傳送至伺服器，並且由伺服器執行解密及混音。接著，再由伺服器將混音結果加密後傳輸至用戶設備，以讓用戶設備執行解密及播放。換言之，在一般的加密混音方法中，伺服器將會執行解密而獲得明文語音資訊。若伺服器受到第三方監控，則此種作法將會造成資安上的風險。Currently, the general encrypted mixing method is to perform voice encryption on the user device, then transmit the encrypted voice to the server, and the server performs decryption and mixing. Then, the server encrypts the mixing result and transmits it to the user device, so that the user device can perform decryption and playback. In other words, in the general encrypted mixing method, the server will perform decryption to obtain the plaintext voice information. If the server is monitored by a third party, this approach will cause information security risks.

本發明的基於大數型態及加法同態執行加密混音的系統包括用戶設備以及伺服器。用戶設備包括解密單元、第一語音資訊處理單元、第二語音資訊處理單元以及加密與傳送單元。伺服器包括混音單元，其中伺服器通訊連接至用戶設備，其中第二語音資訊處理單元執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果；加密與傳送單元對語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果；混音單元對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音；解密單元對密文混音執行解密操作以獲得大數型態的混音語音資訊；第一語音資訊處理單元對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。The system of the present invention for performing encrypted mixing based on large number types and additive homomorphism includes user equipment and a server. The user equipment includes a decryption unit, a first voice information processing unit, a second voice information processing unit, and an encryption and transmission unit. The server includes a mixing unit, where the server communication is connected to the user equipment, wherein the second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in a large number type; encryption and The transmission unit performs an encryption operation on the voice information pre-processing result to obtain a large number type voice information encryption result; the mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain a large number type ciphertext mix; the decryption unit Perform a decryption operation on the ciphertext mix to obtain the mixed voice information of a large number type; the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain a post-processing result of the mixed voice information.

本發明的基於大數型態及加法同態執行加密混音的方法包括：由第二語音資訊處理單元執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果；由加密與傳送單元對語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果；由混音單元對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音；由解密單元對密文混音執行解密操作以獲得大數型態的混音語音資訊；以及由第一語音資訊處理單元對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。The method of performing encrypted mixing based on large number type and additive homomorphism of the present invention includes: performing a pre-processing operation by the second voice information processing unit to obtain the voice information pre-processing result, and storing the voice information pre-processing in large number type As a result; the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain a large number type voice information encryption result; the mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain a large number type encryption result. text mixing; the decryption unit performs a decryption operation on the ciphertext mixing to obtain the mixed voice information of a large number type; and the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain the mixed voice information. Post-processing results.

圖1是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的系統10的示意圖。系統10可包括多個用戶設備100以及伺服器200。需先說明的是，本發明不限制用戶設備100的數量。用戶設備100及伺服器200可支援基於IP的語音傳輸（VoIP，Voice over IP）。FIG. 1 is a schematic diagram of a system 10 for performing encrypted mixing based on large number types and additive homomorphism according to an embodiment of the present invention. System 10 may include multiple user devices 100 and servers 200 . It should be noted that the present invention does not limit the number of user equipments 100. The user equipment 100 and the server 200 can support voice transmission over IP (VoIP, Voice over IP).

用戶設備100（或伺服器200）具有處理單元（如：處理器但不限於此）、通訊單元（如：各類通訊晶片、藍芽晶片、WiFi晶片等但不限於此）及儲存單元（如：可移動隨機存取記憶體、快閃記憶體、硬碟等但不限於此）等運行用戶設備100（或伺服器200）的必要構件。伺服器200可（分別）通訊連接至各用戶設備100。例如，伺服器200可經由網際網路300通訊連接至各用戶設備100。The user equipment 100 (or server 200) has a processing unit (such as a processor but not limited to this), a communication unit (such as various communication chips, Bluetooth chips, WiFi chips, etc. but not limited to this) and a storage unit (such as : Removable random access memory, flash memory, hard disk, etc. (but not limited to these) are necessary components for running the user device 100 (or server 200). The server 200 can be communicatively connected to each user device 100 (respectively). For example, the server 200 can be communicatively connected to each user device 100 via the Internet 300 .

用戶設備100可包括受話單元110以及發話單元120。進一步而言，受話單元110可包括收音單元111、解密單元112以及第一語音資訊處理單元113。除此之外，發話單元120可包括錄音單元121、第二語音資訊處理單元122以及加密與傳送單元123。在本實施例中，用戶設備100可包括輸入輸出裝置（圖未繪示）。輸入輸出裝置例如是麥克風或喇叭等用於輸入語音或輸出語音的裝置。另一方面，伺服器200可包括收送單元210以及混音單元220。The user equipment 100 may include a receiving unit 110 and a speaking unit 120. Furthermore, the receiving unit 110 may include a sound collecting unit 111, a decryption unit 112 and a first voice information processing unit 113. In addition, the speaking unit 120 may include a recording unit 121, a second voice information processing unit 122, and an encryption and transmission unit 123. In this embodiment, the user equipment 100 may include an input and output device (not shown). The input/output device is, for example, a device for inputting speech or outputting speech, such as a microphone or a speaker. On the other hand, the server 200 may include a transmitting unit 210 and a mixing unit 220.

圖2是圖1所示的系統10的各單元的運作流程圖。請同時參照圖1及圖2。在步驟S201中，錄音單元121可通過輸入輸出裝置接收語音資訊。詳細而言，使用者可操作用戶設備100以輸入（類比訊號形式的）語音資訊至錄音單元121。接著，錄音單元121可將類比訊號形式的語音資訊轉換為數位訊號形式的語音資訊，並且將（數位訊號形式的）語音資訊送往第二語音資訊處理單元122。在此需先說明的是，所述語音資訊可包括（多個）語音樣本（Sample）。使用者可利用輸入輸出裝置來設定語音樣本的位元數量。位元數量例如是8 bits或16 bits，然而本發明不限於此。FIG. 2 is an operation flow chart of each unit of the system 10 shown in FIG. 1 . Please refer to both Figure 1 and Figure 2. In step S201, the recording unit 121 may receive voice information through the input and output device. In detail, the user can operate the user device 100 to input voice information (in the form of analog signals) to the recording unit 121 . Then, the recording unit 121 may convert the voice information in the form of analog signals into voice information in the form of digital signals, and send the voice information (in the form of digital signals) to the second voice information processing unit 122 . It should be noted here that the voice information may include (multiple) voice samples (Samples). The user can use the input and output device to set the number of bits of the speech sample. The number of bits is, for example, 8 bits or 16 bits, but the invention is not limited thereto.

在步驟S202中，第二語音資訊處理單元122可執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果。具體而言，第二語音資訊處理單元122可對語音資訊執行語音資訊分割操作以獲得語音資訊分割結果，並且對語音資訊分割結果執行語音資訊向上平移操作以獲得語音資訊向上平移結果。所述語音資訊向上平移操作可關聯於語音樣本的位元數量。詳細而言，第二語音資訊處理單元122可將語音資訊根據各語音樣本執行語音資訊分割操作。接著，若語音樣本的位元數量為 bits，第二語音資訊處理單元122可將語音資訊中的各語音樣本分別都加上，以執行所述語音資訊向上平移操作。換言之，由於前述步驟S201中第二語音資訊處理單元122從錄音單元121獲得的，（數位訊號形式的）語音資訊中的各語音樣本可能為負數，第二語音資訊處理單元122可將語音資訊中的各語音樣本分別都加上，以使各語音樣本必然為正數。更進一步而言，第二語音資訊處理單元122可添加溢位緩衝碼至語音資訊向上平移結果以獲得語音資訊前處理結果。溢位緩衝碼可關聯於（語音樣本的）位元數量。例如，第二語音資訊處理單元122可利用下述公式1獲得溢位緩衝碼的位元數量（長度）。最後，第二語音資訊處理單元122可將此些語音樣本合併，且以大數型態儲存語音資訊前處理結果，並且將語音資訊前處理結果送往加密與傳送單元123。 ……… (公式1) 其中k為混音前音軌數量。 In step S202, the second voice information processing unit 122 may perform a pre-processing operation to obtain a voice information pre-processing result, and store the voice information pre-processing result in a large number type. Specifically, the second voice information processing unit 122 may perform a voice information segmentation operation on the voice information to obtain a voice information segmentation result, and perform a voice information upward translation operation on the voice information segmentation result to obtain an upward translation result of the voice information. The upward translation operation of the speech information may be related to the number of bits of the speech sample. Specifically, the second voice information processing unit 122 can perform a voice information segmentation operation on the voice information according to each voice sample. Next, if the number of bits of the speech sample is bits, the second voice information processing unit 122 can add each voice sample in the voice information respectively. , to perform the upward translation operation of the voice information. In other words, since each voice sample in the voice information (in the form of a digital signal) obtained by the second voice information processing unit 122 from the recording unit 121 in the aforementioned step S201 may be a negative number, the second voice information processing unit 122 can convert the voice information into Each voice sample of is added separately , so that each speech sample must be a positive number. Furthermore, the second voice information processing unit 122 may add an overflow buffer code to the voice information upward translation result to obtain the voice information pre-processing result. The overflow buffer code may be associated with the number of bits (of the speech sample). For example, the second voice information processing unit 122 may use the following formula 1 to obtain the number of bits (length) of the overflow buffer code. Finally, the second voice information processing unit 122 can merge these voice samples, store the voice information pre-processing results in a large number format, and send the voice information pre-processing results to the encryption and transmission unit 123 . ……… (Formula 1) where k is the number of tracks before mixing.

需說明的是，上述公式1中的（混音前音軌數量）的意義可以是，圖1所示的用戶設備100的數量。進一步而言，在本實施例中，伺服器200可將用戶設備100的數量預先告知各用戶設備100，以讓第二語音資訊處理單元122利用上述值執行步驟S202中添加溢位緩衝碼的操作。 It should be noted that in the above formula 1 The meaning of (number of tracks before mixing) may be the number of user devices 100 shown in FIG. 1 . Furthermore, in this embodiment, the server 200 can inform each user equipment 100 of the number of user equipments 100 in advance, so that the second voice information processing unit 122 can utilize the above-mentioned The value performs the operation of adding the overflow buffer code in step S202.

在步驟S203中，加密與傳送單元123可對（以大數型態儲存的）語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果。具體而言，加密與傳送單元123可利用帕耶（Paillier）演算法對語音資訊前處理結果執行加密操作。接著，加密與傳送單元123可傳送語音資訊加密結果至收送單元210。詳細而言，加密與傳送單元123可根據UDP/RTP傳輸協定通過網際網路300傳送語音資訊加密結果至伺服器200的收送單元210。In step S203, the encryption and transmission unit 123 may perform an encryption operation on the voice information pre-processing result (stored in large number type) to obtain the voice information encryption result in large number type. Specifically, the encryption and transmission unit 123 may use the Paillier algorithm to perform an encryption operation on the voice information pre-processing result. Then, the encryption and transmission unit 123 can transmit the voice information encryption result to the sending and receiving unit 210 . Specifically, the encryption and transmission unit 123 can transmit the voice information encryption result to the transmission unit 210 of the server 200 through the Internet 300 according to the UDP/RTP transmission protocol.

在步驟S204中，收送單元210可根據UDP/RTP傳輸協定拆解UDP/RTP傳輸協定封包以獲得前述語音資訊加密結果，並且將語音資訊加密結果送往混音單元220。In step S204, the sending and receiving unit 210 may disassemble the UDP/RTP transmission protocol packet according to the UDP/RTP transmission protocol to obtain the aforementioned voice information encryption result, and send the voice information encryption result to the mixing unit 220.

在步驟S205中，混音單元220可對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音。具體而言，混音單元220可利用帕耶演算法對語音資訊加密結果執行加法同態操作。換言之，本發明的伺服器200並不需執行解密，而僅需進行密文相加以獲得密文混音。接著，收送單元210可將密文混音打包成UDP/RTP傳輸協定的格式，並且通過網際網路300傳送密文混音至收音單元111。In step S205, the mixing unit 220 may perform an additive homomorphic operation on the voice information encryption result to obtain a large number type ciphertext mixing operation. Specifically, the mixing unit 220 may use the Payet algorithm to perform an additive homomorphic operation on the speech information encryption result. In other words, the server 200 of the present invention does not need to perform decryption, but only needs to add the ciphertext to obtain the ciphertext mix. Then, the sending and receiving unit 210 may package the ciphertext mix into a UDP/RTP transmission protocol format, and transmit the ciphertext mix to the radio unit 111 through the Internet 300 .

在步驟S206中，在收音單元111接收封包之後，收音單元111可採用UDP/RTP傳輸協定進行拆包以獲得前述密文混音。接著，收音單元111可傳送密文混音至解密單元112。In step S206, after the sound collecting unit 111 receives the packet, the sound collecting unit 111 may use the UDP/RTP transmission protocol to unpack the packet to obtain the aforementioned ciphertext mix. Then, the sound collection unit 111 can transmit the ciphertext mix to the decryption unit 112 .

在步驟S207中，解密單元112可對密文混音執行解密操作以獲得大數型態的混音語音資訊。具體而言，解密單元112可利用帕耶演算法對密文混音執行解密操作。詳細而言，解密單元112可透過帕耶演算法的金鑰將密文混音執行解密操作以獲得混音語音資訊，並且將混音語音資訊送往第一語音資訊處理單元113。In step S207, the decryption unit 112 may perform a decryption operation on the ciphertext mix to obtain the mix voice information in the large number type. Specifically, the decryption unit 112 may use the Payet algorithm to perform a decryption operation on the ciphertext mix. Specifically, the decryption unit 112 can perform a decryption operation by mixing the ciphertext using the key of the Payet algorithm to obtain the mixed voice information, and send the mixed voice information to the first voice information processing unit 113 .

在步驟S208中，第一語音資訊處理單元113可對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。具體而言，第一語音資訊處理單元113可根據前述溢位緩衝碼對混音語音資訊執行混音語音資訊分割操作以獲得混音語音資訊分割結果，並且對混音語音資訊分割結果執行語音資訊向下平移操作以獲得混音語音資訊後處理結果。所述語音資訊向下平移操作可關聯於混音前音軌數量以及語音樣本的位元數量。詳細而言，若（混音語音資訊分割結果中的）語音樣本的位元數量為 bits，且為混音前音軌數量（如前述實施例所說明的，可為圖1所示的用戶設備100的數量）。第一語音資訊處理單元113可對（混音語音資訊分割結果中的）的每個語音樣本分別都減去，以執行所述語音資訊向下平移操作。最後，第一語音資訊處理單元113可通過輸入輸出裝置輸出混音語音資訊後處理結果。 In step S208, the first voice information processing unit 113 may perform a post-processing operation on the mixed voice information to obtain a post-processing result of the mixed voice information. Specifically, the first voice information processing unit 113 can perform a mixed voice information segmentation operation on the mixed voice information according to the aforementioned overflow buffer code to obtain a mixed voice information segmentation result, and perform a voice information segmentation operation on the mixed voice information segmentation result. Pan down to obtain the mixed voice information post-processing results. The downward panning operation of the voice information may be related to the number of audio tracks before mixing and the number of bits of the voice samples. Specifically, if the number of bits of the speech sample (in the mixed speech information segmentation result) is bits, and is the number of tracks before mixing (as explained in the previous embodiment, It may be the number of user equipments 100 shown in Figure 1). The first voice information processing unit 113 may subtract each voice sample (in the mixed voice information segmentation result) respectively. , to perform the downward panning operation of the voice information. Finally, the first voice information processing unit 113 can output the post-processing result of the mixed voice information through the input and output device.

圖3是根據本發明的一實施例繪示的群組加密混音通話的示意圖。請同時參照圖1、圖2及圖3。在圖3所示的實施例中，使用者「小花」、使用者「小明」以及使用者「阿中」3人分別操作用戶設備100以執行群組加密混音通話。換言之，在圖3的實施例中，上述公式1的值（混音前音軌數量，即用戶設備100的數量）為3。在此假設圖3所示的各用戶設備100以及伺服器200已預先交換了同態加密金鑰，且假設各用戶設備100所設定的語音樣本的位元數量均為8 bits。 FIG. 3 is a schematic diagram of a group encrypted mixed call according to an embodiment of the present invention. Please refer to Figure 1, Figure 2 and Figure 3 at the same time. In the embodiment shown in FIG. 3 , the user "Xiaohua", the user "Xiaoming" and the user "Azhong" respectively operate the user device 100 to perform the group encrypted mixed call. In other words, in the embodiment of Figure 3, the above formula 1 The value (number of tracks before mixing, i.e. number of user devices 100) is 3. It is assumed here that each user equipment 100 and the server 200 shown in FIG. 3 have exchanged homomorphic encryption keys in advance, and it is assumed that the number of bits of the voice samples set by each user equipment 100 is 8 bits.

相似於前述實施例所說明的，使用者「小花」、使用者「小明」以及使用者「阿中」可分別對著各自持有的用戶設備100開始對話。對話內容可先由發話單元120中的錄音單元121進行收錄，以獲得一連串以位元數量為8 bits為單位的語音樣本所組成的數位訊號。接著，此數位訊號可由第二語音資訊處理單元122進行前處理操作以獲得語音資訊前處理結果。接著，語音資訊前處理結果可由加密與傳送單元123執行加密操作以獲得語音資訊加密結果，並且由加密與傳送單元123將語音資訊加密結果傳送至伺服器200的收送單元210。在收送單元210分別接收到各用戶設備100的語音資訊加密結果後，可由伺服器200的混音單元220將各用戶設備100的語音資訊加密結果執行加法同態操作以獲得密文混音，再由收送單元210將密文混音傳送回使用者「小花」、使用者「小明」以及使用者「阿中」使用的各用戶設備100的收音單元111。接著，各用戶設備100可分別透過解密單元112將密文混音執行解密操作以獲得，尚未經過後處理操作的，大數型態的混音語音資訊。最後，各用戶設備100的第一語音資訊處理單元113可分別對混音語音資訊執行後處理操作以獲得（能夠輸出/播放的）混音語音資訊後處理結果，並且通過各用戶設備100的輸入輸出裝置分別輸出混音語音資訊後處理結果。Similar to what was explained in the previous embodiments, the user "Xiaohua", the user "Xiaoming" and the user "Azhong" can respectively start a conversation with the user equipment 100 they hold. The conversation content can first be recorded by the recording unit 121 in the speaking unit 120 to obtain a digital signal composed of a series of voice samples with a unit number of 8 bits. Then, the digital signal can be pre-processed by the second voice information processing unit 122 to obtain a voice information pre-processing result. Then, the voice information pre-processing result can be encrypted by the encryption and transmission unit 123 to obtain the voice information encryption result, and the encryption and transmission unit 123 transmits the voice information encryption result to the transmission unit 210 of the server 200 . After the sending and receiving unit 210 receives the voice information encryption results of each user equipment 100 respectively, the mixing unit 220 of the server 200 can perform an additive homomorphic operation on the voice information encryption results of each user equipment 100 to obtain ciphertext mixing. The sending and receiving unit 210 then sends the mixed ciphertext back to the radio receiving units 111 of each user equipment 100 used by the user "Xiaohua", the user "Xiaoming" and the user "Azhong". Then, each user equipment 100 can perform a decryption operation on the ciphertext mix through the decryption unit 112 to obtain a large number of mixed voice information that has not undergone post-processing operations. Finally, the first voice information processing unit 113 of each user equipment 100 can respectively perform a post-processing operation on the mixed voice information to obtain a post-processing result of the mixed voice information (that can be output/played), and through the input of each user equipment 100 The output device respectively outputs the mixed voice information post-processing results.

圖4A是圖1所示的系統10在群組加密混音通話時的第一階段運作的示意圖。圖4B是圖1所示的系統10在群組加密混音通話時的第二階段運作的示意圖。承前述圖3實施例所說明的，在圖4A以及圖4B的實施例中，上述公式1的值（混音前音軌數量，即用戶設備100的數量）為3，且各用戶設備100所設定的語音樣本的位元數量為8 bits（即值為8）。 FIG. 4A is a schematic diagram of the first stage operation of the system 10 shown in FIG. 1 during a group encrypted mixed call. FIG. 4B is a schematic diagram of the second stage operation of the system 10 shown in FIG. 1 during a group encrypted mixed call. Following the aforementioned explanation of the embodiment of Figure 3, in the embodiment of Figure 4A and Figure 4B, the formula 1 above The value (the number of audio tracks before mixing, that is, the number of user equipment 100) is 3, and the number of bits of the voice sample set for each user equipment 100 is 8 bits (i.e. value is 8).

請先參照圖1、圖2、圖3及圖4A。以下以使用者「小花」為例繼續說明。如圖4A所示，假設使用者「小花」持有的用戶設備100的錄音單元121所接收的語音資訊中，包括了3個以位元數量為8 bits為單位的語音樣本[11110110,00000110,00011110]。使用者「小花」持有的用戶設備100的第二語音資訊處理單元122可對此3個語音樣本都加上，以執行語音資訊向上平移操作並且獲得語音資訊向上平移結果（即，3個語音樣本[118，134，158]）。藉此，此3個語音樣本必然為正數（0~255的正整數）。接著，為了避免後續（伺服器200的）混音過程產生溢位的情形，在第二語音資訊處理單元122串接此3個語音樣本成為大數型態時，第二語音資訊處理單元122可添加溢位緩衝碼至語音資訊向上平移結果以獲得語音資訊前處理結果。詳細而言，根據上述公式1，由於，第二語音資訊處理單元122可在此3個語音樣本的開頭都添加位元數量（長度）為1的溢位緩衝碼以獲得語音資訊前處理結果（即，1110110010000110010011110)，並且完成大數型態的轉換。然後，使用者「小花」持有的用戶設備100的加密與傳送單元123可對（大數型態的）語音資訊前處理結果執行加密操作（即同態加密）以獲得大數型態的語音資訊加密結果，即E(1110110010000110010011110)。相似地，使用者「小明」持有的用戶設備100以及使用者「阿中」持有的用戶設備100也可用與使用者「小花」相同的方式，來獲得各自的語音資訊加密結果。接著，各用戶設備100可分別傳送語音資訊加密結果至伺服器200的收送單元210。 Please refer to Figure 1, Figure 2, Figure 3 and Figure 4A first. The following uses the user "Xiaohua" as an example to continue the explanation. As shown in Figure 4A, it is assumed that the voice information received by the recording unit 121 of the user equipment 100 held by the user "Xiaohua" includes 3 voice samples with a unit number of 8 bits [11110110,00000110, 00011110]. The second voice information processing unit 122 of the user equipment 100 held by the user "Xiaohua" can add all three voice samples. , to perform the upward translation operation of the speech information and obtain the upward translation result of the speech information (ie, 3 speech samples [118, 134, 158]). Therefore, these three speech samples must be positive numbers (positive integers from 0 to 255). Then, in order to avoid overflow in the subsequent mixing process (of the server 200), when the second voice information processing unit 122 concatenates the three voice samples to form a large number, the second voice information processing unit 122 can Add an overflow buffer code to the voice information and shift the result upward to obtain the voice information pre-processing result. In detail, according to the above formula 1, since , the second voice information processing unit 122 can add an overflow buffer code with a bit number (length) of 1 at the beginning of these three voice samples to obtain the voice information pre-processing result (ie, 1110110010000110010011110), and complete the large number type state conversion. Then, the encryption and transmission unit 123 of the user equipment 100 held by the user "Xiaohua" can perform an encryption operation (ie, homomorphic encryption) on the voice information pre-processing result (in the large number type) to obtain the large number type voice The information encryption result is E(1110110010000110010011110). Similarly, the user equipment 100 held by the user "Xiao Ming" and the user equipment 100 held by the user "A Zhong" can also obtain their respective voice information encryption results in the same manner as the user "Xiao Hua". Then, each user equipment 100 can respectively transmit the voice information encryption result to the sending unit 210 of the server 200.

請繼續參照圖1、圖2、圖3、圖4A及圖4B。在伺服器200接收到來自各用戶設備100的語音資訊加密結果之後，混音單元220可對語音資訊加密結果執行加法同態操作以獲得（大數型態）的密文混音，即E(110110100110011100100111101)。接著，收送單元210可回傳密文混音至各用戶設備100的收音單元111。然後，解密單元112可對密文混音執行解密操作以獲得（大數型態的）混音語音資訊，即110110100110011100100111101。然後，第一語音資訊處理單元113可對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。詳細而言，第一語音資訊處理單元113可對此混音語音資訊依照每個含有溢位緩衝碼的語音樣本的位元執行混音語音資訊分割操作（即字串分割），以獲得混音語音資訊分割結果[110110100，110011100，100111101]。更進一步而言，第一語音資訊處理單元113可對混音語音資訊分割結果執行語音資訊向下平移操作以獲得混音語音資訊後處理結果。換言之，第一語音資訊處理單元113可將混音語音資訊分割結果中的此3個語音樣本分別都減去，以得到混音語音資訊後處理結果[52 , 28 , -67]。最後，第一語音資訊處理單元113可通過輸入輸出裝置輸出/播放混音語音資訊後處理結果。 Please continue to refer to Figures 1, 2, 3, 4A and 4B. After the server 200 receives the voice information encryption results from each user device 100, the mixing unit 220 can perform an additive homomorphic operation on the voice information encryption results to obtain a (large number type) ciphertext mix, that is, E ( 110110100110011100100111101). Then, the sending and receiving unit 210 may return the ciphertext mix to the sound receiving unit 111 of each user equipment 100 . Then, the decryption unit 112 may perform a decryption operation on the ciphertext mix to obtain the mix voice information (in large number type), that is, 110110100110011100100111101. Then, the first voice information processing unit 113 may perform a post-processing operation on the mixed voice information to obtain a post-processing result of the mixed voice information. Specifically, the first voice information processing unit 113 can perform a mixed voice information segmentation operation (ie, string segmentation) on the mixed voice information according to the bits of each voice sample containing the overflow buffer code, to obtain the mixed voice information. Voice information segmentation results [110110100, 110011100, 100111101]. Furthermore, the first voice information processing unit 113 can perform a voice information downward translation operation on the mixed voice information segmentation result to obtain the mixed voice information post-processing result. In other words, the first voice information processing unit 113 can subtract each of the three voice samples from the mixed voice information segmentation result. , to obtain the mixed speech information post-processing results [52, 28, -67]. Finally, the first voice information processing unit 113 can output/play the post-processing result of the mixed voice information through the input and output device.

圖5是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的方法的流程圖，其中所述方法可由圖1所示的系統10實施。在步驟S501中，由第二語音資訊處理單元執行前處理操作以獲得語音資訊前處理結果，並且以大數型態儲存語音資訊前處理結果。在步驟S502中，由加密與傳送單元對語音資訊前處理結果執行加密操作以獲得大數型態的語音資訊加密結果。在步驟S503中，由混音單元對語音資訊加密結果執行加法同態操作以獲得大數型態的密文混音。在步驟S504中，由解密單元對密文混音執行解密操作以獲得大數型態的混音語音資訊。在步驟S505中，由第一語音資訊處理單元對混音語音資訊執行後處理操作以獲得混音語音資訊後處理結果。所述方法已於前述實施例說明，於此不再贅述。FIG. 5 is a flowchart illustrating a method for performing encrypted mixing based on large number types and additive homomorphism according to an embodiment of the present invention, wherein the method can be implemented by the system 10 shown in FIG. 1 . In step S501, the second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the voice information pre-processing result in a large number type. In step S502, the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain a large number type voice information encryption result. In step S503, the mixing unit performs an additive homomorphic operation on the voice information encryption result to obtain a large number type ciphertext mix. In step S504, the decryption unit performs a decryption operation on the ciphertext mix to obtain mixed voice information in the large number type. In step S505, the first voice information processing unit performs a post-processing operation on the mixed voice information to obtain a post-processing result of the mixed voice information. The method has been described in the foregoing embodiments and will not be described again here.

綜上所述，本發明的基於大數型態及加法同態執行加密混音的系統及方法可在用戶設備對語音資訊執行語音資訊向上平移操作及添加溢位緩衝碼等前處理操作之後，由用戶設備執行加密操作以獲得語音資訊加密結果。接著，伺服器並不需執行解密，而是對語音資訊加密結果執行加法同態操作以獲得密文混音。最後，用戶設備可對密文混音執行解密操作，並且在用戶設備執行後處理操作之後輸出/播放。換言之，由於伺服器不需執行解密，伺服器將不會獲得明文語音資訊。從而提高了用戶設備的語音資訊的安全性。In summary, the system and method for performing encrypted mixing based on large number types and additive homomorphism of the present invention can perform pre-processing operations such as upward translation of the voice information and addition of an overflow buffer code on the voice information by the user equipment. The user device performs an encryption operation to obtain the voice information encryption result. Then, the server does not need to perform decryption, but performs an additive homomorphic operation on the voice information encryption result to obtain the ciphertext mix. Finally, the user device can perform a decryption operation on the ciphertext mix and output/play it after the user device performs a post-processing operation. In other words, since the server does not need to perform decryption, the server will not obtain the clear text voice information. This improves the security of voice information on user devices.

10:基於大數型態及加法同態執行加密混音的系統 100:用戶設備 110:受話單元 111:收音單元 112:解密單元 113:第一語音資訊處理單元 120:發話單元 121:錄音單元 122:第二語音資訊處理單元 123:加密與傳送單元 200:伺服器 210:收送單元 220:混音單元 300:網際網路 S201~S208、S501~S505:步驟 10: A system that performs encrypted mixing based on large number types and additive homomorphisms 100: User equipment 110:Receiver unit 111:Radio unit 112:Decryption unit 113: First voice information processing unit 120: Sending unit 121:Recording unit 122: Second voice information processing unit 123: Encryption and transmission unit 200:server 210: Sending and receiving unit 220: Mixing unit 300:Internet S201~S208, S501~S505: steps

圖1是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的系統的示意圖。圖2是圖1所示的系統的各單元的運作流程圖。圖3是根據本發明的一實施例繪示的群組加密混音通話的示意圖。圖4A是圖1所示的系統在群組加密混音通話時的第一階段運作的示意圖。圖4B是圖1所示的系統在群組加密混音通話時的第二階段運作的示意圖。圖5是根據本發明的一實施例繪示的一種基於大數型態及加法同態執行加密混音的方法的流程圖。 FIG. 1 is a schematic diagram of a system for performing encrypted mixing based on large number types and additive homomorphism according to an embodiment of the present invention. FIG. 2 is an operation flow chart of each unit of the system shown in FIG. 1 . FIG. 3 is a schematic diagram of a group encrypted mixed call according to an embodiment of the present invention. FIG. 4A is a schematic diagram of the first stage operation of the system shown in FIG. 1 during a group encrypted mixed call. FIG. 4B is a schematic diagram of the second stage operation of the system shown in FIG. 1 during a group encrypted mixed call. FIG. 5 is a flowchart illustrating a method for performing encrypted mixing based on large number types and additive homomorphism according to an embodiment of the present invention.

S501~S505:步驟 S501~S505: steps

Claims

A system for performing encrypted mixing based on large number types and additive homomorphism, including: user equipment, including a decryption unit, a first voice information processing unit, a second voice information processing unit, and an encryption and transmission unit; and a server, including Mixing unit, wherein the server communication is connected to the user equipment, wherein the second voice information processing unit performs a pre-processing operation to obtain a voice information pre-processing result, and stores the pre-voice information in a large number format Processing results; the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain the large number type voice information encryption result; the mixing unit performs addition homomorphism on the voice information encryption result The operation is performed to obtain the ciphertext mix of the large number type; the decryption unit performs a decryption operation on the ciphertext mix to obtain the mixed voice information of the large number type; the first voice information processing The unit performs a post-processing operation on the mixed voice information to obtain a mixed voice information post-processing result, wherein the first voice information processing unit performs a mixed voice information segmentation operation on the mixed voice information according to the overflow buffer code. To obtain the mixed voice information segmentation result, and perform a voice information downward translation operation on the mixed voice information segmentation result to obtain the mixed voice information post-processing result, wherein the downward translation operation of the voice information is associated with the mixed voice information segmentation result. The number of pre-sound tracks and the number of bits of the speech sample.

The system of claim 1, wherein the user equipment further includes an input and output device, wherein the user equipment further includes a recording unit, and the recording unit receives voice information through the input and output device.

The system of claim 2, wherein the voice information includes the voice sample, wherein the second voice information processing unit performs a voice information segmentation operation on the voice information to obtain a voice information segmentation result, and performs a voice information segmentation operation on the voice information. The voice information segmentation result performs a voice information upward translation operation to obtain a voice information upward translation result, wherein the voice information upward translation operation is associated with the number of bits of the speech sample.

The system according to claim 3, wherein the second voice information processing unit adds the overflow buffer code to the voice information upward translation result to obtain the voice information pre-processing result, wherein the overflow buffer code associated with the number of bits.

The system of claim 4, wherein the second voice information processing unit concatenates the voice samples to become the voice information pre-processing result in the large number type.

The system of claim 1, wherein the server further includes a transmitting unit, wherein the user equipment further includes a sound receiving unit, wherein the encryption and transmitting unit transmits the voice information encryption result to the transmitting unit ; The transmitting unit transmits the mixed ciphertext to the radio unit.

The system according to claim 1, wherein the user equipment further includes an input and output device, wherein the first voice information processing unit outputs the post-processing result of the mixed voice information through the input and output device.

The system according to claim 1, wherein the encryption and transmission unit uses a Paillier algorithm to perform the encryption operation on the voice information pre-processing result; the mixing unit uses the Paillier algorithm. The additive homomorphic operation is performed on the voice information encryption result; the decryption unit uses the Payet algorithm to perform the decryption operation on the ciphertext mixing.

A method for performing encrypted mixing based on large number types and additive homomorphism, suitable for systems including user equipment and servers, wherein the user equipment includes a decryption unit, a first voice information processing unit, and a second voice information processing unit and an encryption and transmission unit, and the server includes a mixing unit, wherein the method includes: performing a pre-processing operation by the second voice information processing unit to obtain a voice information pre-processing result, and storing it in a large number type The voice information pre-processing result; the encryption and transmission unit performs an encryption operation on the voice information pre-processing result to obtain the large number type voice information encryption result; the mixing unit performs an encryption operation on the voice information The information encryption result performs an additive homomorphic operation to obtain the ciphertext mix of the large number type; The decryption unit performs a decryption operation on the ciphertext mixing to obtain the large number type of mixed voice information; and the first voice information processing unit performs a post-processing operation on the mixed voice information. To obtain the mixed voice information post-processing result, the first voice information processing unit performs a mixed voice information segmentation operation on the mixed voice information according to the overflow buffer code to obtain the mixed voice information segmentation result, and performs a mixed voice information segmentation operation on the mixed voice information according to the overflow buffer code, and The mixed voice information segmentation result performs a voice information downward translation operation to obtain the mixed voice information post-processing result, wherein the voice information downward translation operation is associated with the number of pre-mixing audio tracks and the number of bits of the voice sample. .