EP2014066A1

EP2014066A1 - Method and apparatus for voice signal encryption/decryption

Info

Publication number: EP2014066A1
Application number: EP07746030A
Authority: EP
Inventors: Won Yong Choi
Original assignee: Prestige Corp
Current assignee: Prestige Corp
Priority date: 2006-04-18
Filing date: 2007-04-17
Publication date: 2009-01-14
Also published as: KR20070103113A; WO2007120006A1; KR100836942B1

Abstract

A voice signal encryption/decryption method and a device for real-time encrypted phone conversation using the method are provided. A real-time voice is converted to a digital data, and the amount of data is reduced by performing a time-scale modification (TSM) by a predetermined ratio. The time-scale modification processed digital voice signal are separated into contiguous pre-encryption frames, and each of them is separated shuffling blocks and the time order of the shuffling blocks is mixed, and then post-encryption frames are generated by adding dummy voice signals to each of the pre-encryption frame. The signal is added with a predetermined amount of synchronization data and then transmitted to the other side. These processes are performed according a predetermined encrypt rules using an encryption key the user set. The encrypted voice signal from transmitting side is decrypted in receiving side by applying the encryption rules reversely with the same encryption key.

Description

METHOD AND APPARATUS FOR VOICE SIGNAL ENCRYPTION/DECRYPTION

Technical Field

[1] The invention relates to a method and apparatus for voice signal encryption/ decryption. More particularly, the invention relates to a method and apparatus for voice signal encryption/decryption for securing a real-time voice communication. Background Art

[2] Eavesdropping in cellular phones became a public issue recently in Korea. The cellular phone communication service company claimed that it is impossible to eavesdrop a cellular phone, but it turned out that the eavesdropping occurred actually. Thus, many people worry about eavesdropping, and want to have a personal anti- eavesdropping means because they distrust the cellular phone communication service company. Eavesdropping is not a problem only to cellular phone communication, but also to other communication devices such as a regular telephone, an internet phone, and a two-way radio, and a walkie-talkie. An anti-eavesdropping method that users can take is to prevent the communication from being disclosed to others by using an encryption technology.

[3] Representative and available voice signal encryption technology includes an analog voice scrambler type and a digital voice encryptor type. The analog voice scrambler type makes the communication contents indecipherable to eavesdroppers by transforming the voice signal to several elements in a frequency region or time region using random numbers generated in a pseudo-random number generator. This type has a problem of residual intelligibility, in which there exists some remnants of voice signal even after encrypting process, reducing the confidence of tight security of the communication, and therefore its usage is decreasing.

[4] The digital voice signal encryptor type has an advantage in increasing the security during the process of encryption, but since it uses digital signals the digital voice signal encryptor type cannot use regular voice communication device and therefore has to use an additional data communication. In cases of current communication types for cellular phone such as CDMA or GSM, the digital voice signal encryptor type is difficult to be applied to conventional voice communication networks (regular communication network) due to the increased amount of data, and the commercialization is not easy because it is going to be possible only with additional digital data communication. Also, since the digital voice signal encryptor type is a processing of digital values, the security of the entire system would be easily and seriously breached when how the digital values are processed is exposed to the public or deciphered. Disclosure of Invention

Technical Problem

[5] The object of the invention includes providing a method and apparatus for voice signal encryption/decryption for securing a real-time voice communication by encrypting the real-time voice through mixing the real-time voice and the past voice of the speaker.

[6] Another object of the invention includes proving a method and apparatus for voice signal encryption/decryption of the above, in which the apparatus is connected to the conventional communication devices including a cellular telephone, wireless/wired phones, a two-way radio, and a walkie-talkie externally therefore without any change in the communication devices, and increases the security of the real-time voice communication. Technical Solution

[7] To achieve the above goals, in the present invention, the analog voice signal of the user is converted to a digital voice signal by an encryption rules using an encryption key value, and then reduces the amount of data by a time-scale modification processing of the converted digital voice signal, and mixes the time order of voice signal in random by a shuffling process. Separately, the invention extracts a dummy voice signal from the real-time or past voice signal of the user. The dummy voice signal is added to the shuffled voice signal such that two voice signal are mixed in random. It is desirable to add the dummy voice signal to the digital voice signal in an amount by which the amount of data was reduced in the time-scale modification processing.

[8] Since it is a mix of the current and past voice signals and even the current voice signal is shuffled in time, the sound extracted from such encrypted voice signal is unintelligible and sounds like just a noise. Also, since both of the voice signal to be encrypted and the dummy voice signal are the voice signal of the same person, their frequency characteristics is same. Therefore, it is nearly impossible to extract the real voice signal of the user only from the encrypted voice signal.

[9] One aspect of the invention provides a method for voice signal encryption- decryption, which comprises: reducing the amount of data by performing a time-scale modification (TSM) on a digital voice signal of the speaker; shuffling the time-scale modification processed digital voice signal by separating the time-scale modification processed digital voice signal into a plurality of contiguous pre-encryption frames, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks, and shuffling the order of the plurality of shuffling blocks in time; generating a plurality of post-encryption frames by adding dummy voice signals to each of the plurality of pre-encryption frame after shuffling, wherein the voice signal is converted to the encrypted voice signal which is unintelligible.

[10] Also, according to another embodiment of the present invention, provided is a method for encrypting/decrypting a real-time telephone conversation voice between a first user using a first encrypting/decrypting device built in or connected to a first communication device and a second user using a second encrypting/decrypting device built in or connected to a second communication device, the method comprising: the first and second users setting the first and second encrypting/decrypting devices respectively to an identical encryption key; in the first encrypting/decrypting device of the first user, transmitting an unintelligible encrypted voice signal to the second communication device of the second user by converting the voice of the first user to a digital voice signal, reducing the amount of the data with a predetermined time scale ratio by performing a time-scale-modification (TSM) on the digital voice signal according to a set of encryption rules determined using the encryption key, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks, shuffling the order of the plurality of shuffling blocks in time, converting the plurality of pre-encryption frames into a plurality of post-encryption frames by adding dummy voice signals to each of the plurality of pre-encryption frame after shuffling, and forming the encrypted voice signal which is unintelligible by adding synchronization signals for decryption periodically or aperiodically to the post-encryption frames; in the second encrypting/decrypting device of the second user, restoring the unintelligible encrypted voice signal to the original digital voice signal according to the decryption rules using the encryption key and applying the encryption rules reversely by detecting the synchronization signals in the encrypted voice received from the second communication device, separating the encrypted voice signal of the first user into the plurality of post-encryption frames using the detected synchronization signals as a base, extracting the shuffling blocks from each of the plurality of post-encryption frames and removing the dummy voice signal blocks, restoring the time order of the shuffling blocks and converting into the pre-encryption frames, and applying the predetermined time scale ratio reversely and performing the time-scale-modification (TSM) on the pre-encryption frames.

[11] The method may further comprise performing an analog/digital signal conversion and a digital/analog signal conversion for communicating the voice signal in analog between the first encryption/decryption device and the first communication device and between the second encryption/decryption device and the second communication device.

[12] The encryption rules may change randomly in the process of the encryption at least part of a number of samples forming the pre-encryption frame, a number of samples forming the shuffling blocks, a rearranging order of the shuffling blocks, a position to add the dummy voice signal and a number of samples each block of the added dummy voice signal, a position of the synchronization signal, and the time-scale ratio for the time-scale-modification.

[13] An algorithm for processing the time-scale-modification may comprise an overlap and add (OLA) type algorithm, the OLA algorithm comprising: analyzing by cutting out PCM data of the digital voice signal into a plurality of contiguous analysis windows, each of which has a predetermined size and neighboring analysis windows overlap by a predetermined distance; and combining by expanding the overlapping segments of the neighboring analysis windows according to a given time-scale ratio (a), combining data in the overlapping segment with weight, and combining the remaining data without weight.

[14] To increase the security, the dummy voice signal may be obtained from the user's digital voice signal stored in a storing device after being taken prior to the present encryption or from the user's voice signal taken in real-time during the present encryption.

[15] On the other hand, another aspect of the invention provides a voice signal encryption device, which comprises: a microphone configured to convert the voice into an analog voice signal; an encryption key provider configured to make the user set the encryption key by inputting; an analog/digital converter (ADC) configured to convert the analog voice signal from the microphone into a digital voice signal by sampling the analog voice signal; a storing device configured to provide a space for processing data and to store the voice signal, the encryption key, and an encryption program; and a central processing unit (CPU) configured to perform controlling and calculating for encrypting the digital voice signal by executing the encryption program. The encryption program may comprise an encryption function, which comprises: reducing the amount of the data with a predetermined time scale ratio by performing a time- scale-modification (TSM) on the digital voice signal according to a set of encryption rules using the encryption key, separating the time-scale-modification process digital voice signal into a plurality of contiguous pre-encryption frames, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks and shuffling the order of the plurality of shuffling blocks in time, converting the plurality of pre- encryption frames into a plurality of post-encryption frames by adding dummy voice signal blocks selected from the storing device to at lease part of the plurality of pre- encryption frame, and forming the encrypted voice signal which is unintelligible by adding synchronization signals for decryption periodically or aperiodically to the post- encryption frames.

[16] Still another aspect of the present invention provides a voice signal encryption/ decryption device to enable a real-time secured conversation between users using a same encryption key with the communication devices. The voice signal encryption/ decryption device comprises: an interface configured to provide an interface to communicate a voice signal with the communication device; a storing device configured to provide a space for processing data and to store the voice signal, the encryption key, and an encryption program; a central processing unit (CPU) configured to perform controlling and calculating for encrypting the digital voice signal and decrypting the encrypted voice signal by executing the encryption/decryption program; an input device configured to set the encryption key used for the encryption and the decryption; a microphone configured to convert the user's voice into an analog voice signal; a speaker configured to convert the analog voice signal input to a voice; an analog/digital converter (ADC) configured to convert the analog voice signal from the microphone into a digital voice signal by sampling the analog voice signal and providing the digital voice signal to the CPU; and a digital/analog converter (DAC) configured to convert the decrypted voice signal from the CPU into an analog voice signal and providing the analog voice signal to the speaker. Also, the encryption/ decryption program may comprise: an encryption function comprising reducing the amount of the data with a predetermined time scale ratio by performing a time- scale-modification (TSM) on the digital voice signal of the user provided by the ADC according to a set of encryption rules using the encryption key, separating the time- scale-modification processed digital voice signal into a plurality of contiguous pre- encryption frames, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks and shuffling the order of the plurality of shuffling blocks in time, converting the plurality of pre-encryption frames into a plurality of post- encryption frames by adding dummy voice signal blocks to at lease part of the plurality of pre-encryption frame, and forming the encrypted voice signal which is unintelligible by adding synchronization signals for decryption periodically or aperiodically to the post-encryption frames and transmitting the user's encrypted voice signals to a communication device of the other user through the interface; and a decryption function comprising detecting the synchronization signals in the encrypted voice received from the other side, separating the encrypted voice signal into the plurality of post- encryption frames using the detected synchronization signals as a base, extracting the shuffling blocks from each of the plurality of post-encryption frames and removing the dummy voice signal blocks, restoring the time order of the shuffling blocks and converting into the pre-encryption frames, applying the predetermined time scale ratio reversely and performing the time-scale-modification (TSM) on the pre-encryption frames, restoring the unintelligible encrypted voice signal to the original digital voice signal according to the decryption rules using the encryption key and applying the encryption rules reversely, and converting the original digital voice signal into an analog voice signal in the DAC and providing the analog voice signal to the speaker.

[17] The encryption/decryption program in the voice signal encryption/decryption device is configured to: convert the encrypted voice signal of the user into an analog voice signal through the DAC and provide the analog voice signal to the interface the interface in encryption process; and receive through the interface the encrypted analog voice signal of the other side provided by the communication device of the other user, convert the analog voice signal to a digital voice signal with the ADC, and provide the voice signal in the form of analog signal between the encryption/decryption programs and the communication devices of the users on both sides in decryption process. Also, the encryption/decryption device provides a function to set different encryption keys for different opposite sides so as to enable a secured phone conversation with each of a plurality of users in opposite sides.

[18] Interfacing between the encryption/decryption device provided by the interface and the communication device may be one of a wire type connecting the earphone jacks of the encryption/decryption device and the communication device with cable and a wireless type connecting wireless communication modules of the encryption/ decryption device and the communication device.

Advantageous Effects

[19] The voice signal encryption/decryption technology according to the present invention provides more improved security by removing almost all of the residual intelligibility in the remaining voice quality using the time scale modification of audio signal in the encryption algorithm. Also, since the encryption rules applied in encryption process are not static, but dynamically changing according to various factors, it is very hard to figure out the encryption rules, and even if the rules are figured out, due to the other factors such as a time elapse, the encryption rules become ineffective, a strong encryption/decryption is provided.

[20] Also, the encryption/decryption device according to the invention has a very wide range of application since it can be provided as not only a built-in type for conventional communication devices, but also an external type that does not require any change in the conventional communication devices.

[21] The encryption/decryption device according to the invention makes up the amount of data reduced from the real-time analog voice signal by adding the past voice signal of the user as dummy data, such that it can communicate the voice information in the form of analog signal with external devices without delay or interrupt of the voice information. Therefore, it is nearly impossible to break the encryption since applying various decryption rules while listening the analog signal is needed. Furthermore, since twofold or threefold encryptions are applied with a synthesis technique to make it hard to distinguish the real-time voice signal from the dummy data and a shuffling technique on the real-time voice signal, the decryption gets more and more difficult and it takes very long time to decrypt. Therefore, embodiments of the invention provide a real-time secured phone conversation. Brief Description of the Drawings

[22] Fig. 1 is a function block diagram illustrating an encryption device for converting voice signal to encrypted data through encryption process;

[23] Fig. 2 is a function block diagram illustrating a decryption device for restoring original voice signal by decrypting the encrypted data;

[24] Fig. 3 is a perspective view illustrating an ear-microphone type or hand-held type encryption/decryption device using the encryption device and the decryption device shown in Fig. 1 and Fig. 2, which is connected with a cell phone;

[25] Fig. 4 is a block diagram illustrating a hardware structure of an encryption/ decryption device;

[26] Fig. 5 is a flowchart illustrating an encryption algorithm for encrypting the voice of a user;

[27] Fig. 6 is a flowchart illustrating a decryption algorithm for decrypting the encrypted voice of another user in the other side;

[28] Fig. 7 is a diagram for explaining an encryption of voice signal according to the invention, in which (A) shows a dummy data block that separated and stored according to energy levels, (B) and (C) show a concept of dynamic TSM process, and (D) shows concepts of shuffling of the shuffling blocks of pre-encryption frame and adding dummy data blocks in the shuffling blocks; and

[29] Fig. 8 is a diagram illustrating adding dummy data by OLA using pre-encryption frame EF 2.

Best Mode for Carrying Out the Invention

[30] Inventive embodiments of the present invention are described in detail referring to the figures below.

[31] ( 1 ) Structure of encryption/decryption device

[32] Fig. 1 is a function block diagram illustrating an encryption device for converting voice signal to encrypted data through encryption process.

[33] An encryption device comprises an encryption engine (100) for converting pulse code modulation (PCM) data into encrypted digital data. For example, the PCM data inputted to the encryption engine (100) may be provided through an analog-digital converter (ADC) (150) for receiving analog voice signal and converting it into digital signal by sampling. Furthermore, the analog voice signal may be provided by a microphone (426). The encrypted digital voice data produced by the encryption engine (100) may be converted into an encrypted analog voice signal through the DAC (170) if necessary.

[34] The encryption engine (100) comprises a buffer (110), a time-scale modification

(TSM) processor (120), a dummy stack (130), and an encryptor (140). The buffer (110) stores the PCM data input temporarily. The TSM processor (120) reads the PCM data from the buffer (110) to a space for processing, corrects the scale of time in a predetermined ratio such that the amount of data is reduced. The dummy stack (130) is a storing space for storing dummy data, which is used to increase the degree of encryption by disguising the user's real-time voice data. The dummy stack (130) comprises a nonvolatile memory such that the stored data is maintained even without power supplied. The encryption engine (100) is provided with an encryption key for encrypting the PCM data. And, according to the rules determined by the encryption key, the encryption engine (100) performs a series of processes (described in detail below) such as reducing the amount of data through the TSM process of the PCM data, voice signal to be encrypted, shuffling for mixing the order of time of the reduced voice signal, adding dummy voice signal and synchronization signal.

[35] Fig. 2 is a function block diagram illustrating a decryption device for restoring original voice signal by decrypting the encrypted data. The decryption device comprises a decryption engine (200), which restores uncrypted PCM data by decrypting the encrypted digital voice data. If the encrypted voice signal is given as an analog voice signal, an ADC (240) needs to be disposed in front of the decryption engine (200). In such case, the analog voice signal is converted to a digital voice data while passing through the ADC (240) and then provided to the decryption engine (200). The final data from the decryption engine (200) is a PCM data, and in order to convert this to voice through a speaker (410), the decryption engine (200) comprises a digital- analog converter (DAC) (250) at the output, through which the PCM data is converted to the analog voice signal.

[36] The decryption engine (200) comprises a decryptor (210), a TSM processor (220), and a buffer (230). The decryptor (210) extracts the real-time voice data of the user only, to which applied doubly with the TSM process and shuffling, from an input data, the encrypted digital voice data, with the dummy data excluded, performs deshuffling on the real-time voice data, and restores the TSM data produced by the TSM processor (120) of the encryption engine (100). The TSM processor (220) restores the voice data (PCM data) before the TSM process from the TSM data provided by the TSM processor (120). And, the buffer (230) is configured to provide the PCM data serially to the DAC (250) by buffering the PCM data. The decryption engine (200), like the encryption engine (100), is provided with an encryption key from outside for decrypting the encrypted digital voice data, and performs the decryption according to the rules determined by the encryption key. The encryption engine (100) and the decryption engine (200) use the same encryption key.

[37] The encryption/decryption device of the present invention, for example, can be embodied as a ear-microphone type or hand-held type encryption/decryption device, which is connected to a cell phone and used conveniently if necessary. Fig. 3 is a perspective view illustrating an ear-microphone type (400a, 400b) or hand-held type encryption/decryption device (500a, 500b) using the encryption device and the decryption device shown in Fig. 1 and Fig. 2, which is connected with a cell phone (300a, 300b). The ear-microphone type encryption/decryption device (400a, 400b) includes cables connected to both sides of the body (420) of the ear- microphone type encryption/decryption device, an earphone speaker (410) is connected to one of the cables, and an earphone jack (430) is connected to the other cable. In the inside or outside of a case (428) of the ear-microphone type encryption/decryption device body (420), a selection button (422), a display (424), an encryption conversation button (425), a microphone (426), are a volume control dial (429) are installed. In case of the ear-microphone type encryption/decryption device (400a, 400b), the signal that it exchanges with the cell phone (300a, 300b) is an analog voice signal. However, since the output of the encryption engine (100) is an encrypted 'digital voice data', the DAC (170) is needed to convert it to analog signal. Also, since the input to the decryption engine (200) is the encrypted 'digital voice data', the ADC (240) is needed to convert the encrypted 'analog signal' provided by the cell phone to digital signal.

[38] Fig. 4 is a block diagram illustrating a hardware structure of an encryption/ decryption device (400a, 400b). Inside the ear-microphone type encryption/decryption device case (428), there are installed two ADC (150, 240), two DAC (170, 250), a RAM (446), a flash memory (448), a key input device (422-1) for converting an operation of the selection button (422) to electrical signal, a display driver (424-1) for driving the display (424), and a central processing unit (CPU) (450) installed on a PCB. The first ADC (150) is connected between the microphone (426) and the CPU (450), and the second ADC (240) is connected between the earphone jack (430) and the CPU (450). The first DAC (170) is connected between the CPU (450) and the earphone jack (430), and the second DAC (250) is connected between the CPU (450) and the speaker (410). And, the CPU (450) is connected with the two ADC (150, 240) and the two DAC (170, 250) and further with the key input device (422-1), the display driver (424-1), the RAM (446), the flash memory (448), and others. In this embodiment, since a path through which the voice signal of the user from the microphone (426) is encrypted in the CPU (450) and outputted to the outside by the earphone jack (430) is completely separated from another path through which the encrypted voice signal is transferred to the CPU (450) via the earphone jack (430) to be decrypted and converted to a voice by the speaker (410), it is possible to provide a full duplex communication.

[39] In order to enable the ear-microphone type encryption/decryption device (400a) to be used as a regular earphone, the encryption conversation button (425) is provided in the ear-microphone type encryption/decryption device (400a, 400b). The encryption conversation button (425) may not be provided separately, but instead, using the section button (422) and the key input device (422-1), by adding to the selection button (422) a function to select activating/deactivating the encryption/decryption device. The flash memory (448) stores the encryption/decryption program described below, dummy data, and the encryption key used to determine the encryption/decryption process rules described below. The flash memory (448) is provided as a nonvolatile memory, and other type of nonvolatile memory can be used as the storing device. The RAM (446) provides an operating space for the TSM processor (120, 220), the encryptor (140), and the decryptor (210), and further provides a space for the buffer (110, 230). The encryption engine (100) and the decryption engine (200) in Fig. 1 and Fig. 2 are provided by the CPU (450), the flash memory (448), and the RAM (446), which are combined with the encryption/decryption program.

[40] The selection button (422) provides functions for moving an active cell on a screen of the display (424), changing cyclically the alphabets and numbers displayed on a top line, and setting an encryption key by selecting the alphabets and numbers display on the active cell. The display (424) may include a screen comprising a top line to display candidates for the encryption key and a bottom line to display an encryption key selected from the candidates. The user can make the alphabets and numbers in each cell displayed on a cell right below the bottom line by selecting finally the alphabets and numbers when they are displayed as wanted by operating the selection button (422), for example, by changing the alphabets from A to Z cyclically in five cells on the left side and changing the numbers from 0 to 9 in four cells on the right side. The combination of alphabets and numbers displayed in the bottom line is stored in the flash memory (448), and they becomes the encryption key which is provided to the encryption engine (100) and the decryption engine (200).

[41] The encryption key is used to determine the rules used for the encryption engine

(100) to convert the PCM voice data to the encrypted digital data, and also used to determine the rules used for the decryption engine (200) to restore the original PCM data from the encrypted digital voice data. The users using the encrypted conversation have to use the same encryption key. In order to communicate securely and separately with a plurality of users using one earmicrophone type encryption/decryption device, it is needed to use separate encryption key for each of the plurality of other users. For this, the key input means comprising the selection button (422) and the key input device (422-1) includes information (for example, telephone number, name, etc.) specific to each of the plurality of other users and function for setting corresponding separate encryption key. The user stores the selected data in the flash memory (448) using the function. When a user wants to have an encrypted conversation with someone, the user can activate the corresponding encryption key by displaying a registered encryption key list on the display (424) and selecting the encryption key.

[42] The encryption/decryption device according to the invention can be designed as a hand-held type encryption/decryption device (500a, 500b) having a cell phone look as well as an ear-microphone type encryption/decryption device (400a, 400b). The handheld type encryption/decryption device (500a, 500b) comprises a microphone (510), a speaker (515), a keypad (520), and a display screen (525), which are housed in a cell phone type case. The encryption key may be set more conveniently. The user connects the earphone jack (530) to the cell phone (300a, 300b) and is able to use the hand-held type encryption/decryption device (500a, 500b) carrying instead of the cell phone (300a, 300b).

[43] The interface for exchanging voice signal between the communication device (for example, a cell phone) and the device according to the invention can be enabled by a wireless type connecting the two device wirelessly using wireless local area network modules (for example, bluetooth modules) built in the two devices, in addition to a wire type connecting the earphone jacks in the two devices with cables as described above.

[44] On the other hand, the encryption/decryption device according to the invention can be provided as a built-in device in a communication device such as a cell phone, wired or wireless telephone, and a two-way radio, or as an external device connected to the communication device by wire or wireless. For example, a cell phone with a function of an encryption/decryption device according to the invention may be provided. Conventional cell phone includes hardware needed to make an encryption/decryption device according to the invention, that is, all the elements shown in Fig. 4. Therefore, if one utilizes the given hardware properly, it is possible to make a cell phone having functions of the encryption/decryption device simply by installing software (refer to the encryption program and decryption program described below) which imparts the functions of encryption/decryption device. If a function to command start and finish of the encrypted conversation is given to a specific button of the cell phone, one can have an encrypted telephone conversation only when one wants to do so.

[45] In this case, the following differences can be considered further, comparing to the encryption/decryption device (400a, 400b or 500a, 500b). In the encryption/decryption device (400a, 400b or 500a, 500b), since the voice signal for the earphone jack of the cell phone (300a, 300b) is analog signal, the encrypted digital voice data from the encryption engine (100) has to go through a post-process converting the encrypted digital voice data to analog signal using the DAC (170) (refer to Fig. 1). However, in case of a cell phone having a built-in encryption/decryption device, the above process is not necessary, but it uses the encrypted digital voice data produced by the encryption engine (100) of the cell phone having a built-in encryption/decryption device as it is. Also, in case of decrypting the encryption of the received encrypted analog voice signal, since the encryption/decryption device (400a, 400b or 500a, 500b) receives the encrypted voice signal in analog, the analog signal goes through a pre-process converting it to the encrypted digital voice data using the ADC (240) before decrypting (refer to Fig. 2). However, in case of a cell phone having a built-in encryption/ decryption device, the above pre-process is not necessary, but the decryption engine (200) of the cell phone having a built-in encryption/decryption device can use the encrypted digital voice data received wirelessly. It would be obvious to a person skilled in the art to provide other communication devices having a built-in encryption/ decryption device by applying the method of providing a cell phone having a built-in encryption/decryption device.

[46] (2) Encryption/decryption Method

[47] Next, referring Fig. 5 and Fig. 7, encrypting a voice signal according to the invention and decrypting the encrypted voice data are explained below. For an explanation, an example of exchanging a real-time encrypted conversation with the encryption/decryption devices (400a or 500a) according to the invention connected to the cell phones (300a, 300b) of the two users talking to each other is described referring to Fig. 4. Fig. 5 is a flowchart illustrating an encryption algorithm for encrypting the voice of a user, and Fig. 6 is a flowchart illustrating a decryption algorithm for decrypting the encrypted voice of another user in the other side.

[48] 1) Encryption algorithm

[49] In Fig. 5, in order to encrypt voice, the user on a transmitting side sets an encryption key to be used to determine the encryption rules in the encryption/decryption device (400a) (step SlO). The encryption key is set as described above. For example, an encryption key is "CHOIS3625". Once the encryption key is set, a random number corresponding to the encryption key is generated as follows, for example. Factors such as the number of digits of the encryption key used in the invention, type and range of data used as the encryption key, and the random number corresponding to each of the digits of the encryption key are determined arbitrarily and reflected to the encryption/decryption program.

[50] Table 1

[51] This encryption key is stored in the flash memory (448) of the encryption/ decryption device (400a), and can be used frequently thereafter. In order to talk to a plurality of other users with one encryption/decryption device, one must apply different encryption key to each of the other users. In this case, setting encryption keys for the other users in advance and associating specific information (for example, serial number, name, or phone number) with the encryption keys to distinguish one from the others can be used conveniently. Further, in setting the encryption keys, it may be needed to provide a function to set and impart the specific additional information of the other users to the encryption/decryption device, which would not be described in detail because it is obvious to a skilled person of the art.

[52] When the user on the transmitting side speaks using the encryption/ decryption device (400a) with the encryption/decryption function activated, the voice is converted to an analog voice signal through the microphone (426), and the analog voice signal is sampled in the ADC (440) and converted to the PCM data (step S 15). The sampling rate is enough if it can transfer the voice information precisely, and for example it may be from about 16K to about 48K. The converted PCM data is delivered to the buffer (110) of the encryption engine (100).

[53] A part of the PCM data buffered in the buffer (110) is extracted as a dummy data and stored in the dummy stack (130) (step S20). The dummy data is a data to be mixed with the real-time voice in an encrypted conversation and to work as noise interfering comprehension of the real-time voice. Therefore, the dummy data needs to be camouflaged not to be distinguished from the real-time voice data. For this, it is desirable to use the voice signal of the present user which is being encrypted or a past voice signal of the user as the dummy data (that is, dummy voice signal). In other words, a part of the past voice signal of the same person whose voice is being encrypted can be taken as a dummy voice signal and stored in the memory, and then retrieved and used whenever needed. Or, part of the real-time voice signal that is being encrypted can be taken in real-time and then used as a dummy data. To obtain a dummy voice signal, the voice data in the buffer (10) can be taken by a predetermined amount, and stored in the dummy stack (130) or used in a read- time encryption.

[54] Also, it is desirable to avoid a small or very weak voice with low sound energy.

For example, data for 100- 150msec (called 'dummy data block') is taken from the buffer (110) at a time, and loaded into the dummy stack (130) only if the sum (Sly I) of absolute value of energy of the dummy data block is larger than a predetermined value. For a first encrypted phone call, it may be needed to save enough of the dummy data beforehand, and thereafter more dummy data can be collected consistently during encrypted phone calls or regular phone calls. The dummy stack (130) comprises the flash memory (448) and is able to keep using the stored dummy data once collected, which can be provided in a form of a circular buffer.

[55] As a way to strengthen the camouflage of the real-time voice by the dummy data, when the dummy data block is added to the real-time voice data, a data block, having an energy level which is closest to the energy level of the real-time voice data block (called 'addend data block') which is located next to the dummy data block, is used. For this, when the dummy data is stored in the dummy stack (130), the dummy data block is needed to be categorized according to the energy levels. Fig. 7(A) shows a case in which the dummy data blocks are categorized by the energy levels, which is categorized to three levels; high, intermediate, and low. In Fig. 7(A), DMs, DMm, and DMw stand for the sets of dummy data blocks having the high, intermediate, or low energy level, respectively. It is desirable to store hundreds of thousands of dummy data blocks per the dummy data block set and retrieve/use them whenever needed.

[56] A second way to strengthen the camouflage of the real-time voice by the dummy data is to set a limit to the number of use of the dummy data blocks. That is, the number of use in encryption of the dummy data block stored in the dummy stack (130) is counted, and if the number of use of a dummy data block exceeds a predetermined value, the dummy data block is replace with a new dummy data block, that is, a reserved dummy data block prepared beforehand. The dummy data blocks use too many times may reduce the camouflage. In such case, as shown in Fig. 7(A), the hundreds of thousands of dummy data blocks reserved according to the energy levels replaces the dummy data blocks which have been used too many times beyond a predetermined value when they occur. In Fig. 7(A), ADMs, ADMm, and ADMw stand for the sets of reserved dummy data blocks having the high, intermediate, or low energy level, respectively.

[57] Applying encryption to a real-time phone conversation is allowed to be performed after an enough number of dummy data blocks are reserved. A first step to encrypt voice is that the TSM processor (120) modifies the time scale of the PCM data stored in the buffer (110) so as to reduce the amount of data (step S30). The TSM process is for obtaining a time space for adding the dummy data block. For example, if an original real-time voice data (PCM data retrieved from the buffer (110)) is processed with the TSM process applying a time scale ratio of 1/2, the amount of data is reduced to a half, and dummy data blocks are added in the saved room of time. In TSM process, the time scale ratio may be set in a range from about 1/4.0 to about 1/1.2. If the time scale ratio is smaller than 1/4.0 then too much of the original sound information may be lost, and if the time ratio is larger than 1/1.2 then the amount of dummy data to be added may be little. Additionally, in order to make breaking of the encryption harder, the time scale ratio may not be fixed, but varying by performing, for example, a modular calculation using the selected encryption key (described later) or applying the selected encryption key to a random number table.

[58] Such a TSM method for this TSM process is proposed by the invention, and the invention is not limited to one method. Any TSM technology can be used if it can reduce the amount of the PCM data to a desired ratio. In order to make the distinction between the real-time voice data and the dummy data, it is desirable to adopt an algorithm for keeping the pitch information of the original sound in reducing the amount of data. The conventional TSM technology includes an overlap-add (OLA) method for modifying the scale in time, a synchronized overlap and add (SOLA) method, and a waveform similarity based overlap and add (WSOLA), which are OLA- series TSM techniques improved from the OLA method, and further includes a technique modifying the time scale in frequency space using FFT technology. These methods modify the time scale without losing much of the pitch information of the original sound. There is another method to reduce the amount of data using interpolation. One out of these can be adopted considering the performance and quality of sound required for the encryption/decryption device. A variable speed audio signal reproducing method using time-scale modification has been disclosed in the inventor's prior PCT application publications WO 2004/015688 (Title: Audio Signal Time-Scale Modification Method Using Variable Length Synthesis and Reduced Cross-Correlation Computations) and WO 2005/045830 (Title: Time-Scale Modification Method for Digital Audio Signal and Digital Audio/Video Signal, and Variable Speed Reproducing Method of Digital Television Signal by Using the Same Method). Detailed disclosure of the TSM processor (120) for the present invention can be found in these patent application publications. One skilled in the art will be able to realize the TSM processor (120) of the present invention referring the disclosure of the above application publication and the explanation below, and therefore the TSM technique of OLA series is disclosed briefly.

[59] A basic concept of the OLA series TSM techniques is cutting the input PCM data to obtain consecutive analysis windows, each having a predetermined size, such that neighboring analysis windows overlap by a predetermined length (analyzing step). Then, according to the given value of the time-scale ratio a (this becomes a ratio of variable speed reproducing mode to speed of normal reproducing mode, and called 'variable speed ratio'), the overlapping lengths between neighboring analysis windows are readjusted and summed for a plurality of analysis windows obtained in the analysis step. That is, according to the time-scale ratio a, the overlapping length between the neighboring analysis windows can be lengthened or shortened. The overlapping segment is combined by applying weight function to the neighboring windows (synthesis step). Non-overlapped segments are added as they are. If the amount of the PCM data is increased then the reproducing speed of the audio data gets slower by the increased amount, and if the amount is decreased then the reproducing speed of the audio data gets faster by the decreased amount. When a TSM technique of OLA series is used, in order to decrease the amount of the PCM data, the overlapping length between the neighboring analysis windows in the synthesis step should be longer than the original overlapping length between the neighboring analysis windows in the analysis step. The time- scale ratio a is defined theoretically by the ratio between the synthesis interval Ss and the analysis interval Sa (a=Ss/Sa), where the synthesis interval Ss is an interval between two starting points of the neighboring analysis windows when the plurality of analysis windows are rearranged in the synthesis step, and the analysis interval Sa is an interval between two starting points of the neighboring analysis windows when the original PCM data is separated into the plurality of analysis windows in the analysis step. In the OLA series TSM process, the synthesis interval Ss is a value given fixed, the value of the time-scale ratio a is a value given variable, and the analysis interval Sa is a value calculated by a function of the two values. Therefore, the amount of the PCM data can be reduced by a desired ratio by changing the value of time-scale ratio a as properly as desired.

[60] In the TSM process, for example, if modifying the PCM data of 200msec with the ratio of 1/1.5, the amount of data is reduced to about 130msec. A space in time (free interval) of about 70msec is obtained per every PCM data of 200msec. This space can be filled with the dummy data. If the dummy data camouflaging the real-time voice data makes the distinction from the original data, the residual intelligibility can be reduced significantly. In order to increase the degree of encryption in the TSM processing of the PCM data, in separating the PCM data into a plurality of analysis windows in the analysis step, the values of the time-scale ratio a can be varied, not fixed. A modular calculation may be needed for this.

[61] Fig. 7 is a diagram for explaining a concept of dynamic TSM process. As described above, the TSM process is performed on a predetermined amount of data every time. Suppose a unit of amount of the PCM data to be processed at a time to be a TSM frame (TF). First, the size of the TSM frame TF using the modular calculation is determined as follows. The basic value of size of the TSM frame TF and a divisor are determined as 5,000, for example, and 673 in advance. Also, which digit of the encryption key for using is going to be used to determine the analysis frame TF should be determined, and for example, they are determined to use the value of #1, #2, #3 in Table 1 beforehand. The keys in #1, #2, #3, that is, 31, 11, 23 are multiplied and then divided by the divisor 673 to obtain the remainder. The calculated remainder 440 is added to the basic value 5,000 of the analysis frame TF such that the size of corresponding analysis frame is determined to be 5,440, for example. Especially, to make the size different for different analysis frames, the digit of the key used to determine the size of the analysis frame may be changed according to a predetermined rule (for example, the position of the digit is increased by a predetermined value). Since a different key changes the remainder of the modular calculation, the size of the analysis frame is also determined differently. The time-scale ratio a is also determined differently for different analysis frames. Then, the sizes of the frames EF right after the TSM process

(pre-encryption frames) become different from one another. Fig. 7 (B) and (C) showing the TSM process as described in the above can be summarized in the following tables. As a result of the TSM process, the room of time per frame is obtained as in the following Table 2.

[62] Table 2

[63] Next, the encryptor (140) receives the TSM processed data from the TSM processor (120), and then performs a shuffling process on the TSM data (step S40). Such a shuffling is performed for each of the frame after the TSM process (EF ) (in this case, the frame after the TSM process (EF ) is a basic processing unit of the encryptor (140) and called 'pre-encryption frame'). Of course, the shuffling can be performed based on a different criterion. Each of the pre-encryption frames (EF , EF , ..., EF , ...) is separated into a plurality of data blocks (called 'shuffling block') for shuffling. For example, each of the TSM processed pre-encryption frames is separated into about 4-8 shuffling blocks having size of 20msec. The number of shuffling blocks may be fixed as a constant, or varying using the modular calculation. Suppose that each of the pre- encryption frames (EF , EF , ..., EF , ...) is separated into, for example, five shuffling

1 2 i blocks. These five shuffling blocks are mixed in time by shuffling process. As a result, in this step of shuffling process, the TSM processed data is separated into a plurality of consecutive data blocks, that is, 'shuffling blocks', and a predetermined number of shuffling blocks are grouped into a group (that is, pre-encryption frame), and the shuffling blocks in each of the groups are mixed in the order of time.

[64] Since the decryption is possible only when the shuffling order in the encryption is known, it is desirable that the shuffling rule is determined using the encryption key as explained in the above. The rules with which the shuffling blocks are mixed using the encryption key are very various. Any rule can be used. For the two users of the encrypted phone conversation on both sides can perform precisely deshuffling of the shuffling processed blocks if they share the encryption key.

[65] After performing rearranging (that is, shuffling) the time order of the shuffling blocks belonging to each of pre-encryption frame (EF , EF , ..., EF , ...), all or part of shuffling blocks are pulled apart in time, and the dummy data is filled in the resulted empty space to obtain the encrypted data (step S50). Every pre-encryption frame does not have to be added with dummy data. It may increase the degree of encryption by increasing the irregularity in adding dummy data. Since the voice of the transmitting side goes through several AD/DA conversions to reach the receiving side, there may be a difficulty in representing the TSM processed digital voice signal in terms of analog audio or converting the analog audio to a precise digital voice. However, by filling the dummy data in the room obtained by pulling all or part of shuffling blocks apart in time, the sounds can be connected naturally and the original PCM data can be delivered without distortion. For adding such dummy data, it is desirable to selecting dummy data blocks to add from the dummy stack (130), determining the adding position and size of the dummy block, performing combining the shuffling blocks and dummy blocks, and performing these steps varyingly using the encryption key.

[66] The criterion to select a dummy data to add is to select randomly out of the dummy blocks having the same energy levels as the shuffling block on which the dummy block is added. To increase the degree of encryption, it is desirable to add dummy blocks in as many locations as possible. For example, as shown in Fig. 7(D), all shuffling blocks are pulled apart in time to form empty spaces inbetweens, and the dummy data blocks are filled in all the resulted empty spaces inbetween as well as in front of the first shuffling block and behind of the last shuffling block. The size (this is equal to the size of the empty space between the shuffling blocks) of the added dummy data block can be fixed, but in order to increase the degree of encryption it is desirable to make the size varying. The dummy data blocks are added in front of the first shuffling block of each of the pre-encryption frame and behind the last shuffling block. Especially, a synchronization data (described later) is added in front of the first shuffling block instead of the dummy data. Furthermore, it is not required to make same the amounts of the data before and after the encryption process, but it is desirable to make them same. By doing that way, the real-time phone conversation is not interfered, and it becomes more difficult to distinguish the dummy data from the original voice data. In order to do so, preferably the added dummy data has the same amount of dummy data as the amount of reduced data by the TSM process. Fig. 7(D) shows that the shuffling blocks and the dummy blocks are disposed alternatingly. It is the case in which each of the pre- encryption frames (EF , EF , ..., EF , ...) is separated into five shuffling blocks and the

1 2 i five shuffling blocks are pulled apart in time, and the dummy data blocks are filled between the five shuffling blocks and in front of the first shuffling block. For example, the second pre-encryption frame EF is separated into five shuffling blocks B , B , B , B , B , and the dummy data blocks, D' , D' , D' , D' are added to four empty

23 24 ^J m25 w9 w21 m38 ^{r J} space provided between B , B , B , B , B , and the dummy block D' is added to ^{r r} 25 22 21 23 24 ^J s20 an external space provided in front of the first shuffling block B , so as to form a post- encryption frame SF . The size of the dummy block D' added to the external space in front of the post-encryption frame SF is fixed to a constant value, and the sizes of the five shuffling blocks, B , B , B , B , B , and the dummy data blocks, D' , D' , D' ^to 25 22 21 23 24 ^J m25 w9 w21 D' added to the four empty spaces are varying.

[67] The amount of data of the post-encryption frames (SF 1 , SF 2 , ..., SF i , ...) obtained by adding dummy data to the shuffling blocks this way is made to be equal to the amount of data of the corresponding TSM frames (TF 1 , TF 2 , ..., TF i , ...) before the encryption process. For example, in the case of post-encryption frame SF , since the frame size before the TSM process is 4,550 [sample] and the sum of the sample number of the five shuffling blocks is 2,758, when the size of the dummy data block to add to the empty space in the front is 400 [sample], the size of the remaining empty space is 1,392 [sample]. Thus, the average size of the dummy data blocks (D' , D' , D' , D' m25 w9 w21

) becomes 348 [sample], and the sizes of the dummy data blocks, D' , D' , D' , m38 m25 w9 w21

D' are allotted as the values 310, 357, 344, 381 [sample] which are obtained by m.38 adding the average size to the remainders obtained from the modular calculations using the encryption key as described above. Furthermore, in order to vary the size of the dummy data block for each pre-encryption frame, the serial number of the encryption key used in the modular calculation can be changed according to a predetermined rule, for example, for each of the pre-encryption frames. The sizes of the five shuffling blocks (B , B , B , B , B ) can be determined using a similar method as in de-

21 22 23 24 25 termining the sizes of the dummy data blocks. Table 3 below shows a structure of the post-encryption frame SF obtained by the above way. [68] [Table 3]

[69] Structure of post-encryption frame SF

[70]

[71] When the dummy blocks are added to the shuffling blocks, the shuffling block, realtime voice data, and the dummy data block may be combined smoothly leaving no trace of boundary so as to make it hard to distinguish the two blocks. For this, the dummy data block and the neighboring shuffling block are overlapped by a predetermined length and combined by applying a weight function. Especially, the connection can be smoother if applying the overlap-add (OLA) at a point where the shuffling block and the dummy block have a maximum cross-correlation. Here, varying the overlapping length for every pre-encryption frame makes breaking the encryption more difficult. In order to vary the overlapping length, for example, values for the overlapping length and rules to select one out of the values are determined beforehand, and one value for the overlapping length is selected from the values according to the rules. Further, as discussed above, in selecting a dummy data block to fill in a specific empty space, it is desirable to select a dummy data block which has a energy level closest to energy level of one or both of two shuffling blocks on the right and left sides of the empty space.

[72] Fig. 8 is a diagram illustrating adding dummy data by OLA using pre-encryption frame EF . First, when the pre-encryption frame is separated into five shuffling blocks (B₂₁, B₂₂, B₂₃, B₂₄, B₂₅), further data is taken from the neighboring shuffling blocks on the right and left sides of the shuffling block. It does not matter even if the additional data is taken from only one side out of the right and left sides. The additional data is taken as a part of the dummy data by overlap-adding with the dummy data, and this gives a smooth connection between the shuffling block and the dummy data block. For example, on both side of the shuffling block B are added a predetermined amount of additional data E and E , respectively, and the dummy data blocks D and D xl x2 m25 w9 added to both sides of the shuffling block B are overlap-added with the additional data E xl and E x2. In overlap-adding, it is desirable to find a maximum cross-correlation point between the dummy data block D and the additional data E and a maximum m25 xl cross-correlation point between the dummy data block D and the additional data E w9 x2 and performs overlap-adding at a location closest to the maximum cross-correlation points. Overlap-adding is performed using, for example, a linear function as the weight function. For example, a synthesis data (OA) obtained by overlap-adding the dummy data block D and the additional data E is disposed next to the left of the shuffling block B , the entire remaining empty space to the left is filled with a part of the dummy data block D , and the remaining data is discarded. The dummy data block D m25 is also added after overlap-adding with the additional data E in a similar way. Adding dummy data block to the other shuffling blocks is performed in the same way, so as to obtain the post-encryption frame SF having the structure of Fig. 8(C). Of course, instead of using the data taken from the neighboring shuffling blocks for data of shuffling block overlapping with the dummy data block, a sample data of the shuffling block itself to which the dummy data block is added can be used.

[73] As explained in the above, in the encryption method according to the invention, a precise decryption is possible only when the same encryption key is known. Especially, by changing the encryption rules dynamically with various values changing while the phone conversation is being made, the security of the encryption is heightened further. Considering the practicability, it is desirable to change, for example, at least size and the overlapping interval of the TSM frame in TSM process (that is, number of samples in the pre-encryption frame), number of sample in the shuffling block and the dummy data block, order of rearranging the shuffling blocks. The rules for changing can be made by combining properly the encryption key which the users share, modular calculation, table prepared beforehand, elapse of phone conversation time (that is, increase of processed frames), etc.

[74] In a real-time phone conversation between two users, in order to decrypt the encrypted voice data of the transmitting side precisely in the receiving side, a precise synchronization between the transmitting and receiving sides are needed. The encryption/decryption device in the transmitting side transmits to the receiving side the encrypted voice data comprising a synchronization data, and the encryption/decryption device in the receiving side, if set with the same encryption key as the transmitting side, finds out the synchronization data in the encrypted voice data and extracts the real-time voice data of the transmitting side based on the found synchronization data.

[75] The synchronization data, for example, can be made using a sinusoidal wave having a predetermined frequency. That is, for example, sampled sine waves of several periods can be used as the synchronization data. One period of sine wave has three points, starting point, a middle point (zero-crossing point), and an ending point, at which the sample value is 0, and these can be used to detect the synchronization data in the voice data in the form of sine wave easily. The synchronization data, as shown in Fig. 7(D), can be added instead of dummy data block to the empty space in front of the first post-encryption frame after the encryption conversation button (425) is pressed, and the synchronization data can be configured to have the same size as the dummy data block. When the encrypted voice data comprising such synchronization data is received and decrypted, for example, the starting point of the real-time voice data can be calculated by finding the zero-crossing point of the synchronization data and considering the total number of samples of the synchronization data. Since the synchronization may not be maintained due to problems such as noise during the encrypted conversation, it is desirable to insert the synchronization data to the encryption data periodically. The encryption/decryption devices of the users in the transmitting and receiving sides are configured to monitor periodically whether the received voice data contains synchronization data of the above. If a new synchronization data is detected, the decryption is performed based on it. In order to make it simple to initiate the encrypted phone conversation mode, if one side activates the encryption/decryption function of his own encryption/decryption device by pressing the encryption conversation button (425), it is configured to generate the synchronization data right away, and the encryption/decryption device of the other side enters to the encrypted conversation mode automatically once it detects the synchronization data in the encrypted voice data even though the encryption conversation button (425) is not pressed.

[76] The encryptor (140) produces the encrypted data in the way described above, and then converts it to the encrypted analog voice signal using the DAC (170) before providing to the cell phone (300a) (step S60). The encrypted analog voice signal obtained this way is provided to the cell phone (300a) through the earphone jack (430), and then transmitted to the cell phone (300b) in the receiving side after a predetermined process for transmission (step S70).

[77] 2) Decryption algorithm

[78] Next, referring to Figs. 2-4 and 6, described is a decryption algorithm for receiving the encrypted analog voice signal of the user in the transmitting side from the cell phone (300b) in the receiving side and restoring the original voice signal before the encryption by decrypting the encrypted analog voice signal.

[79] For decryption, it is needed to know the encryption rules applied in the encryption process. The encryption rules are determined by the encryption key. Therefore, the same encryption key as applied in the encryption is used in the decryption. For this, first of all, the user in the receiving side sets the same encryption key as the user in the transmitting side in his encryption/decryption device (400b or 500b) (step SlOO). In such a situation, the cell phone (300b) of the receiving side receives the encrypted voice signal of the transmitting side from the cell phone (300a) in the transmitting side (step S 105). The encryption/decryption device (400b or 500b) connected to the cell phone (300b) in the receiving side is provided with the encrypted voice signal from the cell phone (300b) in analog signal. This encrypted analog voice signal is converted to the digital voice data going through the ADC (240), and then provided to the decryptor (210) (step SI lO). The ADC (240) applies the same sampling rate and resolution (for example, 44.1K of sampling rate and 16bit of sampling resolution) as applied in the encryption process.

[80] The decryptor (210) removes dummy data contained in the encrypted digital voice data and extracts only the shuffling blocks (that is, shuffling processed TSM data) by applying reversely the rules to add dummy data used in the encryption using the encryption key (step S 120).

[81] In order for such a data processing to be possible, first of all, the encryption data in the encrypted digital voice data must be found. For example, in a case that the synchronization data is made by sampling a sine wave, the synchronization data can be found, as described above, by finding the zero crossing point of the sine wave and finding out the starting point of the synchronization data based on the zero crossing point. Once the starting point of the encryption data is found, the size of each of the post-encryption frames (SF , SF , ..., SF , ...) is calculated based on it. Since the size of each of the post-encryption frames is equal to that of each of the TSM frames before the TSM process shown in Fig. 7(B), it can be found by applying the rules for determining the size of the TSM frame. After finding out the size of each of the post- encryption frames, the locations of the starting and ending points of the shuffling blocks contained in each of the post-encryption frames are calculated. Using a fact that there is disposed always the same amount (for example, 200 samples) of data in front of the first shuffling block of all the post-encryption frames, it is easy to know the starting point of the first shuffling block. Also, the rules for determining the size of each of the shuffling blocks and the dummy data blocks are same in the encryption/ decryption device (400a or 500a) in the transmitting side and the encryption/decryption device (400b or 500b) in the receiving side if the encryption key is same. Therefore, once the starting point of the synchronization data is known, only the shuffling blocks can be extracted from each of the post-encryption frames.

[82] The decryptor (210), after extracting the shuffling blocks of each of the post- encryption frames (SF , SF , ..., SF , ...), restores their order of time to an original state by rearranging the shuffling blocks contained in each of the post-encryption frames in the original order of time (step S 130). This is done by applying reversely the shuffling rules which was applied to the encryption using the encryption key. By this deshuffling process, the pre-encryption frames (EF , EF , ..., EF , ...) right after the TSM process shown in Fig. 7(C) are obtained.

[83] The pre-encryption frames (EF 1 , EF 2 , ..., EF i , ...) extracted by the decryptor (210) are delivered to the TSM processor (220), and the TSM processor (220) performs the TSM process by applying reversely the time-scale ratio applied in the encryption/ decryption device (400a) in the transmitting side using the encryption key (step S 140). According to the TSM process, each of the pre-encryption frames (EF 1 , EF 2 , ..., EF i , ...) is restored to have almost the same amount of data as the TSM frames (TF , TF , ..., TF , ...) before the TSM process and approximately the same size of samples. If these

TSM frames (TF , TF , ..., TF , ...) are connected serially, it becomes the PCM data before the encryption in the encryption/decryption device (400a) in the transmitting side.

[84] The restored PCM data is temporarily buffered in the buffer (230), delivered to the

DAC (250), and converted to the analog voice signal (step S 150). And, the analog voice signal is delivered to the earphone speaker (410) and then restored to a voice (step S 160).

[85] If encountered a post-encryption frame comprising a new synchronization data in the process of decryption, the decryption process as in the above is performed from then on. The encryption process and the decryption process are independently separated into transmitting and receiving sides in hardware and also in software, such that a full duplex phone conversation is possible. Industrial Applicability

[86] Even though the invention is explained using some embodiments in the above, one skilled in the are will recognize clearly that many various possible changes are possible without leaving the scope of the invention. For example, in order to strengthen the degree of encryption, in addition to the inter-block shuffling, that is, shuffling between the shuffling blocks in the pre-encryption frame, by performing an intra-block shuffling for mixing the time order of the samples of at least a part of the shuffling blocks in the post-encryption frame, the post-encryption frames can be formed. This intra-block shuffling rules can use a modular calculation or others using the encryption key. Since a reasonable degree of encryption can be obtained only with the TSM process, inter-block shuffling, and intra-block shuffling without adding the dummy data, an appropriate synthesis of the above encryption methods according to the required encryption level can be used. Also, the method to add the dummy data block can be changed in many ways. For example, the dummy data blocks can be filled only in a part of the empty spaces between the shuffling blocks in each of the pre- encryption frames, and a dummy block can be added in the end of each of the pre- encryption frames.

[87] Also, the rules obtained using the encryption key set by the user can be figured to be changed by using information such as time, day, date of the phone conversation, or number of phone calls with the other user. The encryption rules can be made by using a proper number given to the encryption/decryption device along with the encryption key. Using these methods may make it harder for a third party to eavesdrop. Further, the above embodiments have referred to a cell phone to explain, but they can be applied to other communication devices including wired telephones or VoIP phones. Also, the above embodiments are about 1 to 1 phone conversation, but it can be applied to a conference call in which more than or equal to three persons converse or a 1 to multi parties two-way radio communication.

Claims

[1] A method for voice signal encryption/decryption, the method comprising steps of: reducing the amount of data by performing a time-scale modification (TSM) on a digital voice signal of a user in transmitting side; shuffling the time-scale modification processed digital voice signal by separating the time-scale modification processed digital voice signal into a plurality of contiguous pre-encryption frames, separating each of the plurality of pre- encryption frames into a plurality of shuffling blocks, and shuffling order of the plurality of shuffling blocks in time; generating a plurality of post-encryption frames by adding dummy voice signals to each of the plurality of pre-encryption frame after shuffling, wherein the voice signal is converted to the encrypted voice signal which is unintelligible.

[2] The method of Claim 1, wherein at least part of the steps including the time-scale modification process, the shuffling process, and the dummy voice signal add process are performed according to a predetermined encryption rules which are set by an encryption key.

[3] The method of Claim 2, further comprising adding synchronization signal needed for decrypting the encrypted voice signal periodically or aperiodically to the plurality of pre-encryption frame along with adding the dummy voice signal.

[4] The method of Claim 3, further comprising a step of decrypting the encrypted voice signal of the user in transmitting side according to decryption rules which apply the encryption rules reversely using the same encryption key.

[5] The method of Claim 4, wherein the step of decrypting comprises steps of: detecting the synchronization signal contained in the encrypted voice signal; converting to the pre-encryption frames by separating the encrypted voice signal into a plurality of post-encryption frames based on the detected synchronization signal, extracting the shuffling blocks from each of the post-encryption frames with the dummy voice signal excluded, and restoring the order of time of the shuffling blocks; and restoring the original digital voice signal of the user by performing the time-scale modification on the pre-encryption frames with the time-scale ratio applied reversely.

[6] The method of any one of Claims 2-5, wherein the encryption rules change randomly in the process of the encryption at least part of number of samples forming the pre-encryption frame, number of samples forming the shuffling blocks, rearranging order of the shuffling blocks, number and position of samples of the added dummy voice signal blocks, position of the synchronization signal, and the time-scale ratio applied for the time-scale-modification.

[7] The method of Claim 6, wherein the time- scale ratio is in the range from about

1/4.0 to about 1/1.2.

[8] The method of Claim 6, further comprising a step of rearranging at least part of the shuffling blocks after mixing the order of time of the samples in the shuffling blocks.

[9] The method of Claim 6, wherein an algorithm used for the time- scale-modification comprises an overlap and add (OLA) type algorithm, the OLA algorithm comprising: analyzing by cutting out PCM data of the digital voice signal into a plurality of contiguous analysis windows, each of which has a predetermined size and neighboring analysis windows overlap by a predetermined distance; and combining by expanding the overlapping segments of the neighboring analysis windows according to a given time- scale ratio (a), combining data in the overlapping segment with weight, and combining the remaining data without weight.

[10] The method of Claim 6, wherein the dummy voice signal is obtained from the user's digital voice signal stored in a storing device after being taken prior to the encryption or from the user's voice signal taken in real-time during the encryption.

[11] The method of Claim 6, wherein adding the dummy voice signal comprises separating the dummy voice signal into blocks and assigning a level out of a plurality of energy levels according to absolute value of energy to each of the blocks; adding the dummy voice signal in empty spaces provided by pulling at least part of the shuffling blocks of at least part of the pre-encryption frames apart in time, wherein the dummy voice signal block, having an energy level closest to energy levels of the shuffling block to which the dummy voice signal block is added, is selected and added.

[12] The method of Claim 6 or Claim 11, wherein when the dummy voice signal block is added the dummy voice signal block is overlapped with a part of samples of the shuffling block to which the dummy voice signal block is added and the samples in the overlapped segments are combined with weight.

[13] A method for encrypting/decrypting a real-time telephone conversation voice between a first user using a first encrypting/decrypting device built in or connected to a first communication device and a second user using a second encrypting/decrypting device built in or connected to a second communication device, the method comprising: the first and second users' setting the first and second encrypting/decrypting devices respectively to an identical encryption key; in the first encrypting/decrypting device of the first user, transmitting an unintelligible encrypted voice signal to the second communication device of the second user by converting the voice of the first user to a digital voice signal, reducing the amount of the data with a predetermined time scale ratio by performing a time-scale-modification (TSM) on the digital voice signal according to a set of encryption rules determined using the encryption key, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks, shuffling the order of the plurality of shuffling blocks in time, converting the plurality of pre-encryption frames into a plurality of post- encryption frames by adding dummy voice signals to each of the plurality of pre- encryption frame after shuffling, and forming the encrypted voice signal which is unintelligible by adding synchronization signals for decryption periodically or aperiodically to the post-encryption frames; and in the second encrypting/decrypting device of the second user, restoring the unintelligible encrypted voice signal to the original digital voice signal according to the decryption rules using the encryption key and applying the encryption rules reversely by detecting the synchronization signals in the encrypted voice received from the second communication device, separating the encrypted voice signal of the first user into the plurality of post-encryption frames using the detected synchronization signals as a base, extracting the shuffling blocks from each of the plurality of post-encryption frames and removing the dummy voice signal blocks, restoring the time order of the shuffling blocks and converting into the pre-encryption frames, and applying the predetermined time scale ratio reversely and performing the time-scale-modification (TSM) on the pre- encryption frames.

[14] The method of Claim 13, further comprising performing an analog/digital signal conversion and a digital/analog signal conversion for communicating the voice signal in analog between the first encryption/decryption device and the first communication device and between the second encryption/decryption device and the second communication device.

[15] The method of Claim 13 or Claim 14, wherein the encryption rules change randomly in the process of the encryption at least part of a number of samples forming the pre-encryption frame, a number of samples forming the shuffling blocks, a rearranging order of the shuffling blocks, a position to add the dummy voice signal and a number of samples each block of the added dummy voice signal, a position of the synchronization signal, and the time-scale ratio for the time-scale-modification.

[16] The method of Claim 13 or Claim 14, wherein the algorithm for processing the time-scale-modification comprises an overlap and add (OLA) type algorithm, the OLA algorithm comprising: analyzing by cutting out PCM data of the digital voice signal into a plurality of contiguous analysis windows, each of which has a predetermined size and neighboring analysis windows overlap by a predetermined distance; and combining by expanding the overlapping segments of the neighboring analysis windows according to a given time-scale ratio (a), combining data in the overlapping segment with weight, and combining the remaining data without weight.

[17] The method of Claim 13 or Claim 14, wherein the dummy voice signal is obtained from the user's digital voice signal stored in a storing device after being taken prior to the present encryption or from the user's voice signal taken in realtime during the present encryption.

[18] The method of Claim 13 or Claim 14, wherein the dummy voice signal is added in empty spaces provided by pulling at least part of the shuffling blocks of at least part of the pre-encryption frames apart in time, wherein the dummy voice signal block is overlapped with a part of samples of the shuffling block to which the dummy voice signal block is added and the samples in the overlapped segments are combined with weight.

[19] The method of Claim 13 or Claim 14, wherein if the second encryption/ decryption device detects the synchronization data in the encrypted voice data of the first user the second encryption/decryption device enters to the encrypted conversation mode automatically even though the encryption conversation button is not pressed.

[20] The method of Claim 13 or Claim 14, wherein the amount of the added dummy data is same as the amount of reduced data by the TSM process.

[21] A voice signal encryption device comprising: a microphone configured to convert the voice into an analog voice signal; an encryption key provider configured to make the user set the encryption key by inputting; an analog/digital converter (ADC) configured to convert the analog voice signal from the microphone into a digital voice signal by sampling the analog voice signal; a storing device configured to provide a space for processing data and to store the voice signal, the encryption key, and an encryption program; and a central processing unit (CPU) configured to perform controlling and calculating for encrypting the digital voice signal by executing the encryption program, wherein the encryption program comprises an encryption function, which comprises: reducing the amount of the data with a predetermined time scale ratio by performing a time-scale-modification (TSM) on the digital voice signal according to a set of encryption rules using the encryption key, separating the time-scale-modification process digital voice signal into a plurality of contiguous pre-encryption frames, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks and shuffling the order of the plurality of shuffling blocks in time, converting the plurality of pre-encryption frames into a plurality of post-encryption frames by adding dummy voice signal blocks selected from the storing device to at lease part of the plurality of pre-encryption frame, and forming the encrypted voice signal which is unintelligible by adding synchronization signals for decryption periodically or aperiodically to the post- encryption frames.

[22] The device of Claim 21, wherein the encryption rules change randomly in the process of the encryption at least part of number of samples forming the pre- encryption frame, number of samples forming the shuffling blocks, rearranging order of the shuffling blocks, number and position of samples of the added dummy voice signal blocks, position of the synchronization signal, and the time- scale ratio applied for the time-scale-modification.

[23] The device of Claim 21 or Claim 22, wherein an algorithm used for the time- scale-modification comprises an overlap and add (OLA) type algorithm, the OLA algorithm comprising: analyzing by cutting out PCM data of the digital voice signal into a plurality of contiguous analysis windows, each of which has a predetermined size and neighboring analysis windows overlap by a predetermined distance; and combining by expanding the overlapping segments of the neighboring analysis windows according to a given time-scale ratio (a), combining data in the overlapping segment with weight, and combining the remaining data without weight.

[24] The device of any one of Claims 21-23, wherein the dummy voice signal is obtained from the user's digital voice signal stored in a storing device after being taken prior to the encryption or from the user's voice signal taken in real-time during the encryption.

[25] A voice signal encryption/decryption device to enable a real-time secured conversation between users using a same encryption key with the communication devices, the device comprising: an interface configured to provide an interface to communicate a voice signal with the communication device; a storing device configured to provide a space for processing data and to store the voice signal, the encryption key, and an encryption program; a central processing unit (CPU) configured to perform controlling and calculating for encrypting the digital voice signal and decrypting the encrypted voice signal by executing the encryption/decryption program; an input device configured to set the encryption key used for the encryption and the decryption; a microphone configured to convert the user's voice into an analog voice signal; a speaker configured to convert the analog voice signal input to a voice; an analog/digital converter (ADC) configured to convert the analog voice signal from the microphone into a digital voice signal by sampling the analog voice signal and providing the digital voice signal to the CPU; and a digital/analog converter (DAC) configured to convert the decrypted voice signal from the CPU into an analog voice signal and providing the analog voice signal to the speaker, wherein the encryption/decryption program comprise: an encryption function comprising reducing the amount of the data with a predetermined time scale ratio by performing a time-scale-modification (TSM) on the digital voice signal of the user provided by the ADC according to a set of encryption rules using the encryption key, separating the time-scale-modification processed digital voice signal into a plurality of contiguous pre-encryption frames, separating each of the plurality of pre-encryption frames into a plurality of shuffling blocks and shuffling the order of the plurality of shuffling blocks in time, converting the plurality of pre-encryption frames into a plurality of post- encryption frames by adding dummy voice signal blocks to at lease part of the plurality of pre-encryption frame, and forming the encrypted voice signal which is unintelligible by adding synchronization signals for decryption periodically or aperiodically to the post-encryption frames and transmitting the user's encrypted voice signals to a communication device of the other user through the interface; and a decryption function comprising detecting the synchronization signals in the encrypted voice received from the other side, separating the encrypted voice signal into the plurality of post-encryption frames using the detected synchronization signals as a base, extracting the shuffling blocks from each of the plurality of post-encryption frames and removing the dummy voice signal blocks, restoring the time order of the shuffling blocks and converting into the pre-encryption frames, applying the predetermined time scale ratio reversely and performing the time-scale-modification (TSM) on the pre-encryption frames, restoring the unintelligible encrypted voice signal to the original digital voice signal according to the decryption rules using the encryption key and applying the encryption rules reversely, and converting the original digital voice signal into an analog voice signal in the DAC and providing the analog voice signal to the speaker.

[26] The device of Claim 25, wherein the encryption/decryption program in the voice signal encryption/decryption device is configured to: convert the encrypted voice signal of the user into an analog voice signal through the DAC and provide the analog voice signal to the interface the interface in encryption process; and receive through the interface the encrypted analog voice signal of the other side provided by the communication device of the other user, convert the analog voice signal to a digital voice signal with the ADC, and provide the voice signal in the form of analog signal between the encryption/decryption programs and the communication devices of the users on both sides in decryption process.

[27] The device of Claim 25 or Claim 26, wherein the ADC comprises: a first ADC for sampling the analog voice signal of the user provided by the microphone, converting it to digital voice signal, and providing the digital voice signal to the CPU; and a second ADC for sampling the analog voice signal of the other side provided through the interface, converting it to digital voice signal, and providing the digital voice signal to the CPU, wherein the DAC comprises: a first DAC for converting the encryption voice signal of the user provided by the CPU into analog voice signal and providing the analog voice signal to the interface; and a second DAC for converting the decrypted voice signal of the other side provided by the CPU into analog voice signal and providing the analog voices signal to the speaker.

[28] The device of Claim 25 or Claim 26, wherein the encryption/decryption device provides a function to set different encryption keys for different other sides so as to enable a secured phone conversation with each of a plurality of users in other sides.

[29] The device of Claim 25 or Claim 26, wherein the encryption rules change randomly in the process of the encryption at least part of number of samples forming the pre-encryption frame, number of samples forming the shuffling blocks, rearranging order of the shuffling blocks, number and position of samples of the added dummy voice signal blocks, position of the synchronization signal, and the time-scale ratio applied for the time-scale-modification.

[30] The device of Claim 25 or Claim 26, wherein the dummy voice signal is obtained from the user's digital voice signal stored in a storing device after being taken prior to the encryption or from the user's voice signal taken in real-time during the encryption.

[31] The device of Claim 25 or Claim 26, wherein interfacing provided by the interface between the encryption/decryption device and the communication device is provided by one of a wire type connecting the earphone jacks of the encryption/decryption device and the communication device with cable and a wireless type connecting wireless communication modules of the encryption/ decryption device and the communication device.