CN106157961B

CN106157961B - Voice signal processing method and device

Info

Publication number: CN106157961B
Application number: CN201510167312.6A
Authority: CN
Inventors: 姜山; 刘智勇; 郎冬玲
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2015-04-09
Filing date: 2015-04-09
Publication date: 2020-01-31
Anticipated expiration: 2035-04-09
Also published as: CN106157961A

Abstract

A method for processing voice signals includes converting collected time domain voice signals into digital voice signals and carrying out framing processing to obtain corresponding digital voice frames, carrying out frequency domain conversion on the obtained digital voice frames to obtain corresponding frequency domain voice data, encrypting the frequency domain voice data corresponding to a preset number of digital voice frames in a preset encryption period by using corresponding encryption information respectively, wherein the encryption information comprises an encryption key and an encryption algorithm, encrypting the frequency domain voice data corresponding to the digital voice frames with bit sequences in each encryption period by using the same encryption key and the same encryption algorithm, carrying out time domain conversion on the encrypted frequency domain voice data to obtain encrypted digital voice data frames, and compressing and sending the encrypted digital voice data frames.

Description

Voice signal processing method and device

Technical Field

The present invention relates to the field of voice communication technologies, and in particular, to methods and apparatuses for processing voice signals.

Background

When the mobile terminal performs voice communication in the operator network, because the voice signal has the characteristics of short-time stationarity and partial quasi-periodicity, the signal has redundancy, and in the prior art performs compression coding on the voice source of the sending end equipment, and then performs decoding after reaching the receiving end, so as to improve the transmission efficiency of the voice signal and the utilization efficiency of the transmission medium.

In order to improve the security of voice communication, a voice encryption technology is usually adopted to encrypt a voice source sent by a sending end device. However, the voice encryption technology in the prior art is limited by the used key and encryption and decryption algorithm, and has the problem of poor voice communication security.

Disclosure of Invention

The embodiment of the invention solves the problem of how to improve the safety of voice data transmission between mobile terminals.

To solve the above problem, an embodiment of the present invention provides methods for processing a speech signal, where the method includes:

converting the collected time domain voice signals into digital voice signals and performing framing processing to obtain corresponding digital voice frames;

performing frequency domain conversion on the obtained digital voice frame to obtain corresponding frequency domain voice data, wherein the frequency domain voice data comprises frequency information of the corresponding digital voice frame;

encrypting the frequency domain voice data corresponding to a preset number of digital voice frames by using corresponding encryption information in a preset encryption period respectively, wherein the encryption information comprises an encryption key and an encryption algorithm, and the frequency domain voice data corresponding to the digital voice frames with the same bit sequence in each encryption period are encrypted by using the same encryption key and the same encryption algorithm;

carrying out time domain conversion on the encrypted frequency domain voice data to obtain an encrypted digital voice data frame;

and compressing and transmitting the encrypted digital voice data frame.

Optionally, the frequency domain voice data corresponding to each data voice frame includes an invalid voice frequency band and an effective voice frequency band, the invalid voice frequency band does not include voice information, and the effective voice frequency band includes voice information;

the encrypting method is characterized in that the frequency domain voice data corresponding to the preset number of digital voice frames in the preset encrypting period are respectively encrypted by adopting corresponding encrypting information, and comprises the following steps:

and in the encryption period, adding frame synchronization information in the invalid voice frequency band of the frequency domain voice data corresponding to each digital voice frame, and encrypting the valid voice frequency band of the frequency domain voice data corresponding to each digital voice frame by adopting corresponding encryption information.

Optionally, the method further comprises:

when the preset renegotiation condition is met, sending a request for changing the encryption information to receiving end equipment through in-band signaling;

when receiving information of agreeing to change the encrypted information replied by the receiving end equipment through in-band signaling, sending the information of the switching time point of the changed encrypted information to the receiving end equipment through in-band signaling;

and when receiving the confirmation information that the receiving end equipment receives the switching time point information, determining that the negotiation with the receiving end equipment is successful in changing the encrypted information.

Optionally, the preset renegotiation condition includes: and reaching the preset time, or receiving a renegotiation request of the user for requesting to change the encrypted information.

The embodiment of the invention also provides voice signal processing methods, which comprise:

decompressing the received digital voice data frame which is subjected to the compression and encryption processing to obtain the digital voice data frame which is subjected to the encryption processing;

converting the encrypted digital voice data frame into corresponding frequency domain voice data, and decrypting the frequency domain voice data by adopting corresponding decryption information, wherein the decryption information comprises a decryption key and a decryption algorithm;

and converting the decrypted frequency domain voice data into a corresponding digital voice frame and outputting the digital voice frame.

Optionally, the frequency domain voice data includes an invalid voice frequency band and an effective voice frequency band, the invalid voice frequency band does not include voice information, and the effective voice frequency band includes voice information;

the converting the encrypted digital voice data frame into corresponding frequency domain voice data and decrypting the frequency domain voice data by using corresponding decryption information includes:

and analyzing corresponding frame synchronization information from the invalid voice frequency band of the frequency domain voice data, and decrypting the valid voice frequency band in the frequency domain voice data by adopting decryption information corresponding to the analyzed frame synchronization information to obtain the corresponding decrypted frequency domain voice data.

Optionally, the method further comprises:

when a renegotiation request for changing the encryption information is received through an in-band signaling, replying information for agreeing to change the encryption information through the in-band signaling, wherein the encryption information comprises an encryption key and an encryption algorithm;

receiving information of a switching time point for enabling the changed encryption information through an in-band signaling;

and when receiving the replied confirmation information of receiving the switching time point information, successfully negotiating and changing the encrypted information with the receiving terminal equipment.

Optionally, before converting the decrypted frequency-domain speech data into a corresponding digital speech frame and outputting the digital speech frame, the method further includes: and converting the decrypted frequency domain voice data into a corresponding digital voice frame for voice enhancement processing.

The embodiment of the invention also provides voice signal processing devices, which comprise:

the framing processing unit is suitable for converting the collected time domain voice signals into digital voice signals and performing framing processing to obtain corresponding digital voice frames;

frequency domain conversion unit, adapted to perform frequency domain conversion on the obtained digital speech frames to obtain corresponding frequency domain speech data, where the frequency domain speech data includes frequency information of the corresponding digital speech frames;

the encryption processing unit is suitable for encrypting the frequency domain voice data corresponding to a preset number of digital voice frames respectively by adopting corresponding encryption information in a preset encryption period, wherein the encryption information comprises an encryption key and an encryption algorithm, and the frequency domain voice data corresponding to the digital voice frames with the same bit sequence in each encryption period are encrypted by adopting the same encryption key and the same encryption algorithm;

, a time domain conversion unit, adapted to perform time domain conversion on the encrypted frequency domain voice data to obtain an encrypted digital voice data frame;

and the compression sending unit is suitable for compressing and sending the encrypted digital voice data frames.

the encryption processing unit is suitable for adding frame synchronization information in the invalid voice frequency band of the frequency domain voice data corresponding to each digital voice frame in the encryption period, and encrypting the valid voice frequency band of the frequency domain voice data corresponding to each digital voice frame by adopting corresponding encryption information.

Optionally, the apparatus further comprises:

the renegotiation request unit is suitable for sending a request for changing the encrypted information to the receiving terminal equipment through in-band signaling when a preset renegotiation condition is met;

the switching time negotiation unit is suitable for sending the information of the switching time point for starting the changed encryption information to the receiving end equipment through the in-band signaling when receiving the information which agrees to change the encryption information and is replied by the receiving end equipment through the in-band signaling;

and the receiving confirmation unit is suitable for confirming that the negotiation with the receiving terminal equipment for changing the encrypted information is successful when receiving the confirmation information of the switching time point information received by the receiving terminal equipment.

the receiving and decompressing unit is suitable for decompressing the received digital voice data frame which is subjected to the compression and encryption processing to obtain the digital voice data frame which is subjected to the encryption processing;

a second frequency domain converting unit adapted to convert the digital voice data frame which has been subjected to the encryption processing into corresponding frequency domain voice data;

the decryption unit is suitable for decrypting the frequency domain voice data by adopting corresponding decryption information, and the decryption information comprises a decryption key and a decryption algorithm;

and the second time domain conversion output unit is suitable for converting the decrypted frequency domain voice data into a corresponding digital voice frame and outputting the digital voice frame.

the decryption unit is suitable for analyzing corresponding frame synchronization information from the invalid voice frequency band of the frequency domain voice data, and decrypting the valid voice frequency band in the frequency domain voice data by adopting decryption information corresponding to the analyzed frame synchronization information to obtain correspondingly decrypted frequency domain voice data.

Optionally, the apparatus further comprises:

a request receiving reply unit adapted to reply information agreeing to change the encryption information, including an encryption key and an encryption algorithm, by in-band signaling when a renegotiation request to change the encryption information is received by the in-band signaling;

a switching time receiving unit adapted to receive information of a switching time point for enabling the changed encryption information through in-band signaling;

and the renegotiation confirmation unit is suitable for successfully negotiating and changing the encrypted information with the receiving terminal equipment when receiving the replied confirmation information of receiving the switching time point information.

Optionally, the apparatus further comprises: and the enhancement processing unit is suitable for converting the decrypted frequency domain voice data into a corresponding digital voice frame and performing voice enhancement processing on the decrypted frequency domain voice data into the corresponding digital voice frame before outputting the digital voice frame.

Compared with the prior art, the technical scheme of the invention has the following advantages:

according to the scheme, the frequency domain voice data corresponding to the adjacent digital voice frames in each encryption period are encrypted by using different encryption keys and encryption algorithms, and the frequency domain voice data corresponding to the digital voice frames with the same bit sequence in different encryption periods are encrypted by using the same encryption key and encryption algorithm, so that the variation diversity of the used keys and encryption and decryption algorithms can be increased, and the safety of voice communication can be improved.

, because of adding frame synchronization information in the invalid voice frequency band of the frequency domain voice data corresponding to the digital voice frame, it can realize the frame synchronization of voice frame transmission between the receiving and transmitting ends of the mobile terminal, avoid the frame loss phenomenon causing the dislocation of the used decryption key and decryption method, and further avoid the occurrence of abnormal encryption communication caused by the decryption failure, therefore, it can effectively improve the quality of voice communication.

, when renegotiation conditions between the two terminals of the mobile terminal are met, the used encryption key and encryption algorithm, and decryption key and decryption algorithm are replaced, and when the preset switching time comes, the replaced encryption key and encryption algorithm, and decryption key and decryption algorithm are used to encrypt and decrypt the voice data, so that the used key and encryption/decryption algorithm have a greater possibility of change, and therefore, the security of voice transmission can be further improved .

, because the decrypted digital voice frame is enhanced correspondingly, the voice quality of the output voice data frame can be improved effectively, and the user experience is improved.

Drawings

Fig. 1 is a schematic diagram of a frame structure of a voice signal transmission system in an embodiment of the present invention;

fig. 2 is a flowchart of a voice signal processing method of a transmitting-end device in an embodiment of the present invention;

fig. 3 is a flowchart of a voice signal processing method of a receiving end device in an embodiment of the present invention;

fig. 4 is a flowchart of a renegotiation process between a sending end device and a receiving end device in an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a speech signal processing apparatus of a transmitting-end device in an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a speech signal processing apparatus of a receiving end device in an embodiment of the present invention.

Detailed Description

When voice communication is carried out, the sending end equipment firstly carries out corresponding processing on voice input by a user, and then sends a processed voice signal to the receiving end equipment through a network. When receiving the voice signal sent by the sending end device, the receiving end device performs corresponding processing and outputs the processed voice signal to the user, so that voice communication between the sending end device and the receiving end device can be realized.

in the prior art is to compress and encode the voice source input by the sending end equipment first, and then decode the voice source after reaching the receiving end equipment, so as to improve the transmission efficiency of the voice signal and the utilization efficiency of the transmission medium.

Meanwhile, in order to improve the security of communication, a voice encryption technology is introduced to encrypt voice transmission between a sending end device and a receiving end device, which creates another problems, namely encryption before compression encoding or encryption after compression encoding.

If the voice is encrypted after compression coding, when the mobile terminal performs cross-system or cross-core network switching, a network side network element may be triggered to decompress the compressed voice and then recompress the voice under certain conditions. Since the network side network element does not know that the voice data is encrypted, the re-compression operation of the voice data may cause the situation that the voice signal is damaged and the encrypted voice is interrupted.

If the encrypted voice data is encrypted before compression coding, and the encrypted voice data needs to be subjected to compression coding and decoding, the situation that the voice subjected to voice compression coding and decoding is damaged may occur, so that the voice signal output after the decryption by the receiving end device is severely distorted, and further normal communication cannot be performed.

In order to solve the above problems, the encryption operation of key participation is performed on the frequency domain signal of the time domain voice source, so that the influence of compression coding and decoding on the encrypted voice can be well resisted. However, the existing digital voice frequency domain encryption algorithm has poor safety of voice communication due to small changes of the key and the encryption algorithm used in the conversation process, and the application of the frequency domain voice encryption method is seriously influenced.

In addition , in order to meet the requirement of real-time transmission of voice data packets, the voice data is transmitted in the network using Transparent Mode (TM), the transmitting end device and the receiving end do not check the transmission result, i.e. the transmission protocol layer of voice communication has no means for synchronizing the data packets of the transmitting end device and the receiving end.

Therefore, the voice transmission method in the prior art has the problems of poor security and poor voice call quality.

In order to solve the above problems in the prior art, in the technical scheme adopted in the embodiment of the present invention, the frequency domain voice data corresponding to the adjacent digital voice frames in each encryption period are encrypted by using different encryption keys and encryption algorithms, and the frequency domain voice data corresponding to the digital voice frames with the same bit sequence in different encryption periods are encrypted by using the same encryption key and encryption algorithm, so that the diversity of the changes of the used keys and encryption and decryption algorithms can be increased, and the security of voice communication can be improved.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

In order to more clearly explain the speech signal processing method and apparatus in the embodiment of the present invention in detail, speech communication systems in the embodiment of the present invention are first described.

Referring to fig. 1, a voice communication system according to an embodiment of the present invention includes a sending-end device 101, an operator network 102, and a receiving-end device 103. The sending end device 101 and the receiving end device realize voice communication through the operator network 102.

Fig. 2 shows a flowchart of a voice signal processing method of sending-end devices in an embodiment of the present invention, where the voice signal processing method shown in fig. 2 may include:

step S201: and converting the collected time domain voice signals into digital voice signals and performing framing processing to obtain corresponding digital voice frames.

In a specific implementation, when the sending-end device 101 and the receiving-end device 103 perform voice communication, the sending-end device 110 first receives a time-domain analog voice signal input by a user, and performs analog-to-digital conversion processing to obtain a corresponding digital voice signal. Then, the digital voice signals obtained by conversion are subjected to framing processing to obtain a plurality of corresponding digital voice frames.

Step S202: and carrying out frequency domain conversion on the obtained digital voice frame to obtain corresponding frequency domain voice data.

In a specific implementation, in order to prevent the subsequent encoding and decoding processes from damaging the voice data, after the corresponding digital voice frame is obtained, the sending-end device 101 may perform frequency domain conversion to obtain frequency domain voice data corresponding to each digital voice frame. Wherein the frequency domain voice data includes frequency information of a corresponding digital voice frame.

Step S203: and respectively encrypting the frequency domain voice data corresponding to the preset number of digital voice frames by adopting corresponding encryption information in a preset encryption period.

In the embodiment of the present invention, in encryption cycles, the sending-end device 101 will respectively encrypt the frequency domain speech data corresponding to each digital speech frame with different encryption information, that is, the frequency domain speech data corresponding to the digital speech frame in the encryption cycle has a correspondence relationship with the encryption information.

For example, when the th encryption period includes frequency domain speech data corresponding to 3 digital speech frames, then the corresponding encryption information will also include 3 different sets of encryption keys and encryption algorithms, when performing encryption, the th set of encryption keys and encryption algorithms is used to encrypt the frequency domain speech data corresponding to the th digital speech frame, while the second set of encryption keys and encryption algorithms is used to encrypt the frequency domain speech data corresponding to the second digital speech frame, and the third set of encryption keys and encryption algorithms is used to encrypt the frequency domain speech data corresponding to the third digital speech frame.

Then, the frequency domain voice data corresponding to the 3 rd digital voice frames in the next encryption periods are encrypted by using the same encryption information, that is, the frequency domain voice data corresponding to the th digital voice data frame in the next encryption periods are encrypted by using the th group of encryption key and encryption algorithm, the frequency domain voice data corresponding to the second digital voice frame is encrypted by using the second group of encryption key and encryption algorithm, and the frequency domain voice data corresponding to the third digital voice frame is encrypted by using the third group of encryption key and encryption algorithm.

Therefore, the frequency domain voice data corresponding to the adjacent digital voice frames in each encryption period are encrypted by adopting different encryption keys and encryption algorithms, so that the diversity of used encryption information can be increased, and the safety of voice transmission can be improved.

Meanwhile, the pieces of encryption information comprising encryption keys and encryption algorithm groups are adopted to circularly encrypt the frequency domain voice data corresponding to the digital voice frames in different encryption periods, and the length of the used encryption information can be controlled within a controllable range so as to meet the actual storage requirement of the mobile terminal.

In a specific implementation, the length of the encryption period, and the corresponding encryption information including the encryption key and the encryption algorithm, and the number of the frequency domain voice data corresponding to the corresponding digital voice frame to be encrypted may be set according to actual needs, so as to further increase the possibility of the change of the used encryption information, and thus may further improve the security of the voice transmission.

To further improve the quality of voice communication, in embodiment of the present invention, the synchronization of voice transmission can be achieved by adding corresponding frame synchronization information to the digital voice frames.

In specific implementation, frequency domain voice data corresponding to each digital voice frame is obtained through frequency domain conversion, and the frequency domain voice data comprises corresponding frequency information. The maximum frequency value of the effective voice signal is 3400HZ, and the frequency band higher than 3400HZ will not carry voice information and will not affect the effective voice frequency band. Therefore, in the embodiment of the present invention, the frequency domain voice data corresponding to each digital voice frame is divided into an effective voice band carrying voice information and an invalid voice band not carrying voice information. And adding the frame number information of the corresponding digital voice frame as synchronous information to the invalid voice frequency band of each frequency domain voice data, and simultaneously encrypting the valid voice frequency band by adopting a corresponding encryption key and an encryption algorithm.

Therefore, the setting of the frame synchronization information can realize the synchronization of voice transmission, and can avoid the problems that a decryption key and a decryption algorithm used during analysis due to frame loss are unmatched with an encryption key and an encryption algorithm used during encryption, thereby avoiding the occurrence of abnormal communication conditions, effectively improving the quality of voice communication and improving the use experience of users.

Step S204: and carrying out time domain conversion on the encrypted frequency domain voice data to obtain an encrypted digital voice data frame.

In a specific implementation, when the encryption of the frequency domain voice data corresponding to all the digital voice frames is completed, the sending end device 101 performs time domain conversion on the encrypted frequency domain data to obtain the encrypted digital voice frames. Therefore, the damage to the voice information caused by subsequent voice compression and decoding can be effectively avoided, and the communication quality is improved.

Step S205: and compressing and transmitting the encrypted digital voice data frame.

In a specific implementation, after obtaining the encrypted digital speech frame, the sending-end device 101 may use a corresponding compression technique to compress and send the encrypted digital speech frame.

In the above-described procedure, the speech signal processing procedure of the transmitting-side apparatus 101 is described in detail. Accordingly, please refer to fig. 3 for a flowchart of the voice signal processing method of the receiving-end device 103.

Fig. 3 shows a flowchart of a speech signal processing method of receiving-end devices according to an embodiment of the present invention, where the speech signal processing method shown in fig. 3 may include:

step S301: and decompressing the received digital voice data frame which is subjected to the compression and encryption processing to obtain the digital voice data frame which is subjected to the encryption processing.

In a specific implementation, when the receiving end device 103 receives the compressed and encrypted digital voice data frame sent by the sending end device 101, it may first perform decryption by using a corresponding decompression technique to obtain the encrypted digital voice data frame.

Step S302: and converting the encrypted digital voice data frame into corresponding frequency domain voice data, and decrypting the frequency domain voice data by adopting corresponding decryption information.

In a specific implementation, the receiving end device 103 performs frequency domain conversion on the encrypted digital voice data frame obtained by decompression to obtain corresponding encrypted frequency domain voice data. Meanwhile, the receiving end device 203 decrypts the obtained frequency domain voice data by using corresponding decryption information including a decryption key and a decryption algorithm.

In the embodiment of the present invention, the receiving end device 103 divides the frequency domain voice data corresponding to each digital voice frame into an effective voice band carrying voice information and an ineffective voice band not carrying voice information, meanwhile, adds the frame number information of the corresponding digital voice frame as synchronization information to the ineffective voice band of each frequency domain voice data, and encrypts the effective voice band by using the corresponding encryption algorithm and encryption information.

Accordingly, when the obtained frequency domain voice data is encrypted frequency domain voice data, the receiving end device 103 may analyze the corresponding synchronization information, i.e., the frame number, from the invalid voice frequency band corresponding to the frequency domain voice data. And then, decrypting the effective voice frequency band of the frequency domain voice data by using decryption information which corresponds to the analyzed frame number and comprises a decryption key and a decryption algorithm to obtain the decrypted frequency domain voice data.

Step S303: and converting the decrypted frequency domain voice data into a corresponding digital voice frame.

In a specific implementation, when obtaining the decrypted frequency domain speech data, the receiving end device 103 may use a corresponding time domain conversion means to obtain a corresponding digital speech frame.

In a specific implementation, to further improve the quality of voice communication, the voice signal processing method in the embodiment of the present invention may further include:

step S304: and carrying out voice enhancement processing on the digital voice frame obtained after conversion.

In a specific implementation, operations such as encoding and decoding of voice information during transmission may weaken to a certain extent to the voice signal, and the receiving end device 103 may improve the quality of the output voice signal by performing voice enhancement processing on the digital voice frame by using a corresponding voice enhancement technique.

Step S305: and outputting the digital voice frame.

In a specific implementation, the receiving end device 103 may output the digital speech frame obtained after decryption and enhancement processing to the user after digital-to-analog conversion, so that the user of the receiving end device 103 may receive the corresponding speech information.

Fig. 4 is a flowchart illustrating a renegotiation process between sending end devices and a receiving end device in an embodiment of the present invention, where the renegotiation process illustrated in fig. 4 may include:

step S401: and when the preset renegotiation condition is met, sending a request for changing the encryption information to the receiving terminal equipment through in-band signaling.

In a specific implementation, the preset renegotiation condition includes: and reaching the preset time, or receiving a renegotiation request of the user for requesting to change the encrypted information. The encryption information includes an encryption key and an encryption algorithm.

In a specific implementation, when a preset time is reached or a renegotiation request of a user requesting to change encrypted information is received, the sending end device 101 sends the renegotiation request to the corresponding receiving end device 103 in an in-band signaling manner, and waits for a response from the receiving end device 103.

Step S402: when a renegotiation request for changing encryption information is received through in-band signaling, information agreeing to change encryption information is replied through in-band signaling.

In a specific implementation, when receiving a renegotiation request sent by the sending end device 101 through in-band signaling, the receiving end device 103 replies corresponding information agreeing to change the encryption information to the sending end device 101 through in-band signaling.

Step S403: and when receiving the information which is replied by the receiving end equipment through the in-band signaling and agrees to replace the encrypted information, sending the information of the switching time point to the receiving end equipment through the in-band signaling.

In a specific implementation, when receiving the information of agreeing to replace the encrypted information replied by the receiving end device 103 through the in-band signaling, the sending end device 101 sends the information of the switching time point to the receiving end device 103 through the in-band signaling, and waits for the acknowledgement information replied by the receiving end device 103 to receive the information of the switching time point.

In a specific implementation, the switching time point is a time point for encrypting by using the changed encryption information, and may be set according to actual needs.

Step S404: information of a switching time point enabling the changed encryption information is received through in-band signaling.

In a specific implementation, when the sending end device 101 receives the information of agreeing to change the encryption information sent by the receiving end device 103, the switching time point is sent to the receiving end device 103 through in-band signaling. After receiving the information of the switching time point, the receiving end device 103 may reply an acknowledgement to the sending end device 101.

Step S405: when receiving the information of the switching time point for starting the changed encryption information through the in-band signaling, replying the confirmation information of the switching time point information.

Step S406: and when receiving the replied confirmation information of receiving the switching time point information through the in-band signaling, finishing the renegotiation process between the sending end equipment and the receiving end equipment.

In a specific implementation, when the receiving end device 103 replies to the sending end device 101 through in-band signaling that the acknowledgement information of the switching time point information is received, the renegotiation process between the sending end device 101 and the receiving end device 103 is ended.

When the switching time point determined in the renegotiation process arrives, the sending-end device 101 may encrypt the subsequent digital speech frame using the encryption information different from the encryption information currently used, and accordingly, the receiving-end device 103 may decrypt the subsequent digital speech frame using the decryption information corresponding to the changed encryption information, so that the decryption difficulty of the speech data may be further increased , and the security of the speech information transmission may be improved.

Fig. 5 shows a schematic structural diagram of kinds of speech signal processing apparatuses in the embodiment of the present invention, the speech signal processing apparatus 500 shown in fig. 5 may include a framing processing unit 501, a frequency domain converting unit 502, an encryption processing unit 503, a time domain converting unit 504, and a compression transmitting unit 505, wherein:

the framing processing unit 501 is adapted to convert the acquired time domain speech signal into a digital speech signal and perform framing processing to obtain a corresponding digital speech frame.

frequency domain converting unit 502 is adapted to perform frequency domain conversion on the obtained digital speech frames to obtain corresponding frequency domain speech data, where the frequency domain speech data includes frequency information of the corresponding digital speech frames.

The encryption processing unit 503 is adapted to encrypt the frequency domain voice data corresponding to a preset number of digital voice frames in a preset encryption period respectively by using corresponding encryption information, where the encryption information includes an encryption key and an encryption algorithm, and the frequency domain voice data corresponding to the digital voice frames in the same bit sequence in each encryption period are encrypted by using the same encryption key and the same encryption algorithm;

in specific implementation, the frequency domain voice data corresponding to each data voice frame includes an invalid voice frequency band and an effective voice frequency band, the invalid voice frequency band does not include voice information, and the effective voice frequency band includes voice information. The encryption processing unit 503 is adapted to add frame synchronization information to the invalid speech frequency band of the frequency domain speech data corresponding to each digital speech frame in the encryption period, and encrypt the valid speech frequency band of the frequency domain speech data corresponding to each digital speech frame by using the corresponding encryption information.

, the time-domain converting unit 504 is adapted to perform time-domain conversion on the encrypted frequency-domain speech data to obtain an encrypted digital speech data frame.

And a compression transmitting unit 505 adapted to compress and transmit the encrypted digital voice data frame.

In a specific implementation, the speech signal processing apparatus 500 in the embodiment of the present invention may further include a renegotiation request unit 506, a handover time negotiation unit 507, and a reception confirmation unit 508, where:

a renegotiation request unit 506, adapted to send a request for changing the encryption information to the receiving end device through in-band signaling when a preset renegotiation condition is satisfied. In a specific implementation, the preset renegotiation condition includes: and reaching the preset time, or receiving a renegotiation request of the user for requesting to change the encrypted information.

The switching time negotiation unit 507 is adapted to send the information of the switching time point for enabling the changed encryption information to the receiving end device through the in-band signaling when receiving the information of agreeing to change the encryption information replied by the receiving end device through the in-band signaling.

A receiving confirmation unit 508, adapted to determine that the negotiation with the receiving end device for changing the encryption information is successful when receiving the confirmation information that the receiving end device receives the switching time point information.

Fig. 6 shows a schematic structural diagram of kinds of speech signal processing apparatuses in the embodiment of the present invention, the speech signal processing apparatus 600 shown in fig. 6 may include a receiving decompression unit 601, a second frequency domain converting unit 602, a decryption unit 603, and a second time domain converting output unit 604, where:

the receiving and decompressing unit 601 is adapted to decompress the received digital voice data frame that has been subjected to the compression and encryption processing, so as to obtain the digital voice data frame that has been subjected to the encryption processing.

A second frequency domain converting unit 602, adapted to convert the digital voice data frames that have been subjected to the encryption processing into corresponding frequency domain voice data.

A decryption unit 603 adapted to decrypt the frequency domain speech data with corresponding decryption information, the decryption information comprising a decryption key and a decryption algorithm;

in a specific implementation, the frequency domain voice data includes an invalid voice frequency band and an effective voice frequency band, the invalid voice frequency band does not include voice information, and the effective voice frequency band includes voice information.

The decryption unit 603 is adapted to parse the corresponding frame synchronization information from the invalid voice frequency band of the frequency domain voice data, and decrypt the valid voice frequency band of the frequency domain voice data by using the decryption information corresponding to the parsed frame synchronization information, so as to obtain the corresponding decrypted frequency domain voice data.

And a second time domain conversion output unit 604, adapted to convert the decrypted frequency domain voice data into a corresponding digital voice frame and output the digital voice frame.

In a specific implementation, the speech signal processing apparatus 600 in the embodiment of the present invention may further include an enhancement processing unit 605, where:

the enhancement processing unit 605 is adapted to perform speech enhancement processing on the decrypted frequency-domain speech data to convert the decrypted frequency-domain speech data to a corresponding digital speech frame before outputting the digital speech frame.

In a specific implementation, the voice signal processing apparatus 600 may further include a request receiving reply unit 606, a switching time receiving unit 607, and a renegotiation confirmation unit 608, where:

a request reception replying unit 606 adapted to, when a renegotiation request to change the encryption information is received through in-band signaling, reply with information agreeing to change the encryption information, including the encryption key and the encryption algorithm, through in-band signaling.

A switching time receiving unit 607 adapted to receive information of a switching time point enabling the changed encryption information through in-band signaling.

A renegotiation confirmation unit 608, adapted to negotiate with the receiving end device to successfully change the encryption information when receiving the replied confirmation message of receiving the switching time point message.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by instructions associated with hardware via a program, which may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.

The method and system of the embodiments of the present invention have been described in detail, but the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1, A speech signal processing method, comprising:

and compressing and transmitting the encrypted digital voice data frame.

2. The speech signal processing method according to claim 1, wherein the frequency-domain speech data corresponding to each data speech frame includes an inactive speech band and an active speech band, the inactive speech band does not include speech information, and the active speech band includes speech information;

3. The speech signal processing method according to claim 1 or 2, further comprising:

4. The speech signal processing method according to claim 3, wherein the preset renegotiation condition comprises: and reaching the preset time, or receiving a renegotiation request of the user for requesting to change the encrypted information.

5, A speech signal processing method, comprising:

decompressing the received digital voice data frame which is subjected to compression and encryption processing to obtain a digital voice data frame which is subjected to encryption processing, wherein the digital voice data frame which is subjected to compression and encryption processing is the digital voice data frame which is obtained by processing according to the method of claim 1;

6. The speech signal processing method according to claim 5, wherein the frequency-domain speech data includes an inactive speech band and an active speech band, the inactive speech band does not include speech information, and the active speech band includes speech information;

7. The speech signal processing method according to claim 5 or 6, further comprising:

and when receiving the replied confirmation information of receiving the switching time point information, successfully negotiating with the receiving end equipment to change the encrypted information.

8. The speech signal processing method according to claim 5, before converting the decrypted frequency-domain speech data into a corresponding digital speech frame and outputting the digital speech frame, further comprising: and converting the decrypted frequency domain voice data into a corresponding digital voice frame for voice enhancement processing.

The speech signal processing apparatus of claim 9, , comprising:

10. The speech signal processing apparatus of claim 9, wherein the frequency-domain speech data corresponding to each data speech frame comprises an inactive speech band and an active speech band, the inactive speech band does not include speech information, and the active speech band includes speech information;

11. The speech signal processing apparatus according to claim 9 or 10, further comprising:

12. The speech signal processing apparatus of claim 11, wherein the preset renegotiation condition comprises: and reaching the preset time, or receiving a renegotiation request of the user for requesting to change the encrypted information.

The speech signal processing apparatus of claim , comprising:

a receiving and decompressing unit, adapted to decompress a received digital voice data frame that has been subjected to compression and encryption processing to obtain an encrypted digital voice data frame, where the compressed and encrypted digital voice data frame is a digital voice data frame obtained by processing with the apparatus disclosed in claim 9;

14. The speech signal processing apparatus according to claim 13, wherein the frequency-domain speech data includes an inactive speech band and an active speech band, the inactive speech band does not include speech information, and the active speech band includes speech information;

15. The speech signal processing apparatus according to claim 13 or 14, further comprising:

16. The speech signal processing apparatus of claim 13, further comprising: and the enhancement processing unit is suitable for converting the decrypted frequency domain voice data into a corresponding digital voice frame and performing voice enhancement processing on the decrypted frequency domain voice data into the corresponding digital voice frame before outputting the digital voice frame.