CN106157961A

CN106157961A - Audio signal processing method and device

Info

Publication number: CN106157961A
Application number: CN201510167312.6A
Authority: CN
Inventors: 姜山; 刘智勇; 郎冬玲
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2015-04-09
Filing date: 2015-04-09
Publication date: 2016-11-23
Anticipated expiration: 2035-04-09
Also published as: CN106157961B

Abstract

Audio signal processing method and device, described method includes: the time domain speech signal gathered is converted into audio digital signals and carries out sub-frame processing, obtains the digital voice frame of correspondence；The digital voice frame obtained being carried out frequency domain conversion, obtains the frequency domain speech data of correspondence, described frequency domain speech data include the frequency information of respective digital speech frame；In default encryption period, the frequency domain speech data that the digital voice frame of predetermined number is corresponding are respectively adopted and add confidential information accordingly and be encrypted, wherein, described encrypted message packet includes encryption key and AES, and the frequency domain speech data acquisition being positioned at the digital voice frame of same position sequence in each encryption period corresponding is encrypted with identical encryption key and AES；Time domain conversion will be carried out, the digital voice data frame after being encrypted through the frequency domain speech data of encryption；Digital voice data frame after encryption is compressed and sends.Above-mentioned scheme can improve the safety of voice communication.

Description

Audio signal processing method and device

Technical field

The present invention relates to voice communication technology field, particularly relate to a kind of audio signal processing method and dress Put.

Background technology

When mobile terminal carries out voice communication in the operator network, owing to voice signal has short-term stationarity Property and the feature of part quasi periodic, so that signal has redundancy, common practices of the prior art It is that first speech source to sending ending equipment is compressed coding, is decoded again after arriving receiving terminal, with Promote efficiency of transmission and the utilization ratio of transmission medium of voice signal.

In order to improve the safety of voice communication, generally use voice encryption technology that sending ending equipment is sent Speech source be encrypted.But, voice encryption technology of the prior art be subject to used key and The restriction of enciphering and deciphering algorithm, the problem that there is voice communication poor stability.

Summary of the invention

The problem that the embodiment of the present invention solves is how to improve the peace of the voice data transmission between mobile terminal Quan Xing.

For solving the problems referred to above, embodiments provide a kind of audio signal processing method, described side Method includes:

The time domain speech signal gathered being converted into audio digital signals and carries out sub-frame processing, it is right to obtain The digital voice frame answered；

The digital voice frame obtained is carried out frequency domain conversion, obtains the frequency domain speech data of correspondence, described frequency Territory speech data includes the frequency information of respective digital speech frame；

Frequency domain speech data difference corresponding to the digital voice frame of predetermined number in default encryption period Employing adds confidential information accordingly and is encrypted, and wherein, described encrypted message packet includes encryption key and encryption is calculated Method, is positioned at the frequency domain speech data acquisition phase that the digital voice frame of same position sequence is corresponding in each encryption period Same encryption key and AES are encrypted；

Time domain conversion will be carried out, the digital voice data after being encrypted through the frequency domain speech data of encryption Frame；

Digital voice data frame after encryption is compressed and sends.

Alternatively, the frequency domain speech data that each data-voice frame is corresponding include that invalid voice frequency range is with effective Voice band, described invalid voice frequency range does not include that voice messaging, described efficient voice frequency range include voice Information；

Described frequency domain speech data corresponding to the digital voice frame of predetermined number in default encryption period It is respectively adopted and adds confidential information accordingly and be encrypted, including:

In described encryption period, at the invalid voice of frequency domain speech data corresponding to each digital voice frame Frequency range is added frame synchronization information, and employing adds the frequency that confidential information is corresponding to each digital voice frame accordingly The efficient voice frequency range of territory speech data is encrypted.

Alternatively, described method also includes:

When meeting the heavy warranty term preset, send change encryption by in-band signaling to receiving device The request of information；

Agree to that replacing adds the information of confidential information when receiving described receiving device by what in-band signaling was replied Time, the information that the switching time adding confidential information after enabling change puts is sent to the most described by in-band signaling Receiving device；

When receiving the confirmation that described receiving device receives dot information described switching time, determine Consult change with described receiving device and add confidential information success.

Alternatively, described default heavy warranty term includes: reaches Preset Time, or receives user Request change add the renegotiation request of confidential information.

The embodiment of the present invention additionally provides a kind of audio signal processing method, and described method includes:

The digital voice data frame of the overcompression received and encryption is decompressed, obtains Through the digital voice data frame of encryption；

Described digital voice data frame through encryption is converted into the frequency domain speech data of correspondence, And using corresponding decryption information that described frequency domain speech data are decrypted, described decryption information includes solving Decryption key and decipherment algorithm；

Frequency domain speech data after deciphering are converted into the digital voice frame of correspondence and export.

Alternatively, described frequency domain speech data include invalid voice frequency range and efficient voice frequency range, described nothing Effect voice band does not include that voice messaging, described efficient voice frequency range include voice messaging；

The described frequency domain speech that described digital voice data frame through encryption is converted into correspondence Data, and use corresponding decryption information that described frequency domain speech data are decrypted, including:

From the invalid voice frequency range of described frequency domain speech data, parse the frame synchronization information of correspondence, and adopt With the decryption information corresponding with the frame synchronization information parsed, to efficient voice in described frequency domain speech data Frequency range is decrypted, and obtains the frequency domain speech data accordingly after deciphering.

Alternatively, described method also includes:

When by in-band signaling receive change add the renegotiation request of confidential information time, by in-band signaling to Replying the information agreeing to that change adds confidential information, described encrypted message packet includes encryption key and AES；

The information put the switching time adding confidential information after enabling change is received by in-band signaling；

When receiving the confirmation receiving dot information described switching time of reply, set with described receiving terminal Standby negotiation change adds confidential information success.

Alternatively, the frequency domain speech data after deciphering are converted into the digital voice frame defeated of correspondence Before going out, also include: the frequency domain speech data after deciphering are converted into the digital voice frame of correspondence Carry out speech enhan-cement process.

The embodiment of the present invention additionally provides a kind of speech signal processing device, and described device includes:

Sub-frame processing unit, is suitable to that the time domain speech signal gathered is converted into audio digital signals and goes forward side by side Row sub-frame processing, obtains the digital voice frame of correspondence；

First frequency domain converting unit, is suitable to the digital voice frame obtained is carried out frequency domain conversion, obtains correspondence Frequency domain speech data, described frequency domain speech data include the frequency information of respective digital speech frame；

Cryptographic processing unit, is suitable in default encryption period corresponding to the digital voice frame of predetermined number Frequency domain speech data be respectively adopted and add confidential information accordingly and be encrypted, wherein, described encrypted message packet Including encryption key and AES, the digital voice frame being positioned at same position sequence in each encryption period is corresponding Frequency domain speech data acquisition identical encryption key and AES are encrypted；

First time domain converting unit, is suitable to carry out time domain conversion through the frequency domain speech data of encryption, Digital voice data frame after encryption；

Compression transmitting element, the digital voice data frame after being suitable to encryption is compressed and sends.

Described cryptographic processing unit is suitable in described encryption period, at the frequency that each digital voice frame is corresponding The invalid voice frequency range of territory speech data is added frame synchronization information, and employing adds confidential information accordingly to respectively The efficient voice frequency range of the frequency domain speech data that individual digital voice frame is corresponding is encrypted.

Alternatively, described device also includes:

Renegotiation request unit, is suitable to when meeting the heavy warranty term preset, by in-band signaling to connecing Receiving end equipment sends change and adds the request of confidential information；

Switching time, negotiation element, was suitable to when receiving what described receiving device was replied by in-band signaling When agreeing to change the information adding confidential information, the information that the switching time adding confidential information after enabling change puts Sent to described receiving device by in-band signaling；

Receive confirmation unit, be suitable to receive dot information described switching time when receiving described receiving device Confirmation time, determine with described receiving device consult change add confidential information success.

Receive decompression unit, be suitable to the digital speech number by the overcompression received and encryption Decompress according to frame, obtain through the digital voice data frame of encryption；

Second frequency domain converting unit, is suitable to described through the digital voice data frame conversion of encryption Become corresponding frequency domain speech data；

Decryption unit, is suitable for use with corresponding decryption information and is decrypted described frequency domain speech data, institute State decryption information and include decruption key and decipherment algorithm；

Second time domain conversion output unit, be suitable to by through deciphering after frequency domain speech data be converted into right The digital voice frame answered also exports.

Described decryption unit is suitable to parse correspondence from the invalid voice frequency range of described frequency domain speech data Frame synchronization information, and use the decryption information corresponding with the frame synchronization information parsed, to described frequency domain language In sound data, efficient voice frequency range is decrypted, and obtains the frequency domain speech data accordingly after deciphering.

Alternatively, described device also includes:

Request receives replys unit, is suitable to work as received by in-band signaling and changes the heavily negotiation adding confidential information During request, by in-band signaling to replying the information agreeing to that change adds confidential information, described encrypted message packet includes Encryption key and AES；

Switching time receives unit, be suitable to by in-band signaling receive enable change after add cutting of confidential information Change the information of time point；

Heavily consult confirmation unit, be suitable to when the confirmation receiving dot information described switching time that receive reply Information, consults change with described receiving device and adds confidential information success.

Alternatively, described device also includes: enhancement process unit, is suitable to the frequency domain language after deciphering Before sound data are converted into digital voice frame the output of correspondence, to the frequency domain speech number after deciphering Speech enhan-cement process is carried out according to the digital voice frame being converted into correspondence.

Compared with prior art, technical scheme has the advantage that

Above-mentioned scheme, by the frequency domain speech that digital voice frame adjacent in each encryption period is corresponding Data use different encryption keys and AES to be encrypted, and to identical bits in different encryption period The frequency domain speech data acquisition that the digital voice frame of sequence is corresponding adds with identical encryption key and AES Close, used key and the multiformity of enciphering and deciphering algorithm change can be increased, thus voice can be improved The safety of communication.

Further, owing to the invalid voice frequency range of the frequency domain speech data corresponding in digital voice frame adding Add frame synchronization information, it is possible to achieve the frame synchronization of the speech frames between mobile terminal transmitting-receiving two-end, keep away The phenomenon exempting from frame losing causes used decruption key and decryption method dislocation occur, and then can avoid leading Cause the generation owing to deciphering the situation unsuccessfully causing speech scrambling abnormal, therefore, it can effectively promote voice The quality of communication.

Further, due to when realizing between mobile terminal transmitting-receiving two-end according to default heavy warranty term, To the encryption key used and AES, and decruption key and decipherment algorithm are replaced, and in advance If switching time point arrive time use change after encryption key and AES, and decruption key conciliate Speech data is encrypted and decrypted by close algorithm so that the key and the enciphering and deciphering algorithm that are used have more Big change probability, therefore, it can improve further the safety of voice transfer.

Further, owing to the digital voice frame after deciphering is carried out corresponding enhancement process, can be effective Improve the voice quality of the speech data frame of output, promote the experience of user.

Accompanying drawing explanation

Fig. 1 is the frame structure schematic diagram of the transmitting voice signal system in the embodiment of the present invention；

Fig. 2 is the flow chart of the audio signal processing method of the sending ending equipment in the embodiment of the present invention；

Fig. 3 is the flow chart of the audio signal processing method of the receiving device in the embodiment of the present invention；

Fig. 4 is the heavily negotiations process between the sending ending equipment in the embodiment of the present invention and receiving device Flow chart；

Fig. 5 is the structural representation of the speech signal processing device of the sending ending equipment in the embodiment of the present invention；

Fig. 6 is the structural representation of the speech signal processing device of the receiving device in the embodiment of the present invention.

Detailed description of the invention

When carrying out voice communication, the voice that first user is inputted by sending ending equipment processes through corresponding, By network, voice signal after treatment is sent to receiving device again.Receiving device is receiving During the voice signal sent to sending ending equipment, through corresponding process, and the voice that will obtain after processing Signal exports to user, such that it is able to the voice communication realized between sending ending equipment and receiving device.

Short-term stationarity that voice signal has and part quasi periodic feature so that voice signal has superfluous Yu Xing.Common practices of the prior art is that the speech source first inputted at sending ending equipment is compressed compiling Code, is decoded, to promote efficiency of transmission and the transmission medium of voice signal after arriving receiving device again Utilization ratio.

Meanwhile, in order to improve the safety of communication, introduce voice encryption technology, to sending ending equipment and Voice transfer between receiving device is encrypted.This generates another one problem, i.e. exist Encryption or encryption after compressed encoding before compressed encoding.

If encrypting after compressed encoding, when mobile terminal is when carrying out cross-system or switching across core net, certain Network side element may be triggered under the conditions of Xie re-compress again after the voice of overcompression decompresses. Owing to network side element does not knows that speech data is encrypted, the heavily squeeze operation of speech data would potentially result in Voice signal is destroyed and then the situation of voice interruption occurs to encrypt.

If encrypting before compressed encoding, the speech data after encryption needs through compressed encoding and decoding, then May cause living through the decoded voice of voice compression coding and be destroyed the generation of situation so that receiving terminal The voice signal serious distortion of output after equipment deciphering, and then cannot be carried out proper communication.

For solving the problems referred to above, the frequency-region signal in time domain speech source is carried out the cryptographic calculation of key participation, Can preferably resist the compression coding and decoding impact on encryption voice.But existing digital speech frequency domain encryption Algorithm, causes the safety of voice communication because the key used in communication process and AES change less Property is poor, has had a strong impact on the application of frequency domain speech encryption method.

On the other hand, in order to reach VoP real-time Transmission demand, speech data uses in a network Transparent mode (TM) transmits, and transmission result is not checked by sending ending equipment and receiving terminal, i.e. voice The host-host protocol aspect of communication does not has sending ending equipment and the means of receiving terminal packet synchronisation.If data Different pieces of information bag is used different keys and different cryptographic calculations to be encrypted it by sending ending equipment, connects If receiving end causes statistical data packet number dislocation occur with sending ending equipment due to packet loss, then there will be and add solution Close dislocation and then cause enciphoring voice telecommunication abnormal.

Therefore, voice transmission method of the prior art also exists safety and poor the asking of voice call quality Topic.

For solving the above-mentioned problems in the prior art, the technical scheme that the embodiment of the present invention uses is passed through Frequency domain speech data corresponding to digital voice frame adjacent in each encryption period use different encryptions close Key and AES are encrypted, and corresponding to the digital voice frame of identical bits sequence in different encryption period Frequency domain speech data acquisition identical encryption key and AES be encrypted, and can increase and be used Key and the multiformity of enciphering and deciphering algorithm change, can improve the safety of voice communication.

Understandable, below in conjunction with the accompanying drawings for enabling the above-mentioned purpose of the present invention, feature and advantage to become apparent from The specific embodiment of the present invention is described in detail.

In order to clearly the audio signal processing method in the embodiment of the present invention and device be done detailed Illustrate, first introduce a kind of voice communication system in the embodiment of the present invention.

Shown in Figure 1, the voice communication system in present example includes sending ending equipment 101, fortune Battalion's business's network 102 and receiving device 103.Wherein, logical between sending ending equipment 101 and receiving device Cross carrier network 102 and realize voice communication.

Fig. 2 shows the stream of the audio signal processing method of a kind of sending ending equipment in the embodiment of the present invention Cheng Tu.Audio signal processing method as shown in Figure 2, may include that

Step S201: the time domain speech signal gathered is converted into audio digital signals and carries out at framing Reason, obtains the digital voice frame of correspondence.

In being embodied as, when sending ending equipment 101 and receiving device 103 are when carrying out voice communication, First received the time domain simulation voice signal from user's input by sending ending equipment 110, and carry out modulus Conversion process, obtains corresponding audio digital signals.Then, the audio digital signals being converted to is entered Row sub-frame processing, obtains the digital voice frame of multiple correspondence.

Step S202: the digital voice frame obtained is carried out frequency domain conversion, obtains the frequency domain speech number of correspondence According to.

In being embodied as, in order to resist the destruction that follow-up encoding-decoding process causes for speech data, After obtaining corresponding digital voice frame, sending ending equipment 101 can carry out frequency domain conversion, obtains each The frequency domain speech data that digital voice frame is corresponding.Wherein, described frequency domain speech data include respective digital language The frequency information of sound frame.

Step S203: frequency domain language corresponding to the digital voice frame of predetermined number in default encryption period Sound data are respectively adopted and add confidential information accordingly and be encrypted.

In an embodiment of the present invention, in an encryption period, sending ending equipment 101 will be respectively adopted The frequency domain speech data that each digital voice frame is corresponding are encrypted by the different confidential informations that adds, and i.e. encrypt week The frequency domain speech data that digital voice frame in phase is corresponding, have relation one to one with adding confidential information. Wherein, described encrypted message packet includes encryption key and AES.Meanwhile, in different encryption period, The frequency domain speech data acquisition that the digital voice frame of identical bits sequence is corresponding is encrypted with the identical confidential information that adds.

Such as, when first encryption period includes frequency domain speech data corresponding to 3 digital voice frame, Add confidential information the most accordingly and also will include 3 groups of different encryption keys and AES.It is being encrypted Time, first by the frequency domain language that the first group encryption keys and AES are corresponding to first digit speech frame Sound data are encrypted, and the frequency domain speech data that second digit speech frame is corresponding then use second group to add Decryption key and AES are encrypted, and frequency domain speech data corresponding to third digit speech frame use the Three group encryption keys and AES are encrypted.

Then, followed by the frequency domain speech number corresponding to 3 digital voice frame in next encryption period According to, use the same confidential information that adds to be encrypted, the first digit voice in i.e. next encryption period The frequency domain speech data acquisition that Frame is corresponding is encrypted with the first group encryption keys and AES, and second Frequency domain speech data corresponding to individual digital voice frame use the second group encryption keys and AES to be encrypted, The frequency domain speech data that third digit speech frame is corresponding use the 3rd group encryption keys and AES to carry out Encryption.

So, the frequency domain speech data that digital voice frame adjacent during each encryption period is interior is corresponding, use Different encryption keys and AES are encrypted, and can increase the multiformity adding confidential information used, And then the safety of voice transfer can be promoted.

Meanwhile, use the same confidential information that adds including encryption key and AES group that difference is encrypted week Frequency domain speech data corresponding to digital voice frame in the phase are circulated encryption, and can control to be used adds The length of confidential information is in controlled scope, to meet the actual storage demand of mobile terminal.

In being embodied as, the length of encryption period, and correspondence include encryption key and AES Add confidential information, the number of the frequency domain speech data corresponding with corresponding digital voice frame to be encrypted, can To be configured according to the actual needs, to increase the probability of the encryption information change used further, Thus the safety of voice transfer can be improved further.

In order to promote the quality of voice communication further, in an embodiment of the present invention, can be by counting Word speech frame adds the mode of corresponding frame synchronization information to realize the simultaneously operating of voice transfer.

In being embodied as, it is converted to, through frequency domain, the frequency domain speech data that each digital voice frame is corresponding, Including corresponding frequency information.Wherein, the frequency maxima of efficient voice signal is 3400HZ, is higher than The frequency range of 3400HZ will not carry voice messaging, and efficient voice frequency range will not be produced impact.Therefore, In embodiments of the present invention, frequency domain speech data corresponding for each digital voice frame are divided into carry language The efficient voice frequency range of message breath, and do not carry the invalid voice frequency range of voice messaging.By respective digital language The frame number information of sound frame adds the invalid voice frequency range to each frequency domain speech data as synchronizing information, with Time, use corresponding encryption key and AES that efficient voice frequency range is encrypted.

So, frame synchronization information the synchronization that can realize voice transfer is set, can avoid due to frame losing The decruption key used when causing parsing and decipherment algorithm, the encryption key used during with encryption and encryption The unmatched problem of algorithm, such that it is able to avoid the generation of communication abnormality situation, can effectively promote voice The quality of communication, promotes the experience of user.

Step S204: time domain conversion will be carried out through the frequency domain speech data of encryption, the number after being encrypted Word speech data frame.

In being embodied as, when the frequency domain speech data encryption corresponding to all digital voice frame completes, The frequency domain data obtained after encryption is being carried out time domain conversion by sending ending equipment 101, obtains through adding Digital voice frame after close.As such, it is possible to be prevented effectively from owing to follow-up compress speech and decoding are to voice The destruction that information causes, improves the quality of communication.

Step S205: the digital voice data frame after encryption is compressed and sends.

In being embodied as, sending ending equipment 101 is after obtaining the digital voice frame after encryption, just Can use corresponding compress technique that the digital voice frame after encryption is compressed and is sent.

During above-mentioned, the Speech processing process of sending ending equipment 101 is carried out detailed Introduce.Correspondingly, the flow chart of the audio signal processing method of receiving device 103 refers to Fig. 3 institute Show.

Fig. 3 shows that the embodiment of the present invention additionally provides the audio signal processing method of a kind of receiving device Flow chart.Audio signal processing method as shown in Figure 3, may include that

Step S301: the digital voice data frame of the overcompression received and encryption is solved Compression, obtains through the digital voice data frame of encryption.

In being embodied as, when receiving device 103 receives passing through of sending ending equipment 101 transmission When compression and the digital voice data frame of encryption, can carry out initially with corresponding decompression technique Deciphering, obtains through the digital voice data frame of encryption.

Step S302: by described through the digital voice data frame of encryption, be converted into correspondence Frequency domain speech data, and use corresponding decryption information that described frequency domain speech data are decrypted.

In being embodied as, receiving device 103 by through decompression after obtain through encryption Digital voice data frame, changed by frequency domain, obtain correspondence through encryption frequency domain speech data. Meanwhile, receiving device 203 uses the decryption information including decruption key and decipherment algorithm accordingly to obtaining To frequency domain speech data be decrypted.

In an embodiment of the present invention, receiving device 103 is by frequency domain language corresponding for each digital voice frame Sound data are divided into the efficient voice frequency range carrying voice messaging, and do not carry the invalid language of voice messaging Audio section.Meanwhile, the frame number information of respective digital speech frame is added to each frequency domain as synchronizing information The invalid voice frequency range of speech data, and use corresponding AES and add confidential information to efficient voice frequency Section is encrypted.

Correspondingly, receiving device 103 is the frequency domain speech through encryption in the frequency domain speech data obtained During data, the synchronizing information of correspondence can be parsed from the invalid voice frequency range of corresponding frequency domain speech data Frame number.Then, what employing was corresponding with the described frame number parsed includes decruption key and decipherment algorithm Decryption information, the efficient voice frequency range of frequency domain speech data is decrypted, the frequency domain after being deciphered Speech data.

Step S303: the frequency domain speech data after deciphering are converted into the digital voice frame of correspondence.

In being embodied as, obtaining the frequency domain speech data after deciphering, receiving device 103 Corresponding time domain conversion means can be used, obtain the digital voice frame of correspondence.

In being embodied as, in order to improve the quality of voice communication further, the language in the embodiment of the present invention Signal processing method can also include:

Step S304: the digital voice frame obtained after will convert into carries out speech enhan-cement process.

In being embodied as, voice messaging encoding and decoding in transmitting procedure etc. operate, can be to voice signal Causing a certain degree of weakening, receiving device 103 carrys out logarithm by using corresponding speech enhancement technique Word speech frame carries out speech enhan-cement process, can improve the quality of the voice signal of output.

Step S305: output digital voice frame.

In being embodied as, receiving device 103 can obtain digital speech after deciphering and enhancement process Frame is exporting to user after digital-to-analogue conversion, then the user of receiving device 103 just can receive Corresponding voice messaging.

Fig. 4 shows the heavily association between a kind of sending ending equipment and receiving device in the embodiment of the present invention The flow chart of business's process.Heavily negotiations process as shown in Figure 4, may include that

Step S401: when meeting the heavy warranty term preset, sent out to receiving device by in-band signaling Change is sent to add the request of confidential information.

In being embodied as, described default heavy warranty term includes: reaches Preset Time, or receives Request change to user adds the renegotiation request of confidential information.Described encrypted message packet includes encryption key and adds Close algorithm.

In being embodied as, when reaching Preset Time, or the request change receiving user adds confidential information Renegotiation request time, sending ending equipment 101 by by the way of in-band signaling by described renegotiation request Send to corresponding receiving device 103, and wait the response of receiving device 103.

Step S402: when receiving, by in-band signaling, the renegotiation request that change adds confidential information, pass through In-band signaling replys the information agreeing to that change adds confidential information.

In being embodied as, please by the heavily negotiation that in-band signaling sends receiving sending ending equipment 101 When asking, receiving device 103 is replied to sending ending equipment 101 by in-band signaling and is agreed to change accordingly Add the information of confidential information.

Step S403: when receiving the agreement replacing encryption that described receiving device is replied by in-band signaling During the information of information, the information put switching time is sent to described receiving device by in-band signaling.

In being embodied as, when receiving the agreement replacing that receiving device 103 is replied by in-band signaling When adding the information of confidential information, the information that switching time puts is sent by sending ending equipment 101 by in-band signaling To receiving device 103, and wait the letter receiving point described switching time that receiving device 103 is replied The confirmation of breath.

In being embodied as, described switching time point for after using change add that confidential information is encrypted time Between point, can be configured according to the actual needs.

Step S404: received the letter put the switching time adding confidential information after enabling change by in-band signaling Breath.

In being embodied as, when sending ending equipment 101 receives the agreement change that receiving device 103 sends During the information of more confidential information, point described switching time can be sent to receiving device by in-band signaling 103.Receiving device 103 receive described switching time point information after, the confirmation that can reply Information is to sending ending equipment 101.

Step S405: when receiving some switching time adding confidential information after enabling change by in-band signaling Information time, reply the confirmation receiving dot information described switching time.

Step S406: receive dot information described switching time really when receive reply by in-band signaling When recognizing information, the heavily negotiations process between sending ending equipment and receiving device terminates.

In being embodied as, when receiving device 103 is replied to sending ending equipment 101 by in-band signaling When receiving the confirmation of dot information described switching time, sending ending equipment 101 and receiving device 103 Between heavily negotiations process terminate.

When the switching time determined in heavy negotiations process, point arrived, sending ending equipment 101 just can use From presently used add confidential information different add confidential information, follow-up digital voice frame is encrypted place Reason.Correspondingly, receiving device 103 just can use with change after add the solution secret letter that confidential information is corresponding Breath, is decrypted process to follow-up digital voice frame.Therefore, it can increase further speech data Decode difficulty, improve the safety of transmission of speech information.

Fig. 5 shows the structural representation of a kind of speech signal processing device in the embodiment of the present invention.As Speech signal processing device 500 shown in Fig. 5, can include that sub-frame processing unit the 501, first frequency domain turns Change unit 502, cryptographic processing unit the 503, first time domain converting unit 504 and compression transmitting element 505, Wherein:

Sub-frame processing unit 501, is suitable to the time domain speech signal gathered is converted into audio digital signals also Carry out sub-frame processing, obtain the digital voice frame of correspondence.

First frequency domain converting unit 502, is suitable to the digital voice frame obtained is carried out frequency domain conversion, and it is right to obtain The frequency domain speech data answered, described frequency domain speech data include the frequency information of respective digital speech frame.

Cryptographic processing unit 503, is suitable to the digital voice frame pair to predetermined number in default encryption period The frequency domain speech data answered are respectively adopted and add confidential information accordingly and be encrypted, wherein, described in add confidential information Including encryption key and AES, the digital voice frame being positioned at same position sequence in each encryption period is corresponding Frequency domain speech data acquisition be encrypted with identical encryption key and AES；

In being embodied as, the frequency domain speech data that each data-voice frame is corresponding include invalid voice frequency range With efficient voice frequency range, described invalid voice frequency range does not include voice messaging, described efficient voice frequency range bag Include voice messaging.Described cryptographic processing unit 503 is suitable in described encryption period, at each numeral language The invalid voice frequency range of the frequency domain speech data that sound frame is corresponding is added frame synchronization information, and uses corresponding The efficient voice frequency range adding the confidential information frequency domain speech data corresponding to each digital voice frame is encrypted.

First time domain converting unit 504, is suitable to carry out time domain conversion through the frequency domain speech data of encryption, Digital voice data frame after being encrypted.

Compression transmitting element 505, the digital voice data frame after being suitable to encryption is compressed and sends.

In being embodied as, the speech signal processing device 500 in the embodiment of the present invention can also include weight Consult request unit 506, negotiation element switching time 507 and receive confirmation unit 508, wherein:

Renegotiation request unit 506, be suitable to when meet preset heavy warranty term time, by in-band signaling to Receiving device sends change and adds the request of confidential information.In being embodied as, described default heavily negotiation bar Part includes: reach Preset Time, or the request change receiving user adds the renegotiation request of confidential information.

Negotiation element 507, was suitable to ought receive described receiving device and was replied by in-band signaling switching time Agree to change when adding the information of confidential information, after change being enabled add confidential information switching time point letter Cease and sent to described receiving device by in-band signaling.

Receive confirmation unit 508, be suitable to receive described receiving device and receive some letter described switching time During the confirmation ceased, determine that consulting change with described receiving device adds confidential information success.

Fig. 6 shows the structural representation of a kind of speech signal processing device in the embodiment of the present invention.As Speech signal processing device 600 shown in Fig. 6, can include receiving decompression unit the 601, second frequency domain Converting unit 602, decryption unit 603 and the second time domain conversion output unit 604, wherein:

Receive decompression unit 601, be suitable to the overcompression received and the digital speech of encryption Frame decompresses, and obtains through the digital voice data frame of encryption.

Second frequency domain converting unit 602, is suitable to turn described digital voice data frame through encryption Change into as corresponding frequency domain speech data.

Decryption unit 603, is suitable for use with corresponding decryption information and is decrypted described frequency domain speech data, Described decryption information includes decruption key and decipherment algorithm；

In being embodied as, described frequency domain speech data include invalid voice frequency range and efficient voice frequency range, Described invalid voice frequency range does not include that voice messaging, described efficient voice frequency range include voice messaging.

It is right that described decryption unit 603 is suitable to parse from the invalid voice frequency range of described frequency domain speech data The frame synchronization information answered, and use the decryption information corresponding with the frame synchronization information parsed, to described frequency In the speech data of territory, efficient voice frequency range is decrypted, and obtains the frequency domain speech data accordingly after deciphering.

Second time domain conversion output unit 604, is suitable to be converted into the frequency domain speech data after deciphering Corresponding digital voice frame also exports.

In being embodied as, the speech signal processing device 600 in the embodiment of the present invention can also include increasing Strong processing unit 605, wherein:

Enhancement process unit 605, is suitable to be converted into the frequency domain speech data after deciphering the number of correspondence Before word speech frame output, the frequency domain speech data after deciphering are converted into the digital language of correspondence Sound frame carries out speech enhan-cement process.

In being embodied as, described speech signal processing device 600 can also include that request receives reply sheet Unit 606, switching time receive unit 607 and heavily consult confirmation unit 608, wherein:

Request receives replys unit 606, is suitable to work as received by in-band signaling and changes the heavily association adding confidential information Consult and request when asking, by in-band signaling to replying the information agreeing to that change adds confidential information, described encrypted message packet Include encryption key and AES.

Receive unit 607 switching time, be suitable to receive the confidential information that adds after enabling change by in-band signaling The information that switching time puts.

Heavily consult confirmation unit 608, be suitable to receive dot information described switching time really when receive reply Recognize information, consult change with described receiving device and add confidential information success.

One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment Suddenly the program that can be by completes to instruct relevant hardware, and this program can be stored in computer-readable In storage medium, storage medium may include that ROM, RAM, disk or CD etc..

Having been described in detail the method and system of the embodiment of the present invention above, the present invention is not limited to this. Any those skilled in the art, without departing from the spirit and scope of the present invention, all can make various change with Amendment, therefore protection scope of the present invention should be as the criterion with claim limited range.

Claims

1. an audio signal processing method, it is characterised in that including:

The time domain speech signal gathered it is converted into audio digital signals and carries out sub-frame processing, obtaining correspondence Digital voice frame；

The digital voice frame obtained is carried out frequency domain conversion, obtains the frequency domain speech data of correspondence, described frequency domain Speech data includes the frequency information of respective digital speech frame；

Frequency domain speech data difference corresponding to the digital voice frame of predetermined number in default encryption period Employing adds confidential information accordingly and is encrypted, and wherein, described encrypted message packet includes encryption key and encryption Algorithm, is positioned at the frequency domain speech data acquisition that the digital voice frame of same position sequence is corresponding in each encryption period It is encrypted with identical encryption key and AES；

Time domain conversion will be carried out, the digital voice data frame after being encrypted through the frequency domain speech data of encryption； Digital voice data frame after encryption is compressed and sends.

Audio signal processing method the most according to claim 1, it is characterised in that each data-voice frame Corresponding frequency domain speech data include invalid voice frequency range and efficient voice frequency range, described invalid voice frequency Section does not include that voice messaging, described efficient voice frequency range include voice messaging；

In described encryption period, at the invalid voice frequency of frequency domain speech data corresponding to each digital voice frame Add frame synchronization information in Duan, and employing adds the frequency that confidential information is corresponding to each digital voice frame accordingly The efficient voice frequency range of territory speech data is encrypted.

Audio signal processing method the most according to claim 1 and 2, it is characterised in that also include: When meeting the heavy warranty term preset, send change by in-band signaling to receiving device and add secret letter The request of breath；

Agree to that replacing adds the information of confidential information when receiving described receiving device by what in-band signaling was replied Time, the information that the switching time adding confidential information after enabling change puts is sent to institute by in-band signaling State receiving device；

When receiving the confirmation that described receiving device receives dot information described switching time, determine with Described receiving device is consulted change and is added confidential information success.

Audio signal processing method the most according to claim 3, it is characterised in that described default heavily association Business's condition includes: reach Preset Time, or the request change receiving user adds the heavily association of confidential information Consult and request and ask.

5. an audio signal processing method, it is characterised in that including:

The digital voice data frame of the overcompression received and encryption is decompressed, obtains Digital voice data frame through encryption；

Described digital voice data frame through encryption is converted into the frequency domain speech data of correspondence, And using corresponding decryption information that described frequency domain speech data are decrypted, described decryption information includes Decruption key and decipherment algorithm；

Audio signal processing method the most according to claim 5, it is characterised in that described frequency domain speech number According to including invalid voice frequency range and efficient voice frequency range, described invalid voice frequency range does not include voice messaging, Described efficient voice frequency range includes voice messaging；

From the invalid voice frequency range of described frequency domain speech data, parse the frame synchronization information of correspondence, and use The decryption information corresponding with the frame synchronization information parsed, to efficient voice in described frequency domain speech data Frequency range is decrypted, and obtains the frequency domain speech data accordingly after deciphering.

7. according to the audio signal processing method described in claim 5 or 6, it is characterised in that also include: When receiving, by in-band signaling, the renegotiation request that change adds confidential information, by in-band signaling to returning The multiple information agreeing to that change adds confidential information, described encrypted message packet includes encryption key and AES；

When receiving the confirmation receiving dot information described switching time of reply, with described receiving device Consult change and add confidential information success.

Audio signal processing method the most according to claim 5, it is characterised in that by after deciphering Before frequency domain speech data are converted into digital voice frame the output of correspondence, also include: to through solving Frequency domain speech data after close are converted into the digital voice frame of correspondence and carry out speech enhan-cement process.

9. a speech signal processing device, it is characterised in that including:

Sub-frame processing unit, is suitable to the time domain speech signal gathered is converted into audio digital signals and carries out Sub-frame processing, obtains the digital voice frame of correspondence；

Cryptographic processing unit, is suitable in default encryption period corresponding to the digital voice frame of predetermined number Frequency domain speech data are respectively adopted and add confidential information accordingly and be encrypted, wherein, and described encrypted message packet Including encryption key and AES, the digital voice frame being positioned at same position sequence in each encryption period is corresponding Frequency domain speech data acquisition be encrypted with identical encryption key and AES；

First time domain converting unit, is suitable to, by carrying out time domain conversion through the frequency domain speech data of encryption, obtain Digital voice data frame after encryption；

Speech signal processing device the most according to claim 9, it is characterised in that each data-voice frame Corresponding frequency domain speech data include invalid voice frequency range and efficient voice frequency range, described invalid voice frequency Section does not include that voice messaging, described efficient voice frequency range include voice messaging；

Described cryptographic processing unit is suitable in described encryption period, at the frequency domain that each digital voice frame is corresponding The invalid voice frequency range of speech data is added frame synchronization information, and employing adds confidential information accordingly to respectively The efficient voice frequency range of the frequency domain speech data that individual digital voice frame is corresponding is encrypted.

11. according to the speech signal processing device described in claim 9 or 10, it is characterised in that also include: Renegotiation request unit, is suitable to when meeting the heavy warranty term preset, by in-band signaling to reception End equipment sends change and adds the request of confidential information；

Switching time negotiation element, be suitable to when receive described receiving device by in-band signaling reply same When meaning changes the information adding confidential information, the information that the switching time adding confidential information after enabling change puts Sent to described receiving device by in-band signaling；

Receive confirmation unit, be suitable to receive dot information described switching time when receiving described receiving device During confirmation, determine that consulting change with described receiving device adds confidential information success.

12. speech signal processing devices according to claim 11, it is characterised in that described default weight Warranty term includes: reach Preset Time, or the request change receiving user adds the weight of confidential information Consult request.

13. 1 kinds of speech signal processing devices, it is characterised in that including:

Receive decompression unit, be suitable to the overcompression received and the digital voice data of encryption Frame decompresses, and obtains through the digital voice data frame of encryption；

Second frequency domain converting unit, is suitable to be converted into described digital voice data frame through encryption For corresponding frequency domain speech data；

Decryption unit, is suitable for use with corresponding decryption information and is decrypted described frequency domain speech data, described Decryption information includes decruption key and decipherment algorithm；

Second time domain conversion output unit, is suitable to the frequency domain speech data after deciphering are converted into correspondence Digital voice frame and export.

14. speech signal processing devices according to claim 13, it is characterised in that described frequency domain speech Data include that invalid voice frequency range and efficient voice frequency range, described invalid voice frequency range do not include that voice is believed Breath, described efficient voice frequency range includes voice messaging；

Described decryption unit is suitable to parse correspondence from the invalid voice frequency range of described frequency domain speech data Frame synchronization information, and use the decryption information corresponding with the frame synchronization information parsed, to described frequency domain In speech data, efficient voice frequency range is decrypted, and obtains the frequency domain speech data accordingly after deciphering.

15. according to the speech signal processing device described in claim 13 or 14, it is characterised in that also include: Request receive reply unit, be suitable to when by in-band signaling receive change add confidential information heavily negotiation please When asking, by in-band signaling to replying the information agreeing to that change adds confidential information, described encrypted message packet includes Encryption key and AES；

Switching time receives unit, is suitable to receive the switching adding confidential information after enabling change by in-band signaling The information of time point；

Heavily consult confirmation unit, be suitable to when the confirmation letter receiving dot information described switching time that receive reply Breath, consults change with described receiving device and adds confidential information success.

16. speech signal processing devices according to claim 13, it is characterised in that also include: strengthen Processing unit, is suitable to be converted into the frequency domain speech data after deciphering the digital voice frame of correspondence And before exporting, the digital voice frame that the frequency domain speech data after deciphering are converted into correspondence is entered Lang sound enhancement process.