CN1901442A - Camouflage communication method based on voice identification - Google Patents

Camouflage communication method based on voice identification Download PDF

Info

Publication number
CN1901442A
CN1901442A CNA2006100855931A CN200610085593A CN1901442A CN 1901442 A CN1901442 A CN 1901442A CN A2006100855931 A CNA2006100855931 A CN A2006100855931A CN 200610085593 A CN200610085593 A CN 200610085593A CN 1901442 A CN1901442 A CN 1901442A
Authority
CN
China
Prior art keywords
prime
ciphertext
code stream
frame
speech recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2006100855931A
Other languages
Chinese (zh)
Other versions
CN100550723C (en
Inventor
邓宗元
杨震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CNB2006100855931A priority Critical patent/CN100550723C/en
Publication of CN1901442A publication Critical patent/CN1901442A/en
Application granted granted Critical
Publication of CN100550723C publication Critical patent/CN100550723C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

This invention relates to a disguise communication method based on phone identification used in military secrecy communicatioin and facsimile apparatus system including: generating secret information code stream based on phone identification, serial and parallel transformation and generating being inserted cryptographs based on random cryptographic keys, testing cleartext sonant frame formant and insertion of random cryptograph information and secret information extraction in the following steps: 1, generation of water print and insertion, 2, extraction of water print.

Description

Camouflage communication method based on speech recognition
Technical field
The present invention relates to a kind of real-time secret signalling scheme that is used for military security communication and facsimile printer system, belong to Information Hiding Techniques and voice process technology field based on speech recognition.
Background technology
Information Hiding Techniques has had development at full speed as a branch of information security field over past ten years.This technology is carrier signal not to be produced under the prerequisite of undue influence, extra information is embedded in the Digital Media, to realize functions such as copyright protection, covert communications.The research focus of Information Hiding Techniques mainly contains following two aspects: the one, and the digital audio watermark that research has the transparency, fail safe and robustness; This achievement can be used for copyright protection, data and adds, prevents to distort and the occasions such as AM automatic monitoring of wireless transmission.The 2nd, study and how to utilize information disguising to realize a kind of brand-new speech secure communication mode.For this reason, relate to the voice camouflage core algorithm of research innovation, the voice compression algorithm of extremely low code check, and technology such as identity code authentication.Its target is to realize non real-time and real-time two class voice manipulative communications deceptions.This achievement can be used for systems such as military speech secure communication.
Along with development of computer, the conventional cryptography algorithm is subjected to great challenge, and studying new digitize voice secret signalling becomes and have a challenging problem.At present real time information is hidden and the research of voice camouflage fewer, to be the amount of information hidden in real time than watermark compare its main cause wants big many.The hiding scheme of existing real time information is subject to the contradiction between the amount of hiding Info and the transparency and the robustness mostly: a kind of is that least significant bit is replaced (LSB) method, this method can embed more data, algorithm is simple, is easy to realize, but relatively poor for the robustness of the interference in the communication line.Another kind of method commonly used is to realize Information hiding at transform domain by DCT (discrete cosine transform) coefficient of revising the plaintext carrier.The embedding robustness of this kind scheme is stronger, but the embedding data volume is many not as LSB (least significant bit) scheme.Therefore, adopt the voice compression algorithm of extremely low code check, reduce the ciphertext code check as far as possible, just might seek the real-time reliable communication that the stronger hiding scheme of robustness realizes secret information.
Summary of the invention
Technical problem: the objective of the invention is to propose a kind of camouflage communication method that can be used in the secure communication based on speech recognition, can realize the transparency, robustness, the real-time of Information hiding, the performance index such as self-reparability of receiving terminal, has good practical value, for the research of secure communication and design provide a new approach.
Technical scheme: the camouflage communication method based on speech recognition of the present invention comprises that generation, the random key based on the secret information code stream of speech recognition generates, the unvoiced frame formant embeds cipher-text information, secret information extracts four big parts, and its entire method job step is as follows:
1.) watermark generates and embeds:
A. the user of service of system sends the phrase command that needs secret transmission to system
B. system at first carries out speech recognition based on DTW (dynamic time convolution), is corresponding literal with command conversion, if find mistake, revises; And literal table is shown as binary code stream,
C. according to formula:
S &prime; = s &prime; ( i ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 }
Carry out binary system and quaternary string and conversion, wherein, s ' is a quaternary code fluxion value (i), and it makes up in twos from binary code stream numerical value and obtains, and conversion back forms the code stream string S ' of quaternary number, and M is the bit number that order needs during with binary coding.And according to formula:
S=(s′(k)+K 1)mod(4),s′(k)∈{0,1,2,3},0≤k<M/2
Carry out encryption, obtain to encrypt ciphertext S, K 1Be predetermined accidental enciphering seed, be even number, s ' is a quaternary unencryption code stream numerical value (k);
D. expressly voice divide frame, and each frame carries out voiceless sound and voiced sound judgement according to energy, finds out unvoiced frame,
E. unvoiced frame is DFT (discrete fourier transform), and selects first or second formant according to controlling elements K2,
F. according to formula:
Figure A20061008559300062
The watermark that the c step is generated embeds, wherein C kBe the coefficient that obtains behind the raw tone unvoiced frame DFT (discrete fourier transform), C ' kBe the coefficient after the embedding ciphertext, β is an insert depth, the scale factor that coefficient amplitude changes before and after promptly embedding, be determined by experiment, the n of 4n+s ' in (k) makes minimum value when the inequality on the right side is just set up in the following formula, and s ' is the quaternary ciphertext code stream numerical value of encrypting (k)
Figure A20061008559300071
The expression round numbers.
G. the plaintext after watermarked carries out obtaining mixing voice against DFT (discrete fourier transform) and communicates;
2.) watermark extracting:
H. at first the mixing voice that receives is carried out the branch frame equally,
I. each frame carries out voiceless sound and voiced sound judgement according to energy equally, finds out unvoiced frame,
J. unvoiced frame is DFT (discrete fourier transform), and selects first or second formant according to same controlling elements K2,
K. according to formula:
s &prime; &prime; ( i ) = ( round [ C k &prime; &beta; ] ) mod 4 , ( 0 &le; i , k < M / 2 )
Extract secret information, C ' kBe the coefficient that embeds after the ciphertext, β is an insert depth, M be order with binary-coded bit number, during quaternary representation, the Command field bit number of coding just equals M/2, s " (i) be the encryption ciphertext quaternary numerical value that extracts, round[] the expression round
L. according to formula:
S &prime; &prime; = ( s &prime; &prime; ( i ) + K 1 ) mod ( 4 ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 } , K is the even number deciphering, S " be the quaternary ciphertext code stream string after the deciphering that extracts, K 1Be identical predetermined accidental enciphering seed with transmit leg, be even number, other variable is the same.
M. according to formula:
s(i)=S″,0≤i<M,s(i)∈{0,1}
Carry out the quaternary to binary conversion, obtain real ciphertext binary numeral s  (i), it is corresponding one by one with literal, and other variable implication is the same.
N. literal is presented on the screen.
The training of larger data amount is carried out in generation based on the secret information code stream of speech recognition in advance at the speaker, pass on the people of secret information quietly transmitting an order by the microphone of terminal in the environment, possible minor error is revised in advance by keyboard through DTW (dynamic time convolution) speech recognition system identification back, secret subsequently voice messaging S is through being encoded into secret code stream.
The random key generating portion produces the key that above-mentioned voice identification result is upset, and final generation ciphertext to be hidden; At first carry out quaternary serial to parallel conversion, obtain new ciphertext code stream S ', generate key K then at random for recognition result sequence S 1S ' is encrypted upset, produce and wait to hide ciphertext, wherein K 1Be even number.
The unvoiced frame formant embeds the cipher-text information stage at first to expressly carrying out voice divides frame, carrying out the clear/voiced sound of frame then judges, the unvoiced frame that accounts for about 70% is searched for first and second formants, according to people's ear masking effect, the first or second formant place of adaptive selection unvoiced frame i.e. second random key K 2Control the coefficient of pairing Frequency point and make amendment, if K 2=0, select the first formant place frequency, otherwise select the second formant place frequency information of carrying out to embed; According to the three dB bandwidth theory, search out in DFT (discrete fourier transform) coefficient coefficient, and make amendment to realize hiding of cipher-text information near the first or second formant position, be about to wait hide ciphertext S ' embedding coefficient C kIn; Replace the back voice of having hidden ciphertext are expressly carried out IDFT (contrary discrete fourier transform), obtain mixing voice, in PSTN (public users telephone network) channel, transmit.
Secret information extracts and at first mixing voice and the built-in end that receives is carried out the branch frame according to same frame length, carry out voicing decision then, unvoiced frame is carried out N point DFT (discrete fourier transform), find out the Frequency point of the every frame voiced sound first or the second formant correspondence, searching method is consistent when embedding; Find out the back and it is handled extract secret information, again by the key upset to being decrypted, carry out the speech code stream that parallel serial conversion obtains original transmission at last, on the receiving terminal screen, obtain the secret information that transmit leg transmits.
Beneficial effect:
1, speech recognition technology is introduced the Information hiding field as extremely low code check compression scheme, greatly compressed the code check of secret voice messaging, hide scheme for the real time information of realization transparency, robustness, high safety and created precondition.
2, the existing information concealing technology mainly concentrates on the digital watermarking aspect at present, and this programme realizes that jumbo real time information is hiding, in fields such as military security communications very high practical value is arranged.
3, the hiding scheme of existing audio-frequency information based on DFT (discrete fourier transform) adopts fixed intermediate frequency to embed mostly, fixed-site, and fail safe is relatively poor.And this programme is encrypted the fail safe that the two-stage key guarantees secret information before adopting the self adaptation frequency to select and embed.And make full use of human hearing characteristic (HAS), the capacity that adopts the multi-system modulation technique to hide Info with raising.
Description of drawings
Fig. 1 is a system block diagram of the present invention,
Fig. 2 be raw tone (on) with take cipher sound (descending) time domain waveform,
Fig. 3 be raw tone (on) with take cipher sound (descending) sound spectrograph,
Fig. 4 is the performance chart of system through low-pass filtering,
Fig. 5 is that white Gaussian noise disturbs the performance map of system down.
Embodiment:
For the system that carries out secure communication, it is vital that secret information is delivered to the destination like clockwork, and the form of the information of transmission is less important.This programme proposes with the method for speech recognition secret voice to be handled, and the code check of secret voice is reduced greatly, provides the high as far as possible embedding scheme of the transparency and robustness, realizes real-time secure communication.From information-theoretical angle, the reason that speech recognition why can compression bit rate is not only to have forgiven semantic information, the tone of also forgiving the speaker, intonation, characteristic informations such as emotion in the voice; In military security communication, these speakers' feature all is ' redundant ' with respect to semantic commands, the scheme that adopts speech recognition with secret voice change into the order literal again coding transmission can reduce the code check of secret information greatly.Through measuring and calculating, adopt this scheme to carry out ciphertext compression after, the ciphertext code check can be controlled within the 100bit/s, is the present traditional voice compression coding scheme code check that is beyond one's reach.According to present speech recognition technology level, can reasonably suppose: in the military security communication system, the order of transmission can be the limited vocabulary amount, and in this case, speech recognition can reach very high accuracy rate.
Be the transparency and the robustness that guarantees to hide Info after embedding, adopt at frequency domain and realize the Information hiding scheme by adaptive embedding point selection and the modulation of multi-system code element.Usually embedding in transform domain hides Info is fixed intermediate frequency position in DCT (discrete fourier transform) territory, fixed-site, and fail safe is relatively poor.The main distinction of this programme and traditional frequency domain Information hiding scheme is: (1) embedded location is unfixing, and the selection that embeds point can produce in adaptive search, is equivalent to key.(2) for each selected embedding point, can transmit multiple code element state (as four condition) after revising a frequency coefficient, realize the modulation of multi-system information, increase the bit rate that embedding hides Info.(3) utilize the apperceive characteristic of people's ear, the voice after the embedding are owing to transmit in PSTN (public users telephone network) channel, and the various possible interference (companding, low-pass filtering, white noise etc.) that channel is existed has very strong robustness.
(words) encodes with the semanteme after the speech recognition, obtains secret information code stream S, is hidden among the open voice V, in PSTN (public users telephone network) channel.For satisfying transparent requirement, make secret information disperse as far as possible, plaintext V is carried out the processing of branch frame, voicing decision, select unvoiced frame expressly to carry out the embedding of ciphertext.According to the auditory masking effect of people's ear, the Frequency point masking effect that spectrum energy is big more is strong more, can introduce bigger noise and is not discovered by pleasant.For unvoiced frame, the spectrum energy at first and second formant place of frequency spectrum is local maximum, and the Frequency point of therefore selecting the open voice unvoiced frame first or the second formant place correspondence is (by key K 2Control) revise the embedding that its coefficient carries out secret information.Owing to introducing bigger distortion at these frequency places modification coefficients and not discovered,, revise a coefficient and can transmit a plurality of states 2 for fully increasing the bit rate that embeds ciphertext by people's ear N(as N=2), and just transmit two states unlike coefficient of traditional scheme modifying, when making full use of masking effect, guaranteeing the transparency, improved secret information and embedded efficient.Carry out voicing decision at receiving terminal according to identical strategy, amended coefficient is extracted in the formant search, and the information that judgement is hidden is decoded into semanteme at last, is presented on the terminal screen.The whole system framework as shown in Figure 1.
A. the generation of secret information code stream
The present invention adopts the compression algorithm of speech recognition technology as extremely low code check, improves the camouflage efficient of voice dazzle system.In view of little vocabulary speech recognition systems such as military security communications, adopt DTW (dynamic timewarping dynamic time convolution) to carry out little vocabulary speech recognition.DTW (dynamic time convolution) scheme is the speech recognition schemes of comparative maturity, in little vocabulary speech recognition systems such as military security communication, higher recognition success rate is arranged.Designed system of the present invention is carried out the training of larger data amount in advance at the speaker, pass on the people of secret information quietly transmitting an order by the microphone of terminal in the environment (as secret bunker), possible minor error is revised in advance by keyboard through DTW (dynamic time convolution) speech recognition system identification back, secret subsequently voice messaging S is embedded among the plaintext V through being encoded into secret code stream.Can suppose that the speaker sends secret order with such form: the military operation (for example shifting) of certain army (for example)+preposition (for example to)+place name (for example Nanjing)+take.Like this, in conjunction with semantic pause, adopt DTW (dynamic time convolution) technology that very high discrimination is arranged, and certain practical value is arranged.
By test, the speech recognition schemes that this paper adopts still has very high discrimination having under the situation of certain noise.And this paper designed system has been utilized PSTN (public users telephone network) wire message way when communication, can resist very strong electronic jamming under war environment.
B. secret information telescopiny
(1) the real-time secret voice of system acquisition are carried out be encoded into after the speech recognition M bit ciphertext code stream S:
S=s(i),0≤i<M,s(i)∈{0,1} (1)
Wherein, s (i) is a binary code stream numerical value.
(2) determine to revise the bit number that coefficient transmitted, native system can multi-system be modulated cipher-text information, is the modification that example is carried out coefficient with the quaternary.
Carry out serial to parallel conversion for S, obtain new ciphertext code stream S ':
S &prime; = s &prime; ( i ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 } - - - ( 2 )
Wherein, s ' is the numerical value that binary code stream in (1) formula is changed into quaternary code stream (i), and S ' is a unencryption quaternary ciphertext code stream, and M is the bit number of the coded command of binary representation after the speech recognition.
Generate key K at random 1(K 1Be even number) S ' is encrypted upset, specific algorithm is:
S &prime; = ( s &prime; ( i ) + K 1 ) mod ( 4 ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 } - - - ( 3 )
Wherein, the quaternary ciphertext code stream of S ' for having encrypted, other each variable implication is the same.
Like this, even algorithm is open, the person of stealing secret information also just obtains encrypted code stream at most and can't obtain effective information.
(3) divide frame for disclosed plaintext voice (8kHz sampling), carry out voicing decision, find out satisfactory unvoiced frame (V k, frame length is L) and carry out Information hiding.For selected frame V kMake the DFT (discrete fourier transform) that N is ordered, obtain
F=DFT(V k)={f k(i),0≤i≤N} (4)
F wherein k(i) expression is used for i DFT (discrete fourier transform) coefficient of the k frame of Information hiding, and F is a transformation results.If discrete fourier transform points N>L (unvoiced frame voice number of samples), the back mends 0 when making DFT (discrete fourier transform).
(4) determine the embedded location of every frame and revise coefficient to hide Info.Selecting suitable frequency to embed is a very important problem, according to people's ear masking effect, and the first or second (K of formant place of adaptive selection unvoiced frame 2Control) coefficient of pairing Frequency point is made amendment.If K 2=0, select the first formant place frequency, otherwise select the second formant place frequency information of carrying out to embed.According to the three dB bandwidth theory, search out in DFT (discrete fourier transform) coefficient near the coefficient of the first or second formant position and the hiding of making amendment with the realization cipher-text information.Ciphertext S ' to be hidden after encrypting is embedded expressly conversion coefficient C kIn, in order to carry out the blind Detecting of secret information at receiving terminal, embedding grammar is: carry out the quantification that insert depth is β, coefficient is after quantizing
Figure A20061008559300121
Wherein Expression rounds up, and insert depth β is determined by experiment.After the embedding information
C wherein kBe the coefficient that obtains behind the raw tone unvoiced frame DFT (discrete fourier transform), C ' kBe the coefficient after the embedding ciphertext, β is an insert depth, the scale factor that coefficient amplitude changes before and after promptly embedding, be determined by experiment, the n of 4n+s ' in (k) makes minimum value when the inequality on the right side is just set up in the following formula, and s ' is the quaternary ciphertext code stream numerical value of encrypting (k)
Figure A20061008559300124
The expression round numbers, other variable implication is the same.
Because first or second formant frequency of voiced sound concentrates on medium and low frequency (200-1000Hz), being embedded in information in this scope can avoid high fdrequency component to cause the loss of information in filtering or quantizing process, because the position difference of first and second formant of each frame voice, the selection of therefore hiding frequency is adaptive, this is equivalent to add the one-level key, has further strengthened the fail safe of secret information.In addition, because the spectrum component at first and second formant place of voiced sound is big many of other frequency place spectrum components relatively, when satisfying the transparency, can realize the multi-system modulation and guarantee robustness, as long as select suitable insert depth β by experiment, just may command embeds the influence to the transparency.
(5) will embed coefficient C after the ciphertext k' replacement original plaintext voice coefficient C k, and the transformation results F that revises carried out IDFT (contrary discrete fourier transform), and obtain mixing voice V ', in PSTN (public users telephone network) channel, transmit.
C. secret information leaching process
This programme can carry out the blind Detecting of cipher-text information at receiving terminal." carry out the branch frame with built-in end according to same frame length, carry out voicing decision then, unvoiced frame is carried out N point DFT (discrete fourier transform), obtain corresponding F ' according to Fig. 1, at first with the mixing voice V that receives.Because it is fixing that frequency domain embeds point, thus also need find out the Frequency point of the every frame voiced sound first or the second formant correspondence, consistent when searching method and embedding.The local maximum norm value of DFT (discrete fourier transform) coefficient that can prove the mixing voice V ' after the embedding information is still at the respective frequencies place of V, promptly
arg max i ( | f k ( i ) | ) = arg max i ( f k &prime; ( i ) | ) , I ∈ 3dBwidth, the searching method when therefore embedding is suitable equally in testing process.Find out C k' back is carried out following processing to it and is extracted secret information:
S &prime; &prime; = s &prime; &prime; ( i ) = ( round [ C k &prime; &beta; ] ) mod 4 , ( 0 &le; i , k < M / 2 ) - - - ( 6 )
Round[wherein] the expression round, C ' kBe to receive the coefficient that voice carry out discrete fourier transform after embedding ciphertext, s " (i) is the quaternary ciphertext code stream numerical value of the encryption that extracts; S " be the quaternary ciphertext code stream of the encryption that extracts, M is the bit number of the coded command of binary representation after the speech recognition.
By key to S " be decrypted, obtain:
S &prime; &prime; = ( s &prime; &prime; ( i ) + K 1 ) mod ( 4 ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 } - - - ( 7 )
K 1Being the encryption seed identical with transmit leg, being even number, S " is the quaternary ciphertext code stream of having deciphered that extracts." carrying out parallel serial conversion obtains with S
S=s(i),0≤i<M,s(i)∈{0,1} (8)
S  (i) is the binary system ciphertext code stream numerical value that extracts, and is corresponding one by one with literal, and S  is the binary system ciphertext code stream that extracts, and can directly translate or be shown as literal, and M is the bit number of the coded command of binary representation after the speech recognition
Decode at last and on the receiving terminal screen, obtain the secret information that transmit leg transmits.

Claims (5)

1. camouflage communication method based on speech recognition, it is characterized in that this method comprise generation based on the secret information code stream of speech recognition, serial to parallel conversion and based on random key generate ciphertext to be embedded, expressly the unvoiced frame formant detect and at random embedded technology, the secret information of cipher-text information extract four big parts, its entire method job step is as follows:
1.) watermark generates and embeds:
A. the user of service of system sends the phrase command that needs secret transmission to system,
B. system at first carries out speech recognition based on DTW (dynamic time convolution), is corresponding literal with command conversion, if find mistake, revises; And literal table is shown as binary code stream,
C. according to formula:
S &prime; = s &prime; ( i ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 }
Carry out binary system and quaternary string and conversion, wherein, s ' is to change binary code stream into behind the quaternary code stream numerical value (i), and S ' is a unencryption quaternary ciphertext code stream, and M is the bit number of the coded command of binary representation after the speech recognition;
And according to formula:
S &prime; = ( s &prime; ( i ) + K 1 ) mod ( 4 ) , 0 &le; i < M 2 , s &prime; ( i ) &Element; { 0,1,2,3 }
Carry out encryption, obtain ciphertext, wherein, the quaternary ciphertext code stream of S ' for having encrypted, K 1Being even number, is key, and other each variable implication is the same,
D. expressly voice divide frame, and each frame carries out voiceless sound and voiced sound judgement according to energy, finds out unvoiced frame,
E. unvoiced frame is done the DFT discrete fourier transform, and selects first or second formant according to controlling elements K2,
F. according to formula:
The watermark that the c step is generated embeds, wherein C kBe the coefficient that obtains behind the raw tone unvoiced frame DFT (discrete fourier transform), C ' kBe the coefficient after the embedding ciphertext, β is an insert depth, the scale factor that coefficient amplitude changes before and after promptly embedding, be determined by experiment, the n of 4n+s ' in (k) makes minimum value when the inequality on the right side is just set up in the following formula, and s ' is the quaternary ciphertext code stream numerical value of encrypting (k) The expression round numbers,
Other variable implication is the same;
G. the plaintext after watermarked carries out obtaining mixing voice against the DFT conversion and communicates;
2.) watermark extracting:
H. at first the mixing voice that receives is carried out the branch frame equally,
I. each frame carries out voiceless sound and voiced sound judgement according to energy equally, finds out unvoiced frame,
J. unvoiced frame is DFT, and selects first or second formant according to same controlling elements,
K. according to formula:
S &prime; &prime; = s &prime; &prime; ( i ) = ( round [ C k &prime; &beta; ] ) mod 4 , ( 0 &le; i , k < M / 2 )
Extract secret information, wherein round[] the expression round, C ' kBe to receive the coefficient that voice carry out discrete fourier transform after embedding ciphertext, s " (i) is the quaternary ciphertext code stream numerical value of the encryption that extracts; S " be the quaternary ciphertext code stream of the encryption that extracts, M is the bit number of the coded command of binary representation after the speech recognition;
1. according to formula:
S &prime; &prime; = ( s &prime; &prime; ( i ) + K 1 ) mod ( 4 ) , 0 &le; i < M 2 , s &prime; &prime; ( i ) &Element; { 0,1,2,3 } , K is an even number
Deciphering, K 1Being the encryption seed identical with transmit leg, being even number, S, " be the quaternary ciphertext code stream of having deciphered that extracts, other variable implication is the same.
M. according to formula:
S=s(i),0≤i<M,s(i)∈{0,1}
Carry out the quaternary to binary conversion, obtain real ciphertext literal, s  (i) is the binary system ciphertext code stream numerical value that extracts, corresponding one by one with literal, S  is the binary system ciphertext code stream that extracts, can directly translate or be shown as literal, M is the bit number of the coded command of binary representation after the speech recognition
N. literal is presented on the screen;
2. the camouflage communication method based on speech recognition according to claim 1, it is characterized in that carrying out the training of larger data amount in advance at the speaker based on the generation of the secret information code stream of speech recognition, pass on the people of secret information quietly transmitting an order by the microphone of terminal in the environment, possible minor error is revised in advance by keyboard through DTW dynamic time convolution speech recognition system identification back, secret subsequently voice messaging S is through being encoded into secret code stream.
3. the camouflage communication method based on speech recognition according to claim 1 is characterized in that serial to parallel conversion and generates the key that ciphertext part to be embedded upsets above-mentioned voice identification result based on random key, and final generation ciphertext to be hidden; At first carry out quaternary serial to parallel conversion, obtain new ciphertext code stream S ', generate key K then at random for recognition result sequence S 1S ' is encrypted upset, produce and wait to hide ciphertext, wherein K 1Be even number.
4. the camouflage communication method based on speech recognition according to claim 1, it is characterized in that unvoiced frame formant expressly detects and at random embedding stage of cipher-text information at first to expressly carrying out voice divides frame, carrying out the clear/voiced sound of frame then judges, the unvoiced frame that accounts for about 70% is searched for first and second formants, according to people's ear masking effect, the first or second formant place of adaptive selection unvoiced frame i.e. second random key K 2Control the coefficient of pairing Frequency point and make amendment, if K 2=0, select the first formant place frequency, otherwise select the second formant place frequency information of carrying out to embed; According to the three dB bandwidth theory, search out in the DFT coefficient coefficient, and make amendment to realize hiding of cipher-text information near the first or second formant position, be about to S ' embedding C kIn; Replace the back voice of having hidden ciphertext are expressly carried out the contrary discrete fourier transform of IDFT, obtain mixing voice, in PSTN public users telephone network channel, transmit.
5. the camouflage communication method based on speech recognition according to claim 1, it is characterized in that secret information extracts at first carries out the branch frame with mixing voice and the built-in end that receives according to same frame length, carry out voicing decision then, unvoiced frame is carried out N point DFT, find out the Frequency point of the every frame voiced sound first or the second formant correspondence, searching method is consistent when embedding; Find out the back and it is handled extract secret information, again by the key upset to being decrypted, carry out the speech code stream that parallel serial conversion obtains original transmission at last, on the receiving terminal screen, obtain the secret information that transmit leg transmits.
CNB2006100855931A 2006-06-26 2006-06-26 Camouflage communication method based on speech recognition Expired - Fee Related CN100550723C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100855931A CN100550723C (en) 2006-06-26 2006-06-26 Camouflage communication method based on speech recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100855931A CN100550723C (en) 2006-06-26 2006-06-26 Camouflage communication method based on speech recognition

Publications (2)

Publication Number Publication Date
CN1901442A true CN1901442A (en) 2007-01-24
CN100550723C CN100550723C (en) 2009-10-14

Family

ID=37657199

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100855931A Expired - Fee Related CN100550723C (en) 2006-06-26 2006-06-26 Camouflage communication method based on speech recognition

Country Status (1)

Country Link
CN (1) CN100550723C (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577619A (en) * 2008-05-08 2009-11-11 吴志军 Real-time speech secret communication system based on information hiding
CN101577605A (en) * 2008-05-08 2009-11-11 吴志军 Speech LPC hiding and extraction algorithm based on filter similarity
CN102916803A (en) * 2012-10-30 2013-02-06 山东省计算中心 File implicit transfer method based on public switched telephone network
CN104852799A (en) * 2015-05-12 2015-08-19 陕西师范大学 Digital audio camouflage and reconstruction method based on segmented sequences
CN106875954A (en) * 2017-03-27 2017-06-20 中国农业大学 The speech hiding circuit structure and its control method of a kind of anti vocoder treatment
CN106910508A (en) * 2017-01-23 2017-06-30 哈尔滨工程大学 A kind of hidden underwater acoustic communication method of imitative ocean piling sound source
CN107547196A (en) * 2017-08-29 2018-01-05 中国民航大学 Speech hiding algorithm based on parameters revision
CN108962239A (en) * 2018-06-08 2018-12-07 四川斐讯信息技术有限公司 A kind of quick distribution method and system based on voice masking
CN109119086A (en) * 2017-06-24 2019-01-01 天津大学 A kind of breakable watermark voice self-restoring technology of multilayer least significant bit
CN110176235A (en) * 2019-05-23 2019-08-27 腾讯科技(深圳)有限公司 Methods of exhibiting, device, storage medium and the computer equipment of speech recognition text
CN110739984A (en) * 2019-11-08 2020-01-31 江苏科技大学 camouflage communication method based on wavelet analysis
CN111669394A (en) * 2020-06-04 2020-09-15 西安空间无线电技术研究所 Method for hiding and transmitting image and voice information of satellite communication

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577605A (en) * 2008-05-08 2009-11-11 吴志军 Speech LPC hiding and extraction algorithm based on filter similarity
CN101577619B (en) * 2008-05-08 2013-05-01 吴志军 Real-time speech secret communication system based on information hiding
CN101577605B (en) * 2008-05-08 2014-06-18 吴志军 Speech LPC hiding and extraction algorithm based on filter similarity
CN101577619A (en) * 2008-05-08 2009-11-11 吴志军 Real-time speech secret communication system based on information hiding
CN102916803A (en) * 2012-10-30 2013-02-06 山东省计算中心 File implicit transfer method based on public switched telephone network
CN102916803B (en) * 2012-10-30 2015-06-10 山东省计算中心 File implicit transfer method based on public switched telephone network
CN104852799B (en) * 2015-05-12 2017-12-29 陕西师范大学 DAB camouflage and reconstructing method based on fragment sequence
CN104852799A (en) * 2015-05-12 2015-08-19 陕西师范大学 Digital audio camouflage and reconstruction method based on segmented sequences
CN106910508B (en) * 2017-01-23 2020-04-03 哈尔滨工程大学 Hidden underwater acoustic communication method for imitating marine pile driving sound source
CN106910508A (en) * 2017-01-23 2017-06-30 哈尔滨工程大学 A kind of hidden underwater acoustic communication method of imitative ocean piling sound source
CN106875954A (en) * 2017-03-27 2017-06-20 中国农业大学 The speech hiding circuit structure and its control method of a kind of anti vocoder treatment
CN109119086A (en) * 2017-06-24 2019-01-01 天津大学 A kind of breakable watermark voice self-restoring technology of multilayer least significant bit
CN107547196A (en) * 2017-08-29 2018-01-05 中国民航大学 Speech hiding algorithm based on parameters revision
CN108962239A (en) * 2018-06-08 2018-12-07 四川斐讯信息技术有限公司 A kind of quick distribution method and system based on voice masking
CN110176235A (en) * 2019-05-23 2019-08-27 腾讯科技(深圳)有限公司 Methods of exhibiting, device, storage medium and the computer equipment of speech recognition text
CN110176235B (en) * 2019-05-23 2022-02-01 腾讯科技(深圳)有限公司 Method and device for displaying voice recognition text, storage medium and computer equipment
CN110739984A (en) * 2019-11-08 2020-01-31 江苏科技大学 camouflage communication method based on wavelet analysis
CN110739984B (en) * 2019-11-08 2021-07-02 江苏科技大学 Camouflage communication method based on wavelet analysis
CN111669394A (en) * 2020-06-04 2020-09-15 西安空间无线电技术研究所 Method for hiding and transmitting image and voice information of satellite communication
CN111669394B (en) * 2020-06-04 2022-03-04 西安空间无线电技术研究所 Method for hiding and transmitting image and voice information of satellite communication

Also Published As

Publication number Publication date
CN100550723C (en) 2009-10-14

Similar Documents

Publication Publication Date Title
CN1901442A (en) Camouflage communication method based on voice identification
Djebbar et al. A view on latest audio steganography techniques
CN1279512C (en) Methods for improving high frequency reconstruction
Huang et al. Steganography integration into a low-bit rate speech codec
CN103430234B (en) Voice transformation with encoded information
KR100595202B1 (en) Apparatus of inserting/detecting watermark in Digital Audio and Method of the same
CN1290290C (en) Method and device for computerized voice data hidden
Nematollahi et al. An overview of digital speech watermarking
Sarreshtedari et al. A watermarking method for digital speech self-recovery
CN107886962B (en) High-security steganography method for IP voice
Shirali-Shahreza et al. High capacity error free wavelet domain speech steganography
EP2787503A1 (en) Method and system of audio signal watermarking
Kheddar et al. High capacity speech steganography for the G723. 1 coder based on quantised line spectral pairs interpolation and CNN auto-encoding
CN111246469B (en) Artificial intelligence secret communication system and communication method
Kreuk et al. Hide and speak: Deep neural networks for speech steganography
Kanhe et al. DCT based audio steganography in voiced and un-voiced frames
CN107689226A (en) A kind of low capacity Methods of Speech Information Hiding based on iLBC codings
Hu et al. Incorporating spectral shaping filtering into DWT-based vector modulation to improve blind audio watermarking
Wu Information hiding in speech signals for secure communication
Datta et al. Robust multi layer audio steganography
Dastoor Comparative analysis of Steganographic algorithms intacting the information in the speech signal for enhancing the message security in next generation mobile devices
Singh A survey on audio steganography approaches
Liu et al. Adaptive audio steganography scheme based on wavelet packet energy
Amiri et al. DWT-GBT-SVD-based robust speech steganography
Wu et al. Comparison of two speech content authentication approaches

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Assignee: Nanjing 6902 Technology Co., Ltd.

Assignor: Nanjing Post & Telecommunication Univ.

Contract record no.: 2011320000147

Denomination of invention: Camouflage communication method based on voice identification

Granted publication date: 20091014

License type: Exclusive License

Open date: 20070124

Record date: 20110303

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091014

Termination date: 20180626