CN115514603A - Voice-like modulation and demodulation method and device - Google Patents

Voice-like modulation and demodulation method and device Download PDF

Info

Publication number
CN115514603A
CN115514603A CN202210956470.XA CN202210956470A CN115514603A CN 115514603 A CN115514603 A CN 115514603A CN 202210956470 A CN202210956470 A CN 202210956470A CN 115514603 A CN115514603 A CN 115514603A
Authority
CN
China
Prior art keywords
user terminal
demodulation
voice
voice data
modulation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210956470.XA
Other languages
Chinese (zh)
Inventor
俞佳宝
丁艳军
胡爱群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Network Communication and Security Zijinshan Laboratory
Original Assignee
Network Communication and Security Zijinshan Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Network Communication and Security Zijinshan Laboratory filed Critical Network Communication and Security Zijinshan Laboratory
Priority to CN202210956470.XA priority Critical patent/CN115514603A/en
Publication of CN115514603A publication Critical patent/CN115514603A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L27/00Modulated-carrier systems
    • H04L27/18Phase-modulated carrier systems, i.e. using phase-shift keying
    • H04L27/20Modulator circuits; Transmitter circuits
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L27/00Modulated-carrier systems
    • H04L27/18Phase-modulated carrier systems, i.e. using phase-shift keying
    • H04L27/22Demodulator circuits; Receiver circuits
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L5/00Arrangements affording multiple use of the transmission path
    • H04L5/14Two-way operation using the same type of signal, i.e. duplex
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/04Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks
    • H04L63/0428Network architectures or network communication protocols for network security for providing a confidential data exchange among entities communicating through data packet networks wherein the data content is protected, e.g. by encrypting or encapsulating the payload

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Digital Transmission Methods That Use Modulated Carrier Waves (AREA)

Abstract

The invention provides a voice-like modulation and demodulation method and a device, which relate to the technical field of voice communication, and the method comprises the following steps: controlling a first user terminal and a second user terminal to perform differential phase modulation on encrypted voice signals to be sent respectively based on different modulation frequency points, and transmitting the modulated voice signals to an opposite user terminal after time domain conversion; under the condition that the second user terminal is determined to receive the first type of voice data symbols sent by the first user terminal, controlling the second user terminal to perform differential phase demodulation on the first type of voice data symbols; and under the condition that the first user terminal is determined to receive the second type voice data symbol sent by the second user terminal, controlling the first user terminal to perform differential phase demodulation on the second type voice data symbol. The invention generates the similar voice signal with time-frequency characteristic by carrying out differential phase modulation on the encrypted voice signal, and realizes full-duplex voice communication by a modulation and demodulation mode of frequency point interleaving.

Description

Voice-like modulation and demodulation method and device
Technical Field
The present invention relates to the field of voice communications technologies, and in particular, to a method and an apparatus for modulating and demodulating voice-like data.
Background
In a conventional voice transmission system, in order to save transmission channel bandwidth, a vocoder is generally used to perform compression coding on a voice signal. Since the speech signal after compression coding will not have speech characteristics after digital encryption, the vocoder cannot extract the basic characteristics of the speech signal therefrom, or the speech signal can be treated as noise. Therefore, it is necessary to convert the encrypted voice signal into a voice-like signal having voice characteristics, and further to realize encrypted transmission on a voice channel.
The existing voice-like algorithm has a good simulation effect, but in an actual environment, hardware deviation and voice enhancement processing of communication software can greatly affect voice-like signals, so that the error rate is high. Moreover, existing voice communication software usually performs voice detection on the calls of both communication parties, and if the voices of both communication parties are similar, one of the voices is suppressed, so that only half-duplex communication can be realized. In addition, the speech enhancement technology adopted by the existing speech communication software has a large influence on the amplitude of the speech signal, so that the amplitude of the speech signal has large distortion and the phase of the speech signal has small influence.
In view of the above, it is highly desirable to design a phase modulation-based speech-like modulation and demodulation scheme to convert an encrypted speech signal into a speech-like signal with a certain time-frequency characteristic, so as to achieve the purpose of performing full-duplex communication in an actual speech channel.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a similar voice modulation and demodulation method and a similar voice modulation and demodulation device.
In a first aspect, the present invention provides a speech-like modulation and demodulation method, including:
determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order to obtain a first frequency domain voice signal, performing time domain conversion on the first frequency domain voice signal to obtain a first type of voice data symbols, and transmitting the first type of voice data symbols to a second user terminal;
controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order to obtain a second frequency domain voice signal, performing time domain conversion on the second frequency domain voice signal to obtain a second voice data symbol, and transmitting the second voice data symbol to the first user terminal;
under the condition that the second user terminal is determined to receive the first type of voice data symbols, controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order so as to obtain first demodulated voice data;
under the condition that the first user terminal is determined to receive the second type of voice data symbols, controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulated voice data;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
Optionally, according to a speech-like modulation and demodulation method provided by the present invention, transmitting the first type of speech data symbols to a second user terminal includes:
controlling the first user terminal to perform differential phase modulation on first control information required by key synchronization based on the first modulation frequency point and the second modulation order, and performing time domain conversion on a first frequency domain control signal obtained after modulation to obtain a first time domain control signal;
forming a first preamble symbol by a first synchronization signal of the first user terminal and the first time domain control signal;
forming a first type voice signal frame by the first blank symbol, the first leading symbol and the first type voice data symbol;
and controlling the first user terminal to transmit the first type of voice signal frame to a second user terminal.
Optionally, according to a voice-like modulation and demodulation method provided by the present invention, controlling the second user terminal to perform differential phase demodulation on the first voice data symbol based on the first demodulation frequency and the first demodulation order, so as to obtain first demodulated voice data includes:
controlling the second user terminal to perform frame synchronization on the first type voice signal frame based on a first blank symbol and a first preamble symbol in the received first type voice signal frame, determining a first initial position of the first preamble symbol, and determining a first type voice data symbol in the first type voice signal frame based on the first initial position;
and controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on the first demodulation frequency point and the first demodulation order so as to obtain first demodulation voice data.
Optionally, after determining the first starting position of the first preamble symbol, the method for modulating and demodulating similar speech further includes:
controlling the second user terminal to determine a first time domain control signal in the first type of voice signal frame based on the first starting position;
and controlling the second user terminal to carry out differential phase demodulation on the first time domain control signal based on a first demodulation frequency point and a second demodulation order so as to obtain first control information, and carrying out decryption and compression decoding on the first demodulated voice data based on the first control information so as to obtain first original voice data.
Optionally, according to a speech-like modulation and demodulation method provided by the present invention, transmitting the second-type speech data symbols to the first user terminal includes:
controlling the second user terminal to perform differential phase modulation on second control information required by key synchronization based on the second modulation frequency point and the second modulation order, and performing time domain conversion on a second frequency domain control signal obtained after modulation to obtain a second time domain control signal;
forming a second preamble symbol by a second synchronization signal of the second user terminal and the second time domain control signal;
forming a second type voice signal frame by the second blank symbol, the second leading symbol and the second type voice data symbol;
and controlling the second user terminal to transmit the second type of voice signal frame to the first user terminal.
Optionally, according to a voice-like modulation and demodulation method provided by the present invention, controlling the first user terminal to perform differential phase demodulation on the second voice data symbol based on a second demodulation frequency and a first demodulation order, so as to obtain second demodulated voice data includes:
controlling the first user terminal to perform frame synchronization on a second type voice signal frame based on a second blank symbol and a second leading symbol in the received second type voice signal frame, determining a second initial position of the second leading symbol, and determining a second type voice data symbol in the second type voice signal frame based on the second initial position;
and controlling the first user terminal to carry out differential phase demodulation on the second type voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulation voice data.
Optionally, after determining the second starting position of the second preamble symbol, the method for modulating and demodulating similar speech further includes:
controlling the first user terminal to determine a second time domain control signal in the second type voice signal frame based on the second starting position;
and controlling the first user terminal to perform differential phase demodulation on the second time domain control signal based on a second demodulation frequency point and a second demodulation order to acquire second control information, and performing decryption and compression decoding on the second demodulated voice data based on the second control information to acquire second original voice data.
Optionally, according to the speech-like modulation and demodulation method provided by the present invention, the second modulation order is smaller than or equal to the first modulation order.
In a second aspect, the present invention further provides a speech-like modulation/demodulation apparatus, comprising:
the determining module is used for determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
the first control module is used for controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order, so as to obtain a first frequency domain voice signal, perform time domain conversion on the first frequency domain voice signal, obtain a first type of voice data symbol, and transmit the first type of voice data symbol to the second user terminal;
the second control module is used for controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order, so as to obtain a second frequency domain voice signal, perform time domain conversion on the second frequency domain voice signal, obtain a second type of voice data symbol, and transmit the second type of voice data symbol to the first user terminal;
the third control module is configured to, when it is determined that the second user terminal receives the first type of voice data symbols, control the second user terminal to perform differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order, so as to obtain first demodulated voice data;
the fourth control module is configured to control the first user terminal to perform differential phase demodulation on the second type voice data symbol based on a second demodulation frequency point and a first demodulation order to obtain second demodulated voice data when it is determined that the first user terminal receives the second type voice data symbol;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
In a third aspect, the present invention also provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the speech modulation and demodulation method according to the first aspect when executing the computer program.
In a fourth aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, which computer program, when executed by a processor, implements a speech-like modem method as described in the first aspect.
In a fifth aspect, the present invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a speech-like modem method as described in the first aspect.
According to the voice-like modulation and demodulation method and device provided by the invention, the first user terminal and the second user terminal are controlled to perform differential phase modulation on the encrypted voice signals respectively based on different modulation frequency points to generate voice-like signals with time-frequency characteristics, and the first user terminal and the second user terminal are controlled to transmit the generated voice-like signals to the opposite communication terminal and then perform differential phase demodulation, so that not only can the hardware deviation and the voice enhancement processing influence of the communication terminal be effectively avoided, but also full-duplex voice communication can be realized based on the frequency point staggered modulation and demodulation mode.
Drawings
In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a speech-like modulation and demodulation method provided by the present invention;
FIG. 2 is a schematic flow chart of differential phase modulation provided by the present invention;
FIG. 3 is a schematic phase diagram of a modulated speech-like signal of a first user terminal according to the present invention;
FIG. 4 is a schematic phase diagram of a second user terminal modulated post-class speech signal provided by the present invention;
FIG. 5 is a schematic phase diagram of a full-duplex aliasing-like speech signal provided by the present invention;
FIG. 6 is a schematic flow chart of differential phase demodulation provided by the present invention;
FIG. 7 is a schematic diagram of a time domain waveform of a synchronization signal provided by the present invention;
FIG. 8 is a schematic diagram of a structure of a 400 ms-class speech signal frame provided by the present invention;
FIG. 9 is a schematic diagram illustrating a synchronization process of a speech-like signal frame provided by the present invention;
FIG. 10 is a schematic diagram of the coarse synchronization result based on energy detection provided by the present invention;
FIG. 11 is a schematic diagram of the fine synchronization result based on cross-correlation provided by the present invention;
FIG. 12 is a schematic diagram of the operation of a full duplex voice-like system provided by the present invention;
FIG. 13 is a schematic structural diagram of a speech-like modem apparatus provided in the present invention;
fig. 14 is a schematic physical structure diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
To facilitate a clearer understanding of embodiments of the present invention, some relevant background information is first presented below.
Voice communication has been widely used in life, but eavesdropping of an event by voice communication sometimes occurs. For example, after a trojan is installed on a mobile phone, a user's call can be easily eavesdropped through a wireless network, and if voice can be encrypted before voice transmission, third party eavesdropping can be prevented in subsequent voice processing and voice transmission.
In a conventional voice transmission system, in order to save transmission channel bandwidth, a vocoder is used to perform compression coding on voice. Most vocoders process and model speech by using a hybrid parameter coding algorithm, and extract model parameters for transmission. The voice signal after compression coding has no voice characteristics after digital encryption, the basic characteristics of the voice signal can not be extracted after passing through a vocoder, or the voice signal is treated as noise and is difficult to be effectively transmitted through a voice channel. Therefore, it is necessary to further process the encrypted voice signal and convert it into a voice-like signal with voice characteristics, so as to realize encrypted transmission on the voice channel.
The existing voice-like algorithm has a good simulation effect, but in an actual environment, hardware deviation and voice enhancement processing of communication software can greatly affect voice-like signals, so that the voice enhancement algorithms have poor use effect and high error rate in various voice communication software. For example, due to the difference in the accuracy of the crystal oscillators of the sound cards of the transmitting and receiving parties, frequency offset exists in the transmission process of the voice-like modulated data, which causes errors in the demodulation of the voice-like data. In addition, existing voice communication software usually performs voice detection on the calls of both communication parties, and if the voices of both parties are similar, one party is treated as an echo, that is, the voice of one party is suppressed, and only half-duplex communication can be realized. In addition, the speech enhancement technology adopted by the existing speech communication software has a large influence on the amplitude of the speech signal, which causes the amplitude of the speech signal to have large distortion and has a small influence on the phase of the speech signal.
In order to overcome the above-mentioned drawbacks, the present invention provides a speech-like modulation and demodulation method and apparatus. The following describes the speech-like modulation and demodulation method and apparatus provided by the present invention with reference to fig. 1-14.
Fig. 1 is a schematic flow chart of a speech-like modulation and demodulation method provided by the present invention, as shown in fig. 1, the method includes:
step 100, determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal.
Specifically, the first user terminal and the second user terminal may use respective sound cards based on a preset sampling rate f s The method comprises the steps of collecting voice data with preset duration (t seconds), then carrying out voice activity detection on the collected voice data, identifying effective voice data, further carrying out compression coding on each section of effective voice data according to the voice transmission code rate (r), obtaining m-bit voice data, and then carrying out encryption processing, so that a first encrypted voice signal to be sent of a first user terminal and a second encrypted voice signal to be sent of a second user terminal can be obtained.
For example, the first user terminal and the second user terminal respectively use their respective USB sound cards to acquire 20ms voice data at a sampling rate of 8kHz, that is, 160 sampling points (where the acquisition duration of the voice data can be set according to the length requirement of the output of a subsequently selected compression coding algorithm), then use a voice activity detection algorithm in WebRTC (Web Real-Time Communications ) to perform voice activity detection on the acquired voice data, if the voice data is valid, store the frame of voice data for subsequent processing, and otherwise output an all-zero blank signal.
Further, codec2 compression coding may be performed on the 20ms effective speech data obtained above according to the transmission code rate r =2.4kbps of speech to obtain m = r × t =48 bits of speech data, where the compression rate of the compression coding is 160 ÷ 48=3.33, and then the m bits of speech data are encrypted by using a 256-bit Chongzhi's Cipher-256, ZUC-256 stream encryption algorithm to obtain a first encrypted speech signal to be sent by the first user terminal and a second encrypted speech signal to be sent by the second user terminal.
Step 110, controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order to obtain a first frequency domain voice signal, performing time domain conversion on the first frequency domain voice signal to obtain a first type of voice data symbol, and transmitting the first type of voice data symbol to a second user terminal.
Specifically, the first user terminal may be controlled to perform differential phase modulation on the first encrypted voice signal determined in step 100 based on the first modulation frequency point and the first modulation order, to obtain a first frequency domain voice signal, perform time domain conversion on the first frequency domain voice signal, and transmit the obtained first type of voice data symbol to the second user terminal.
And step 120, controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order to obtain a second frequency domain voice signal, performing time domain conversion on the second frequency domain voice signal to obtain a second voice data symbol, and transmitting the second voice data symbol to the first user terminal.
Specifically, the second user terminal may be controlled to perform differential phase modulation on the second encrypted voice signal determined in step 100 based on the second modulation frequency point and the first modulation order, to obtain a second frequency domain voice signal, perform time domain conversion on the second frequency domain voice signal, and transmit the obtained second type voice data symbol to the first user terminal.
Optionally, a first modulation frequency point selected when the first user terminal performs differential phase modulation on the first encrypted voice signal is different from a second modulation frequency point selected when the second user terminal performs differential phase modulation on the second encrypted voice signal, so that a frequency point staggered modulation mode can be implemented.
For example, the first user terminal and the second user terminal may perform 2 on the encrypted voice data in the frequency domain respectively k The order difference phase modulation has the modulation frequency point range of 300 Hz-3400 Hz, the fundamental tone frequency of delta f, and the first user terminal and the second user terminalEach terminal selects N f +1 frequency points are modulated, for example, the initial frequency point selected by the first user terminal is f a The initial frequency point selected by the second user terminal is f b =f a + Δ f, the frequency point used by the first user terminal is f a +i×2×Δf(0≤i≤N f ) The frequency point used by the second user terminal is f a +(i×2+1)×Δf(0≤i≤N f ) 2X (N) of both f + 1) frequency points are staggered, and the similar voice signals can be effectively separated even after being mixed together, so that the effect of full duplex communication is achieved.
Optionally, the first user terminal and the second user terminal respectively select N f +1 frequency points 2 k Step difference phase modulation, every two adjacent frequency points with the interval of 2 multiplied by delta f are in a group, and each frequency point comprises N f Group, k-bit data mapping to 2 k The selectable differential phase parameters are used as the phase difference of adjacent frequency points, and the initial frequency point f a And f b The modulation is performed with a fixed phase.
After the frequency domain differential phase modulation is completed, the first user terminal and the second user terminal may respectively convert the modulated frequency domain signal into a time domain through Inverse Fast Fourier Transform (IFFT) to obtain a time domain data symbol modulated like speech.
Fig. 2 is a schematic diagram of a Differential Phase modulation process provided by the present invention, and as shown in fig. 2, the first user terminal and the second user terminal may perform 4-order Differential Phase modulation (DQPSK) in the frequency domain respectively based on the process shown in fig. 2, and each of the modulation frequency points 300Hz to 3400Hz selects 16 frequency points for modulation, for example, the initial frequency point selected by the first user terminal is 300Hz, the initial frequency point selected by the second user terminal is 400Hz, and the same fundamental tone frequency (minimum frequency spacing) selected by both sides is 100Hz, so the frequency point selected by the first user terminal is 300+200 × i (i is greater than or equal to 0 and less than or equal to 15), the frequency point selected by the second user terminal is 400+200 × i (i is greater than or equal to 0 and less than or equal to 15), so that the two selected 32 frequency points are staggered with each other, and the voice-like signals can be effectively separated even though they are mixed together, thereby achieving the effect of full duplex communication.
In the embodiment of the present invention, a first user terminal and a second user terminal respectively modulate at initial frequency points 300Hz and 400Hz with fixed phase 0, and each two adjacent frequency points with an interval of 200Hz are respectively selected as a group, each of which includes 15 groups, that is, the first user terminal modulates based on 15 groups of adjacent frequency points including { (300, 500), (500, 700),. }, (3100, 3300) }, and the second user terminal modulates based on 15 groups of adjacent frequency points including { (400, 600), (600, 800),., (3200, 3400) }, and each group of frequency points modulates 2-bit data. The error rate is reduced by adopting a Gray coding mapping mode, and the mapping relation between the modulation bit of the encrypted voice signal and the adjacent frequency point differential phase is shown in a table 1:
TABLE 1 mapping relation between modulation bit of encrypted voice signal and differential phase of adjacent frequency point
2 bit data 00 10 11 01
DQPSK differential phase 0 π/2 π -π/2
Fig. 3 is a schematic phase diagram of a modulated speech signal of a first user terminal provided by the present invention, fig. 4 is a schematic phase diagram of a modulated speech signal of a second user terminal provided by the present invention, and as shown in fig. 3 and fig. 4, phase diagrams after DQPSK modulation is performed on 30-bit data { 10 110 0 10 10 100 1 } by the first user terminal and the second user terminal, respectively. When the first user terminal and the second user terminal are in full duplex communication, the transmitted signals may mix together. Fig. 5 is a schematic phase diagram of a full-duplex aliasing-type voice signal provided by the present invention, as shown in fig. 5, where a solid line represents a phase portion of a first user terminal, and a dotted line represents a phase portion of a second user terminal, and it can be seen from fig. 5 that frequency points occupied by signals of the first user terminal and the second user terminal are staggered with each other, and no interference is generated.
In the embodiment of the present invention, after the 30-bit encrypted data is subjected to frequency domain differential phase modulation, the modulated frequency domain signal may be converted into a time domain through inverse fast fourier transform, so as to obtain a voice-like data symbol, where the length is 1/100hz =10ms. Thus, 20ms voice-like data symbols can carry 60 bits of encrypted data.
Step 130, controlling the second user terminal to perform differential phase demodulation on the first type voice data symbols based on a first demodulation frequency point and a first demodulation order under the condition that it is determined that the second user terminal receives the first type voice data symbols, so as to obtain first demodulated voice data.
Specifically, under the condition that it is determined that the second user terminal receives the first type of voice data symbols transmitted by the first user terminal, the second user terminal may be controlled to perform differential phase demodulation on the first type of voice data symbols based on the first demodulation frequency point and the first demodulation order, so as to obtain the first demodulated voice data, where the first demodulation frequency point is the same as the first modulation frequency point, and the first demodulation order is the same as the first modulation order.
Step 140, under the condition that it is determined that the first user terminal receives the second type voice data symbol, controlling the first user terminal to perform differential phase demodulation on the second type voice data symbol based on a second demodulation frequency point and a first demodulation order, so as to obtain second demodulated voice data.
Specifically, under the condition that it is determined that the first user terminal receives the second type of voice data symbol transmitted by the second user terminal, the first user terminal may be controlled to perform differential phase demodulation on the second type of voice data symbol based on the second demodulation frequency point and the first demodulation order, so as to obtain the second demodulated voice data, where the second demodulation frequency point is the same as the second modulation frequency point, and the first demodulation order is the same as the first modulation order.
For example, the first user terminal and the second user terminal respectively convert the first type voice data symbol and the second type voice data symbol of the time domain into the frequency domain through Fourier transform for 2 k Step difference phase demodulation, the initial frequency point selected by the first user terminal is f b =f a + delta f, the initial frequency point selected by the second user terminal is f a And the fundamental tone frequency is delta f, namely the frequency point used by the first user terminal for demodulation is the frequency point used by the second user terminal for modulation: f. of a +(i×2+1)×Δf(0≤i≤N f ) And the frequency point demodulated and used by the second user terminal is the frequency point used by the first user terminal during modulation: f. of a +i×2×Δf(0≤i≤N f ) The first user terminal and the second user terminal respectively carry out 2 operations on each group of adjacent frequency points with the interval of 2 multiplied by delta f k And carrying out phase-difference demodulation on the order difference to obtain demodulated voice data.
Fig. 6 is a schematic flow chart of differential phase demodulation provided by the present invention, and for 16 voice data symbols of 320ms, based on the flow chart shown in fig. 6, voice data symbols of 10ms length are sequentially selected and transformed to the frequency domain to perform 4-order differential phase demodulation, so as to obtain 30-bit data. The first user terminal demodulates according to the modulation parameter of the second user terminal during demodulation, the second user terminal demodulates according to the modulation parameter of the first user terminal, namely the demodulation initial frequency point selected by the first user terminal is 400Hz, the demodulation initial frequency point selected by the second user terminal is 300Hz, the first user terminal calculates the differential phase of every two adjacent frequency points with the interval of 200Hz on the 400+200 × i (i is more than or equal to 0 and less than or equal to 15) frequency point, and the second user terminal calculates the differential phase of every two adjacent frequency points with the interval of 200Hz on the 300+200 × i (i is more than or equal to 0 and less than or equal to 15) frequency point. The mapping relationship between the differential phase of the adjacent frequency points and the 2-bit data obtained by demodulation is shown in table 2:
TABLE 2 mapping relationship between differential phase of adjacent frequency points and 2-bit DQPSK demodulation data
Figure BDA0003791567840000101
30 bits of encrypted voice data can be demodulated from each 10ms voice data symbol, and 960 bits of encrypted voice data can be demodulated from a 320ms data symbol in total.
Specifically, in order to overcome the defects that hardware deviation and voice enhancement processing of communication software in the prior art can generate great influence on similar voice signals and only half-duplex communication can be realized under the condition that voices of both parties are similar, the invention modulates the encrypted voice signals to be sent by both communication parties through a differential phase modulation method to generate similar voice signals with general voice signal parameter characteristics, can effectively resist the influence of hardware deviation of sound cards of both communication parties and voice enhancement processing of actual voice communication software, realizes full-duplex encrypted communication through staggered frequency points, and can be widely used for confidential communication of end-to-end voice channels.
According to the voice-like modulation and demodulation method provided by the invention, the first user terminal and the second user terminal are controlled to perform differential phase modulation on the encrypted voice signals respectively based on different modulation frequency points to generate voice-like signals with time-frequency characteristics, and the first user terminal and the second user terminal are controlled to transmit the generated voice-like signals to the opposite communication terminal and then perform differential phase demodulation, so that not only can the hardware deviation and the voice enhancement processing influence of the communication terminal be effectively avoided, but also full-duplex voice communication can be realized based on the frequency point staggered modulation and demodulation mode.
Optionally, transmitting the first type of voice data symbols to a second user terminal includes:
controlling the first user terminal to perform differential phase modulation on first control information required by key synchronization based on the first modulation frequency point and the second modulation order, and performing time domain conversion on a first frequency domain control signal obtained after modulation to obtain a first time domain control signal;
forming a first preamble symbol by a first synchronization signal of the first user terminal and the first time domain control signal;
forming a first type voice signal frame by the first blank symbol, the first leading symbol and the first type voice data symbol;
and controlling the first user terminal to transmit the first type of voice signal frame to a second user terminal.
It is to be understood that, in order to facilitate transmission of an encrypted voice signal to a correspondent node, the encrypted voice signal may be decrypted at the correspondent node, and therefore, in the embodiment of the present invention, when a first user terminal transmits a first type voice data symbol to a second user terminal, the first user terminal simultaneously transmits first control information required for key synchronization to the second user terminal, and in order to ensure signal continuity, a first blank symbol is added to a frame of the transmitted first type voice signal.
Specifically, the first user terminal may be first controlled to perform differential phase modulation on first control information required for key synchronization based on a first modulation frequency point and a second modulation order, and perform time domain conversion on a first frequency domain control signal obtained after modulation to obtain a first time domain control signal; then, the first synchronization signal and the first time domain control signal of the first user terminal form a first preamble symbol; further combining the first blank symbol, the first leading symbol and the first voice data symbol into a first voice signal frame; and finally, controlling the first user terminal to transmit the first type of voice signal frame to the second user terminal.
It is understood that the first control information required for key synchronization is distributed by a dedicated key distribution center for the first user terminal, and the first control information required for key synchronization is used for the second user terminal to decrypt the encrypted voice signal transmitted by the first user terminal.
It can be understood that, in the embodiment of the present invention, in order to ensure that the clocks of the sending party (first user terminal) and the receiving party (second user terminal) are uniform and the transmission between characters and characters is synchronous without intervals, the first user terminal is controlled to generate a first synchronous signal and transmit the first synchronous signal to the second user terminal together with the voice-like data to be transmitted, the first synchronous signal is used for notifying the second user terminal that a data frame has arrived, and the sampling speed of the second user terminal and the arrival speed of bits are ensured to be consistent, so that the first user terminal and the second user terminal are synchronized.
Optionally, in this embodiment of the present invention, in order to ensure signal continuity, a first blank symbol is added to a first type of speech signal frame sent by a first user terminal, where a length of the first blank symbol may be determined according to an actually transmitted first type of speech signal frame.
Alternatively, the first blank symbol added in the first type of speech signal frame may be an all-zero signal.
It can be understood that, by adding the first blank symbol in the first type voice signal frame sent by the first user terminal, the embodiment of the present invention can ensure continuous transmission of the voice-like signal in the voice communication channel.
Optionally, in this embodiment of the present invention, the first user terminal may be controlled to perform differential phase modulation on the first control information required for key synchronization based on the first modulation frequency and a second modulation order, where the second modulation order may be the same as or different from the first modulation order, and the size of the second modulation order is not specifically limited in this embodiment of the present invention.
For example, the first user terminal performs 2 in the frequency domain the first control information required for key synchronization p And performing order difference phase modulation, converting the modulated first frequency domain control signal into a time domain to obtain a first time domain control signal, and further forming a first preamble symbol by the first synchronization signal and the first time domain control signal of the first user terminal.
Alternatively, the first user terminal may perform similar voice-like modulation on the first control information required for key synchronization according to the method of modulating the first encrypted voice signal.
Optionally, in the embodiment of the present invention, in order to reduce the error rate and ensure the correct delivery of the control information, the modulation order p of the first control information required for key synchronization may be less than or equal to the modulation order k of the first encrypted voice signal, that is, the second modulation order is less than or equal to the first modulation order.
Optionally, after the frequency-domain differential phase modulation of the first control information is completed, the first user terminal may convert the modulated first frequency-domain control signal into a time-domain through inverse fast fourier transform to obtain a first time-domain control signal modulated like speech, and then combine the first synchronization signal and the first time-domain control signal of the first user terminal into a first preamble symbol.
For example, the first user terminal performs Cyclic Redundancy Check (CRC) coding with a code rate of 2/3 on 10 bits of control information (including a 2-bit key number and an 8-bit frame number) required for key synchronization, and may obtain the coded 15 bits of control information.
Optionally, in this embodiment of the present invention, the first user terminal may perform 2-order Differential Phase Shift Keying (DBPSK) on the coded 15-bit control information in the frequency domain, where frequency point selection, initial frequency point modulation Phase, and the like are the same as those of the first encrypted voice signal, but each group of adjacent frequency points modulates only 1-bit data, and a mapping relationship between the modulation bits of the control information and the Differential Phase of the adjacent frequency points is shown in table 3:
TABLE 3 mapping relation between modulation bit of control information and adjacent frequency point differential phase
1 bit data 0 1
DBPSK differential phase 0 π
Optionally, after performing frequency-domain differential phase modulation on the encoded 15-bit control information, the modulated frequency-domain signal may be converted into the time domain by inverse fast fourier transform to obtain 10 ms-like voice modulated control information.
Fig. 7 is a schematic diagram of a time domain waveform of a synchronization signal provided by the present invention, which can combine control information modulated by 10 ms-like speech with a 10ms synchronization signal shown in fig. 7 to form a 20ms preamble symbol, wherein the synchronization signal is formed by 4 segments of 2.5ms sine wave signals, and the corresponding frequencies are 500Hz, 1000Hz, 1500Hz, and 2000Hz in sequence.
Fig. 8 is a schematic structural diagram of a 400 ms-like speech signal frame provided by the present invention, and as shown in fig. 8, 3 blank symbols with a length of 60ms, 20ms preamble symbols and 16 speech data symbols with a length of 320ms are combined into a 400 ms-like speech signal frame, which is then transmitted and cyclically transmitted in segments through a communication channel, where the blank symbols are 20ms all-zero signals.
According to the invention, differential phase modulation similar to the first encrypted voice signal is carried out on the first control information required by key synchronization of the first user terminal, the first synchronization signal of the first user terminal and the first time domain control signal obtained after differential phase modulation and time domain conversion form a first leading symbol, and then the first blank symbol, the first leading symbol and the first type voice data symbol form a complete voice signal frame which is transmitted to the second user terminal through a voice channel, so that the second user terminal can conveniently decrypt the received voice signal in the subsequent process, and the continuity of the signal can be ensured.
Optionally, controlling the second user terminal to perform differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order, and acquiring first demodulated voice data includes:
controlling the second user terminal to perform frame synchronization on the first type voice signal frame based on a first blank symbol and a first leading symbol in the received first type voice signal frame, determining a first initial position of the first leading symbol, and determining a first type voice data symbol in the first type voice signal frame based on the first initial position;
and controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on the first demodulation frequency point and the first demodulation order so as to obtain first demodulation voice data.
Specifically, under the condition that it is determined that the second user terminal receives the first type of voice signal frame sent by the first user terminal, the second user terminal may be controlled to perform frame synchronization on the first type of voice signal frame based on a first blank symbol and a first preamble symbol in the received first type of voice signal frame, so as to determine a first starting position of the first preamble symbol, and determine a first type of voice data symbol in the first type of voice signal frame based on the first starting position; and then controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on the first demodulation frequency point and the first demodulation order, thereby obtaining first demodulation voice data.
Alternatively, an energy detection method may be used to detect an energy jump from the first blank symbol to the first preamble symbol, find an approximate position of the first preamble symbol through coarse synchronization, and then perform fine synchronization based on a cross-correlation shift method using a synchronization signal locally stored by the second user terminal to further determine the first starting position of the first preamble symbol.
Alternatively, fig. 9 is a schematic diagram of a synchronization process of a speech-like signal frame provided by the present invention, and as shown in fig. 9, frame synchronization of the speech-like signal frame can be completed through two stages, namely, coarse synchronization and fine synchronization.
Alternatively, when frame synchronization is performed on the first type speech signal frame, coarse synchronization may be performed using an energy detection algorithm based on an energy difference between the first blank symbol and the first preamble symbol.
Fig. 10 is a schematic diagram of the coarse synchronization result based on energy detection provided by the present invention, as shown in fig. 10, for example, at 8KHz sampling rate, 160 points in the first type speech signal frame are taken according to a single symbol length of 20ms, the energy ratio of the last 80 points to the energy of the first 80 points is calculated, if the ratio is greater than the threshold 5, the 160 points are considered to have the first start position of the first preamble symbol, otherwise, the points are moved forward by 40 points, and the coarse synchronization operation is continued.
Optionally, after the coarse synchronization is completed, a fine synchronization operation may be started according to a data segment where a coarse start position of the first preamble symbol found by the coarse synchronization is located, an algorithm used is to calculate cross-correlation coefficients of 80 points from the coarse start position and a synchronization signal of 10ms, slide 1 sampling point forward each time to calculate the cross-correlation coefficient, slide 120 points in total to find a position where the maximum cross-correlation coefficient is located, fig. 11 is a schematic diagram of a fine synchronization result based on the cross-correlation provided by the present invention, as shown in fig. 11, if the maximum cross-correlation coefficient is greater than a threshold value of 0.8, the fine synchronization is successful, it is determined that the first start position of the first preamble symbol is found accurately, and otherwise, the coarse synchronization is restarted.
It can be understood that, in the embodiment of the present invention, in order to ensure that the second user terminal can restore the voice data sent by the first user terminal, the second user terminal is controlled to perform differential phase demodulation on the first type voice data symbols based on the first demodulation frequency point, where the first demodulation frequency point is the same as the first modulation frequency point, that is, the frequency point used when the second user terminal demodulates the voice data is the frequency point used when the first user terminal modulates the voice data, so as to ensure that the voice data demodulated by the second user terminal is the same as the original voice data to be sent by the first user terminal.
Optionally, after determining the first starting position of the first preamble symbol, further comprising:
controlling the second user terminal to determine a first time domain control signal in the first type of voice signal frame based on the first starting position;
and controlling the second user terminal to carry out differential phase demodulation on the first time domain control signal based on a first demodulation frequency point and a second demodulation order so as to obtain first control information, and carrying out decryption and compression decoding on the first demodulated voice data based on the first control information so as to obtain first original voice data.
Specifically, after the first start position of the first preamble symbol is determined, the second user terminal may be controlled to determine the first time domain control signal in the first type of voice signal frame based on the first start position, and further controlled to perform differential phase demodulation on the first time domain control signal based on the first demodulation frequency point and the second demodulation order, so as to obtain the first control information, and decrypt and compress-decode the first demodulated voice data based on the first control information, so as to obtain the first original voice data, where the first original voice data is original voice data sent by the first user terminal and restored by the second user terminal.
Optionally, the first demodulated voice data may be decrypted based on the ZUC-256 stream decryption algorithm, and the decrypted voice data may be further compression-decoded based on the Codec2 decoding algorithm, so that the first original voice data may be obtained.
It can be understood that, in the embodiment of the present invention, in order to ensure that the second user terminal can restore the first control information required for key synchronization, the second user terminal is controlled to perform differential phase demodulation on the first time domain control signal based on the first demodulation frequency point, where the first demodulation frequency point is the same as the first modulation frequency point, that is, the frequency point used when the second user terminal demodulates the control information is the frequency point used when the first user terminal modulates the control information, so as to ensure that the control information demodulated by the second user terminal is the same as the original control information to be sent by the first user terminal.
Optionally, transmitting the second type voice data symbol to the first user terminal includes:
controlling the second user terminal to perform differential phase modulation on second control information required by key synchronization based on the second modulation frequency point and the second modulation order, and performing time domain conversion on a second frequency domain control signal obtained after modulation to obtain a second time domain control signal;
forming a second preamble symbol by a second synchronization signal of the second user terminal and the second time domain control signal;
forming a second type voice signal frame by the second blank symbol, the second leading symbol and the second type voice data symbol;
and controlling the second user terminal to transmit the second type of voice signal frame to the first user terminal.
It is understood that, in order to facilitate transmission of the encrypted voice signal to the correspondent node, the encrypted voice signal may be decrypted at the correspondent node, so in the embodiment of the present invention, when the second user terminal transmits the second type voice data symbol to the first user terminal, the second user terminal simultaneously transmits the second control information required for key synchronization to the first user terminal, and in order to ensure the continuity of the signal, a second blank symbol is added in the transmitted second type voice signal frame.
Specifically, the second user terminal may be first controlled to perform differential phase modulation on second control information required for key synchronization based on a second modulation frequency and a second modulation order, and perform time domain conversion on a second frequency domain control signal obtained after modulation to obtain a second time domain control signal; then, a second preamble symbol is formed by a second synchronization signal and a second time domain control signal of a second user terminal; further forming a second type voice signal frame by the second blank symbol, the second leading symbol and the second type voice data symbol; and finally, controlling the second user terminal to transmit the second type of voice signal frame to the first user terminal.
It is understood that the second control information required for key synchronization is distributed by a dedicated key distribution center for the second user terminal, and the second control information required for key synchronization is used for the first user terminal to decrypt the encrypted voice signal transmitted by the second user terminal.
It can be understood that, in the embodiment of the present invention, in order to ensure that clocks of a sending party (second user terminal) and a receiving party (first user terminal) are uniform, and transmission between characters is synchronous and gapless, the second user terminal is controlled to generate a second synchronization signal, and the second user terminal is controlled to transmit the second synchronization signal to the first user terminal together with voice-like data to be transmitted, the second synchronization signal is used for notifying that a data frame has arrived at the first user terminal, and ensuring that the sampling speed of the first user terminal and the arrival speed of bits are consistent, so that both the second user terminal and the first user terminal enter synchronization.
Optionally, in this embodiment of the present invention, in order to ensure signal continuity, a second blank symbol is added to a second type speech signal frame sent by a second user terminal, where a length of the second blank symbol may be determined according to an actually transmitted second type speech signal frame.
Alternatively, the second blank symbol added in the second type speech signal frame may be an all-zero signal.
It can be understood that, in the embodiment of the present invention, by adding the second blank symbol to the frame of the second type voice signal transmitted by the second user terminal, the continuous transmission of the voice-like signal in the voice communication channel can be ensured.
Optionally, in this embodiment of the present invention, the second user terminal may be controlled to perform differential phase modulation on the second control information required for key synchronization based on the second modulation frequency and the second modulation order, where the second modulation order may be the same as or different from the first modulation order, and the size of the second modulation order is not specifically limited in this embodiment of the present invention.
For example, the second user terminal performs 2 in the frequency domain the second control information required for key synchronization p And step-difference phase modulation, then converting the modulated frequency domain signal into a time domain to obtain a second time domain control signal, and further forming a second preamble symbol by a second synchronization signal and a second time domain control signal of a second user terminal.
Alternatively, the second user terminal may perform similar voice-like modulation on the second control information required for key synchronization according to the method of modulating the second encrypted voice signal.
Optionally, in the embodiment of the present invention, in order to reduce the error rate and ensure the correct delivery of the control information, the modulation order p of the second control information required for key synchronization may be less than or equal to the modulation order k of the second encrypted voice signal, that is, the second modulation order is less than or equal to the first modulation order.
Optionally, after the frequency domain differential phase modulation of the second control information is completed, the second user terminal may convert the modulated second frequency domain control signal into a time domain through inverse fast fourier transform to obtain a second time domain control signal modulated like speech, and then combine a second synchronization signal and the second time domain control signal of the second user terminal into a second preamble symbol.
According to the invention, differential phase modulation similar to the second encrypted voice signal is carried out on the second control information required by key synchronization of the second user terminal, the second synchronization signal of the second user terminal and the second time domain control signal obtained after differential phase modulation and time domain conversion form a second leading symbol, and then the second blank symbol, the second leading symbol and the second voice data symbol form a complete voice signal frame which is transmitted to the first user terminal through the voice channel, so that the first user terminal can conveniently decrypt the received voice signal and the continuity of the signal can be ensured.
Optionally, controlling the first user terminal to perform differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order, and acquiring second demodulated voice data, includes:
controlling the first user terminal to perform frame synchronization on a second type voice signal frame based on a second blank symbol and a second leading symbol in the received second type voice signal frame, determining a second initial position of the second leading symbol, and determining a second type voice data symbol in the second type voice signal frame based on the second initial position;
and controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulation voice data.
Specifically, in a case that it is determined that the first user terminal receives a second-class voice signal frame sent by the second user terminal, the first user terminal may be controlled to perform frame synchronization on the second-class voice signal frame based on a second blank symbol and a second preamble symbol in the received second-class voice signal frame, so as to determine a second start position of the second preamble symbol, and determine a second-class voice data symbol in the second-class voice signal frame based on the second start position; and then controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on the second demodulation frequency point and the first demodulation order, thereby obtaining second demodulation voice data.
It can be understood that, in the embodiment of the present invention, in order to ensure that the voice data sent by the second user terminal can be restored at the first user terminal, the first user terminal is controlled to perform differential phase demodulation on the second type voice data symbol based on the second demodulation frequency point, where the second demodulation frequency point and the second modulation frequency point are the same, that is, the frequency point used when the first user terminal demodulates the voice data is the frequency point used when the second user terminal modulates the voice data, so as to ensure that the voice data demodulated by the first user terminal is the same as the original voice data to be sent by the second user terminal.
Alternatively, an energy detection method may be used to detect an energy sudden change from the second blank symbol to the second preamble symbol, find the approximate position of the second preamble symbol through coarse synchronization, and then perform fine synchronization based on a shift cross correlation method using a synchronization signal locally stored by the first user terminal to further determine the second start position of the second preamble symbol.
Optionally, after determining the second starting position of the second preamble symbol, further comprising:
controlling the first user terminal to determine a second time domain control signal in the second type of voice signal frame based on the second starting position;
and controlling the first user terminal to perform differential phase demodulation on the second time domain control signal based on a second demodulation frequency point and a second demodulation order to acquire second control information, and performing decryption and compression decoding on the second demodulated voice data based on the second control information to acquire second original voice data.
Specifically, after the second start position of the second preamble symbol is determined, the first user terminal may be controlled to determine a second time domain control signal in the second type of voice signal frame based on the second start position, and further, the first user terminal is controlled to perform differential phase demodulation on the second time domain control signal based on the second demodulation frequency point and the second demodulation order to obtain second control information, and decrypt, compress and decode the second demodulated voice data based on the second control information to obtain second original voice data, where the second original voice data is original voice data sent by the second user terminal and restored by the first user terminal.
Optionally, the second demodulated voice data may be decrypted based on the ZUC-256 stream decryption algorithm, and the decrypted voice data may be further compression-decoded based on the Codec2 decoding algorithm, so that the second original voice data may be obtained.
It can be understood that, in the embodiment of the present invention, in order to ensure that the first user terminal can restore the second control information required for key synchronization, the first user terminal is controlled to perform differential phase demodulation on the second time domain control signal based on the second demodulation frequency point, where the second demodulation frequency point is the same as the second modulation frequency point, that is, the frequency point used when the first user terminal demodulates the control information is the frequency point used when the second user terminal modulates the control information, so as to ensure that the control information demodulated by the first user terminal is the same as the original control information to be sent by the second user terminal.
For example, the first user terminal and the second user terminal respectively convert the first key synchronization control signal and the second key synchronization control signal into the frequency domain by fourier transform for 2 p The phase of the order difference is demodulated, and the initial frequency point selected by the first user terminal is f b =f a + Δ f, initial selection by the second user terminalFrequency point of f a The fundamental tone frequency is Δ f, that is, the frequency point demodulated and used by the first user terminal is the frequency point modulated and used by the second user terminal: f. of a +(i×2+1)×Δf(0≤i≤N f ) F, the frequency point used by the second user terminal for demodulation is the frequency point used by the first user terminal for modulation a +i×2×Δf(0≤i≤N f ) The first user terminal and the second user terminal respectively carry out 2 operations on each group of adjacent frequency points with the interval of 2 multiplied by delta f p And carrying out step-difference phase demodulation to obtain a first demodulation synchronous control signal and a second demodulation synchronous control signal.
For example, according to the differential phase demodulation process shown in fig. 6, a time domain control signal (10ms, 160 sampling points) in the preamble symbol may be subjected to fourier transform to convert to a frequency domain, and then 2-order differential phase demodulation is performed, during the demodulation, the first user terminal demodulates according to the modulation parameter of the second user terminal, the second user terminal demodulates according to the modulation parameter of the first user terminal, that is, the demodulation initial frequency point selected by the first user terminal is 400Hz, the demodulation initial frequency point selected by the second user terminal is 300Hz, the first user terminal calculates a differential phase between every two adjacent frequency points with an interval of 200Hz on a 400+200 × i (0 ≦ i ≦ 15) frequency point, the second user terminal calculates a differential phase between every two adjacent frequency points with an interval of 200Hz on a 300+200 × i (0 ≦ i ≦ 15), and a mapping relationship between the adjacent differential phases and the 1-bit data obtained by demodulation is shown in table 4:
TABLE 4 mapping relation between differential phase of adjacent frequency points and 1-bit DBPSK demodulation data
Figure BDA0003791567840000191
In the embodiment of the present invention, after 2-order differential phase demodulation, 15-bit CRC encoded control information is obtained, then CRC decoding is performed on the control information to obtain 10-bit control information required for key synchronization, where the control information includes a 2-bit key number and an 8-bit frame number, and further based on the 10-bit control information, 960-bit encrypted voice data (first demodulated voice data or second demodulated voice data) after demodulation is decrypted by using a ZUC-256 stream decryption algorithm to obtain 960-bit compressed voice data. Then, 48 bits of data are decoded in sequence by using a Codec2 decoding algorithm to obtain 160 bits (20 ms) of original voice data, 960 bits of compressed voice data correspond to 3200 bits (400 ms) of original voice data, the length of the original voice data is the same as that of a complete voice-like signal frame in fig. 8, and finally the original voice data can be transmitted to a loudspeaker through a USB sound card for playing. Through actual environment tests, the end-to-end delay is less than 100ms, the error rate is lower than five thousandths, and the MOS value of the voice quality is greater than 3.6.
Fig. 12 is a schematic workflow diagram of a full-duplex voice-like system provided by the present invention, and as shown in fig. 12, the workflow of the system includes: firstly, a sound card of a first user terminal collects voice data, then voice activity detection is carried out on the collected voice data, when the collected voice data is determined to be effective voice data, the effective voice data is continuously subjected to subsequent processing, the effective voice data is subjected to compression coding and encryption processing to obtain encrypted voice data, the encrypted voice data is further subjected to differential phase modulation to obtain a similar voice data symbol, and a blank symbol, a leading symbol and the similar voice data symbol are combined into a complete similar voice signal frame to be transmitted to a second user terminal in a voice channel; when the second user terminal receives the voice signal frame, firstly, frame synchronization and key synchronization are carried out, then, differential phase demodulation is further carried out on the voice signal frame, then, the voice data obtained after demodulation is decrypted, compressed and decoded, so that the original voice data collected by the first user terminal is obtained, and finally, the original voice data is output through a sound card of the second user terminal.
Optionally, the first user terminal and the second user terminal firstly perform voice activity detection on signals acquired by respective sound cards, and perform compression coding and encryption on detected effective voice data; carrying out frequency domain differential phase modulation on the encrypted voice and converting the voice into a time domain to obtain a voice-like signal, wherein when carrying out differential phase modulation, two communication parties select different initial frequency points and the same frequency point interval so as to stagger the frequency points; carrying out frequency domain differential phase modulation on control information required by key synchronization, converting the control information into a time domain, and forming a preamble symbol together with a synchronization signal; blank symbols, leading symbols and n data symbols (voice-like signals) are combined into a complete voice-like signal frame to be transmitted through a voice channel in sequence; and at a demodulation end (opposite communication end), the received similar voice signal is subjected to frame synchronization, key synchronization, similar voice differential phase demodulation, decryption, compression decoding and the like to obtain an original voice signal.
The like voice modulation and demodulation method provided by the invention has the advantages that the first user terminal and the second user terminal are controlled to perform differential phase modulation on the encrypted voice signals respectively based on different modulation frequency points to generate like voice signals with time-frequency characteristics, and the first user terminal and the second user terminal are controlled to transmit the respective generated like voice signals to the opposite communication terminal and then perform differential phase demodulation, so that the influence of hardware deviation and voice enhancement processing of the communication terminal can be effectively avoided, and full-duplex voice communication can be realized based on the frequency point staggered modulation and demodulation mode.
The following describes the voice-like modem apparatus provided in the present invention, and the voice-like modem apparatus described below and the voice-like modem method described above can be referred to correspondingly.
Fig. 13 is a schematic structural diagram of a speech-like modem apparatus provided in the present invention, and as shown in fig. 13, the apparatus includes: a determination module 1310, a first control module 1320, a second control module 1330, a third control module 1340, and a fourth control module 1350; wherein:
the determining module 1310 is configured to determine a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
the first control module 1320 is configured to control the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order, to obtain a first frequency domain voice signal, perform time domain conversion on the first frequency domain voice signal, obtain a first type of voice data symbol, and transmit the first type of voice data symbol to a second user terminal;
the second control module 1330 is configured to control the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and the first modulation order, to obtain a second frequency domain voice signal, perform time domain conversion on the second frequency domain voice signal, obtain a second voice data symbol, and transmit the second voice data symbol to the first user terminal;
the third control module 1340 is configured to control the second user terminal to perform differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order to obtain first demodulated voice data, when it is determined that the second user terminal receives the first type of voice data symbols;
the fourth control module 1350 is configured to, when it is determined that the second-type voice data symbol is received by the first user terminal, control the first user terminal to perform differential phase demodulation on the second-type voice data symbol based on a second demodulation frequency and the first demodulation order, so as to obtain second demodulated voice data;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
The similar voice modulation and demodulation device provided by the invention generates a similar voice signal with time-frequency characteristics by controlling the first user terminal and the second user terminal to perform differential phase modulation on the encrypted voice signal based on different modulation frequency points respectively, and controls the first user terminal and the second user terminal to transmit the similar voice signal generated by the first user terminal and the second user terminal to the opposite communication terminal and then perform differential phase demodulation, so that the influence of hardware deviation and voice enhancement processing of the communication terminal can be effectively avoided, and full-duplex voice communication can be realized based on the frequency point staggered modulation and demodulation mode.
It should be noted that, the speech-like modulation and demodulation apparatus provided in the embodiment of the present invention can implement all the method steps implemented by the speech-like modulation and demodulation method embodiment, and can achieve the same technical effect, and detailed descriptions of the same parts and beneficial effects as those of the method embodiment in this embodiment are omitted here.
Fig. 14 is a schematic physical structure diagram of an electronic device provided in the present invention, and as shown in fig. 14, the electronic device may include: a processor (processor) 1410, a communication Interface (Communications Interface) 1420, a memory (memory) 1430 and a communication bus 1440, wherein the processor 1410, the communication Interface 1420 and the memory 1430 communicate with each other via the communication bus 1440. Processor 1410 may invoke logic instructions in memory 1430 to perform a speech-like modulation and demodulation method provided by the methods described above, the method comprising:
determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order to obtain a first frequency domain voice signal, performing time domain conversion on the first frequency domain voice signal to obtain a first type of voice data symbols, and transmitting the first type of voice data symbols to a second user terminal;
controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order to acquire a second frequency domain voice signal, performing time domain conversion on the second frequency domain voice signal to acquire a second voice data symbol, and transmitting the second voice data symbol to the first user terminal;
under the condition that the second user terminal is determined to receive the first type of voice data symbols, controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order so as to obtain first demodulated voice data;
under the condition that the first user terminal is determined to receive the second type of voice data symbols, controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulated voice data;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
In addition, the logic instructions in the memory 1430 may be implemented in software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform a speech-like modem method provided by the above methods, the method comprising:
determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order to acquire a first frequency domain voice signal, performing time domain conversion on the first frequency domain voice signal to acquire a first type of voice data symbol, and transmitting the first type of voice data symbol to a second user terminal;
controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order to obtain a second frequency domain voice signal, performing time domain conversion on the second frequency domain voice signal to obtain a second voice data symbol, and transmitting the second voice data symbol to the first user terminal;
under the condition that the second user terminal is determined to receive the first type of voice data symbols, controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order so as to obtain first demodulated voice data;
under the condition that the first user terminal is determined to receive the second type of voice data symbols, controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulated voice data;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
In still another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the voice-like modulation and demodulation methods provided by the above methods, the method comprising:
determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order to obtain a first frequency domain voice signal, performing time domain conversion on the first frequency domain voice signal to obtain a first type of voice data symbols, and transmitting the first type of voice data symbols to a second user terminal;
controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order to obtain a second frequency domain voice signal, performing time domain conversion on the second frequency domain voice signal to obtain a second voice data symbol, and transmitting the second voice data symbol to the first user terminal;
under the condition that the second user terminal is determined to receive the first type of voice data symbols, controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order so as to obtain first demodulated voice data;
under the condition that the first user terminal is determined to receive the second type of voice data symbols, controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulated voice data;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (11)

1. A speech-like modulation/demodulation method, comprising:
determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order to obtain a first frequency domain voice signal, performing time domain conversion on the first frequency domain voice signal to obtain a first type of voice data symbols, and transmitting the first type of voice data symbols to a second user terminal;
controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order to obtain a second frequency domain voice signal, performing time domain conversion on the second frequency domain voice signal to obtain a second voice data symbol, and transmitting the second voice data symbol to the first user terminal;
under the condition that the second user terminal is determined to receive the first type of voice data symbols, controlling the second user terminal to carry out differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order so as to obtain first demodulated voice data;
under the condition that the first user terminal is determined to receive the second type of voice data symbols, controlling the first user terminal to carry out differential phase demodulation on the second type of voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulated voice data;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
2. The voice-like modem method of claim 1 wherein transmitting said first voice data symbols to a second user terminal comprises:
controlling the first user terminal to perform differential phase modulation on first control information required by key synchronization based on the first modulation frequency point and the second modulation order, and performing time domain conversion on a first frequency domain control signal obtained after modulation to obtain a first time domain control signal;
forming a first preamble symbol by the first synchronization signal of the first user terminal and the first time domain control signal;
forming a first type voice signal frame by the first blank symbol, the first leading symbol and the first type voice data symbol;
and controlling the first user terminal to transmit the first type of voice signal frame to a second user terminal.
3. The method of claim 2, wherein controlling the second user terminal to perform differential phase demodulation on the first type voice data symbol based on a first demodulation frequency point and a first demodulation order to obtain first demodulated voice data comprises:
controlling the second user terminal to perform frame synchronization on the first type voice signal frame based on a first blank symbol and a first preamble symbol in the received first type voice signal frame, determining a first initial position of the first preamble symbol, and determining a first type voice data symbol in the first type voice signal frame based on the first initial position;
and controlling the second user terminal to carry out differential phase demodulation on the first type voice data symbols based on a first demodulation frequency point and a first demodulation order so as to obtain first demodulation voice data.
4. The speech-like modem method according to claim 3, further comprising, after determining the first starting position of the first preamble symbol:
controlling the second user terminal to determine a first time domain control signal in the first type of voice signal frame based on the first starting position;
and controlling the second user terminal to perform differential phase demodulation on the first time domain control signal based on a first demodulation frequency point and a second demodulation order to acquire first control information, and performing decryption and compression decoding on the first demodulated voice data based on the first control information to acquire first original voice data.
5. The voice-like modem method according to claim 1, wherein transmitting said second voice-like data symbols to a first user terminal comprises:
controlling the second user terminal to perform differential phase modulation on second control information required by key synchronization based on the second modulation frequency point and the second modulation order, and performing time domain conversion on a second frequency domain control signal obtained after modulation to obtain a second time domain control signal;
forming a second preamble symbol by using a second synchronization signal of the second user terminal and the second time domain control signal;
forming a second type voice signal frame by the second blank symbol, the second leading symbol and the second type voice data symbol;
and controlling the second user terminal to transmit the second type of voice signal frame to the first user terminal.
6. The method of claim 5, wherein controlling the first user terminal to perform differential phase demodulation on the second voice data symbol based on a second demodulation bin and a first demodulation order to obtain second demodulated voice data comprises:
controlling the first user terminal to perform frame synchronization on a second type voice signal frame based on a second blank symbol and a second leading symbol in the received second type voice signal frame, determining a second initial position of the second leading symbol, and determining a second type voice data symbol in the second type voice signal frame based on the second initial position;
and controlling the first user terminal to carry out differential phase demodulation on the second type voice data symbols based on a second demodulation frequency point and a first demodulation order so as to obtain second demodulation voice data.
7. The speech-like modem method according to claim 6, further comprising, after determining the second starting position of the second preamble symbol:
controlling the first user terminal to determine a second time domain control signal in the second type voice signal frame based on the second starting position;
and controlling the first user terminal to carry out differential phase demodulation on the second time domain control signal based on a second demodulation frequency point and a second demodulation order so as to obtain second control information, and carrying out decryption and compression decoding on the second demodulated voice data based on the second control information so as to obtain second original voice data.
8. The speech modem method of claim 2, wherein the second modulation order is less than or equal to the first modulation order.
9. A speech-like modem apparatus, comprising:
the determining module is used for determining a first encrypted voice signal to be sent by a first user terminal and a second encrypted voice signal to be sent by a second user terminal;
the first control module is used for controlling the first user terminal to perform differential phase modulation on the first encrypted voice signal based on a first modulation frequency point and a first modulation order, so as to obtain a first frequency domain voice signal, perform time domain conversion on the first frequency domain voice signal, obtain a first type of voice data symbols, and transmit the first type of voice data symbols to a second user terminal;
the second control module is used for controlling the second user terminal to perform differential phase modulation on the second encrypted voice signal based on a second modulation frequency point and a first modulation order, so as to obtain a second frequency domain voice signal, perform time domain conversion on the second frequency domain voice signal, obtain a second type of voice data symbol, and transmit the second type of voice data symbol to the first user terminal;
the third control module is used for controlling the second user terminal to perform differential phase demodulation on the first type of voice data symbols based on a first demodulation frequency point and a first demodulation order under the condition that the second user terminal is determined to receive the first type of voice data symbols, so as to obtain first demodulation voice data;
the fourth control module is configured to control the first user terminal to perform differential phase demodulation on the second type voice data symbol based on a second demodulation frequency point and a first demodulation order to obtain second demodulated voice data when it is determined that the first user terminal receives the second type voice data symbol;
the first modulation frequency point is different from the second modulation frequency point, the first modulation frequency point is the same as the first demodulation frequency point, the second modulation frequency point is the same as the second demodulation frequency point, and the first modulation order is the same as the first demodulation order.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements a speech modem-like method according to any one of claims 1 to 8 when executing the program.
11. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the speech modem-like method according to any one of claims 1 to 8.
CN202210956470.XA 2022-08-10 2022-08-10 Voice-like modulation and demodulation method and device Pending CN115514603A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210956470.XA CN115514603A (en) 2022-08-10 2022-08-10 Voice-like modulation and demodulation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210956470.XA CN115514603A (en) 2022-08-10 2022-08-10 Voice-like modulation and demodulation method and device

Publications (1)

Publication Number Publication Date
CN115514603A true CN115514603A (en) 2022-12-23

Family

ID=84501603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210956470.XA Pending CN115514603A (en) 2022-08-10 2022-08-10 Voice-like modulation and demodulation method and device

Country Status (1)

Country Link
CN (1) CN115514603A (en)

Similar Documents

Publication Publication Date Title
US11870502B2 (en) Data delivery using acoustic transmissions
RU2624549C2 (en) Watermark signal generation and embedding watermark
DE112007000123B4 (en) A modem for communicating data over a voice channel of a communication system
US7912149B2 (en) Synchronization and segment type detection method for data transmission via an audio communication system
KR100994736B1 (en) Time diversity voice channel data communications
US8472508B2 (en) Data transmission
RU2614855C2 (en) Watermark generator, watermark decoder, method of generating watermark signal, method of generating binary message data depending on watermarked signal and computer program based on improved synchronisation concept
JPH08501665A (en) Discontinuous CDMA reception system
CN103971695B (en) A kind of underwater digital voice communication system of channel self-adapting and its method
JPH0439927B2 (en)
JP5567150B2 (en) Watermark generator using differential encoding, watermark decoder, method for providing watermark signal based on binary message data, method for providing binary message data based on watermarked signal, and computer program
Novak et al. Ultrasound proximity networking on smart mobile devices for IoT applications
CN109690979A (en) The method encoded by random acoustical signal and relevant transmission method
RU2586845C2 (en) Watermark decoder and method of generating binary message data
Krasnowski et al. Introducing a novel data over voice technique for secure voice communication
AU2007286940B2 (en) System and method for terminating a voice call in any burst within a multi-burst superframe
CN115514603A (en) Voice-like modulation and demodulation method and device
EP3729692B1 (en) A method and system for improved acoustic transmission of data
CN108631884B (en) Sound wave communication method based on nonlinear effect
US11244692B2 (en) Audio watermarking via correlation modification using an amplitude and a magnitude modification based on watermark data and to reduce distortion
Pekerti et al. Secure End-to-End Voice Communication: A Comprehensive Review of Steganography, Modem-based Cryptography, and Chaotic Cryptography Techniques
Chen et al. Pilot-subcarrier based impulsive noise mitigation for underwater acoustic OFDM systems
CN105049128A (en) Method for embedding multi-carrier sound wave communication in audio playing
Matsuoka et al. Acoustic OFDM system and its extension: Multiple data frame support
Zhang et al. Towards Faster End-to-End Data Transmission Over Voice Channels

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination