CN111970063B

CN111970063B - Communication method, device, equipment and storage medium

Info

Publication number: CN111970063B
Application number: CN202010873367.XA
Authority: CN
Inventors: 王备
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2022-04-29
Anticipated expiration: 2040-08-26
Also published as: CN111970063A

Abstract

The application relates to a communication method, a device, a storage medium and a system. The method comprises the following steps: a receiving end receives an audio signal, wherein the audio signal is generated after a transmitting end codes and modulates a target text; the receiving end demodulates the audio signal, divides the audio signal into a plurality of signal intervals and respectively determines a plurality of signal-to-noise ratios corresponding to the signal intervals; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2; the receiving end obtains a character sequence contained in the audio signal based on the plurality of signal-to-noise ratios; and the receiving end decodes the character sequence contained in the audio signal to obtain the target text. Therefore, the accuracy of the analyzed signals can be improved, the success rate of handshake connection is improved, and user experience is improved.

Description

Communication method, device, equipment and storage medium

Technical Field

The present application relates to communication technologies, and in particular, to a communication method, apparatus, device, and storage medium.

Background

When the existing receiving end and transmitting end communicate, a synchronization signal, that is, a signal providing the same time reference, is used to ensure that the receiving end can receive a stable and reliable signal, for example, the synchronization signal is used to ensure that the starting point of a signal received by the receiving end is consistent with the starting point of a signal sent by the transmitting end; however, in some scenarios, such as a screen projection scenario, there is no standardized synchronization signal, which causes the starting point of a signal received by a receiving end, such as a microphone, to be different from the starting point of a signal sent by a transmitting end, and there is a random delay.

Disclosure of Invention

In order to solve the above problems, the present invention provides a communication method, apparatus, device and storage medium, which can improve the accuracy of the analyzed signal, thereby improving the success rate of handshake connection and improving user experience.

In a first aspect, an embodiment of the present application provides a communication method, including:

a receiving end receives an audio signal, wherein the audio signal is generated after a transmitting end codes and modulates a target text;

the receiving end demodulates the audio signal, divides the audio signal into a plurality of signal intervals and respectively determines a plurality of signal-to-noise ratios corresponding to the signal intervals; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2;

the receiving end obtains a character sequence contained in the audio signal based on the plurality of signal-to-noise ratios;

and the receiving end decodes the character sequence contained in the audio signal to obtain the target text so as to complete handshake connection with the transmitting end based on the target text.

In a specific example of the present application, the obtaining, by the receiving end, a character sequence included in the audio signal based on the plurality of signal-to-noise ratios includes:

the receiving end carries out phase classification based on the signal-to-noise ratios to obtain N signal-to-noise ratio sets, wherein the signal-to-noise ratio sets comprise S signal units with different phases; the S is related to the number of characters of the characters in the audio signal;

and calculating the mean value of the signal-to-noise ratio set, and obtaining the character sequence contained in the audio signal based on the signal-to-noise ratio set corresponding to the maximum mean value.

In a specific example of the present application, the audio signal further includes a separator for dividing two of the target texts; the decoding, by the receiving end, the character sequence included in the audio signal to obtain the target text, including:

and the receiving end determines the separator from the character sequence contained in the audio signal, and decodes a preset number of characters from the characters before or after the separator until the target text is obtained by decoding based on the preset length of the target text.

In a specific example of the present application, the method further comprises:

if the decoding is finished before the decoding is finished, returning the tail part of the character sequence contained in the audio signal to a first position, and decoding after a preset number of characters are matched from the first position until the target text is obtained through decoding; wherein the first position is determined based on a preset length of the target text.

In a specific example of the present application, the target text indicates a screen-casting code for a screen-casting operation; the method further comprises the following steps:

and the receiving terminal is connected with the transmitting terminal in a handshaking manner based on the screen projecting code indicated by the target text, and performs screen projecting operation after the handshaking connection is completed so as to project the screen projecting content of the receiving terminal to the transmitting terminal for displaying.

In a second aspect, an embodiment of the present application provides a first electronic device, including:

the receiving unit is used for receiving an audio signal, wherein the audio signal is generated after a transmitting end codes and modulates a target text;

the demodulation unit is used for demodulating the audio signal, dividing the audio signal into a plurality of signal intervals and respectively determining a plurality of signal-to-noise ratios corresponding to the plurality of signal intervals; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2; obtaining a character sequence contained in the audio signal based on the plurality of signal-to-noise ratios;

and the decoding unit is used for decoding the character sequence contained in the audio signal to obtain the target text so as to complete handshake connection with the transmitting terminal based on the target text.

In a specific example of the present application, the demodulation unit is further configured to

Performing phase classification based on the plurality of signal-to-noise ratios to obtain N signal-to-noise ratio sets, wherein the signal-to-noise ratio sets comprise S signal units with different phases; the S is related to the number of characters of the characters in the audio signal;

In a specific example of the present application, the audio signal further includes a separator for dividing two of the target texts; the decoding unit is further configured to:

determining the separator from the character sequence contained in the audio signal, and decoding a preset number of characters from the characters before or after the separator until the target text is obtained by decoding based on the preset length of the target text.

In a specific example of the present application, the decoding unit is further configured to:

In a specific example of the present application, the target text indicates a screen-casting code for a screen-casting operation; the device further comprises: a screen projection unit; wherein, the screen projecting unit is used for:

and performing handshake connection with the transmitting terminal based on the screen projecting code indicated by the target text, and performing screen projecting operation after the handshake connection is completed so as to project the screen projecting content of the first electronic device to the transmitting terminal for displaying.

In a third aspect, an embodiment of the present application provides a first electronic device, including:

one or more processors;

a memory communicatively coupled to the one or more processors;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.

In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the method described above.

In a fifth aspect, an embodiment of the present application provides a communication system, including a first electronic device having a signal receiving function and a second electronic device having a signal transmitting function; wherein the content of the first and second substances,

the second electronic device is used for encoding the determined target text, modulating the encoded target text by using a frequency band within a preset range to obtain an audio signal, and broadcasting the audio signal;

the first electronic device is used for receiving an audio signal, wherein the audio signal is generated after a transmitting end codes and modulates a target text; the receiving end demodulates the audio signal, divides the audio signal into a plurality of signal intervals and respectively determines a plurality of signal-to-noise ratios corresponding to the signal intervals; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2; the receiving end obtains a character sequence contained in the audio signal based on the plurality of signal-to-noise ratios; and the receiving end decodes the character sequence contained in the audio signal to obtain the target text so as to complete handshake connection with the transmitting end based on the target text.

Therefore, the receiving end can realize multiple analysis of the same signal unit, even if the incomplete signal unit is received, characters corresponding to the incomplete signal unit can be obtained in a maximized mode due to the fact that the incomplete signal unit can be analyzed for multiple times, a foundation is laid for obtaining all characters contained in the audio signal through the maximized analysis, meanwhile, a foundation is laid for obtaining a target text through the accurate analysis, and then a foundation is laid for improving the success rate of handshaking connection and improving user experience.

Drawings

Fig. 1 is a schematic structural diagram of a communication system according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a communication method according to an embodiment of the present application;

fig. 3 is a schematic diagram of a communication system in a specific application scenario according to an embodiment of the present application;

fig. 4 is an exemplary diagram of a communication method according to an embodiment of the present invention after a transmitting end modulates a signal;

fig. 5 is a comparison diagram before and after smoothing a signal by a transmitting end in the communication method according to the embodiment of the present invention;

fig. 6 is a diagram illustrating an exemplary processing flow of a receiving end in a communication method according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a signal unit in a communication method according to an embodiment of the present invention;

fig. 8 is a schematic diagram of a signal unit after multiple FFTs are performed on the signal unit in the communication method according to the embodiment of the present invention;

fig. 9 is a schematic structural diagram of a logic unit of a first electronic device according to an embodiment of the present invention;

fig. 10 is a schematic diagram of a hardware structure of a first electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In some of the flows described in the specification and claims of the present application and in the above-described figures, a number of operations are included that occur in a particular order, but it should be clearly understood that the flows may include more or less operations, and that the operations may be performed sequentially or in parallel.

The embodiment of the present application provides a communication method, which is applied to a communication system, where the system includes a first electronic device (also referred to as a receiving end) having a signal receiving function and a second electronic device (also referred to as a transmitting end) having a signal transmitting function, where the first electronic device may be specifically a notebook computer, a mobile phone, a tablet computer, or the like; the second electronic device may also be specifically a notebook computer, a mobile phone, a tablet computer, or the like, in a specific scene, as shown in fig. 1, in the screen-projecting scene, the first electronic device may be specifically a notebook computer, the second electronic device may be specifically a display screen, that is, the second electronic device is an emitting end that emits a screen-projecting code, the first electronic device is a receiving end of the screen-projecting code, the screen-projecting code is the target text of the application, a signal emitted by the second electronic device is an audio signal carrying the screen-projecting code, and the first electronic device decodes the received audio signal to obtain the screen-projecting code, and then handshake-connects with the second electronic device, so as to project the screen-projecting content of the first electronic device to the second electronic device for display, thereby completing the screen-projecting operation; specifically, the method comprises the following steps:

the second electronic equipment encodes the determined target text, modulates the encoded target text by using a frequency band within a preset range to obtain an audio signal, and broadcasts the audio signal; the audio signal at least contains a complete target text, for example, the screen-projecting code is hello, and at this time, the audio signal may be a portable logohe or a logoheo, so that a receiving end can decode to obtain the complete target text.

As shown in fig. 2, the first electronic device receives the audio signal; the audio signal is generated after a transmitting end (namely, a second electronic device) encodes and modulates the target text; the audio signal at least comprises a complete target text; demodulating the audio signal, dividing the audio signal into a plurality of signal intervals, and respectively determining a plurality of signal-to-noise ratios corresponding to the plurality of signal intervals; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2; for example, after demodulating the audio signal, N times of short time fourier transform are performed on a plurality of signal intervals obtained by dividing the audio signal, so as to obtain a plurality of signal-to-noise ratios; here, the length of the short-time fourier transform is one N times the length of the signal unit, where N is a positive integer equal to or greater than 2; further, a character sequence contained in the audio signal is obtained based on a plurality of signal-to-noise ratios; and decoding the character sequence contained in the audio signal to obtain the target text, so that the first electronic equipment can complete handshake connection with the second electronic equipment based on the target text.

In the scheme of the application, N may specifically be an empirical value, so, since N times of short-time fourier transform can be performed on the signal interval, for a complete signal unit, N times of short-time fourier transform can be performed on the signal unit without overlapping in the scheme of the application, so that the accuracy of the target text obtained by analysis is improved.

In a specific example of the present application, the target text indicates a screen-casting code for a screen-casting operation; based on the above, the receiving terminal performs handshake connection with the transmitting terminal based on the screen projecting code indicated by the target text, and performs screen projecting operation after the handshake connection is completed, so as to project the screen projecting content of the receiving terminal to the transmitting terminal for display, thus completing the screen projecting operation. In practical application, the screen projection code can be sent after being modulated by frequency which can not be sensed by human ears, for example, the frequency in the 19kHz-20kHz interval is adopted, so that the handshaking connection time is shortened, and the user experience is improved.

In a specific example of the scheme of the present application, the receiving end may obtain a character sequence included in an audio signal by using the following method, and specifically, the receiving end performs phase classification based on the plurality of signal-to-noise ratios to obtain N signal-to-noise ratio sets, where the signal-to-noise ratio sets include S signal units with different phases; the S is related to the number of characters of the characters in the audio signal; and calculating the mean value of the signal-to-noise ratio set, and obtaining the character sequence contained in the audio signal based on the signal-to-noise ratio set corresponding to the maximum mean value. Here, it should be noted that the character sequence is obtained by encoding the target text by the transmitting end, for example, encoding the target text in a hexadecimal or decimal manner, so as to obtain a character sequence matched with the target text. The number of characters of the character sequence in the audio signal is related to the preset length of the target text, and further the S is also related to the preset length of the target text.

In a specific example of the present application, the audio signal further includes a separator for separating two target texts, and at this time, the receiving end may decode the target texts by using the following method, specifically, the receiving end determines the separator from a character sequence included in the audio signal, and decodes a preset number of characters after the characters before or after the separator are matched until the target texts are decoded based on a preset length of the target texts. If the decoding is finished before the decoding is finished to the end point (namely the tail part) of the character sequence contained in the audio signal, returning the tail part of the character sequence contained in the audio signal to a first position, and decoding after a preset number of characters are matched from the first position until the target text is obtained by decoding; wherein the first position is determined based on a preset length of the target text.

Therefore, the receiving end of the scheme can perform N times of short-time Fourier transformation on the signal unit in the audio signal, the length of the short-time Fourier transformation is one N times of the length of the signal unit, so that the same signal unit can be analyzed for multiple times, even if an incomplete signal unit is received, characters corresponding to the incomplete signal unit can be obtained to the maximum extent due to the fact that multiple times of analysis can be performed, a foundation is laid for obtaining all characters contained in the audio signal through the maximum analysis, meanwhile, a foundation is laid for obtaining a target text through accurate analysis, and then a foundation is laid for improving the success rate of handshaking connection and improving user experience.

The following embodiments of the present application are further described in detail with reference to a specific screen projection scenario, specifically, in this example, an ultrasonic screen projection mode is used to implement the screen projection operation, where the ultrasonic screen projection mode has unique advantages over other wireless transmission methods (such as Wi-Fi or bluetooth, etc.):

in the ultrasonic screen projection mode, a receiving end only needs common audio equipment such as a loudspeaker or a microphone, and a special communication module is not needed. The ultrasonic wave establishes communication connection, and only needs about one second; and Wi-Fi or bluetooth, etc., typically take several seconds to establish a connection. The ultrasonic wave cannot penetrate through the wall, so that connectable equipment can be ensured to be in the same room, and the method has significance for meeting screen projection scenes; whereas Wi-Fi or bluetooth etc. will typically affect multiple rooms.

It should be noted that the frequency band used in this example (19kHz-20kHz) is not in the ultrasonic range, but rather close. Also, since most human ears cannot hear the sound in this frequency band, this example also refers to the above frequency band as ultrasonic waves. Based on this, the present example also has the following advantages: reliable transmission within 3m distance, and good interference immunity to environmental noise, room reflection and the like.

The example can realize rapid screen projection in a conference room, and specifically, the method can quickly establish a handshake connection between a source (e.g., a laptop) and a target (e.g., a large screen), and after the connection is successful, send screen projection content of the source to the target for display, where the screen projection content may specifically be multimedia data, and the multimedia data is not transmitted by an ultrasonic method. Specifically, the present example includes two parts, a transmitting end and a receiving end. The transmitting end may be a device fixedly disposed in a conference room, for example, a large screen, and is used for continuously playing the specifically encoded ultrasonic signal (i.e., the audio signal according to the present embodiment), similar to a broadcast manner. The receiving terminal is installed on the conference participant equipment (such as a notebook computer, a mobile phone and the like) and is presented in a client mode, the receiving terminal is not started at ordinary times, the receiving terminal can operate only when a user initiates a shared screen, for example, after a shared screen button in the client is clicked, the receiving terminal receives the ultrasonic signal transmitted by the transmitting terminal, after decoding is carried out, a corresponding screen projecting code can be obtained, then handshake connection is carried out, and screen projecting operation is completed after the handshake connection is successful. Obviously, compared with the traditional mode of manually inputting the screen projection code, the screen projection mode of the example is faster, more convenient and more reliable.

The present example is further described in detail with reference to the framework shown in fig. 3, specifically, as shown in fig. 3, the present example includes a transmitting end and a receiving end, where a content to be sent (such as a screen-projecting code) is usually a text, but may also be encoded as a number, an IP address, and the like, and the scheme of the present application is collectively referred to as a target text, such as byte address, and the transmitting end converts the content to be sent (i.e., the target text) into an ultrasonic signal through an encoding module (Encode) and a modulation module (Modulate), and then broadcasts the ultrasonic signal through a speaker of the transmitting end. Generally, in a conference screen-projecting scene, a transmitting terminal can work all the time and continuously and circularly play ultrasonic signals.

After receiving the corresponding ultrasonic signal, the microphone at the receiving end returns the corresponding screen-projecting code through a demodulation module (Demodulate) and a decoding module (Decode), completes the handshake connection according to the screen-projecting code, and finally completes the screen-projecting operation. The specific process is as follows:

the encoding process of the transmitting terminal: the present example is encoded in hexadecimal and can substantially cover the ASCII character set (excluding the 0x0 to 0xF segments). Taking the example that the screen projection code is byte address, the encoding steps are as follows: according to an ASCII code table, obtaining the hexadecimal corresponding to each letter in the screen-projecting code, namely converting bytanence into a character sequence 6279746564616E 6365; a separator is added at the end of the screen-cast code, for example, using 0x0, the character sequence of the screen-cast code in the previous step is converted into 6279746564616E 636500. As such, the present example is enabled to support ASCII character set transmission. Here, it should be noted that the hexadecimal coding method is only used to explain the present example, and in practical applications, other coding methods may also be used, and the present application does not limit this.

A transmitting end modulation process: the present example employs a Multiple Frequency-Shift Keying (MFSK) modulation scheme. Specifically, after the encoding is completed, the signal after the encoding process is modulated by using + six different frequencies as modulation frequencies corresponding to hexadecimal, where the following points are noted in terms of frequency selection: firstly, the frequency range must ensure insensitivity of human ears, and meanwhile, the audio frequency assembly at the receiving end can normally respond, for example, the frequency in the interval of 19kHz-20kHz is adopted; second, + six frequencies must fall exactly on the short Fourier Transform (FFT) analysis frequency bin to reduce the effects of spectral leakage.

In practical application, for example, when the sampling rate is 48kHz, the frequency value corresponding to bytanence is as shown in fig. 4, at this time, due to discontinuous change of the phase, wide-spectrum noise is introduced between two adjacent sections of sine waves after modulation, which causes that ultrasonic waves that are not audible to the human ear are heard by the human ear.

A receiving end demodulation process: as shown in fig. 6, after the microphone at the receiving end acquires the ultrasonic signal broadcast by the transmitting end, a Short-Time Fourier Transform (STFT) is used to analyze the ultrasonic signal, coefficients corresponding to sixteen frequencies used in a preset encoding process in the STFT are extracted, a power spectrum corresponding to sixteen frequencies is obtained through calculation, and then synchronization processing is performed. Here, in a practical application scenario, since there is no special synchronization signal, the starting point of the ultrasonic signal collected by the microphone at the receiving end is random, as shown in fig. 7, each polygon represents a complete signal unit, which carries a character, where there is a random delay (delay) relative to the starting point of the signal unit due to the absence of the standard synchronization signal in the projection scenario. If the delay cannot be compensated correctly, the fourier transform analysis will erroneously include a part of two adjacent signal units, which greatly reduces the accuracy of the analysis.

To solve the above problem, the present example performs signal synchronization processing, such as a method using multiphase decompositionThe method solves the synchronization problem. Specifically, let FFT length be signal element (symbol) length of 1/4, where 4 is an empirical value; that is, for a signal of a complete symbol length, 4 FFTs without overlap are performed. Assuming that a complete ultrasonic signal contains S symbols, 4S FFTs are needed; then, calculating the signal-to-noise ratio (SNR) of each FFT to obtain 4S SNR values; performing multiphase classification on the 4S SNR values to obtain 4 groups of SNR sets, where each group of SNR sets includes S SNR values, and as shown in fig. 8, the SNR sets include S signal units with different phases; the S is related to the number of characters in the ultrasonic signal and can be recorded as SNR_p(S) ═ SNR (4S + p), where p is 0,1,2,3 are indices of multiphase classifications, S is symbol index, S is 0,1, …, S-1; here, fig. 7 and 8 show only 4 signal units as an example. Further, the average value of each group of SNR sets is calculated, and a group with the largest average value is found out to be used as data to be decoded for subsequent decoding processing.

Decoding process at the receiving end: for data to be decoded, determining the position of a separator 00; here, it should be noted that, in order to prevent confusion from occurring, characters encoded in ASCII in the interval 0x0 to 0xF, which belong to unprintable control characters, do not generally have to be transmitted as screen-shot codes, cannot occur in the ultrasonic signal. The duration of the ultrasonic signal transmitted by the transmitting end needs to be slightly longer (for example, 10%) than the corresponding time of the complete target text, so that the separator 00 is prevented from appearing right at the head and tail ends of the received signal sequence. In practical application, if three continuous 0 s appear in the data to be decoded, the last two are selected as separators; and starting from the first symbol after or before the separator, here, taking decoding from the first symbol after the separator as an example, decoding after combining every two symbols; if the decoding reaches the symbol sequence end point, returning to the starting point to continue merging operation until the separator is encountered, and decoding to obtain a target text; in the decoding process, the length of the screen-cast code needs to be confirmed in advance and serves as an end condition, the return starting point does not refer to the return of the first symbol of the symbol sequence, but 2M +2 symbols are backed from the end of the symbol sequence, and M is the length of the screen-cast code (i.e. the target text) (including no separator). The transmitting end and the receiving end use the same end size (endian) setting.

For example, assuming that the length of the screen-projected code is 3 (without a separator), at the large end, the symbol sequence to be decoded is: 06D6170006, the decoding steps are as follows:

the position where the separator 00 in the symbol sequence was found was determined. Since three consecutive 0 s appear, the last two are taken as separators; starting with the first symbol after the separator, i.e. 6. A return is required due to the sequence end point being encountered. Here, the length of the screen-projecting code is 3, and therefore, 2M +2 is backed up to 8 symbols, i.e., to D. Then, every two symbols are combined and decoded into a character until a separator is encountered, and finally the decoding results in 6D 6170. Here, since the big end is used, the screen-projected code map is finally decoded according to the ASCII code table, and the decoding is completed.

An embodiment of the present application further provides a first electronic device, as shown in fig. 9, including:

a receiving unit 91, configured to receive an audio signal, where the audio signal is generated after a transmitting end encodes and modulates a target text;

the demodulating unit 92 is configured to demodulate the audio signal, divide the audio signal into a plurality of signal intervals, and determine a plurality of signal-to-noise ratios corresponding to the plurality of signal intervals, respectively; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2; obtaining a character sequence contained in the audio signal based on the plurality of signal-to-noise ratios;

a decoding unit 93, configured to decode the character sequence included in the audio signal to obtain the target text, so as to complete a handshake connection with the transmitting end based on the target text.

In a specific example of the present application, the demodulation unit 92 is further configured to

In a specific example of the present application, the audio signal further includes a separator for dividing two of the target texts; the decoding unit 93 is further configured to:

In a specific example of the present application, the decoding unit 93 is further configured to:

Here, it should be noted that: the description of the embodiment of the electronic device is similar to the description of the method, and has the same beneficial effects as the embodiment of the method, and therefore, the description is omitted. For technical details that are not disclosed in the embodiment of the electronic device of the present invention, those skilled in the art should refer to the description of the embodiment of the method of the present invention to understand that, for the sake of brevity, detailed description is not repeated here.

An embodiment of the present application further provides an electronic device, including: one or more processors; a memory communicatively coupled to the one or more processors; one or more application programs; wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method described above.

In a specific example, the electronic device according to the embodiment of the present application may be embodied as a structure as shown in fig. 10, and the electronic device at least includes a processor 101, a storage medium 102, and at least one external communication interface 103; the processor 101, the storage medium 102, and the external communication interface 103 are all connected by a bus 104. The processor 101 may be a microprocessor, a central processing unit, a digital signal processor, a programmable logic array, or other electronic components with processing functions. The storage medium has stored therein computer executable code capable of performing the method of any of the above embodiments. In practical applications, the receiving unit 91, the demodulating unit 92, and the decoding unit 93 may be implemented by the processor 101.

Embodiments of the present application also provide a computer-readable storage medium, which stores a computer program, and when the program is executed by a processor, the computer program implements the method described above.

A computer-readable storage medium can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Additionally, the computer-readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that all or part of the steps carried by the method for implementing the above embodiments can be implemented by hardware related to instructions of a program, which can be stored in a computer readable storage medium, and the program includes one or a combination of the steps of the method embodiments when the program is executed.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.

The embodiments described above are only a part of the embodiments of the present invention, and not all of them. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Claims

1. A communication method, applied to a receiving end, the method comprising:

the receiving end decodes the character sequence contained in the audio signal to obtain the target text so as to complete handshake connection with the transmitting end based on the target text;

the receiving end obtains the character sequence contained in the audio signal based on the plurality of signal-to-noise ratios, and the character sequence comprises the following steps:

the receiving end calculates the SNR of each FFT to obtain N S SNR values;

performing multiphase classification on the N × S SNR values to obtain N groups of SNR sets, where SNR _ p (S) ═ SNR (N × S + p), p ═ 0,1,2,3 are indices of the multiphase classification, S is a symbol index, and S ═ 0,1, …, S-1 in each group of SNR sets;

and calculating the mean value of each group of SNR sets, and obtaining the character sequence contained in the audio signal based on the SNR set corresponding to the maximum mean value.

2. The method of claim 1, wherein the audio signal further comprises a separator for dividing two of the target texts; the decoding, by the receiving end, the character sequence included in the audio signal to obtain the target text, including:

3. The method of claim 2, further comprising:

4. The method of claim 1, wherein the target text indicates a screen-cast code for a screen-cast operation; the method further comprises the following steps:

5. A first electronic device, comprising:

the decoding unit is used for decoding the character sequence contained in the audio signal to obtain the target text so as to complete handshake connection with the transmitting terminal based on the target text;

the demodulation unit is further configured to calculate an SNR of each FFT to obtain N × S SNR values; performing multiphase classification on the N × S SNR values to obtain N groups of SNR sets, where SNR _ p (S) ═ SNR (N × S + p), p ═ 0,1,2,3 are indices of the multiphase classification, S is a symbol index, and S ═ 0,1, …, S-1 in each group of SNR sets; and calculating the mean value of each group of SNR sets, and obtaining the character sequence contained in the audio signal based on the SNR set corresponding to the maximum mean value.

6. The first electronic device of claim 5, wherein the audio signal further comprises a separator for separating two of the target texts; the decoding unit is further configured to:

7. A first electronic device, comprising:

one or more processors;

a memory communicatively coupled to the one or more processors;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-4.

8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 4.

9. A communication system comprising a first electronic device having a signal receiving function and a second electronic device having a signal transmitting function; wherein the content of the first and second substances,

the first electronic device is used for receiving an audio signal, wherein the audio signal is generated after a transmitting end codes and modulates a target text; the receiving end demodulates the audio signal, divides the audio signal into a plurality of signal intervals and respectively determines a plurality of signal-to-noise ratios corresponding to the plurality of signal intervals; the length of the signal interval is one N times of the length of a signal unit in the audio signal, and N is a positive integer greater than or equal to 2; the receiving end calculates the SNR of each FFT to obtain N S SNR values; performing multiphase classification on the N × S SNR values to obtain N groups of SNR sets, where SNR _ p (S) ═ SNR (N × S + p), p ═ 0,1,2,3 are indices of the multiphase classification, S is a symbol index, and S ═ 0,1, …, S-1 in each group of SNR sets; and calculating the mean value of each group of SNR sets, and obtaining the character sequence contained in the audio signal based on the SNR set corresponding to the maximum mean value.