CN116935870A

CN116935870A - Voice transmission method, terminal and computer readable storage medium

Info

Publication number: CN116935870A
Application number: CN202210364802.5A
Authority: CN
Inventors: 颜蓓
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2022-04-08
Filing date: 2022-04-08
Publication date: 2023-10-24
Also published as: WO2023193506A1

Abstract

The invention discloses a voice transmission method, a terminal and a computer readable storage medium, wherein the voice signal of a first terminal is acquired, the characteristic information of the voice signal is extracted, the characteristic information is sent to a second terminal through a circuit switching domain, the voice signal is transmitted to the second terminal through a packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain under the condition that network parameters meet preset conditions, and the repaired voice signal is output. Based on the method, the system and the device have great flexibility, when the network signal is good, repair and restoration are not needed, when the network signal is fading, the restoration is automatically started, and a user does not feel the influence of the network signal falling back on the voice quality of the call at all.

Description

Voice transmission method, terminal and computer readable storage medium

Technical Field

Embodiments of the present invention relate to, but are not limited to, the field of communications, and in particular, to a voice transmission method, a terminal, and a computer-readable storage medium.

Background

In the early networking stage of the new generation mobile communication technology or in response to the region where the base station distribution is rare, the voice transmission quality can be rapidly reduced along with the poor network quality. For example, in VONR (Voice Over New Radio,5G phone) call, a problem that we often encounter is that when 5G (5 th Generation Mobile Communication Technology, fifth generation mobile communication technology) base stations are distributed less, a large amount of jitter and intermittent distortion occur in voice due to network loss and steep increase in delay, and voice quality is even far inferior to CS domain, resulting in a great deal of bad user experience. At present, the method for solving the problem of voice quality degradation caused by poor signals in 4G and 5G networks is to forcedly drop the voice call back to 3G and 2G, the method does not lead to voice interruption and the like, but there is no method for ensuring high bandwidth and high tone quality of VONR, so that the user calls in 5G, but only 3G and 2G call experience is actually achieved. Therefore, how to avoid the rapid degradation of the corresponding voice quality caused by the poor condition of the new generation communication network is a problem to be solved.

Disclosure of Invention

The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.

Embodiments of the present invention provide a voice transmission method, terminal, and computer-readable storage medium, which can transmit a VONR voice signal with high fidelity and without delay even in a case of poor 5G network conditions.

In a first aspect, an embodiment of the present invention provides a voice transmission method, applied to a first terminal, where the method includes:

acquiring a voice signal of the first terminal; extracting characteristic information of the voice signal; transmitting the characteristic information to a second terminal through a circuit switched domain; and transmitting the voice signal to the second terminal through the packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain and outputs the repaired voice signal under the condition that the network parameter meets the preset condition.

In a second aspect, an embodiment of the present invention provides a voice transmission method, applied to a second terminal, where the method includes:

receiving characteristic information sent by a first terminal through a circuit switched domain, wherein the characteristic information is extracted from a voice signal of the first terminal; receiving the voice signal transmitted by the first terminal through a packet switched domain; under the condition that the network parameters meet the preset conditions, repairing the voice signals received by the packet switching domain according to the characteristic information received by the circuit switching domain; and outputting the repaired voice signal.

In a third aspect, an embodiment of the present invention provides a terminal, including: a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the voice transmission method according to the first aspect or the voice transmission method according to the second aspect when executing the computer program.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium storing a computer-executable program for causing a computer to execute the voice transmission method according to the first aspect or the voice transmission method according to the second aspect.

The embodiment of the invention comprises the following steps: the method comprises the steps of obtaining a voice signal of a first terminal, extracting characteristic information of the voice signal, sending the characteristic information to a second terminal through a circuit switching domain, and transmitting the voice signal to the second terminal through a packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain under the condition that network parameters meet preset conditions, and outputting the repaired voice signal. Based on the method, the first terminal extracts the characteristic information from the voice signal, the characteristic information and the original voice signal are transmitted to the second terminal through the circuit switching domain, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the circuit switching domain with the voice signal received by the packet switching domain so as to carry out deletion repair and distortion repair on the voice signal, so that all voice information of a transmitting terminal is completely restored without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the network signal is good, the repairing and restoring are not needed, when the network signal is fading, the restoring is automatically started, and the user can not feel the influence of the network signal falling back to the voice quality of the call at all.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate and do not limit the invention.

Fig. 1 is a flowchart (first terminal side) of a voice transmission method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of content information, time domain information, and frequency domain characteristic information corresponding to a single speech segment according to an embodiment of the present invention;

FIG. 3 is a sub-flowchart of a voice transmission method according to an embodiment of the present invention;

fig. 4 is a flowchart (second terminal side) of a voice transmission method according to an embodiment of the present invention;

FIG. 5 is a schematic illustration of speech repair using time domain information according to one embodiment of the present invention;

FIG. 6 is a schematic diagram of speech repair using content information and frequency domain characteristic information according to one embodiment of the present invention;

FIG. 7 is a sub-flowchart of a voice transmission method according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a voice transmission system according to an embodiment of the present invention;

fig. 9 is a schematic diagram of a terminal structure according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be understood that in the description of the embodiments of the present invention, plural (or multiple) means two or more, and that greater than, less than, exceeding, etc. are understood to not include the present number, and that greater than, less than, within, etc. are understood to include the present number. If any, the terms "first," "second," etc. are used for distinguishing between technical features only, and should not be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.

The embodiment of the invention provides a voice transmission method, a terminal and a computer readable storage medium, which are used for extracting characteristic information of a voice signal by acquiring the voice signal of a first terminal, sending the characteristic information to a second terminal through a CS (Circuit Switch) domain, and transmitting the voice signal to the second terminal through a PS (packet Switch) domain, so that the second terminal repairs the voice signal received by the PS domain according to the characteristic information received by the CS domain under the condition that network parameters meet preset conditions, and outputting the repaired voice signal. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the call to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to repair the voice signal in a missing way and repair the distortion way, so that all voice information of the transmitting end is completely restored without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the network signal is good, the repairing and restoring are not needed, when the network signal is fading, the restoring is automatically started, and the user can not feel the influence of the network signal falling back to the voice quality of the call at all.

As shown in fig. 1, fig. 1 is a flowchart of a voice transmission method according to an embodiment of the present invention. The voice transmission method is applied to the first terminal, and comprises the following steps:

s101, acquiring a voice signal of a first terminal;

s102, extracting characteristic information of a voice signal;

s103, transmitting the characteristic information to a second terminal through a circuit switching domain;

s104, transmitting the voice signal to the second terminal through the packet switched domain, so that the second terminal repairs the voice signal received by the packet switched domain according to the characteristic information received by the circuit switched domain and outputs the repaired voice signal under the condition that the network parameter meets the preset condition.

It is understood that the terminal may include, but is not limited to, a cell phone. Wherein the first terminal and the second terminal are used to represent two different terminals.

It is understood that the characteristic information includes, but is not limited to, content information, time domain information, and frequency domain characteristic information of the voice signal. Wherein, the transmission of the content information and the voice can not generate obvious loss, intermittent and jitter, etc.; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.

It can be understood that fig. 2 is a schematic diagram of content information, time domain information and frequency domain characteristic information corresponding to a single speech segment, where the speech segment is just an example, and in actual operation, the time length of the speech segment can be flexibly selected according to the sampling rate of a specific coding mode; as shown in fig. 2, the content information to be expressed by the user can be extracted from the original voice signal in the upper half of the diagram, for example: "I come home and eat in the evening"; the time axis in the middle of the figure is the time domain part of the speech, indicating the sequential position of the segment throughout the call; the frequency domain characteristic curve in the lower part of the figure is used for representing tone, can be used for identifying the speaking characteristics of each person, and the transmission data amount of the part is small because the frequency domain characteristic curve of each person basically changes little.

It can be understood that the content information, the time domain information and the frequency domain characteristic information of the voice signal are extracted, the number of extracted sampling points is the same as the VONR voice sampling rate, and the three parts of information of each sampling point are coded in a one-to-one correspondence mode.

It can be understood that the CS domain is a circuit switched domain, and is mainly responsible for voice services and video telephony services; the PS domain is a packet switched domain and is mainly responsible for data traffic.

It can be understood that the invention can flexibly adjust the strategy according to the network signal quality, when the VONR network signal is good, the repairing and restoring are not needed, when the network signal is fading, the restoring is automatically started, and the user can not feel the influence of the 5G network signal falling back to the call tone quality at all. The quality of the network signal can be judged by a preset condition, for example, when the network parameter of the VONR is greater than a preset threshold, the network signal is considered to be bad, and the defect part of the voice signal needs to be repaired. And the case where the network parameter for the VONR is greater than the preset threshold may include: the packet loss rate of the network is greater than 10% or the delay of the network is increased by more than 20ms, and the network signal can be considered to be poor only when any one of the above conditions is met, and the defect part of the voice signal needs to be repaired, wherein the defect part can comprise a missing part and a damaged part of the voice signal.

It can be understood that the invention can judge whether to start the repair strategy according to the network packet loss condition, the network packet loss rate can be read in real time in the log of the terminal, the network delay condition can also be read in real time through the network downlink information, and the defect part of the voice signal is repaired when the network packet loss and the network delay reach a certain degree. Because any sentence of voice can be extracted into three parts of content information, time domain information and frequency domain characteristic information, and only the three parts of information are needed, one sentence of voice can be completely restored with fidelity. The content information of the voice, the time domain information of the voice and the binary digital information converted from the frequency domain characteristic information of the voice are transmitted through the CS domain, and meanwhile, the original very clear voice analog signal with high sampling rate is transmitted through the 5G VONR. The intelligent terminal of the opposite user compares the received voice information of the CS domain with the received voice information of the PS domain, and further carries out missing repair and distortion repair on the received voice information of the PS domain, so that all voice information of a transmitting end is completely restored without reducing any tone quality and overlong delay.

It can be understood that, by acquiring the voice signal of the first terminal, extracting the feature information of the voice signal, sending the feature information to the second terminal through the CS domain, and transmitting the voice signal to the second terminal through the PS domain of the VONR, so that the second terminal performs voice repair on the voice signal received by the PS domain according to the feature information received by the CS domain and outputs the voice signal after repair processing in the case that the network parameter of the VONR is greater than the preset threshold. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.

As shown in fig. 3, step S103 may include, but is not limited to, the following sub-steps:

s301, converting the characteristic information into digital information;

s302, the digital information is sent to the second terminal through the circuit switching domain.

It can be understood that the first terminal converts the content information, the time domain information and the frequency domain characteristic information of the voice signal into binary digital information, and then sends the digital information to the second terminal through the CS domain, so that the amount of data transmitted can be reduced through digital conversion.

In summary, the method can adaptively reduce jitter distortion of voice caused by network loss and packet loss in the VONR call, can ensure that voice is always developed under the VONR during the call and cannot be forcedly dropped to 3G or 2G due to poor signals, and can ensure that a mobile phone can meet the requirements of GCF (Global Certification Forum) authentication on voice quality and overall delay after packet loss and delay are added. Because of its flexible adjustment mode of adaptation, the user hardly experiences the influence of the quality of the VONR network on the voice call. Therefore, the invention is a transitional good method for unstable network conditions due to insufficient distribution of base stations in a plurality of areas in the initial stage of 5G network construction. The invention can self-adaptively repair the voice according to the packet loss condition or the delay condition of the network, thus ensuring the punctuality, the stability and the consistency of voice information transmission and being completely free from the restriction of the 5G network signal quality.

As shown in fig. 4, fig. 4 is a flowchart of a voice transmission method according to an embodiment of the present invention. The voice transmission is applied to the second terminal, and includes, but is not limited to, the following steps:

s401, receiving characteristic information sent by a first terminal through a circuit switched domain, wherein the characteristic information is extracted from a voice signal of the first terminal;

s402, receiving a voice signal transmitted by a first terminal through a packet switching domain;

s403, repairing the voice signal received by the packet switched domain according to the characteristic information received by the circuit switched domain under the condition that the network parameter of the VONR meets the preset condition;

s404, outputting the repaired voice signal.

It can be understood that the second terminal can judge whether to repair the voice signal according to the packet loss rate and the newly added delay in the network, and after the content information, the time domain information and the frequency domain characteristic information of the voice signal are transmitted to the opposite terminal, the second terminal can perform one-to-one comparison with the original voice signal transmitted by the VONR due to the identical sampling rate, and when the network condition is bad, the opposite terminal can perform one-to-one synchronous synthesis repair according to the codes.

It will be appreciated that as shown in fig. 5, the speech segments are ordered according to the time domain information of the CS domain, wherein the speech signal is composed of a plurality of speech segments. In order to cope with the problem that the sequence of the voice fragments is disordered caused by delay and jitter, the invalid voice fragments are mixed with the problem that the valid voice fragments are delayed, the disordered voice fragments can be subjected to re-sequence adjustment and homing by using the time domain information transmitted by the CS domain, and the invalid information fragments are removed.

It can be understood that, as shown in fig. 6, for the problem of partial missing and incomplete voice segment information caused by network loss, the missing portion of the voice signal can be repaired well by using the voice segment synthesized by the content information and the frequency domain information transmitted by the CS domain, wherein the missing portion can include the missing portion and the damaged portion of the voice signal.

It can be understood that the invention can flexibly adjust the strategy according to the network signal quality, when the VONR network signal is good, the repairing and restoring are not needed, when the network signal is fading or bad, the restoring can be automatically started, and the user can not feel the influence of the network signal falling back to the call tone quality at all. The quality of the network signal can be judged by a preset condition, for example, when the network parameter of the VONR is greater than a preset threshold, the network signal is considered to be bad, and the defect part of the voice signal needs to be repaired. And the case where the network parameter for the VONR is greater than the preset threshold may include: the packet loss rate of the network is larger than a preset percentage (for example, 10%) or the delay of the network is increased by more than a preset delay (20 ms), and the network signal is considered to be poor only if any one of the above conditions is met, so that the defect part of the voice signal needs to be repaired.

It can be understood that the second terminal receives the feature information sent by the first terminal through the CS domain, and the feature information is extracted from the voice signal of the first terminal. The second terminal receives the voice signal transmitted by the first terminal through the PS domain of the VONR at the same time. And under the condition that the network parameter of the VONR is larger than a preset threshold value, the second terminal carries out voice restoration on the voice signal received by the PS domain according to the characteristic information received by the CS domain, and outputs the voice signal after restoration processing. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.

As shown in fig. 7, step S403 may include, but is not limited to, the following sub-steps:

s701, comparing the voice signals received by the packet switched domain according to the content information, the time domain information and the frequency domain characteristic information of the voice signals received by the circuit switched domain to determine the defect part of the voice signals received by the packet switched domain;

s702, repairing the defect part.

In summary, when the VONR voice call starts, the present invention starts to extract the information of the original voice signal, decompose the information into three parts of content information, time domain information and frequency domain characteristic information, and convert the three parts into digital signals, and when the original voice information is transmitted through the PS domain of the VONR, the extracted information is also continuously sent to the opposite mobile phone through the CS domain. And then judging the condition of the VONR network, if the packet loss rate is greater than a preset percentage or the network delay is increased by more than a preset delay, repairing the original voice signal when the opposite terminal receives the original voice signal and extracts information, wherein the original voice signal is subjected to one-to-one coding correspondence before, so that no information is lost, the tone of the original voice signal is restored, and the user experience of the VONR is not reduced. If the VONR network is good, no repair is initiated. Therefore, the invention has great flexibility and self-adaptability, the strategy can be flexibly adjusted according to the network quality, and the transmission of the content information and the voice can not generate obvious loss, interruption, jitter and the like; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.

As shown in fig. 8, the embodiment of the invention further provides a voice transmission system.

The voice transmission system comprises an information extraction module, a network condition judging module and a voice repairing module. The information extraction module is used for extracting content information, time domain information and frequency domain characteristic information of a speaker, the number of extracted sampling points is the same as the VONR voice sampling rate, and the content information, the time domain information and the frequency domain characteristic information of each sampling point are coded in a one-to-one correspondence mode. The network condition judging module is used for judging whether the voice repair module needs to be started or not according to the packet loss rate and the newly added delay in the network. The voice repair module has the function of performing one-to-one synchronous synthesis repair according to the number when the network condition is bad because the sampling rate is completely the same after the content information, the time domain information and the frequency domain characteristic information are transmitted to the opposite terminal and the original voice transmitted by the VONR can be compared with each other. When the VONR voice call starts, the voice transmission system starts to extract the information of the original voice signal, decompose the information into three parts of content information, time domain information and frequency domain characteristic information, and convert the three parts into digital signals, and when the original voice signal is transmitted through the PS domain of the VONR, the extracted information is also continuously sent to the opposite mobile phone through the CS domain. And then judging the condition of the VONR network, if the packet loss rate is greater than a preset percentage or the network delay is increased by more than a preset delay, repairing the original voice signal when the opposite terminal receives the original voice signal and extracts information, wherein the original voice signal is subjected to one-to-one coding correspondence before, so that no information is lost, the tone of the original voice can be restored, and the user experience of the VONR is not reduced. If the VONR network condition is good, the patching system is not started. The whole voice transmission system is a self-adaptive system, the strategy can be flexibly adjusted according to the network quality, and the transmission of content information and voice cannot be obviously lost, interrupted, jittered and the like; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.

As shown in fig. 9, the embodiment of the present invention further provides a terminal, which includes, but is not limited to, a mobile phone.

Specifically, the terminal includes: one or more processors and memory, one processor and memory being illustrated in fig. 9. The processor and the memory may be connected by a bus or otherwise, for example in fig. 9.

The memory is used as a non-transitory computer readable storage medium for storing a non-transitory software program and a non-transitory computer executable program, such as the voice transmission method in the above-described embodiments of the present invention. The processor implements the voice transmission method in the above-described embodiments of the present invention by running a non-transitory software program stored in a memory and the program.

The memory may include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data and the like required to perform the voice transmission method in the above-described embodiment of the present invention. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The non-transitory software program and the program required for implementing the voice transmission method in the embodiments of the present invention are stored in the memory, and when executed by one or more processors, the voice transmission method in the embodiments of the present invention is executed, for example, the method steps S101 to S104 in fig. 1, the method steps S301 to S302 in fig. 3, or the method steps S401 to S404 in fig. 4, the method steps S701 to S702 in fig. 7 described above are executed, the feature information of the voice signal is extracted by acquiring the voice signal of the first terminal, the feature information is sent to the second terminal through the CS domain, and the voice signal is transmitted to the second terminal through the PS domain of the VONR, so that the second terminal performs voice repair on the voice signal received by the PS domain according to the feature information received by the CS domain and outputs the voice signal after the repair processing, in the case that the network parameter of the VONR is greater than the preset threshold. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.

Furthermore, an embodiment of the present invention provides a computer-readable storage medium storing a computer-executable program that is executed by one or more control processors, for example, by one processor in fig. 9, and that causes the one or more processors to execute the voice transmission method in the embodiment of the present invention described above, for example, execute the method steps S101 to S104 in fig. 1, the method steps S301 to S302 in fig. 3, or execute the method steps S401 to S404 in fig. 4 described above, the method steps S701 to S702 in fig. 7, by acquiring a voice signal of a first terminal, extracting feature information of the voice signal, transmitting the feature information to a second terminal through a CS domain, transmitting the voice signal to the second terminal through a PS domain of the VONR, so that the second terminal restores the received voice signal according to the feature information of the CS domain and outputs the processed voice signal after the received voice restoration in case that the network parameter of the VONR is greater than a preset threshold. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.

Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable programs, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable programs, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.

While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit and scope of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims

1. A voice transmission method applied to a first terminal, the method comprising:

acquiring a voice signal of the first terminal;

extracting characteristic information of the voice signal;

transmitting the characteristic information to a second terminal through a circuit switched domain;

and transmitting the voice signal to the second terminal through the packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain and outputs the repaired voice signal under the condition that the network parameter meets the preset condition.

2. The method of claim 1, wherein extracting characteristic information of the speech signal comprises extracting content information, time domain information, and frequency domain characteristic information of the speech signal.

3. The method of claim 2, wherein the extracting content information, time domain information, and frequency domain characteristic information of the speech signal comprises:

extracting the content information, the time domain information and the frequency domain characteristic information of the voice signal at the same sampling rate as the sampling rate of transmitting the voice signal through a packet switched domain; and carrying out one-by-one corresponding coding on the content information, the time domain information and the frequency domain characteristic information of the sampled voice signal.

4. The method of claim 1, wherein the condition that the network parameter satisfies a preset condition comprises at least one of:

the packet loss rate of the network is larger than a preset percentage;

the delay increase of the network is greater than the preset delay.

5. The method of claim 1, wherein said transmitting said feature information to the second terminal via the circuit switched domain comprises:

converting the characteristic information into digital information;

and transmitting the digital information to the second terminal through a circuit switched domain.

6. A voice transmission method applied to a second terminal, the method comprising:

receiving characteristic information sent by a first terminal through a circuit switched domain, wherein the characteristic information is extracted from a voice signal of the first terminal;

receiving the voice signal transmitted by the first terminal through a packet switched domain;

under the condition that the network parameters meet the preset conditions, repairing the voice signals received by the packet switching domain according to the characteristic information received by the circuit switching domain;

and outputting the repaired voice signal.

7. The method of claim 6, wherein the characteristic information includes content information, time domain information, and frequency domain characteristic information of the voice signal, and wherein performing voice repair on the voice signal received in the packet switched domain based on the characteristic information received in the circuit switched domain comprises:

comparing the voice signals received by the packet switching domain according to the content information, the time domain information and the frequency domain characteristic information of the voice signals received by the circuit switching domain to determine the defect part of the voice signals received by the packet switching domain;

repairing the defect part.

8. The method of claim 7, wherein said performing speech repair on the speech signal received by the packet switched domain based on said characteristic information received by the circuit switched domain, further comprises:

and sequencing the voice fragments according to the time domain information of the circuit switched domain, wherein the voice signal is composed of a plurality of voice fragments.

9. A terminal, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the speech transmission method according to any one of claims 1 to 5 or the speech transmission method according to any one of claims 6 to 8 when executing the computer program.

10. A computer-readable storage medium storing a computer-executable program for causing a computer to execute the voice transmission method according to any one of claims 1 to 5 or the voice transmission method according to any one of claims 6 to 8.