CN116935870A - Voice transmission method, terminal and computer readable storage medium - Google Patents

Voice transmission method, terminal and computer readable storage medium Download PDF

Info

Publication number
CN116935870A
CN116935870A CN202210364802.5A CN202210364802A CN116935870A CN 116935870 A CN116935870 A CN 116935870A CN 202210364802 A CN202210364802 A CN 202210364802A CN 116935870 A CN116935870 A CN 116935870A
Authority
CN
China
Prior art keywords
voice
domain
terminal
information
voice signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210364802.5A
Other languages
Chinese (zh)
Inventor
颜蓓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202210364802.5A priority Critical patent/CN116935870A/en
Priority to PCT/CN2023/071976 priority patent/WO2023193506A1/en
Publication of CN116935870A publication Critical patent/CN116935870A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W36/00Hand-off or reselection arrangements
    • H04W36/14Reselecting a network or an air interface

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voice transmission method, a terminal and a computer readable storage medium, wherein the voice signal of a first terminal is acquired, the characteristic information of the voice signal is extracted, the characteristic information is sent to a second terminal through a circuit switching domain, the voice signal is transmitted to the second terminal through a packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain under the condition that network parameters meet preset conditions, and the repaired voice signal is output. Based on the method, the system and the device have great flexibility, when the network signal is good, repair and restoration are not needed, when the network signal is fading, the restoration is automatically started, and a user does not feel the influence of the network signal falling back on the voice quality of the call at all.

Description

Voice transmission method, terminal and computer readable storage medium
Technical Field
Embodiments of the present invention relate to, but are not limited to, the field of communications, and in particular, to a voice transmission method, a terminal, and a computer-readable storage medium.
Background
In the early networking stage of the new generation mobile communication technology or in response to the region where the base station distribution is rare, the voice transmission quality can be rapidly reduced along with the poor network quality. For example, in VONR (Voice Over New Radio,5G phone) call, a problem that we often encounter is that when 5G (5 th Generation Mobile Communication Technology, fifth generation mobile communication technology) base stations are distributed less, a large amount of jitter and intermittent distortion occur in voice due to network loss and steep increase in delay, and voice quality is even far inferior to CS domain, resulting in a great deal of bad user experience. At present, the method for solving the problem of voice quality degradation caused by poor signals in 4G and 5G networks is to forcedly drop the voice call back to 3G and 2G, the method does not lead to voice interruption and the like, but there is no method for ensuring high bandwidth and high tone quality of VONR, so that the user calls in 5G, but only 3G and 2G call experience is actually achieved. Therefore, how to avoid the rapid degradation of the corresponding voice quality caused by the poor condition of the new generation communication network is a problem to be solved.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
Embodiments of the present invention provide a voice transmission method, terminal, and computer-readable storage medium, which can transmit a VONR voice signal with high fidelity and without delay even in a case of poor 5G network conditions.
In a first aspect, an embodiment of the present invention provides a voice transmission method, applied to a first terminal, where the method includes:
acquiring a voice signal of the first terminal; extracting characteristic information of the voice signal; transmitting the characteristic information to a second terminal through a circuit switched domain; and transmitting the voice signal to the second terminal through the packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain and outputs the repaired voice signal under the condition that the network parameter meets the preset condition.
In a second aspect, an embodiment of the present invention provides a voice transmission method, applied to a second terminal, where the method includes:
receiving characteristic information sent by a first terminal through a circuit switched domain, wherein the characteristic information is extracted from a voice signal of the first terminal; receiving the voice signal transmitted by the first terminal through a packet switched domain; under the condition that the network parameters meet the preset conditions, repairing the voice signals received by the packet switching domain according to the characteristic information received by the circuit switching domain; and outputting the repaired voice signal.
In a third aspect, an embodiment of the present invention provides a terminal, including: a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the voice transmission method according to the first aspect or the voice transmission method according to the second aspect when executing the computer program.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium storing a computer-executable program for causing a computer to execute the voice transmission method according to the first aspect or the voice transmission method according to the second aspect.
The embodiment of the invention comprises the following steps: the method comprises the steps of obtaining a voice signal of a first terminal, extracting characteristic information of the voice signal, sending the characteristic information to a second terminal through a circuit switching domain, and transmitting the voice signal to the second terminal through a packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain under the condition that network parameters meet preset conditions, and outputting the repaired voice signal. Based on the method, the first terminal extracts the characteristic information from the voice signal, the characteristic information and the original voice signal are transmitted to the second terminal through the circuit switching domain, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the circuit switching domain with the voice signal received by the packet switching domain so as to carry out deletion repair and distortion repair on the voice signal, so that all voice information of a transmitting terminal is completely restored without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the network signal is good, the repairing and restoring are not needed, when the network signal is fading, the restoring is automatically started, and the user can not feel the influence of the network signal falling back to the voice quality of the call at all.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate and do not limit the invention.
Fig. 1 is a flowchart (first terminal side) of a voice transmission method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of content information, time domain information, and frequency domain characteristic information corresponding to a single speech segment according to an embodiment of the present invention;
FIG. 3 is a sub-flowchart of a voice transmission method according to an embodiment of the present invention;
fig. 4 is a flowchart (second terminal side) of a voice transmission method according to an embodiment of the present invention;
FIG. 5 is a schematic illustration of speech repair using time domain information according to one embodiment of the present invention;
FIG. 6 is a schematic diagram of speech repair using content information and frequency domain characteristic information according to one embodiment of the present invention;
FIG. 7 is a sub-flowchart of a voice transmission method according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a voice transmission system according to an embodiment of the present invention;
fig. 9 is a schematic diagram of a terminal structure according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
It should be understood that in the description of the embodiments of the present invention, plural (or multiple) means two or more, and that greater than, less than, exceeding, etc. are understood to not include the present number, and that greater than, less than, within, etc. are understood to include the present number. If any, the terms "first," "second," etc. are used for distinguishing between technical features only, and should not be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
The embodiment of the invention provides a voice transmission method, a terminal and a computer readable storage medium, which are used for extracting characteristic information of a voice signal by acquiring the voice signal of a first terminal, sending the characteristic information to a second terminal through a CS (Circuit Switch) domain, and transmitting the voice signal to the second terminal through a PS (packet Switch) domain, so that the second terminal repairs the voice signal received by the PS domain according to the characteristic information received by the CS domain under the condition that network parameters meet preset conditions, and outputting the repaired voice signal. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the call to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to repair the voice signal in a missing way and repair the distortion way, so that all voice information of the transmitting end is completely restored without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the network signal is good, the repairing and restoring are not needed, when the network signal is fading, the restoring is automatically started, and the user can not feel the influence of the network signal falling back to the voice quality of the call at all.
As shown in fig. 1, fig. 1 is a flowchart of a voice transmission method according to an embodiment of the present invention. The voice transmission method is applied to the first terminal, and comprises the following steps:
s101, acquiring a voice signal of a first terminal;
s102, extracting characteristic information of a voice signal;
s103, transmitting the characteristic information to a second terminal through a circuit switching domain;
s104, transmitting the voice signal to the second terminal through the packet switched domain, so that the second terminal repairs the voice signal received by the packet switched domain according to the characteristic information received by the circuit switched domain and outputs the repaired voice signal under the condition that the network parameter meets the preset condition.
It is understood that the terminal may include, but is not limited to, a cell phone. Wherein the first terminal and the second terminal are used to represent two different terminals.
It is understood that the characteristic information includes, but is not limited to, content information, time domain information, and frequency domain characteristic information of the voice signal. Wherein, the transmission of the content information and the voice can not generate obvious loss, intermittent and jitter, etc.; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.
It can be understood that fig. 2 is a schematic diagram of content information, time domain information and frequency domain characteristic information corresponding to a single speech segment, where the speech segment is just an example, and in actual operation, the time length of the speech segment can be flexibly selected according to the sampling rate of a specific coding mode; as shown in fig. 2, the content information to be expressed by the user can be extracted from the original voice signal in the upper half of the diagram, for example: "I come home and eat in the evening"; the time axis in the middle of the figure is the time domain part of the speech, indicating the sequential position of the segment throughout the call; the frequency domain characteristic curve in the lower part of the figure is used for representing tone, can be used for identifying the speaking characteristics of each person, and the transmission data amount of the part is small because the frequency domain characteristic curve of each person basically changes little.
It can be understood that the content information, the time domain information and the frequency domain characteristic information of the voice signal are extracted, the number of extracted sampling points is the same as the VONR voice sampling rate, and the three parts of information of each sampling point are coded in a one-to-one correspondence mode.
It can be understood that the CS domain is a circuit switched domain, and is mainly responsible for voice services and video telephony services; the PS domain is a packet switched domain and is mainly responsible for data traffic.
It can be understood that the invention can flexibly adjust the strategy according to the network signal quality, when the VONR network signal is good, the repairing and restoring are not needed, when the network signal is fading, the restoring is automatically started, and the user can not feel the influence of the 5G network signal falling back to the call tone quality at all. The quality of the network signal can be judged by a preset condition, for example, when the network parameter of the VONR is greater than a preset threshold, the network signal is considered to be bad, and the defect part of the voice signal needs to be repaired. And the case where the network parameter for the VONR is greater than the preset threshold may include: the packet loss rate of the network is greater than 10% or the delay of the network is increased by more than 20ms, and the network signal can be considered to be poor only when any one of the above conditions is met, and the defect part of the voice signal needs to be repaired, wherein the defect part can comprise a missing part and a damaged part of the voice signal.
It can be understood that the invention can judge whether to start the repair strategy according to the network packet loss condition, the network packet loss rate can be read in real time in the log of the terminal, the network delay condition can also be read in real time through the network downlink information, and the defect part of the voice signal is repaired when the network packet loss and the network delay reach a certain degree. Because any sentence of voice can be extracted into three parts of content information, time domain information and frequency domain characteristic information, and only the three parts of information are needed, one sentence of voice can be completely restored with fidelity. The content information of the voice, the time domain information of the voice and the binary digital information converted from the frequency domain characteristic information of the voice are transmitted through the CS domain, and meanwhile, the original very clear voice analog signal with high sampling rate is transmitted through the 5G VONR. The intelligent terminal of the opposite user compares the received voice information of the CS domain with the received voice information of the PS domain, and further carries out missing repair and distortion repair on the received voice information of the PS domain, so that all voice information of a transmitting end is completely restored without reducing any tone quality and overlong delay.
It can be understood that, by acquiring the voice signal of the first terminal, extracting the feature information of the voice signal, sending the feature information to the second terminal through the CS domain, and transmitting the voice signal to the second terminal through the PS domain of the VONR, so that the second terminal performs voice repair on the voice signal received by the PS domain according to the feature information received by the CS domain and outputs the voice signal after repair processing in the case that the network parameter of the VONR is greater than the preset threshold. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.
As shown in fig. 3, step S103 may include, but is not limited to, the following sub-steps:
s301, converting the characteristic information into digital information;
s302, the digital information is sent to the second terminal through the circuit switching domain.
It can be understood that the first terminal converts the content information, the time domain information and the frequency domain characteristic information of the voice signal into binary digital information, and then sends the digital information to the second terminal through the CS domain, so that the amount of data transmitted can be reduced through digital conversion.
In summary, the method can adaptively reduce jitter distortion of voice caused by network loss and packet loss in the VONR call, can ensure that voice is always developed under the VONR during the call and cannot be forcedly dropped to 3G or 2G due to poor signals, and can ensure that a mobile phone can meet the requirements of GCF (Global Certification Forum) authentication on voice quality and overall delay after packet loss and delay are added. Because of its flexible adjustment mode of adaptation, the user hardly experiences the influence of the quality of the VONR network on the voice call. Therefore, the invention is a transitional good method for unstable network conditions due to insufficient distribution of base stations in a plurality of areas in the initial stage of 5G network construction. The invention can self-adaptively repair the voice according to the packet loss condition or the delay condition of the network, thus ensuring the punctuality, the stability and the consistency of voice information transmission and being completely free from the restriction of the 5G network signal quality.
As shown in fig. 4, fig. 4 is a flowchart of a voice transmission method according to an embodiment of the present invention. The voice transmission is applied to the second terminal, and includes, but is not limited to, the following steps:
s401, receiving characteristic information sent by a first terminal through a circuit switched domain, wherein the characteristic information is extracted from a voice signal of the first terminal;
s402, receiving a voice signal transmitted by a first terminal through a packet switching domain;
s403, repairing the voice signal received by the packet switched domain according to the characteristic information received by the circuit switched domain under the condition that the network parameter of the VONR meets the preset condition;
s404, outputting the repaired voice signal.
It is understood that the terminal may include, but is not limited to, a cell phone. Wherein the first terminal and the second terminal are used to represent two different terminals.
It can be understood that the CS domain is a circuit switched domain, and is mainly responsible for voice services and video telephony services; the PS domain is a packet switched domain and is mainly responsible for data traffic.
It is understood that the characteristic information includes, but is not limited to, content information, time domain information, and frequency domain characteristic information of the voice signal. Wherein, the transmission of the content information and the voice can not generate obvious loss, intermittent and jitter, etc.; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.
It can be understood that the second terminal can judge whether to repair the voice signal according to the packet loss rate and the newly added delay in the network, and after the content information, the time domain information and the frequency domain characteristic information of the voice signal are transmitted to the opposite terminal, the second terminal can perform one-to-one comparison with the original voice signal transmitted by the VONR due to the identical sampling rate, and when the network condition is bad, the opposite terminal can perform one-to-one synchronous synthesis repair according to the codes.
It will be appreciated that as shown in fig. 5, the speech segments are ordered according to the time domain information of the CS domain, wherein the speech signal is composed of a plurality of speech segments. In order to cope with the problem that the sequence of the voice fragments is disordered caused by delay and jitter, the invalid voice fragments are mixed with the problem that the valid voice fragments are delayed, the disordered voice fragments can be subjected to re-sequence adjustment and homing by using the time domain information transmitted by the CS domain, and the invalid information fragments are removed.
It can be understood that, as shown in fig. 6, for the problem of partial missing and incomplete voice segment information caused by network loss, the missing portion of the voice signal can be repaired well by using the voice segment synthesized by the content information and the frequency domain information transmitted by the CS domain, wherein the missing portion can include the missing portion and the damaged portion of the voice signal.
It can be understood that the invention can flexibly adjust the strategy according to the network signal quality, when the VONR network signal is good, the repairing and restoring are not needed, when the network signal is fading or bad, the restoring can be automatically started, and the user can not feel the influence of the network signal falling back to the call tone quality at all. The quality of the network signal can be judged by a preset condition, for example, when the network parameter of the VONR is greater than a preset threshold, the network signal is considered to be bad, and the defect part of the voice signal needs to be repaired. And the case where the network parameter for the VONR is greater than the preset threshold may include: the packet loss rate of the network is larger than a preset percentage (for example, 10%) or the delay of the network is increased by more than a preset delay (20 ms), and the network signal is considered to be poor only if any one of the above conditions is met, so that the defect part of the voice signal needs to be repaired.
It can be understood that the second terminal receives the feature information sent by the first terminal through the CS domain, and the feature information is extracted from the voice signal of the first terminal. The second terminal receives the voice signal transmitted by the first terminal through the PS domain of the VONR at the same time. And under the condition that the network parameter of the VONR is larger than a preset threshold value, the second terminal carries out voice restoration on the voice signal received by the PS domain according to the characteristic information received by the CS domain, and outputs the voice signal after restoration processing. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.
As shown in fig. 7, step S403 may include, but is not limited to, the following sub-steps:
s701, comparing the voice signals received by the packet switched domain according to the content information, the time domain information and the frequency domain characteristic information of the voice signals received by the circuit switched domain to determine the defect part of the voice signals received by the packet switched domain;
s702, repairing the defect part.
It can be understood that the invention can judge whether to start the repair strategy according to the network packet loss condition, the network packet loss rate can be read in real time in the log of the terminal, the network delay condition can also be read in real time through the network downlink information, and the defect part of the voice signal is repaired when the network packet loss and the network delay reach a certain degree. Because any sentence of voice can be extracted into three parts of content information, time domain information and frequency domain characteristic information, and only the three parts of information are needed, one sentence of voice can be completely restored with fidelity. The content information of the voice, the time domain information of the voice and the binary digital information converted from the frequency domain characteristic information of the voice are transmitted through the CS domain, and meanwhile, the original very clear voice analog signal with high sampling rate is transmitted through the 5G VONR. The intelligent terminal of the opposite user compares the received voice information of the CS domain with the received voice information of the PS domain, and further carries out missing repair and distortion repair on the received voice information of the PS domain, so that all voice information of a transmitting end is completely restored without reducing any tone quality and overlong delay.
In summary, when the VONR voice call starts, the present invention starts to extract the information of the original voice signal, decompose the information into three parts of content information, time domain information and frequency domain characteristic information, and convert the three parts into digital signals, and when the original voice information is transmitted through the PS domain of the VONR, the extracted information is also continuously sent to the opposite mobile phone through the CS domain. And then judging the condition of the VONR network, if the packet loss rate is greater than a preset percentage or the network delay is increased by more than a preset delay, repairing the original voice signal when the opposite terminal receives the original voice signal and extracts information, wherein the original voice signal is subjected to one-to-one coding correspondence before, so that no information is lost, the tone of the original voice signal is restored, and the user experience of the VONR is not reduced. If the VONR network is good, no repair is initiated. Therefore, the invention has great flexibility and self-adaptability, the strategy can be flexibly adjusted according to the network quality, and the transmission of the content information and the voice can not generate obvious loss, interruption, jitter and the like; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.
As shown in fig. 8, the embodiment of the invention further provides a voice transmission system.
The voice transmission system comprises an information extraction module, a network condition judging module and a voice repairing module. The information extraction module is used for extracting content information, time domain information and frequency domain characteristic information of a speaker, the number of extracted sampling points is the same as the VONR voice sampling rate, and the content information, the time domain information and the frequency domain characteristic information of each sampling point are coded in a one-to-one correspondence mode. The network condition judging module is used for judging whether the voice repair module needs to be started or not according to the packet loss rate and the newly added delay in the network. The voice repair module has the function of performing one-to-one synchronous synthesis repair according to the number when the network condition is bad because the sampling rate is completely the same after the content information, the time domain information and the frequency domain characteristic information are transmitted to the opposite terminal and the original voice transmitted by the VONR can be compared with each other. When the VONR voice call starts, the voice transmission system starts to extract the information of the original voice signal, decompose the information into three parts of content information, time domain information and frequency domain characteristic information, and convert the three parts into digital signals, and when the original voice signal is transmitted through the PS domain of the VONR, the extracted information is also continuously sent to the opposite mobile phone through the CS domain. And then judging the condition of the VONR network, if the packet loss rate is greater than a preset percentage or the network delay is increased by more than a preset delay, repairing the original voice signal when the opposite terminal receives the original voice signal and extracts information, wherein the original voice signal is subjected to one-to-one coding correspondence before, so that no information is lost, the tone of the original voice can be restored, and the user experience of the VONR is not reduced. If the VONR network condition is good, the patching system is not started. The whole voice transmission system is a self-adaptive system, the strategy can be flexibly adjusted according to the network quality, and the transmission of content information and voice cannot be obviously lost, interrupted, jittered and the like; the transmission of the frequency domain characteristic information can reduce the distortion degree of the voice; and when the network delay is increased suddenly, the time domain information can be transferred, and the experience of the voice communication of the two parties is not reduced due to the larger delay.
As shown in fig. 9, the embodiment of the present invention further provides a terminal, which includes, but is not limited to, a mobile phone.
Specifically, the terminal includes: one or more processors and memory, one processor and memory being illustrated in fig. 9. The processor and the memory may be connected by a bus or otherwise, for example in fig. 9.
The memory is used as a non-transitory computer readable storage medium for storing a non-transitory software program and a non-transitory computer executable program, such as the voice transmission method in the above-described embodiments of the present invention. The processor implements the voice transmission method in the above-described embodiments of the present invention by running a non-transitory software program stored in a memory and the program.
The memory may include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data and the like required to perform the voice transmission method in the above-described embodiment of the present invention. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The non-transitory software program and the program required for implementing the voice transmission method in the embodiments of the present invention are stored in the memory, and when executed by one or more processors, the voice transmission method in the embodiments of the present invention is executed, for example, the method steps S101 to S104 in fig. 1, the method steps S301 to S302 in fig. 3, or the method steps S401 to S404 in fig. 4, the method steps S701 to S702 in fig. 7 described above are executed, the feature information of the voice signal is extracted by acquiring the voice signal of the first terminal, the feature information is sent to the second terminal through the CS domain, and the voice signal is transmitted to the second terminal through the PS domain of the VONR, so that the second terminal performs voice repair on the voice signal received by the PS domain according to the feature information received by the CS domain and outputs the voice signal after the repair processing, in the case that the network parameter of the VONR is greater than the preset threshold. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.
Furthermore, an embodiment of the present invention provides a computer-readable storage medium storing a computer-executable program that is executed by one or more control processors, for example, by one processor in fig. 9, and that causes the one or more processors to execute the voice transmission method in the embodiment of the present invention described above, for example, execute the method steps S101 to S104 in fig. 1, the method steps S301 to S302 in fig. 3, or execute the method steps S401 to S404 in fig. 4 described above, the method steps S701 to S702 in fig. 7, by acquiring a voice signal of a first terminal, extracting feature information of the voice signal, transmitting the feature information to a second terminal through a CS domain, transmitting the voice signal to the second terminal through a PS domain of the VONR, so that the second terminal restores the received voice signal according to the feature information of the CS domain and outputs the processed voice signal after the received voice restoration in case that the network parameter of the VONR is greater than a preset threshold. Based on the method, the first terminal extracts the characteristic information from the voice signal, the CS domain transmits the characteristic information and the original voice signal transmitted by the 5G VONR to the second terminal, and under the condition that the network signal is poor, the second terminal compares the characteristic information received by the CS domain with the voice signal received by the PS domain so as to carry out deletion repair and distortion repair on the voice signal, thus completely restoring all voice information of the transmitting terminal without reducing any tone quality and overlong delay. Therefore, the invention has great flexibility, when the VONR network signal is good, the repair and the restoration are not needed, when the network signal is fading, the restoration is automatically started, and the user can not feel the influence of the 5G network signal falling back to the voice quality of the call at all.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable programs, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable programs, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the above embodiments, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit and scope of the present invention, and these equivalent modifications or substitutions are included in the scope of the present invention as defined in the appended claims.

Claims (10)

1. A voice transmission method applied to a first terminal, the method comprising:
acquiring a voice signal of the first terminal;
extracting characteristic information of the voice signal;
transmitting the characteristic information to a second terminal through a circuit switched domain;
and transmitting the voice signal to the second terminal through the packet switching domain, so that the second terminal repairs the voice signal received by the packet switching domain according to the characteristic information received by the circuit switching domain and outputs the repaired voice signal under the condition that the network parameter meets the preset condition.
2. The method of claim 1, wherein extracting characteristic information of the speech signal comprises extracting content information, time domain information, and frequency domain characteristic information of the speech signal.
3. The method of claim 2, wherein the extracting content information, time domain information, and frequency domain characteristic information of the speech signal comprises:
extracting the content information, the time domain information and the frequency domain characteristic information of the voice signal at the same sampling rate as the sampling rate of transmitting the voice signal through a packet switched domain; and carrying out one-by-one corresponding coding on the content information, the time domain information and the frequency domain characteristic information of the sampled voice signal.
4. The method of claim 1, wherein the condition that the network parameter satisfies a preset condition comprises at least one of:
the packet loss rate of the network is larger than a preset percentage;
the delay increase of the network is greater than the preset delay.
5. The method of claim 1, wherein said transmitting said feature information to the second terminal via the circuit switched domain comprises:
converting the characteristic information into digital information;
and transmitting the digital information to the second terminal through a circuit switched domain.
6. A voice transmission method applied to a second terminal, the method comprising:
receiving characteristic information sent by a first terminal through a circuit switched domain, wherein the characteristic information is extracted from a voice signal of the first terminal;
receiving the voice signal transmitted by the first terminal through a packet switched domain;
under the condition that the network parameters meet the preset conditions, repairing the voice signals received by the packet switching domain according to the characteristic information received by the circuit switching domain;
and outputting the repaired voice signal.
7. The method of claim 6, wherein the characteristic information includes content information, time domain information, and frequency domain characteristic information of the voice signal, and wherein performing voice repair on the voice signal received in the packet switched domain based on the characteristic information received in the circuit switched domain comprises:
comparing the voice signals received by the packet switching domain according to the content information, the time domain information and the frequency domain characteristic information of the voice signals received by the circuit switching domain to determine the defect part of the voice signals received by the packet switching domain;
repairing the defect part.
8. The method of claim 7, wherein said performing speech repair on the speech signal received by the packet switched domain based on said characteristic information received by the circuit switched domain, further comprises:
and sequencing the voice fragments according to the time domain information of the circuit switched domain, wherein the voice signal is composed of a plurality of voice fragments.
9. A terminal, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the speech transmission method according to any one of claims 1 to 5 or the speech transmission method according to any one of claims 6 to 8 when executing the computer program.
10. A computer-readable storage medium storing a computer-executable program for causing a computer to execute the voice transmission method according to any one of claims 1 to 5 or the voice transmission method according to any one of claims 6 to 8.
CN202210364802.5A 2022-04-08 2022-04-08 Voice transmission method, terminal and computer readable storage medium Pending CN116935870A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210364802.5A CN116935870A (en) 2022-04-08 2022-04-08 Voice transmission method, terminal and computer readable storage medium
PCT/CN2023/071976 WO2023193506A1 (en) 2022-04-08 2023-01-12 Voice transmission method, terminal and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210364802.5A CN116935870A (en) 2022-04-08 2022-04-08 Voice transmission method, terminal and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN116935870A true CN116935870A (en) 2023-10-24

Family

ID=88243912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210364802.5A Pending CN116935870A (en) 2022-04-08 2022-04-08 Voice transmission method, terminal and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN116935870A (en)
WO (1) WO2023193506A1 (en)

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
BR112013004695A2 (en) * 2010-08-27 2016-05-10 Qualcomm Inc "precedence of packet-switched communication in multimode communication device"
KR101558230B1 (en) * 2013-08-30 2015-10-07 에스케이텔레콤 주식회사 Method and apparatus for handing over call from packet switched domain to circuit switched domain
CN106102087B (en) * 2016-05-27 2019-07-26 维沃移动通信有限公司 A kind of audio communication method and mobile terminal
CN106358254A (en) * 2016-08-31 2017-01-25 广东欧珀移动通信有限公司 Network access control method and equipment
CN109040495B (en) * 2018-08-07 2021-08-10 奇酷互联网络科技(深圳)有限公司 Voice call control method and device, mobile terminal and storage medium
CN111988821B (en) * 2019-05-22 2021-10-22 华为技术有限公司 Voice communication method and device
CN113035226B (en) * 2019-12-24 2024-04-23 中兴通讯股份有限公司 Voice communication method, communication terminal and computer readable medium
CN111405622B (en) * 2020-03-20 2022-10-25 Oppo广东移动通信有限公司 Switching method and device based on voice quality, terminal and storage medium
CN111491290B (en) * 2020-04-14 2021-06-29 深圳市沃特沃德股份有限公司 Method, device and computer equipment for parallel transmission of network voice and PS domain data
CN111901841B (en) * 2020-07-21 2022-11-04 陕西银河景天电子有限责任公司 Method, server and storage medium for fusing and connecting CS domain and PS domain

Also Published As

Publication number Publication date
WO2023193506A1 (en) 2023-10-12

Similar Documents

Publication Publication Date Title
US6138022A (en) Cellular communication network with vocoder sharing feature
CN101461200B (en) Wireless communication device, wireless communication system and wireless communication method
USRE39990E1 (en) Increasing channel capacity in fixed cellular networks
WO2003085908A3 (en) Method computer-readable medium and systems for resorting traffic during failover in a cable head end
CN104917671A (en) Mobile terminal based audio processing method and device
US20050232309A1 (en) In band signal detection and presentation for IP phone
US10560371B2 (en) Method, apparatus, terminal device, and system for adjusting working status of aggregated link
KR20140025664A (en) Adaptive coding modulation apparatus and method of forward link in satellite communication
JPH09247267A (en) Method and device for signal transmission
CN102025963B (en) Method and device for controlling call of video telephone
CN101517948A (en) Communication device, communication method, and recording medium
CN106254678A (en) A kind of call mode switching device based on terminal electric energy and changing method thereof
CN103327014A (en) Voice processing method, device and system
CN1988709A (en) Signalling tracking device
US20100266001A1 (en) Method for transmitting data over voice channel
CN116935870A (en) Voice transmission method, terminal and computer readable storage medium
CN107689945B (en) Media conversion equipment control method and device and media gateway
CN105682157A (en) Call method and system
CN113035226B (en) Voice communication method, communication terminal and computer readable medium
CN110958415B (en) Method for dynamically adjusting media transmission based on single-platform network monitoring
JPH10508738A (en) How to encode in a cellular network
Purohit et al. Simulation Study of DITMC Technique for enhancing channel utilization in speech communication of mobile network
CN101296110A (en) Real-time monitoring apparatus and method
US6795497B1 (en) Use of end to end compression devices in telecommunication networks
JP7070776B1 (en) Relays, methods, programs and relay systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication