CN110267309B - Method and equipment for translating call voice in real time - Google Patents

Method and equipment for translating call voice in real time Download PDF

Info

Publication number
CN110267309B
CN110267309B CN201910559064.8A CN201910559064A CN110267309B CN 110267309 B CN110267309 B CN 110267309B CN 201910559064 A CN201910559064 A CN 201910559064A CN 110267309 B CN110267309 B CN 110267309B
Authority
CN
China
Prior art keywords
voice data
call
base station
server
call voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910559064.8A
Other languages
Chinese (zh)
Other versions
CN110267309A (en
Inventor
陈景郁
成荣飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Guangzhou Mobile R&D Center
Samsung Electronics Co Ltd
Original Assignee
Samsung Guangzhou Mobile R&D Center
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Guangzhou Mobile R&D Center, Samsung Electronics Co Ltd filed Critical Samsung Guangzhou Mobile R&D Center
Priority to CN201910559064.8A priority Critical patent/CN110267309B/en
Publication of CN110267309A publication Critical patent/CN110267309A/en
Application granted granted Critical
Publication of CN110267309B publication Critical patent/CN110267309B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • H04W28/18Negotiating wireless communication parameters
    • H04W28/22Negotiating communication rate

Abstract

A method and device for real-time translation of call voice are provided. The method comprises the following steps: when the electronic terminal needs to translate call voice in real time, transmitting the call voice data received in real time from the base station to a server for translating the call voice at a first encoding rate, and receiving a corresponding translation result from the server; determining whether the time for receiving the translation result corresponding to the call voice data is greater than a first time threshold; when it is determined that the time is greater than the first time threshold, communicating voice data received in real-time from a base station is transmitted to the server at a second encoding rate, and a corresponding translation result is received from the server. According to the method and the device, the real-time performance of the call voice translation function can be improved.

Description

Method and equipment for translating call voice in real time
Technical Field
The present invention relates generally to the field of electronic terminals, and more particularly, to a method and apparatus for real-time translation of call speech.
Background
With the advent of the global era, cross-regional communication has become more frequent. In the cross-region communication process, people can smoothly communicate by using translation software so as to solve the trouble caused by language obstruction. In the voice communication process, the two parties can realize barrier-free voice communication through the function of real-time translation of communication voice even if the two parties use different languages. However, the translation delay of the current call voice translation function is large, so that the translation real-time performance is poor, and the user experience is reduced.
Disclosure of Invention
An exemplary embodiment of the present invention is to provide a method and an apparatus for translating call voice in real time, which can solve the problem of poor real-time translation of call voice.
According to an exemplary embodiment of the present invention, a method for translating call voice in real time is provided, wherein the method comprises: when the electronic terminal needs to translate call voice in real time, transmitting the call voice data received in real time from the base station to a server for translating the call voice at a first encoding rate, and receiving a corresponding translation result from the server; determining whether the time for receiving the translation result corresponding to the call voice data is greater than a first time threshold; when it is determined that the time is greater than the first time threshold, communicating voice data received in real-time from a base station is transmitted to the server at a second encoding rate, and a corresponding translation result is received from the server.
Optionally, the method further comprises: and outputting the translation result received from the server.
Optionally, the second encoding rate is lower than the first encoding rate.
Optionally, the method further comprises: after transmitting the voice data to the server at the second encoding rate, when it is determined that a time taken to receive a translation result corresponding to the call voice data is less than a second time threshold, transmitting the voice data received in real time from the base station to the server at the first encoding rate, and receiving the corresponding translation result from the server.
Optionally, the first encoding rate is an encoding rate of call voice data determined by the base station when the current voice call is initiated.
Optionally, the step of transmitting the voice data received in real time from the base station to the server at the second encoding rate comprises: negotiating with a base station to reduce the encoding rate of the call voice data of the current voice call; when the negotiation is completed, receiving call voice data with a second encoding rate from the base station in real time and transmitting the call voice data to the server; or, converting the call voice data received in real time from the base station into call voice data having the second encoding rate; and transmitting the converted call voice data to the server.
Optionally, the negotiating with the base station to reduce the encoding rate of the call voice data of the current voice call includes: sending a message for requesting to encode the conversation voice data of the current voice conversation according to the encoding mode corresponding to the second encoding rate to the base station, and receiving a response message returned by the base station; or, sending a message for requesting to encode the call voice data of the current voice call at a coding rate lower than the first coding rate to the base station, and receiving a message returned by the base station for instructing to encode the call voice data of the current voice call in a coding mode corresponding to the second coding rate.
Optionally, the step of converting the call voice data received in real time from the base station into call voice data having the second encoding rate includes: and decoding the call voice data received from the base station in real time, and coding the decoded data according to a coding mode corresponding to the second coding rate to obtain the call voice data with the second coding rate.
Optionally, the coding scheme corresponding to the first coding rate is: the adaptive multi-rate wideband coding scheme, the coding scheme corresponding to the second coding rate is: adaptive multi-rate narrowband coding.
According to another exemplary embodiment of the present invention, there is provided an apparatus for translating call voice in real time, wherein the apparatus includes: a communication unit which transmits the communication voice data received in real time from the base station to a server for translating the communication voice at a first encoding rate and receives a corresponding translation result from the server when the electronic terminal needs to translate the communication voice in real time; and a determination unit that determines whether a time taken to receive a translation result corresponding to the call voice data is greater than a first time threshold, wherein when the determination unit determines that the time is greater than the first time threshold, the communication unit transmits the communication voice data received in real time from the base station to the server at a second encoding rate, and receives the corresponding translation result from the server.
Optionally, the apparatus further comprises: and a result output unit which outputs the translation result received from the server.
Optionally, the second encoding rate is lower than the first encoding rate.
Alternatively, the communication unit transmits the speech-through data received in real time from the base station to the server at the first encoding rate and receives the corresponding translation result from the server when the determination unit determines that the time taken to receive the translation result corresponding to the speech-through data is less than the second time threshold after transmitting the speech-through data to the server at the second encoding rate.
Optionally, the first encoding rate is an encoding rate of the call voice data determined by the base station when the current voice call is initiated.
Optionally, the communication unit negotiates with the base station to reduce the encoding rate of the call voice data of the current voice call; when the negotiation is completed, receiving call voice data with a second encoding rate from the base station in real time and transmitting the call voice data to the server; alternatively, the communication unit converts call voice data received in real time from the base station into call voice data having the second encoding rate, and transmits the converted call voice data to the server.
Optionally, the communication unit sends a message requesting to encode the call voice data of the current voice call according to the encoding mode corresponding to the second encoding rate to the base station, and receives a response message returned by the base station; alternatively, the communication unit transmits a message requesting encoding of call voice data of the current voice call at an encoding rate lower than the first encoding rate to the base station, and receives a message returned from the base station instructing encoding of call voice data of the current voice call in an encoding manner corresponding to the second encoding rate.
Optionally, the communication unit decodes the call voice data received in real time from the base station, and encodes the decoded data according to an encoding method corresponding to the second encoding rate to obtain the call voice data with the second encoding rate.
Optionally, the coding scheme corresponding to the first coding rate is: the adaptive multi-rate wideband coding scheme, the coding scheme corresponding to the second coding rate is: adaptive multi-rate narrowband coding.
According to another exemplary embodiment of the present invention, a computer-readable storage medium is provided, in which a computer program is stored, which, when being executed by a processor, implements the method for real-time translation of call speech as described above.
According to another exemplary embodiment of the present invention, there is provided an electronic terminal, wherein the electronic terminal includes: a processor; a memory storing a computer program which, when executed by the processor, implements the method of real-time translation of call speech as described above.
According to the method and the device for translating the call voice in real time, when the translation delay is detected to be larger, the encoding rate of the call voice data sent to the translation server is reduced to reduce the network transmission load and the data processing amount of the translation server, so that the time consumption of the call voice translation process can be effectively reduced, the time for obtaining the translation result of the call voice can be shortened, the real-time performance of the call voice translation function can be improved, and the user experience can be improved.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The above and other objects and features of exemplary embodiments of the present invention will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate exemplary embodiments, wherein:
fig. 1 illustrates a flowchart of a method of translating call speech in real time according to an exemplary embodiment of the present invention;
fig. 2 illustrates a block diagram of an apparatus for real-time translation of call voice according to an exemplary embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
Fig. 1 illustrates a flowchart of a method of translating call speech in real time according to an exemplary embodiment of the present invention. The method may be implemented by a computer program. For example, the method may be performed by a call voice translation application installed in the electronic terminal or by a function program implemented in an operating system of the electronic terminal. As an example, the electronic terminal may be a mobile communication terminal (e.g., a smartphone), a smart wearable device (e.g., a smart watch), or the like capable of voice call.
Referring to fig. 1, when the electronic terminal needs to translate call voice in real time, the electronic terminal transmits the call voice data received in real time from the base station to a server for translating call voice at a first encoding rate and receives a corresponding translation result from the server at step S10.
As an example, when the electronic terminal is in a voice call state and a call voice real-time translation function is turned on, it may be determined that the electronic terminal needs to translate the call voice in real time.
As an example, the first encoding rate may be an encoding rate of call voice data determined by the base station when the current voice call is initiated. For example, the first encoding rate may be an encoding rate at which the electronic terminal negotiates with the base station for the determined voice data of the call when the current voice call is initiated.
In step S20, it is determined whether the time taken to receive the translation result corresponding to the call voice data is greater than a first time threshold.
Specifically, the time taken for receiving the translation result corresponding to the call voice data is: time taken from the start of transmission of the call voice data to the server to the reception of a translation result corresponding to the call voice data from the server.
As an example, it may be periodically determined whether a time taken to receive a translation result corresponding to call voice data transmitted to the server at a first encoding rate is greater than a first time threshold.
When it is determined at step S20 that the time is greater than the first time threshold, step S30 is performed, the through speech data received in real time from the base station is transmitted to the server at the second encoding rate, and the corresponding translation result is received from the server. Specifically, when it is determined that the time is greater than the first time threshold, the speech communication data, which is then received in real time from the base station, is transmitted to the server at a second encoding rate.
When it is determined at step S20 that the time is not greater than the first time threshold, return is made to execution of step S10.
As an example, the second encoding rate may be lower than the first encoding rate. As an example, the coding rate of the call voice data transmitted to the server may be reduced by switching the coding scheme of the call voice data when the quality of the call voice is not affected much (for example, it is ensured that the signal-to-noise ratio of the call voice is not lower than a certain threshold).
As an example, the encoding rate of the call voice data for the current voice call may be negotiated down with the base station; when the negotiation is completed, the call voice data having the second encoding rate is received in real time from the base station and transmitted to the server.
As an example, a message for requesting to encode call voice data of the current voice call in an encoding manner corresponding to the second encoding rate may be sent to the base station, and a response message returned by the base station may be received; alternatively, a message requesting encoding of call voice data of the current voice call at an encoding rate lower than the first encoding rate may be transmitted to the base station, and a message returned by the base station to instruct encoding of call voice data of the current voice call in an encoding manner corresponding to the second encoding rate may be received.
As another example, call voice data received in real time from the base station may be converted into call voice data having a second encoding rate; and transmitting the converted call voice data to the server.
As an example, the call voice data with the second encoding rate may be obtained by decoding call voice data received in real time from the base station and encoding the decoded data in an encoding manner corresponding to the second encoding rate. For example, it is possible to convert call voice data having a first encoding rate received in real time from a base station into call voice data having a second encoding rate and transmit the converted call voice data to the server.
As an example, the coding scheme corresponding to the first coding rate may be: an Adaptive Multi-Rate Wideband (AMR-WB) coding scheme, the coding scheme corresponding to the second coding Rate may be: adaptive Multi-Rate Narrowband (AMR-NB: Adaptive Multi-Rate Narrowband) coding. It should be understood that the coding scheme corresponding to the first coding rate and the coding scheme corresponding to the second coding rate may be other suitable coding schemes, and the present invention is not limited thereto.
For example, adaptive multi-rate wideband coding: the voice bandwidth range is 300-3400 Hz; the sampling rate is 8KHz, and the bit depth is 16 bits; self-adaptive multi-rate narrowband coding mode: the voice bandwidth range is 50-7000 Hz; the sampling rate is 16 KHz; the bit depth is 16 bits. Taking 5 seconds of call voice as an example, the data amount after coding by adopting the AMB-WB coding mode is as follows: 16000 (sample rate) × 16 (bit depth) × 5 (time)/8 bit ═ 32 KB; the data size after being coded by adopting the AMB-NB coding mode is as follows: 8000 (sampling rate) × 16 (bit depth) × 5 (time)/8 bit ═ 16KB, it can be seen that the amount of speech data can be reduced by half by switching the coding mode from the AMR-WB coding mode to the AMR-NB coding mode.
In the prior art, when performing real-time translation of call voice, extracted downlink call voice data is usually directly packaged and sent to a translation server for processing, and when the network transmission quality is poor or the translation task of the translation server is heavy, a situation of relatively large translation delay occurs. According to the exemplary embodiment of the invention, when the translation delay is detected to be larger, the data volume of the call voice data is reduced by reducing the encoding rate of the call voice data sent to the translation server, so as to reduce the network transmission load and the data processing amount of the translation server, thereby realizing the reduction of the delay of the real-time translation of the call voice and improving the user experience.
As an example, the method for translating call voice in real time according to an exemplary embodiment of the present invention may further include: and outputting the translation result received from the server. As an example, the translation result may be a translation result in a voice form and/or a text form. By way of example, the translation results received from the translation server may be output in a variety of suitable ways. For example, the translation result may be output in the form of voice and/or text.
As an example, the method for translating call voice in real time according to an exemplary embodiment of the present invention may further include: after transmitting the voice data to the server at the second encoding rate, when it is determined that a time taken to receive a translation result corresponding to the call voice data is less than a second time threshold, transmitting the voice data received in real time from the base station to the server at the first encoding rate, and receiving the corresponding translation result from the server. As an example, the encoding rate of the call voice data of the current voice call may be negotiated up with the base station; when the negotiation is completed, the call voice data having the first encoding rate is received in real time from the base station and transmitted to the server.
In other words, after the through speech data is transmitted to the server at the second encoding rate, if it is determined that the translation delay is small, it is possible to recover to transmit the call speech data received in real time from the base station to the server at the first encoding rate, which is higher.
As an example, the second time threshold may be less than the first time threshold.
As an example, it may be periodically determined whether a time taken to receive a translation result corresponding to call voice data transmitted to the server at a second encoding rate is less than a second time threshold.
Fig. 2 illustrates a block diagram of an apparatus for real-time translation of call voice according to an exemplary embodiment of the present invention.
As shown in fig. 2, an apparatus for translating call voice in real time according to an exemplary embodiment of the present invention includes: a communication unit 10 and a determination unit 20.
Specifically, the communication unit 10 is configured to transmit, when the electronic terminal needs to translate call voice in real time, the communication voice data received in real time from the base station at the first encoding rate to a server for translating the call voice, and receive the corresponding translation result from the server.
The determination unit 20 is configured to determine whether a time taken to receive a translation result corresponding to the call voice data is greater than a first time threshold, wherein when the determination unit 20 determines that the time is greater than the first time threshold, the communication unit 10 transmits the communication voice data received in real time from the base station to the server at a second encoding rate, and receives the corresponding translation result from the server.
As an example, the second encoding rate may be lower than the first encoding rate.
As an example, the first encoding rate may be an encoding rate of call voice data determined by the base station when the current voice call is initiated.
As an example, the communication unit 10 may negotiate with the base station to reduce the encoding rate of the call voice data of the current voice call; when the negotiation is completed, the call voice data having the second encoding rate is received in real time from the base station and transmitted to the server.
As an example, the communication unit 10 may transmit a message for requesting encoding of call voice data of the current voice call in an encoding manner corresponding to the second encoding rate to the base station, and receive a response message returned by the base station; alternatively, the communication unit 10 may transmit a message requesting encoding of call voice data of the current voice call at a coding rate lower than the first coding rate to the base station, and receive a message returned by the base station to instruct encoding of call voice data of the current voice call in a coding manner corresponding to the second coding rate.
As another example, the communication unit 10 may convert call voice data received in real time from a base station into call voice data having the second encoding rate, and transmit the converted call voice data to the server.
As an example, the communication unit 10 may decode call voice data received in real time from the base station and encode the decoded data in an encoding manner corresponding to the second encoding rate to obtain the call voice data having the second encoding rate.
As an example, the coding scheme corresponding to the first coding rate may be: the adaptive multi-rate wideband coding scheme may further include: adaptive multi-rate narrowband coding.
As an example, the communication unit 10 may transmit the communication voice data received in real time from the base station to the server at the first encoding rate and receive the corresponding translation result from the server when the determination unit 20 determines that the time taken to receive the translation result corresponding to the communication voice data is less than the second time threshold after transmitting the communication voice data to the server at the second encoding rate.
As an example, the apparatus for translating call voice in real time according to an exemplary embodiment of the present invention may further include: a result output unit (not shown) for outputting the translation result received from the server.
It should be understood that, according to the embodiment of the present invention, the specific implementation manner of the device for translating the call speech in real time may be implemented by referring to the related specific implementation manner described in conjunction with fig. 1, and details are not described herein again.
Further, it should be understood that each unit in the apparatus for real-time translation of call voice according to an exemplary embodiment of the present invention may be implemented as a hardware component and/or a software component. The individual units may be implemented, for example, using Field Programmable Gate Arrays (FPGAs) or Application Specific Integrated Circuits (ASICs), depending on the processing performed by the individual units as defined by the skilled person.
A computer-readable storage medium according to an exemplary embodiment of the present invention stores a computer program that, when executed by a processor, causes the processor to perform the method of real-time translation of call voice of the above-described exemplary embodiment. The computer readable storage medium is any data storage device that can store data which can be read by a computer system. Examples of computer-readable storage media include: read-only memory, random access memory, compact disc read-only memory, magnetic tape, floppy disk, optical data storage device, and carrier wave (such as data transmission through the internet via a wired or wireless transmission path).
An electronic terminal according to an exemplary embodiment of the present invention includes: a processor (not shown) and a memory (not shown), wherein the memory stores a computer program which, when executed by the processor, implements the method of real-time translation of call speech as in the above exemplary embodiments.
Although a few exemplary embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.

Claims (10)

1. A method for real-time translation of call speech, wherein the method comprises:
when the electronic terminal needs to translate call voice in real time, transmitting the call voice data received in real time from the base station to a server for translating the call voice at a first encoding rate, and receiving a corresponding translation result from the server;
determining whether the time for receiving the translation result corresponding to the call voice data is greater than a first time threshold;
transmitting the communicated voice data received in real time from the base station to the server at a second encoding rate and receiving a corresponding translation result from the server when it is determined that the time is greater than a first time threshold,
wherein, the time for receiving the translation result corresponding to the call voice data is as follows: time taken from the start of transmission of the call voice data to the server to the reception of a translation result corresponding to the call voice data from the server.
2. The method of claim 1, wherein the method further comprises: outputting the translation result received from the server;
and/or the second encoding rate is lower than the first encoding rate;
and/or, the method further comprises: after transmitting the voice data to the server at the second encoding rate, when determining that the time taken to receive the translation result corresponding to the call voice data is less than the second time threshold, transmitting the voice data received in real time from the base station to the server at the first encoding rate, and receiving the corresponding translation result from the server;
and/or the first coding rate is the coding rate of the call voice data determined by the base station when the current voice call is initiated;
and/or, the step of transmitting the voice data received in real time from the base station to the server at the second encoding rate comprises:
negotiating with a base station to reduce the encoding rate of the call voice data of the current voice call; when the negotiation is completed, receiving call voice data with a second encoding rate from the base station in real time and transmitting the call voice data to the server;
or, converting the call voice data received in real time from the base station into call voice data having the second encoding rate; and transmitting the converted call voice data to the server.
3. The method of claim 2, wherein negotiating with the base station to reduce the coding rate of the call voice data for the current voice call comprises: sending a message for requesting to encode the call voice data of the current voice call according to the encoding mode corresponding to the second encoding rate to the base station, and receiving a response message returned by the base station; or sending a message for requesting to encode the call voice data of the current voice call at a coding rate lower than the first coding rate to the base station, and receiving a message returned by the base station for instructing to encode the call voice data of the current voice call in a coding mode corresponding to the second coding rate;
and/or the step of converting the call voice data received in real time from the base station into the call voice data having the second encoding rate includes: and decoding the call voice data received from the base station in real time, and coding the decoded data according to a coding mode corresponding to the second coding rate to obtain the call voice data with the second coding rate.
4. The method of claim 3, wherein the coding scheme corresponding to the first coding rate is: the adaptive multi-rate wideband coding scheme, the coding scheme corresponding to the second coding rate is: adaptive multi-rate narrowband coding.
5. An apparatus for real-time translation of call speech, wherein the apparatus comprises:
a communication unit which transmits the communication voice data received in real time from the base station to a server for translating the communication voice at a first encoding rate and receives a corresponding translation result from the server when the electronic terminal needs to translate the communication voice in real time;
a determination unit that determines whether a time taken to receive a translation result corresponding to the call voice data is greater than a first time threshold,
wherein, when the determination unit determines that the time is greater than a first time threshold, the communication unit transmits the communication voice data received in real time from the base station to the server at a second encoding rate and receives a corresponding translation result from the server,
wherein, the time for receiving the translation result corresponding to the call voice data is as follows: time taken from the start of transmission of the call voice data to the server to the reception of a translation result corresponding to the call voice data from the server.
6. The apparatus of claim 5, wherein the apparatus further comprises: a result output unit that outputs the translation result received from the server;
and/or the second encoding rate is lower than the first encoding rate;
and/or the communication unit transmits the communication voice data received in real time from the base station to the server at the first encoding rate and receives the corresponding translation result from the server when the determination unit determines that the time taken to receive the translation result corresponding to the communication voice data is less than the second time threshold after transmitting the communication voice data to the server at the second encoding rate;
and/or the first coding rate is the coding rate of the call voice data determined by the base station when the current voice call is initiated;
and/or the communication unit and the base station negotiate to reduce the coding rate of the call voice data of the current voice call; when the negotiation is completed, receiving call voice data with a second encoding rate from the base station in real time and transmitting the call voice data to the server; alternatively, the communication unit converts call voice data received in real time from the base station into call voice data having the second encoding rate, and transmits the converted call voice data to the server.
7. The apparatus of claim 6, wherein the communication unit transmits a message requesting encoding of call voice data for the current voice call in an encoding manner corresponding to the second encoding rate to the base station, and receives a response message returned from the base station; or, the communication unit sends a message for requesting to encode the call voice data of the current voice call at a coding rate lower than the first coding rate to the base station, and receives a message returned by the base station for instructing to encode the call voice data of the current voice call in a coding mode corresponding to the second coding rate;
and/or the communication unit decodes the call voice data received from the base station in real time and codes the decoded data according to a coding mode corresponding to the second coding rate to obtain the call voice data with the second coding rate.
8. The apparatus of claim 7, wherein the coding scheme corresponding to the first coding rate is: the adaptive multi-rate wideband coding mode corresponds to a second coding rate as follows: adaptive multi-rate narrowband coding.
9. A computer-readable storage medium storing a computer program which, when executed by a processor, implements a method of real-time translation of call speech according to any one of claims 1 to 4.
10. An electronic terminal, wherein the electronic terminal comprises:
a processor;
memory storing a computer program which, when executed by a processor, implements a method of real-time translation of call speech according to any of claims 1 to 4.
CN201910559064.8A 2019-06-26 2019-06-26 Method and equipment for translating call voice in real time Active CN110267309B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910559064.8A CN110267309B (en) 2019-06-26 2019-06-26 Method and equipment for translating call voice in real time

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910559064.8A CN110267309B (en) 2019-06-26 2019-06-26 Method and equipment for translating call voice in real time

Publications (2)

Publication Number Publication Date
CN110267309A CN110267309A (en) 2019-09-20
CN110267309B true CN110267309B (en) 2022-09-23

Family

ID=67921611

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910559064.8A Active CN110267309B (en) 2019-06-26 2019-06-26 Method and equipment for translating call voice in real time

Country Status (1)

Country Link
CN (1) CN110267309B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325039B (en) * 2020-01-21 2020-12-01 陈刚 Language translation method, system, program and handheld terminal based on real-time call

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105187167A (en) * 2015-09-28 2015-12-23 广州市百果园网络科技有限公司 Voice data communication method and device
CN106033982A (en) * 2015-03-13 2016-10-19 中国移动通信集团公司 Method and device for realizing ultra wide band voice intercommunication and a terminal
CN106294327A (en) * 2015-05-12 2017-01-04 中国移动通信集团公司 The method of real time translation, device and network element device in a kind of mobile communications network
CN107015970A (en) * 2017-01-17 2017-08-04 881飞号通讯有限公司 A kind of method that bilingual intertranslation is realized in network voice communication
CN107343113A (en) * 2017-06-26 2017-11-10 深圳市沃特沃德股份有限公司 Audio communication method and device
CN109582976A (en) * 2018-10-15 2019-04-05 华为技术有限公司 A kind of interpretation method and electronic equipment based on voice communication

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090006076A1 (en) * 2007-06-27 2009-01-01 Jindal Dinesh K Language translation during a voice call
TW201608414A (en) * 2014-08-18 2016-03-01 Richplay Information Co Ltd Speech assistance system in combination with mobile device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106033982A (en) * 2015-03-13 2016-10-19 中国移动通信集团公司 Method and device for realizing ultra wide band voice intercommunication and a terminal
CN106294327A (en) * 2015-05-12 2017-01-04 中国移动通信集团公司 The method of real time translation, device and network element device in a kind of mobile communications network
CN105187167A (en) * 2015-09-28 2015-12-23 广州市百果园网络科技有限公司 Voice data communication method and device
CN107015970A (en) * 2017-01-17 2017-08-04 881飞号通讯有限公司 A kind of method that bilingual intertranslation is realized in network voice communication
CN107343113A (en) * 2017-06-26 2017-11-10 深圳市沃特沃德股份有限公司 Audio communication method and device
CN109582976A (en) * 2018-10-15 2019-04-05 华为技术有限公司 A kind of interpretation method and electronic equipment based on voice communication

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
VoLTE通话质量提升策略研究;刘华等;《广西通信技术》;20171215(第04期);第44-52页 *
浅谈语音压缩编码的发展和应用;苏桃;《科技情报开发与经济》;20061130(第22期);第157-158页 *

Also Published As

Publication number Publication date
CN110267309A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
US8965545B2 (en) Progressive encoding of audio
RU2434333C2 (en) Apparatus and method of transmitting sequence of data packets and decoder and apparatus for recognising sequence of data packets
KR101699138B1 (en) Devices for redundant frame coding and decoding
US10147435B2 (en) Audio coding method and apparatus
US9704500B2 (en) Method for predicting high frequency band signal, encoding device, and decoding device
RU2710207C1 (en) Wireless communication method and device
US20140088974A1 (en) Apparatus and method for audio frame loss recovery
KR101279857B1 (en) Adaptive multi rate codec mode decoding method and apparatus thereof
US6510409B1 (en) Intelligent discontinuous transmission and comfort noise generation scheme for pulse code modulation speech coders
TW200917764A (en) System and method for providing AMR-WB DTX synchronization
CN110267309B (en) Method and equipment for translating call voice in real time
US9437211B1 (en) Adaptive delay for enhanced speech processing
CN101170590B (en) A method, system and device for transmitting encoding stream under background noise
Aleksić et al. Analysis and design of robust quasilogarithmic quantizer for the purpose of traffic optimization
CN114448588A (en) Audio transmission method and device, electronic equipment and computer readable storage medium
AU2014220330A1 (en) Limiting processing of calls as text telephony calls
TWI394398B (en) Apparatus and method for transmitting a sequence of data packets and decoder and apparatus for decoding a sequence of data packets
CN102177688B (en) Method, apparatus and system for speech coding and decoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant