WO2020010963A1 - Voice handover method, apparatus, terminal, and computer-readable storage medium - Google Patents
Voice handover method, apparatus, terminal, and computer-readable storage medium Download PDFInfo
- Publication number
- WO2020010963A1 WO2020010963A1 PCT/CN2019/089623 CN2019089623W WO2020010963A1 WO 2020010963 A1 WO2020010963 A1 WO 2020010963A1 CN 2019089623 W CN2019089623 W CN 2019089623W WO 2020010963 A1 WO2020010963 A1 WO 2020010963A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pcm
- voice data
- speech
- switching
- audio device
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72448—User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/725—Cordless telephones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/39—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech synthesis
Definitions
- This application relates to, but is not limited to, the field of mobile terminals.
- the user will switch from the hand-held state to the hands-free state during the call out of necessity.
- the user cannot normally listen to the downlink pulse code modulation (PCM) voice data.
- PCM pulse code modulation
- the specific switching time will vary depending on the operating speed and operating habits of different users, and is generally greater than 2 seconds.
- downlink PCM voice data may not be obtained or lost during the handover process. The longer the switchover time, the more downlink PCM voice data is lost, which will affect the user's reception of the voice information of the callee. Affect the user experience.
- embodiments of the present application provide a voice switching method, device, terminal, and computer-readable storage medium for a voice call.
- a voice switching method provided for a mobile terminal includes:
- a voice switching device which is applied to the voice switching method.
- the voice switching device includes: an acquisition module, a cache module, and a processing module, where:
- the obtaining module is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same duration as the first PCM voice data after the audio device is successfully switched;
- the buffer module is configured to buffer the first PCM voice data and the PCM voice data in the approaching voice processing process
- the processing module is configured to perform approximate speech processing on the first PCM speech data and the second PCM speech data, so that the first PCM speech data and the second PCM speech data are played within the switching time. .
- a terminal including: a memory, a processor, and a computer program stored on the memory and executable on the processor, and the computer program is executed by the processor.
- the steps of the voice switching method provided in the embodiments of the present application are implemented.
- a computer-readable storage medium stores a program of the voice switching method, and when the program of the voice switching method is executed by a processor, The steps of implementing the voice switching method provided in the embodiments of the present application are implemented.
- a method, device, terminal, and computer-readable storage medium for a voice call acquire and approach the first PCM voice data received by the receiving end during the voice processing audio device switching process. And stored in the buffer; acquiring and approaching the second PCM voice data of the same length as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data The PCM voice data is played within the switching time to achieve seamless switching.
- FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal that implements various embodiments of the present application
- FIG. 2 is a structural diagram of a communication network system according to an embodiment of the present application.
- FIG. 3 is a schematic flowchart of a voice switching method for a voice call according to an embodiment of the present application
- FIG. 4 is a schematic diagram of a method for seamlessly switching voices during an audio device switching process by using an approximation method combining buffering and voice processing according to an embodiment of the present application;
- FIG. 5 is a schematic structural diagram of a voice switching device for a voice call according to an embodiment of the present application.
- FIG. 6 is a schematic structural diagram of a terminal according to an embodiment of the present application.
- FIG. 7 is a schematic flowchart of a voice switching method for a voice call according to an embodiment of the present application.
- the terminal can be implemented in various forms.
- the terminals described in this application may include mobile phones, tablets, laptops, palmtop computers, Personal Digital Assistants (PDAs), Portable Media Players (PMPs), navigation devices, Mobile terminals such as wearable devices, smart bracelets, pedometers, and fixed terminals such as digital TVs, desktop computers, etc.
- PDAs Personal Digital Assistants
- PMPs Portable Media Players
- Mobile terminals such as wearable devices, smart bracelets, pedometers
- fixed terminals such as digital TVs, desktop computers, etc.
- a mobile terminal will be taken as an example for explanation.
- the configuration according to the embodiment of the present application can also be applied to a fixed type terminal.
- FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal for implementing the embodiments of the present application.
- the mobile terminal 100 may include a radio frequency (RF) unit 101, a WiFi module 102, an audio output unit 103, and audio.
- RF radio frequency
- a / V Video
- input unit 104 sensor 105
- display unit 106 user input unit 107
- interface unit 108 interface unit 108
- memory 109 memory 109
- processor 110 power supply 111
- FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal for implementing the embodiments of the present application.
- the mobile terminal 100 may include a radio frequency (RF) unit 101, a WiFi module 102, an audio output unit 103, and audio.
- / Video (A / V) input unit 104 sensor 105
- display unit 106 user input unit 107
- interface unit 108 user input unit
- memory 109 memory 109
- processor 110 power supply 111
- power supply 111 power supply 111
- the RF unit 101 may be configured to receive and transmit signals during transmission and reception of information or during a call. Specifically, the downlink information of the base station is received and processed by the processor 110; in addition, uplink data is transmitted to the base station.
- the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication.
- the above wireless communication can use any communication standard or protocol, including but not limited to Global System (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access 2000 (Code Division Multiple Access 2000 (CDMA2000), Wideband Code Division Multiple Access (WCDMA), Time Division Synchronous Code Division Multiple Access (Time Division-Synchronous Code Division, Multiple Access, TD-SCDMA), Frequency Division Duplex Long-Term Evolution (Frequency Division Duplexing-Long Terminal Evolution (FDD-LTE)) and Time Division Duplex Long-Term Evolution (Time Division Duplexing-Long Terminal Evolution (TDD-LTE)).
- GSM Global System
- GPRS General Packet Radio Service
- CDMA2000 Code Division Multiple Access 2000
- WCDMA Wideband Code Division Multiple Access
- Time Division Synchronous Code Division Multiple Access Time Division-Synchronous Code Division, Multiple Access
- TD-SCDMA Time Division Synchronous Code Division Multiple Access
- FDD-LTE Frequency Division Duplexing-Long Terminal Evolution
- WiFi is a short-range wireless transmission technology.
- the mobile terminal can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 102. It provides users with wireless broadband Internet access.
- FIG. 1 shows the WiFi module 102, it can be understood that it does not belong to the necessary configuration of the mobile terminal, and can be omitted as needed without changing the essence of the invention.
- the audio output unit 103 may receive the RF unit 101 or the WiFi module 102 or store it in the memory 109 when the mobile terminal 100 is in a call signal receiving mode, a call mode, a recording mode, a voice recognition mode, a broadcast receiving mode, or the like.
- the audio data is converted into audio signals and output as sound.
- the audio output unit 103 may also provide audio output (for example, a call signal receiving sound, a message receiving sound, etc.) related to a specific function performed by the mobile terminal 100.
- the audio output unit 103 may include a speaker, a buzzer, and the like.
- the A / V input unit 104 is configured to receive an audio or video signal.
- the A / V input unit 104 may include a graphics processing unit (Graphics Processing Unit, GPU) 1041 and a microphone 1042.
- the graphics processor 1041 pairs static images obtained by an image capture device (such as a camera) in a video capture mode or an image capture mode or The image data of the video is processed.
- the processed image frames may be displayed on the display unit 106.
- the image frames processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the RF unit 101 or the WiFi module 102.
- the microphone 1042 can receive sound (audio data) via the microphone 1042 in an operation mode such as a telephone call mode, a recording mode, a voice recognition mode, and can process such sound into audio data.
- the processed audio (voice) data can be converted into a format that can be transmitted to a mobile communication base station via the RF unit 101 in the case of a telephone call mode and output.
- the microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to eliminate (or suppress) noise or interference generated during the process of receiving and transmitting audio signals.
- the mobile terminal 100 further includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors.
- the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of the ambient light, and the proximity sensor can close the display panel 1061 and the display panel 1061 when the mobile terminal 100 moves to the ear.
- an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary.
- It can be configured as an application that recognizes the attitude of the mobile phone (such as horizontal and vertical screen switching, (Related games, magnetometer attitude calibration), vibration recognition-related functions (such as pedometer, tap), etc .; as for the mobile phone, the fingerprint sensor, pressure sensor, iris sensor, molecular sensor, gyroscope, barometer, hygrometer can also be configured , Thermometer, infrared sensor and other sensors, will not repeat them here.
- the display unit 106 is configured to display information input by the user or information provided to the user.
- the display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
- LCD liquid crystal display
- OLED organic light-emitting diode
- the user input unit 107 may be configured to receive inputted numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal.
- the user input unit 107 may include a touch panel 1071 and other input devices 1072.
- Touch panel 1071 also known as touch screen, can collect user's touch operations on or near it (such as the user using a finger, stylus, etc. any suitable object or accessory on touch panel 1071 or near touch panel 1071 Operation), and drive the corresponding connection device according to a preset program.
- the touch panel 1071 may include two parts, a touch detection device and a touch controller.
- the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into contact coordinates, and sends it To the processor 110, and can receive the command sent by the processor 110 and execute it.
- various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch panel 1071.
- the user input unit 107 may also include other input devices 1072.
- the other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, which are not limited herein. .
- the touch panel 1071 may cover the display panel 1061.
- the touch panel 1071 detects a touch operation on or near the touch panel 1071, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event.
- the type of touch event provides corresponding visual output on the display panel 1061.
- the touch panel 1071 and the display panel 1061 are implemented as two independent components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated. The implementation of the input and output functions of the mobile terminal is not specifically limited here.
- the interface unit 108 functions as an interface through which at least one external device can connect with the mobile terminal 100.
- the external device may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port configured to connect a device with an identification module, and audio input / output (I / O) port, video I / O port, headphone port, and more.
- the interface unit 108 may be configured to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the mobile terminal 100 or may be configured to connect the mobile terminal 100 and the external Transfer data between devices.
- the memory 109 may be configured to store software programs and various data.
- the memory 109 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.), etc .; Data (such as audio data, phone book, etc.) created by the use of mobile phones.
- the memory 109 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
- the processor 110 is a control center of the mobile terminal, and uses various interfaces and lines to connect various parts of the entire mobile terminal.
- the processor 110 runs or executes software programs and / or modules stored in the memory 109, and calls data stored in the memory 109. , Perform various functions of the mobile terminal and process data, so as to monitor the mobile terminal as a whole.
- the processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, etc.
- the processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 110.
- the mobile terminal 100 may further include a power source 111 (such as a battery) for supplying power to various components.
- a power source 111 such as a battery
- the power source 111 may be logically connected to the processor 110 through a power management system, so as to manage charging, discharging, and power consumption management through the power management system. And other functions.
- the mobile terminal 100 may further include a Bluetooth module and the like, and details are not described herein again.
- FIG. 2 is a structural diagram of a communication network system according to an embodiment of the present application.
- the communication network system is a general mobile communication technology LTE system.
- the LTE system includes user equipment (User Equipment, UE 201), Evolved UMTS Terrestrial Radio Access Network (E-UTRAN) 202, Evolved Packet Core Network (EPC) 203, and IP service 204 of the operator.
- UE 201 User Equipment
- E-UTRAN Evolved UMTS Terrestrial Radio Access Network
- EPC Evolved Packet Core Network
- IP service 204 IP service
- the UE 201 may be the foregoing terminal 100, and details are not described herein again.
- E-UTRAN 202 includes eNodeB 2021 and other eNodeB 2022.
- the eNodeB 2021 can be connected to other eNodeB 2022 through a backhaul (such as an X2 interface), the eNodeB 2021 is connected to the EPC203, and the eNodeB 2021 can provide UE201 to EPC203 access.
- a backhaul such as an X2 interface
- EPC203 may include Mobility Management Entity (MME) 2031, Home Subscriber Server (HSS) 2032, other MME 2033, Serving Gateway (SGW) 2034, Packet Data Network Gateway (PDN GateWay) , PGW) 2035 and Policy and Charging Function (Function PCRF) 2036 and so on.
- MME2031 is a control node that processes signaling between UE201 and EPC203, and provides bearer and connection management.
- the HSS2032 is configured to provide some registers to manage functions such as the home location register (not shown in the figure), and holds some user-specific information about service characteristics, data rates, and so on. All user data can be sent through SGW2034.
- PGW2035 can provide UE 201 IP address allocation and other functions.
- PCRF2036 is a policy and charging control policy decision point for service data flows and IP bearer resources. It performs functions for policy and charging. Units (not shown) select and provide available policy and billing control decisions.
- the IP service 204 may include the Internet, an intranet, an IP Multimedia Subsystem (IMS), or other IP services.
- IMS IP Multimedia Subsystem
- the embodiment of the present application provides a voice switching method for a voice call, which should be configured as a mobile terminal and includes:
- the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
- the method before step S1 of acquiring and approaching the first PCM voice data received by the receiving end during the process of acquiring and approaching the voice processing audio device switching and storing the first PCM voice data in the buffer, the method further includes: detecting the voice call process
- the step of switching the audio device includes: the proximity sensor detects the movement of the mobile terminal during a voice call, and if it detects that the mobile terminal has moved, it is determined that the audio device is switched.
- the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
- the speech processing is to change only the rate of speech and keep the intonation and semantics of the speech unchanged, and the speech processing is divided into two stages of speech decomposition and speech synthesis.
- step S1 the step of acquiring and approaching the first PCM voice data received by the receiving end during the switching of the voice processing audio device and storing the first PCM voice data in the buffer includes:
- the proximity sensor detects that the mobile terminal is moving and audio device switching occurs, such as switching between the handset and the speaker, and records the first time stamp T1 when the mobile terminal is about to leave the human ear;
- the audio device switching time T is also the playback time of the first PCM voice data stored in the buffer during that period;
- step S2 the acquiring and approximating the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first
- the steps of playing the PCM voice data and the second PCM voice data within the switching time include:
- Steps S21 to S23 are executed in a loop until the approximate playback time is shorter than the preset playback time.
- N is equal to the number of playback times when the approximate playback time is shorter than the preset playback time.
- the first PCM voice data and N times are described.
- the playback of the second PCM voice data is completed within the switching time, where N is an integer greater than or equal to 1.
- FIG. 4 is a schematic diagram of a method for seamlessly switching voices during an audio device switching process by using an approximation method combining buffering and voice processing according to an embodiment of the present application.
- the switching of the audio device of the mobile terminal occurs at time T [T1, T2].
- time T is [T2, T3].
- the receiving end plays the same time as the device switching time, and at this time, the second PCM voice data is obtained.
- the PCM voice data PCM-T / 4 is the second and third PCM voice data obtained after processing.
- the playback time is reduced by half compared to the original playback time and reduced to the original playback time. T / 4.
- step S1 the first PCM voice data during the 2-second audio device switching has been acquired and buffered, and the 2-second PCM voice data is subjected to approximate speech processing.
- the third PCM voice data with the first playback time of 1 second is obtained.
- the first second PCM voice data is acquired while playing the first third PCM voice data (1 second playback is complete).
- the original playback time of the first second PCM voice data is For 1 second, the second PCM voice data for the first time is obtained, and the second PCM voice data for the second time is processed after the first second PCM voice data is approximated to a speech time of 0.5 second.
- the original second playback time of the second second PCM voice data is 0.5 seconds
- the second time is acquired
- the third and third PCM voice data of the second PCM voice data are subjected to approximate voice processing for the second and second PCM voice data so that the playback time is 0.25 seconds.
- the third second PCM voice data while playing the third third PCM voice data (0.25 second playback is complete).
- the original playback time of the third second PCM voice data is 0.25 seconds, and the third time is acquired.
- the fourth third PCM speech data is processed by the second PCM speech data, and the third second PCM speech data is subjected to approximate speech processing so that the playback time is 0.125 seconds.
- the first PCM voice data playback starts at T / 2 and the playback time Tx is T / 2; the second PCM voice data playback starts at T / 2 + T / 4 and the playback time Tx is T / 4; the third The second PCM voice data playback starts at T / 2 + T / 4 + T / 8, and the playback time Tx is T / 8; and so on, at T / 2 + T / 4 + T / 8 ... + T / 2 ⁇ N completes the Nth PCM voice data playback, and the playback time Tx is T / 2 ⁇ N.
- the playback time of each PCM audio data is proportional, the first term is T / 2, and the common ratio is 1/2. According to the following proportional series summation formula, when N tends to infinity, PCM voice data The total playback time is T.
- T N is the total playing time
- 1/2 is the common ratio between each playing time
- N is the number of playing times.
- the playback time Tx is shorter than the preset playback time Tu (for example, 50 milliseconds)
- the approximation is stopped, and the approximation process ends.
- the playback time Tx is less than the preset playback time Tu (such as 50 milliseconds)
- the user can hardly subjectively feel that the PCM voice data has not been acquired or lost.
- the switching time T of the audio device is required to be between a preset minimum switching time Tmin (such as 0.5 seconds) and a preset maximum switching time Tmax (such as 5 seconds).
- the preset minimum switching time Tmin ⁇ T ⁇ the preset maximum switching time Tmax when the audio device switching time T is greater than the preset maximum switching time Tmax, the audio device switching time is too long, and the PCM voice data information to be processed is relatively large, The user is advised to re-ask the interlocutor to request a repeat. If the switching time T of the audio device is shorter than the preset minimum switching time Tmin, the switching time of the audio device is very short, and the user may hardly obtain the PCM voice data information during this switching operation, and there is no need to perform seamless switching at this time.
- the first PCM voice data with a playing time of T (audio device switching time) and the second PCM voice data with the same playing time of T can pass through the embodiments of the present application.
- the processed third PCM voice data with the playback time T is obtained by approaching the voice buffer processing, and the playback time is reduced by half compared to the original. It can solve the problem that when the voice device is switched during a call, all downlink PCM voice data information can be completely transmitted and played, and seamlessly switched, so that the user will not lose any voice information during the device switching process, and enhance the user's understanding of the dialogue information. Enhanced user experience.
- the embodiment of the present application provides a voice switching device for a voice call, which should be configured as a mobile terminal.
- the voice switching device 300 includes a detection module 301, an acquisition module 302, a cache module 303, and a processing module 304, of which:
- the detection module 301 is configured to detect an audio device switch during a voice call
- the obtaining module 302 is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same length as the first PCM voice data after the audio device is successfully switched;
- the buffer module 303 is configured to buffer the first PCM voice data and the PCM voice data during the approaching voice processing;
- the processing module 304 is configured to perform approximate speech processing on the buffered first PCM speech data and the second PCM speech data after the audio device is successfully switched, so that the first PCM speech data and the second The PCM voice data is played within the switching time.
- the detection module 301 is a proximity sensor.
- An application embodiment of the present application provides a voice switching method for a voice call, which is applied to a mobile terminal and includes:
- the proximity sensor detects the movement of the mobile terminal during a voice call. If a movement of the mobile terminal is detected, it is determined that an audio device switching situation has occurred.
- the audio device switching time T is also the playing time of the first PCM voice data stored in the buffer during this period.
- S705 Approach the voice to process the first PCM voice data to obtain the first third PCM voice data.
- Play the first third PCM voice data obtain the first second PCM voice data within the playback time, and approach the voice processing the first second PCM voice data to obtain the second third PCM voice data.
- an embodiment of the present application further provides a terminal.
- the terminal 900 includes: a memory 902, a processor 901, and a processor stored in the memory 902 and operable on the processor 901.
- the memory 902 and the processor 901 are coupled together through a bus system 903, and the one or more computer programs are executed by the processor 901 to implement a method provided by an embodiment of the present application
- the methods disclosed in the embodiments of the present application may be applied to the processor 901, or implemented by the processor 901.
- the processor 901 may be an integrated circuit chip and has a signal processing capability. In the implementation process, each step of the foregoing method may be completed by using an integrated logic circuit of hardware in the processor 901 or an instruction in the form of software.
- the processor 901 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
- the processor 901 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present application.
- a general-purpose processor may be a microprocessor or any conventional processor.
- the steps may be directly implemented by a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
- the software module may be located in a storage medium.
- the storage medium is located in the memory 902.
- the processor 901 reads the information in the memory 902 and completes the steps of the foregoing method in combination with its hardware.
- the memory 902 in the embodiment of the present application may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory.
- the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), and an erasable programmable read-only memory (PROM).
- RAM Random Access Memory
- EPROM Electrically Erasable Programmable Read-Only Memory
- EEPROM Electrically Erasable Programmable Read-Only Memory
- FRAM Ferromagnetic Random Access Memory
- Flash Flash
- CD-ROM read-only Memory Compact Disk Read-Only Memory
- CD-ROM Compact Disk Read-Only Memory
- DVD Digital Video Disk
- RAM random Random Access Memory
- RAM random Random Access Memory
- SRAM Static Random Access Memory
- Synchronous Random Access Random Memory Synchronous Random Access Random Memory.
- SSRAM Dynamic Random Access Memory
- DRAM Dynamic Random Access Memory
- SDRAM synchronous dynamic random access memory
- DDRSDRAM double data rate synchronous dynamic random access memory
- DDRSDRAM enhanced synchronous dynamic random access memory
- SyncLink Synchronous Link Dynamic Random Access Memory
- SLDRAM Direct Memory Bus Random Access Memory
- DDRRAM Direct Rambus, Random Access Memory
- an embodiment of the present application further provides a computer storage medium, specifically a computer-readable storage medium, such as a memory 902 including a computer program, where the computer storage medium stores a voice of a voice call.
- a computer storage medium specifically a computer-readable storage medium, such as a memory 902 including a computer program, where the computer storage medium stores a voice of a voice call.
- One or more programs of the switching method When one or more programs of the voice switching method of the voice call are executed by the processor 901 to implement the following steps of the voice switching method of the voice call provided by the embodiment of the present application:
- a method, device, terminal, and computer-readable storage medium for a voice call provided by the present application, obtain and approach the first PCM voice data received by a receiving end during a voice processing audio device switching process, and store it in a buffer; obtain and Approach the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data are played within the switching time Done for seamless switching.
Abstract
Description
Claims (10)
- 一种语音切换方法,包括:A voice switching method includes:获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;Acquiring and approaching the first PCM voice data received by the receiving end during the switching process of the voice processing audio device and storing it in the buffer;获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。Acquiring and approaching the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data are switched at the switching time The playback is complete.
- 根据权利要求1所述的方法,其中,在所述获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中的步骤之前,所述方法还包括:侦测语音通话过程中的音频设备切换。The method according to claim 1, wherein before the step of acquiring and approaching the first PCM voice data received by the receiving end during the process of switching the voice processing audio device and storing the first PCM voice data in the buffer, the method further comprises: detecting Audio device switching during a voice call.
- 根据权利要求1所述的方法,其中,所述获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存的步骤,包括:The method according to claim 1, wherein the step of obtaining and approximating the first PCM voice data received by the receiving end during the switching of the voice processing audio device and storing it in the buffer comprises:发生音频设备切换时,记录移动终端即将离开人耳的第一时间戳T1;When the audio device switching occurs, the first time stamp T1 at which the mobile terminal is about to leave the human ear is recorded;记录移动终端触控屏幕被点击音频设备切换时的第二时间戳T2;Record the second time stamp T2 when the touch screen of the mobile terminal is switched by clicking the audio device;获取在音频设备切换时间T内的第一PCM语音数据,存储在缓存中;其中,所述音频设备切换时间T=T2-T1;Acquiring the first PCM voice data within the audio device switching time T and storing it in a buffer; wherein, the audio device switching time T = T2-T1;逼近语音处理所述第一PCM语音数据,得到第1次第三PCM语音数据。The first PCM speech data is processed by approaching the speech to obtain the first third PCM speech data.
- 根据权利要求3所述的方法,其中,所述获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成的步骤,包括:The method according to claim 3, wherein the acquiring and approximating the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, makes the first PCM The step of playing the voice data and the second PCM voice data within the switching time includes:播放第N次第三PCM语音数据;Play N-th third PCM voice data;在播放时间内获取音处理音频设备切换成功后接收端接收的第N次第二PCM语音数据;Acquiring the Nth second PCM voice data received by the receiving end after the switching of the audio processing audio device is successful within the playing time;逼近语音处理所述第N次第二PCM语音数据,得到第N+1次第三PCM语音数据;Approximate the speech and process the Nth second PCM speech data to obtain the N + 1th third PCM speech data;循环执行以上步骤,直到N等于逼近播放时间小于预设播放时间时的播放次数为止,其中,N为大于或等于1的整数。The above steps are performed in a loop until N is equal to the number of playback times when the playback time is shorter than the preset playback time, where N is an integer greater than or equal to 1.
- 根据权利要求1至4任一项所述的方法,其中,所述逼近语音处理为:采用逼近法结合语音处理对PCM语音数据的播放时间进行逼近,获得放逼近后的PCM语音数据。The method according to any one of claims 1 to 4, wherein the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
- 根据权利要求5所述的方法,其中,所述语音处理分为语音分解、语音合成两个阶段;其中:The method according to claim 5, wherein the speech processing is divided into two stages of speech decomposition and speech synthesis; wherein:所述语音分解阶段,完成原始PCM语音数据的分帧,分解后的帧用于语音合成处理;设帧长为N,帧移为Sa;In the speech decomposition phase, the framing of the original PCM speech data is completed, and the decomposed frames are used for speech synthesis processing; the frame length is set to N and the frame shift is Sa;所述语音合成阶段,保持语音分解阶段的第一帧位置不变,移动之后各帧,根据变速因子a=Ss/Sa,改变语音分解阶段的帧移Sa为语音合成阶段的帧移Ss=Sa*a。In the speech synthesis stage, the position of the first frame of the speech decomposition stage is maintained. After each frame is moved, the frame shift Sa of the speech decomposition stage is changed to the frame shift Ss = Sa of the speech synthesis stage according to the shift factor a = Ss / Sa. * a.
- 一种语音切换装置,包括:获取模块、缓存模块、处理模块,其中:A voice switching device includes: an acquisition module, a cache module, and a processing module, wherein:所述获取模块,配置为获取音频设备切换过程中接收端接收的第一PCM语音数据和在音频设备切换成功后与所述第一PCM语音数据时长相同的第二PCM语音数据;The obtaining module is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same duration as the first PCM voice data after the audio device is successfully switched;所述缓存模块,配置为缓存所述第一PCM语音数据以及在逼近语音处理过程中的PCM语音数据;The buffer module is configured to buffer the first PCM voice data and the PCM voice data in the approaching voice processing process;所述处理模块,配置为将所述第一PCM语音数据与所述第二PCM语音数据进行逼近语音处理,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。The processing module is configured to perform approximate speech processing on the first PCM speech data and the second PCM speech data, so that the first PCM speech data and the second PCM speech data are played within the switching time. .
- 根据权利要求7所述的装置,其中,所述装置还包括侦测模块,所述侦测模块,配置为侦测语音通话过程中的音频设备切换。The device according to claim 7, wherein the device further comprises a detection module configured to detect an audio device switch during a voice call.
- 一种终端,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至6中任一项所述的语音切换方法的步骤。A terminal includes: a memory, a processor, and a computer program stored on the memory and operable on the processor. When the computer program is executed by the processor, any one of claims 1 to 6 is implemented. Steps of a speech switching method.
- 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的语音切换方法的步骤。A computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the voice switching method according to any one of claims 1 to 6 are implemented.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810752172.2 | 2018-07-10 | ||
CN201810752172.2A CN110708417B (en) | 2018-07-10 | 2018-07-10 | Voice switching method, device, terminal and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020010963A1 true WO2020010963A1 (en) | 2020-01-16 |
Family
ID=69142181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/089623 WO2020010963A1 (en) | 2018-07-10 | 2019-05-31 | Voice handover method, apparatus, terminal, and computer-readable storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110708417B (en) |
WO (1) | WO2020010963A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113434309A (en) * | 2021-06-23 | 2021-09-24 | 东风汽车有限公司东风日产乘用车公司 | Message broadcasting method, device and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009224911A (en) * | 2008-03-13 | 2009-10-01 | Onkyo Corp | Headphone |
CN101854425A (en) * | 2009-04-02 | 2010-10-06 | 深圳富泰宏精密工业有限公司 | Mobile device and sound mode switching method thereof |
CN103811033A (en) * | 2012-11-14 | 2014-05-21 | 北京新媒传信科技有限公司 | Method and device for controlling voice playing modes |
CN103984518A (en) * | 2014-04-18 | 2014-08-13 | 青岛尚慧信息技术有限公司 | Intelligent mobile terminal |
CN103985394A (en) * | 2014-04-18 | 2014-08-13 | 青岛尚慧信息技术有限公司 | Audio file playing method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10079030B2 (en) * | 2016-08-09 | 2018-09-18 | Qualcomm Incorporated | System and method to provide an alert using microphone activation |
CN107577447A (en) * | 2017-08-24 | 2018-01-12 | 联想(北京)有限公司 | Control the method and apparatus for media playing of media data output |
CN107872584A (en) * | 2017-11-24 | 2018-04-03 | 维沃移动通信有限公司 | A kind of multi-media processing method, multimedia equipment and terminal |
-
2018
- 2018-07-10 CN CN201810752172.2A patent/CN110708417B/en active Active
-
2019
- 2019-05-31 WO PCT/CN2019/089623 patent/WO2020010963A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009224911A (en) * | 2008-03-13 | 2009-10-01 | Onkyo Corp | Headphone |
CN101854425A (en) * | 2009-04-02 | 2010-10-06 | 深圳富泰宏精密工业有限公司 | Mobile device and sound mode switching method thereof |
CN103811033A (en) * | 2012-11-14 | 2014-05-21 | 北京新媒传信科技有限公司 | Method and device for controlling voice playing modes |
CN103984518A (en) * | 2014-04-18 | 2014-08-13 | 青岛尚慧信息技术有限公司 | Intelligent mobile terminal |
CN103985394A (en) * | 2014-04-18 | 2014-08-13 | 青岛尚慧信息技术有限公司 | Audio file playing method |
Also Published As
Publication number | Publication date |
---|---|
CN110708417B (en) | 2021-02-23 |
CN110708417A (en) | 2020-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106982286B (en) | Recording method, recording equipment and computer readable storage medium | |
CN108234876B (en) | Tracking focusing method, terminal and computer readable storage medium | |
CN107493497A (en) | A kind of video broadcasting method, terminal and computer-readable recording medium | |
CN109618052A (en) | A kind of call audio switching method and device, mobile terminal and readable storage medium storing program for executing | |
CN112004174B (en) | Noise reduction control method, device and computer readable storage medium | |
CN108076204A (en) | The method and terminal of a kind of call treatment | |
WO2021190545A1 (en) | Call processing method and electronic device | |
WO2021129835A1 (en) | Volume control method and device, and computer-readable storage medium | |
CN109743764A (en) | A kind of identification elevator scene method and device, mobile terminal device and storage medium | |
CN107707728A (en) | A kind of audio transmission switching method, terminal and computer-readable recording medium | |
WO2021129818A1 (en) | Video playback method and electronic device | |
CN107463255A (en) | A kind of video broadcasting method, terminal and computer-readable recording medium | |
CN109660973A (en) | Bluetooth control method, mobile terminal and storage medium | |
CN109711830B (en) | Quick display method and device for two-dimension code payment, mobile terminal and storage medium | |
WO2021098708A1 (en) | Calling method, and terminal apparatus | |
WO2020010963A1 (en) | Voice handover method, apparatus, terminal, and computer-readable storage medium | |
WO2022105874A1 (en) | Image display method, terminal device, and storage medium | |
CN112887776B (en) | Method, equipment and computer readable storage medium for reducing audio delay | |
CN109067976A (en) | desktop automatic switching method, mobile terminal and computer readable storage medium | |
CN109739641B (en) | Self-adaptive CPU frequency modulation acceleration method and device, mobile terminal and storage medium | |
CN110087013B (en) | Video chat method, mobile terminal and computer readable storage medium | |
CN109739642B (en) | CPU frequency modulation method and device, mobile terminal and computer readable storage medium | |
CN107566745B (en) | Shooting method, terminal and computer readable storage medium | |
CN109862163A (en) | A kind of screen sound-emanating areas optimization method and device, mobile terminal and storage medium | |
CN110198378A (en) | It is a kind of using the method and device of data buffer storage, mobile terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19835212 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19835212 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19835212 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04/02/2022) |