WO2020010963A1 - Voice handover method, apparatus, terminal, and computer-readable storage medium - Google Patents

Voice handover method, apparatus, terminal, and computer-readable storage medium Download PDF

Info

Publication number
WO2020010963A1
WO2020010963A1 PCT/CN2019/089623 CN2019089623W WO2020010963A1 WO 2020010963 A1 WO2020010963 A1 WO 2020010963A1 CN 2019089623 W CN2019089623 W CN 2019089623W WO 2020010963 A1 WO2020010963 A1 WO 2020010963A1
Authority
WO
WIPO (PCT)
Prior art keywords
pcm
voice data
speech
switching
audio device
Prior art date
Application number
PCT/CN2019/089623
Other languages
French (fr)
Chinese (zh)
Inventor
杨柳
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020010963A1 publication Critical patent/WO2020010963A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/725Cordless telephones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/39Electronic components, circuits, software, systems or apparatus used in telephone systems using speech synthesis

Definitions

  • This application relates to, but is not limited to, the field of mobile terminals.
  • the user will switch from the hand-held state to the hands-free state during the call out of necessity.
  • the user cannot normally listen to the downlink pulse code modulation (PCM) voice data.
  • PCM pulse code modulation
  • the specific switching time will vary depending on the operating speed and operating habits of different users, and is generally greater than 2 seconds.
  • downlink PCM voice data may not be obtained or lost during the handover process. The longer the switchover time, the more downlink PCM voice data is lost, which will affect the user's reception of the voice information of the callee. Affect the user experience.
  • embodiments of the present application provide a voice switching method, device, terminal, and computer-readable storage medium for a voice call.
  • a voice switching method provided for a mobile terminal includes:
  • a voice switching device which is applied to the voice switching method.
  • the voice switching device includes: an acquisition module, a cache module, and a processing module, where:
  • the obtaining module is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same duration as the first PCM voice data after the audio device is successfully switched;
  • the buffer module is configured to buffer the first PCM voice data and the PCM voice data in the approaching voice processing process
  • the processing module is configured to perform approximate speech processing on the first PCM speech data and the second PCM speech data, so that the first PCM speech data and the second PCM speech data are played within the switching time. .
  • a terminal including: a memory, a processor, and a computer program stored on the memory and executable on the processor, and the computer program is executed by the processor.
  • the steps of the voice switching method provided in the embodiments of the present application are implemented.
  • a computer-readable storage medium stores a program of the voice switching method, and when the program of the voice switching method is executed by a processor, The steps of implementing the voice switching method provided in the embodiments of the present application are implemented.
  • a method, device, terminal, and computer-readable storage medium for a voice call acquire and approach the first PCM voice data received by the receiving end during the voice processing audio device switching process. And stored in the buffer; acquiring and approaching the second PCM voice data of the same length as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data The PCM voice data is played within the switching time to achieve seamless switching.
  • FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal that implements various embodiments of the present application
  • FIG. 2 is a structural diagram of a communication network system according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a voice switching method for a voice call according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a method for seamlessly switching voices during an audio device switching process by using an approximation method combining buffering and voice processing according to an embodiment of the present application;
  • FIG. 5 is a schematic structural diagram of a voice switching device for a voice call according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a voice switching method for a voice call according to an embodiment of the present application.
  • the terminal can be implemented in various forms.
  • the terminals described in this application may include mobile phones, tablets, laptops, palmtop computers, Personal Digital Assistants (PDAs), Portable Media Players (PMPs), navigation devices, Mobile terminals such as wearable devices, smart bracelets, pedometers, and fixed terminals such as digital TVs, desktop computers, etc.
  • PDAs Personal Digital Assistants
  • PMPs Portable Media Players
  • Mobile terminals such as wearable devices, smart bracelets, pedometers
  • fixed terminals such as digital TVs, desktop computers, etc.
  • a mobile terminal will be taken as an example for explanation.
  • the configuration according to the embodiment of the present application can also be applied to a fixed type terminal.
  • FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal for implementing the embodiments of the present application.
  • the mobile terminal 100 may include a radio frequency (RF) unit 101, a WiFi module 102, an audio output unit 103, and audio.
  • RF radio frequency
  • a / V Video
  • input unit 104 sensor 105
  • display unit 106 user input unit 107
  • interface unit 108 interface unit 108
  • memory 109 memory 109
  • processor 110 power supply 111
  • FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal for implementing the embodiments of the present application.
  • the mobile terminal 100 may include a radio frequency (RF) unit 101, a WiFi module 102, an audio output unit 103, and audio.
  • / Video (A / V) input unit 104 sensor 105
  • display unit 106 user input unit 107
  • interface unit 108 user input unit
  • memory 109 memory 109
  • processor 110 power supply 111
  • power supply 111 power supply 111
  • the RF unit 101 may be configured to receive and transmit signals during transmission and reception of information or during a call. Specifically, the downlink information of the base station is received and processed by the processor 110; in addition, uplink data is transmitted to the base station.
  • the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication.
  • the above wireless communication can use any communication standard or protocol, including but not limited to Global System (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access 2000 (Code Division Multiple Access 2000 (CDMA2000), Wideband Code Division Multiple Access (WCDMA), Time Division Synchronous Code Division Multiple Access (Time Division-Synchronous Code Division, Multiple Access, TD-SCDMA), Frequency Division Duplex Long-Term Evolution (Frequency Division Duplexing-Long Terminal Evolution (FDD-LTE)) and Time Division Duplex Long-Term Evolution (Time Division Duplexing-Long Terminal Evolution (TDD-LTE)).
  • GSM Global System
  • GPRS General Packet Radio Service
  • CDMA2000 Code Division Multiple Access 2000
  • WCDMA Wideband Code Division Multiple Access
  • Time Division Synchronous Code Division Multiple Access Time Division-Synchronous Code Division, Multiple Access
  • TD-SCDMA Time Division Synchronous Code Division Multiple Access
  • FDD-LTE Frequency Division Duplexing-Long Terminal Evolution
  • WiFi is a short-range wireless transmission technology.
  • the mobile terminal can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 102. It provides users with wireless broadband Internet access.
  • FIG. 1 shows the WiFi module 102, it can be understood that it does not belong to the necessary configuration of the mobile terminal, and can be omitted as needed without changing the essence of the invention.
  • the audio output unit 103 may receive the RF unit 101 or the WiFi module 102 or store it in the memory 109 when the mobile terminal 100 is in a call signal receiving mode, a call mode, a recording mode, a voice recognition mode, a broadcast receiving mode, or the like.
  • the audio data is converted into audio signals and output as sound.
  • the audio output unit 103 may also provide audio output (for example, a call signal receiving sound, a message receiving sound, etc.) related to a specific function performed by the mobile terminal 100.
  • the audio output unit 103 may include a speaker, a buzzer, and the like.
  • the A / V input unit 104 is configured to receive an audio or video signal.
  • the A / V input unit 104 may include a graphics processing unit (Graphics Processing Unit, GPU) 1041 and a microphone 1042.
  • the graphics processor 1041 pairs static images obtained by an image capture device (such as a camera) in a video capture mode or an image capture mode or The image data of the video is processed.
  • the processed image frames may be displayed on the display unit 106.
  • the image frames processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the RF unit 101 or the WiFi module 102.
  • the microphone 1042 can receive sound (audio data) via the microphone 1042 in an operation mode such as a telephone call mode, a recording mode, a voice recognition mode, and can process such sound into audio data.
  • the processed audio (voice) data can be converted into a format that can be transmitted to a mobile communication base station via the RF unit 101 in the case of a telephone call mode and output.
  • the microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to eliminate (or suppress) noise or interference generated during the process of receiving and transmitting audio signals.
  • the mobile terminal 100 further includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of the ambient light, and the proximity sensor can close the display panel 1061 and the display panel 1061 when the mobile terminal 100 moves to the ear.
  • an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary.
  • It can be configured as an application that recognizes the attitude of the mobile phone (such as horizontal and vertical screen switching, (Related games, magnetometer attitude calibration), vibration recognition-related functions (such as pedometer, tap), etc .; as for the mobile phone, the fingerprint sensor, pressure sensor, iris sensor, molecular sensor, gyroscope, barometer, hygrometer can also be configured , Thermometer, infrared sensor and other sensors, will not repeat them here.
  • the display unit 106 is configured to display information input by the user or information provided to the user.
  • the display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • the user input unit 107 may be configured to receive inputted numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal.
  • the user input unit 107 may include a touch panel 1071 and other input devices 1072.
  • Touch panel 1071 also known as touch screen, can collect user's touch operations on or near it (such as the user using a finger, stylus, etc. any suitable object or accessory on touch panel 1071 or near touch panel 1071 Operation), and drive the corresponding connection device according to a preset program.
  • the touch panel 1071 may include two parts, a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into contact coordinates, and sends it To the processor 110, and can receive the command sent by the processor 110 and execute it.
  • various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch panel 1071.
  • the user input unit 107 may also include other input devices 1072.
  • the other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, which are not limited herein. .
  • the touch panel 1071 may cover the display panel 1061.
  • the touch panel 1071 detects a touch operation on or near the touch panel 1071, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event.
  • the type of touch event provides corresponding visual output on the display panel 1061.
  • the touch panel 1071 and the display panel 1061 are implemented as two independent components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated. The implementation of the input and output functions of the mobile terminal is not specifically limited here.
  • the interface unit 108 functions as an interface through which at least one external device can connect with the mobile terminal 100.
  • the external device may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port configured to connect a device with an identification module, and audio input / output (I / O) port, video I / O port, headphone port, and more.
  • the interface unit 108 may be configured to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the mobile terminal 100 or may be configured to connect the mobile terminal 100 and the external Transfer data between devices.
  • the memory 109 may be configured to store software programs and various data.
  • the memory 109 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.), etc .; Data (such as audio data, phone book, etc.) created by the use of mobile phones.
  • the memory 109 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the processor 110 is a control center of the mobile terminal, and uses various interfaces and lines to connect various parts of the entire mobile terminal.
  • the processor 110 runs or executes software programs and / or modules stored in the memory 109, and calls data stored in the memory 109. , Perform various functions of the mobile terminal and process data, so as to monitor the mobile terminal as a whole.
  • the processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, etc.
  • the processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 110.
  • the mobile terminal 100 may further include a power source 111 (such as a battery) for supplying power to various components.
  • a power source 111 such as a battery
  • the power source 111 may be logically connected to the processor 110 through a power management system, so as to manage charging, discharging, and power consumption management through the power management system. And other functions.
  • the mobile terminal 100 may further include a Bluetooth module and the like, and details are not described herein again.
  • FIG. 2 is a structural diagram of a communication network system according to an embodiment of the present application.
  • the communication network system is a general mobile communication technology LTE system.
  • the LTE system includes user equipment (User Equipment, UE 201), Evolved UMTS Terrestrial Radio Access Network (E-UTRAN) 202, Evolved Packet Core Network (EPC) 203, and IP service 204 of the operator.
  • UE 201 User Equipment
  • E-UTRAN Evolved UMTS Terrestrial Radio Access Network
  • EPC Evolved Packet Core Network
  • IP service 204 IP service
  • the UE 201 may be the foregoing terminal 100, and details are not described herein again.
  • E-UTRAN 202 includes eNodeB 2021 and other eNodeB 2022.
  • the eNodeB 2021 can be connected to other eNodeB 2022 through a backhaul (such as an X2 interface), the eNodeB 2021 is connected to the EPC203, and the eNodeB 2021 can provide UE201 to EPC203 access.
  • a backhaul such as an X2 interface
  • EPC203 may include Mobility Management Entity (MME) 2031, Home Subscriber Server (HSS) 2032, other MME 2033, Serving Gateway (SGW) 2034, Packet Data Network Gateway (PDN GateWay) , PGW) 2035 and Policy and Charging Function (Function PCRF) 2036 and so on.
  • MME2031 is a control node that processes signaling between UE201 and EPC203, and provides bearer and connection management.
  • the HSS2032 is configured to provide some registers to manage functions such as the home location register (not shown in the figure), and holds some user-specific information about service characteristics, data rates, and so on. All user data can be sent through SGW2034.
  • PGW2035 can provide UE 201 IP address allocation and other functions.
  • PCRF2036 is a policy and charging control policy decision point for service data flows and IP bearer resources. It performs functions for policy and charging. Units (not shown) select and provide available policy and billing control decisions.
  • the IP service 204 may include the Internet, an intranet, an IP Multimedia Subsystem (IMS), or other IP services.
  • IMS IP Multimedia Subsystem
  • the embodiment of the present application provides a voice switching method for a voice call, which should be configured as a mobile terminal and includes:
  • the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
  • the method before step S1 of acquiring and approaching the first PCM voice data received by the receiving end during the process of acquiring and approaching the voice processing audio device switching and storing the first PCM voice data in the buffer, the method further includes: detecting the voice call process
  • the step of switching the audio device includes: the proximity sensor detects the movement of the mobile terminal during a voice call, and if it detects that the mobile terminal has moved, it is determined that the audio device is switched.
  • the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
  • the speech processing is to change only the rate of speech and keep the intonation and semantics of the speech unchanged, and the speech processing is divided into two stages of speech decomposition and speech synthesis.
  • step S1 the step of acquiring and approaching the first PCM voice data received by the receiving end during the switching of the voice processing audio device and storing the first PCM voice data in the buffer includes:
  • the proximity sensor detects that the mobile terminal is moving and audio device switching occurs, such as switching between the handset and the speaker, and records the first time stamp T1 when the mobile terminal is about to leave the human ear;
  • the audio device switching time T is also the playback time of the first PCM voice data stored in the buffer during that period;
  • step S2 the acquiring and approximating the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first
  • the steps of playing the PCM voice data and the second PCM voice data within the switching time include:
  • Steps S21 to S23 are executed in a loop until the approximate playback time is shorter than the preset playback time.
  • N is equal to the number of playback times when the approximate playback time is shorter than the preset playback time.
  • the first PCM voice data and N times are described.
  • the playback of the second PCM voice data is completed within the switching time, where N is an integer greater than or equal to 1.
  • FIG. 4 is a schematic diagram of a method for seamlessly switching voices during an audio device switching process by using an approximation method combining buffering and voice processing according to an embodiment of the present application.
  • the switching of the audio device of the mobile terminal occurs at time T [T1, T2].
  • time T is [T2, T3].
  • the receiving end plays the same time as the device switching time, and at this time, the second PCM voice data is obtained.
  • the PCM voice data PCM-T / 4 is the second and third PCM voice data obtained after processing.
  • the playback time is reduced by half compared to the original playback time and reduced to the original playback time. T / 4.
  • step S1 the first PCM voice data during the 2-second audio device switching has been acquired and buffered, and the 2-second PCM voice data is subjected to approximate speech processing.
  • the third PCM voice data with the first playback time of 1 second is obtained.
  • the first second PCM voice data is acquired while playing the first third PCM voice data (1 second playback is complete).
  • the original playback time of the first second PCM voice data is For 1 second, the second PCM voice data for the first time is obtained, and the second PCM voice data for the second time is processed after the first second PCM voice data is approximated to a speech time of 0.5 second.
  • the original second playback time of the second second PCM voice data is 0.5 seconds
  • the second time is acquired
  • the third and third PCM voice data of the second PCM voice data are subjected to approximate voice processing for the second and second PCM voice data so that the playback time is 0.25 seconds.
  • the third second PCM voice data while playing the third third PCM voice data (0.25 second playback is complete).
  • the original playback time of the third second PCM voice data is 0.25 seconds, and the third time is acquired.
  • the fourth third PCM speech data is processed by the second PCM speech data, and the third second PCM speech data is subjected to approximate speech processing so that the playback time is 0.125 seconds.
  • the first PCM voice data playback starts at T / 2 and the playback time Tx is T / 2; the second PCM voice data playback starts at T / 2 + T / 4 and the playback time Tx is T / 4; the third The second PCM voice data playback starts at T / 2 + T / 4 + T / 8, and the playback time Tx is T / 8; and so on, at T / 2 + T / 4 + T / 8 ... + T / 2 ⁇ N completes the Nth PCM voice data playback, and the playback time Tx is T / 2 ⁇ N.
  • the playback time of each PCM audio data is proportional, the first term is T / 2, and the common ratio is 1/2. According to the following proportional series summation formula, when N tends to infinity, PCM voice data The total playback time is T.
  • T N is the total playing time
  • 1/2 is the common ratio between each playing time
  • N is the number of playing times.
  • the playback time Tx is shorter than the preset playback time Tu (for example, 50 milliseconds)
  • the approximation is stopped, and the approximation process ends.
  • the playback time Tx is less than the preset playback time Tu (such as 50 milliseconds)
  • the user can hardly subjectively feel that the PCM voice data has not been acquired or lost.
  • the switching time T of the audio device is required to be between a preset minimum switching time Tmin (such as 0.5 seconds) and a preset maximum switching time Tmax (such as 5 seconds).
  • the preset minimum switching time Tmin ⁇ T ⁇ the preset maximum switching time Tmax when the audio device switching time T is greater than the preset maximum switching time Tmax, the audio device switching time is too long, and the PCM voice data information to be processed is relatively large, The user is advised to re-ask the interlocutor to request a repeat. If the switching time T of the audio device is shorter than the preset minimum switching time Tmin, the switching time of the audio device is very short, and the user may hardly obtain the PCM voice data information during this switching operation, and there is no need to perform seamless switching at this time.
  • the first PCM voice data with a playing time of T (audio device switching time) and the second PCM voice data with the same playing time of T can pass through the embodiments of the present application.
  • the processed third PCM voice data with the playback time T is obtained by approaching the voice buffer processing, and the playback time is reduced by half compared to the original. It can solve the problem that when the voice device is switched during a call, all downlink PCM voice data information can be completely transmitted and played, and seamlessly switched, so that the user will not lose any voice information during the device switching process, and enhance the user's understanding of the dialogue information. Enhanced user experience.
  • the embodiment of the present application provides a voice switching device for a voice call, which should be configured as a mobile terminal.
  • the voice switching device 300 includes a detection module 301, an acquisition module 302, a cache module 303, and a processing module 304, of which:
  • the detection module 301 is configured to detect an audio device switch during a voice call
  • the obtaining module 302 is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same length as the first PCM voice data after the audio device is successfully switched;
  • the buffer module 303 is configured to buffer the first PCM voice data and the PCM voice data during the approaching voice processing;
  • the processing module 304 is configured to perform approximate speech processing on the buffered first PCM speech data and the second PCM speech data after the audio device is successfully switched, so that the first PCM speech data and the second The PCM voice data is played within the switching time.
  • the detection module 301 is a proximity sensor.
  • An application embodiment of the present application provides a voice switching method for a voice call, which is applied to a mobile terminal and includes:
  • the proximity sensor detects the movement of the mobile terminal during a voice call. If a movement of the mobile terminal is detected, it is determined that an audio device switching situation has occurred.
  • the audio device switching time T is also the playing time of the first PCM voice data stored in the buffer during this period.
  • S705 Approach the voice to process the first PCM voice data to obtain the first third PCM voice data.
  • Play the first third PCM voice data obtain the first second PCM voice data within the playback time, and approach the voice processing the first second PCM voice data to obtain the second third PCM voice data.
  • an embodiment of the present application further provides a terminal.
  • the terminal 900 includes: a memory 902, a processor 901, and a processor stored in the memory 902 and operable on the processor 901.
  • the memory 902 and the processor 901 are coupled together through a bus system 903, and the one or more computer programs are executed by the processor 901 to implement a method provided by an embodiment of the present application
  • the methods disclosed in the embodiments of the present application may be applied to the processor 901, or implemented by the processor 901.
  • the processor 901 may be an integrated circuit chip and has a signal processing capability. In the implementation process, each step of the foregoing method may be completed by using an integrated logic circuit of hardware in the processor 901 or an instruction in the form of software.
  • the processor 901 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like.
  • the processor 901 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present application.
  • a general-purpose processor may be a microprocessor or any conventional processor.
  • the steps may be directly implemented by a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a storage medium.
  • the storage medium is located in the memory 902.
  • the processor 901 reads the information in the memory 902 and completes the steps of the foregoing method in combination with its hardware.
  • the memory 902 in the embodiment of the present application may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory.
  • the non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), and an erasable programmable read-only memory (PROM).
  • RAM Random Access Memory
  • EPROM Electrically Erasable Programmable Read-Only Memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • FRAM Ferromagnetic Random Access Memory
  • Flash Flash
  • CD-ROM read-only Memory Compact Disk Read-Only Memory
  • CD-ROM Compact Disk Read-Only Memory
  • DVD Digital Video Disk
  • RAM random Random Access Memory
  • RAM random Random Access Memory
  • SRAM Static Random Access Memory
  • Synchronous Random Access Random Memory Synchronous Random Access Random Memory.
  • SSRAM Dynamic Random Access Memory
  • DRAM Dynamic Random Access Memory
  • SDRAM synchronous dynamic random access memory
  • DDRSDRAM double data rate synchronous dynamic random access memory
  • DDRSDRAM enhanced synchronous dynamic random access memory
  • SyncLink Synchronous Link Dynamic Random Access Memory
  • SLDRAM Direct Memory Bus Random Access Memory
  • DDRRAM Direct Rambus, Random Access Memory
  • an embodiment of the present application further provides a computer storage medium, specifically a computer-readable storage medium, such as a memory 902 including a computer program, where the computer storage medium stores a voice of a voice call.
  • a computer storage medium specifically a computer-readable storage medium, such as a memory 902 including a computer program, where the computer storage medium stores a voice of a voice call.
  • One or more programs of the switching method When one or more programs of the voice switching method of the voice call are executed by the processor 901 to implement the following steps of the voice switching method of the voice call provided by the embodiment of the present application:
  • a method, device, terminal, and computer-readable storage medium for a voice call provided by the present application, obtain and approach the first PCM voice data received by a receiving end during a voice processing audio device switching process, and store it in a buffer; obtain and Approach the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data are played within the switching time Done for seamless switching.

Abstract

Disclosed by the present application are a voice-call voice handover method, apparatus, terminal, and computer-readable storage medium. The method comprises: obtaining first PCM voice data of a receiving terminal during switching of an audio device, and storing said data in a cache; obtaining second PCM voice data of the receiving terminal at the same time as the first PCM voice data after the audio device is successfully handed over; processing the first PCM voice data and the second PCM voice data according to a preset PCM voice data processing strategy to obtain processed third PCM voice data, and transmitting said third PCM voice data to the handed-over audio device.

Description

语音切换方法、装置、终端及计算机可读存储介质Voice switching method, device, terminal and computer-readable storage medium
相关申请的交叉引用Cross-reference to related applications
本申请基于申请号为201810752172.2、申请日为2018年07月10日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。This application is based on a Chinese patent application with an application number of 201810752172.2 and an application date of July 10, 2018, and claims the priority of the Chinese patent application. The entire content of this Chinese patent application is incorporated herein by reference.
技术领域Technical field
本申请涉及但不限于移动终端领域。This application relates to, but is not limited to, the field of mobile terminals.
背景技术Background technique
在日常的语音通话的过程中,用户出于需要,在通话过程中,会从手持状态切换到免提状态。当用户进行从手持状态切换到免提状态时移动终端远离用户耳朵的过程中,用户无法正常收听下行脉冲编码调制(PCM)语音数据。具体切换时间依不同用户的操作速度和操作习惯会有不同,一般大于2秒以上。当通话对方正在连续说话时,在切换过程中,下行PCM语音数据可能会无法获取或丢失,切换时间越长,丢失的下行PCM语音数据越多,从而会影响用户对通话对方的语音信息接收,影响用户体验。During the daily voice call, the user will switch from the hand-held state to the hands-free state during the call out of necessity. When the user switches from the handheld state to the hands-free state while the mobile terminal is away from the user's ear, the user cannot normally listen to the downlink pulse code modulation (PCM) voice data. The specific switching time will vary depending on the operating speed and operating habits of different users, and is generally greater than 2 seconds. When the callee is talking continuously, downlink PCM voice data may not be obtained or lost during the handover process. The longer the switchover time, the more downlink PCM voice data is lost, which will affect the user's reception of the voice information of the callee. Affect the user experience.
所以,有必要提出一种新的语音切换方法,以解决以上存在的问题。Therefore, it is necessary to propose a new voice switching method to solve the above problems.
发明内容Summary of the invention
有鉴于此,本申请实施例提供一种语音通话的语音切换方法、装置、终端及计算机可读存储介质。In view of this, embodiments of the present application provide a voice switching method, device, terminal, and computer-readable storage medium for a voice call.
根据本申请实施例的一个方面,提供的一种语音切换方法,应用于移动终端,包括:According to an aspect of the embodiments of the present application, a voice switching method provided for a mobile terminal includes:
获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;Acquiring and approaching the first PCM voice data received by the receiving end during the switching process of the voice processing audio device and storing it in the buffer;
获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。Acquiring and approaching the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data are switched at the switching time The playback is complete.
根据本申请实施例的另一个方面,提供的一种语音切换装置,应用于所述的语音切换方法,所述语音切换装置包括:获取模块、缓存模块、处理模块,其中:According to another aspect of the embodiments of the present application, a voice switching device is provided, which is applied to the voice switching method. The voice switching device includes: an acquisition module, a cache module, and a processing module, where:
所述获取模块,配置为获取音频设备切换过程中接收端接收的第一PCM语音数据和在音频设备切换成功后与所述第一PCM语音数据时长相同的第二PCM语音数据;The obtaining module is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same duration as the first PCM voice data after the audio device is successfully switched;
所述缓存模块,配置为缓存所述第一PCM语音数据以及在逼近语音处理过程中的PCM语音数据;The buffer module is configured to buffer the first PCM voice data and the PCM voice data in the approaching voice processing process;
所述处理模块,配置为将所述第一PCM语音数据与所述第二PCM语音数据进行逼近语音处理,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。The processing module is configured to perform approximate speech processing on the first PCM speech data and the second PCM speech data, so that the first PCM speech data and the second PCM speech data are played within the switching time. .
根据本申请的另一个方面,提供的一种终端,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现本申请实施例提供所述的语音切换方法的步骤。According to another aspect of the present application, a terminal is provided, including: a memory, a processor, and a computer program stored on the memory and executable on the processor, and the computer program is executed by the processor. At this time, the steps of the voice switching method provided in the embodiments of the present application are implemented.
根据本申请的另一个方面,提供的一种计算机可读存储介质,所述计算机可读存储介质上存储有所述的语音切换方法的程序,所述的语音切换方法的程序被处理器执行时实现本申请实施例提供所述的语音切换方法的步骤。According to another aspect of the present application, a computer-readable storage medium is provided, where the computer-readable storage medium stores a program of the voice switching method, and when the program of the voice switching method is executed by a processor, The steps of implementing the voice switching method provided in the embodiments of the present application are implemented.
与相关技术相比,本申请实施例提供的一种语音通话的语音切换方法、装置、终端及计算机可读存储介质,获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成,实现无缝切换。通过以上逼近语音处理的技术手段,可以解决在通话过程中进行语音设备切换时使得下行所有PCM语音 数据信息可以完整传送播放,无缝切换,使得用户在设备切换过程中不会丢失任何语音信息,增强了用户对话音信息的理解,增强了用户体验。Compared with related technologies, a method, device, terminal, and computer-readable storage medium for a voice call provided by the embodiments of the present application acquire and approach the first PCM voice data received by the receiving end during the voice processing audio device switching process. And stored in the buffer; acquiring and approaching the second PCM voice data of the same length as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data The PCM voice data is played within the switching time to achieve seamless switching. Through the above-mentioned technical methods for approaching voice processing, it can be solved that when the voice device is switched during a call, all downlink PCM voice data information can be completely transmitted and played, and seamlessly switched, so that the user will not lose any voice information during the device switching process. The understanding of the user's dialogue information is enhanced, and the user experience is enhanced.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional characteristics and advantages of the purpose of this application will be further described with reference to the embodiments and the drawings.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为实现本申请各个实施例的一种移动终端的硬件结构示意图;FIG. 1 is a schematic diagram of a hardware structure of a mobile terminal that implements various embodiments of the present application; FIG.
图2为本申请实施例提供的一种通信网络系统架构图;FIG. 2 is a structural diagram of a communication network system according to an embodiment of the present application; FIG.
图3为本申请实施例提供的一种语音通话的语音切换方法的流程示意图;3 is a schematic flowchart of a voice switching method for a voice call according to an embodiment of the present application;
图4为本申请实施例采用逼近法结合缓存和语音处理完成音频设备切换过程中的语音无缝切换的方法示意图;FIG. 4 is a schematic diagram of a method for seamlessly switching voices during an audio device switching process by using an approximation method combining buffering and voice processing according to an embodiment of the present application; FIG.
图5为本申请实施例提供的一种语音通话的语音切换装置的结构示意图;5 is a schematic structural diagram of a voice switching device for a voice call according to an embodiment of the present application;
图6为本申请实施例提供的一种终端的结构示意图;6 is a schematic structural diagram of a terminal according to an embodiment of the present application;
图7为本申请实施例提供的一种语音通话的语音切换方法的流程示意图。FIG. 7 is a schematic flowchart of a voice switching method for a voice call according to an embodiment of the present application.
具体实施方式detailed description
为了使本申请所要解决的技术问题、技术方案及有益效果更加清楚、明白,以下结合附图和实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅以解释本申请,并不用于限定本申请。In order to make the technical problems, technical solutions, and beneficial effects to be more clearly understood in the present application, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not intended to limit the application.
在后续的描述中,使用用于表示元件的诸如“模块”、“部件”或“单元”的后缀仅为了有利于本申请的说明,其本身没有特定的意义。因此,“模块”、“部件”或“单元”可以混合地使用。In the following description, the use of suffixes such as "module", "component", or "unit" for indicating elements is merely for the benefit of the description of the present application, and it does not have a specific meaning itself. Therefore, "modules," "components," or "units" can be used in combination.
需要说明的是,本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that the terms “first” and “second” in the specification and claims of this application are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.
终端可以以各种形式来实施。例如,本申请中描述的终端可以包括诸 如手机、平板电脑、笔记本电脑、掌上电脑、个人数字助理(Personal Digital Assistant,PDA)、便捷式媒体播放器(Portable Media Player,PMP)、导航装置、可穿戴设备、智能手环、计步器等移动终端,以及诸如数字TV、台式计算机等固定终端。The terminal can be implemented in various forms. For example, the terminals described in this application may include mobile phones, tablets, laptops, palmtop computers, Personal Digital Assistants (PDAs), Portable Media Players (PMPs), navigation devices, Mobile terminals such as wearable devices, smart bracelets, pedometers, and fixed terminals such as digital TVs, desktop computers, etc.
后续描述中将以移动终端为例进行说明,本领域技术人员将理解的是,除了特别用于移动目的的元件之外,根据本申请的实施方式的构造也能够应用于固定类型的终端。In the subsequent description, a mobile terminal will be taken as an example for explanation. Those skilled in the art will understand that, in addition to the elements specifically used for mobile purposes, the configuration according to the embodiment of the present application can also be applied to a fixed type terminal.
请参阅图1,其为实现本申请各个实施例的一种移动终端的硬件结构示意图,该移动终端100可以包括:射频(Radio Frequency,RF)单元101、WiFi模块102、音频输出单元103、音频/视频(A/V)输入单元104、传感器105、显示单元106、用户输入单元107、接口单元108、存储器109、处理器110、以及电源111等部件。本领域技术人员可以理解,图1中示出的移动终端结构并不构成对移动终端的限定,移动终端可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Please refer to FIG. 1, which is a schematic diagram of a hardware structure of a mobile terminal for implementing the embodiments of the present application. The mobile terminal 100 may include a radio frequency (RF) unit 101, a WiFi module 102, an audio output unit 103, and audio. / Video (A / V) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, power supply 111, and other components. Those skilled in the art can understand that the structure of the mobile terminal shown in FIG. 1 does not constitute a limitation on the mobile terminal. The mobile terminal may include more or fewer components than shown in the figure, or some components may be combined, or different components. Layout.
下面结合图1对移动终端的各个部件进行具体的介绍:The following describes each component of the mobile terminal in detail with reference to FIG. 1:
RF单元101可配置为收发信息或通话过程中,信号的接收和发送,具体的,将基站的下行信息接收后,给处理器110处理;另外,将上行的数据发送给基站。通常,射频单元101包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器、双工器等。此外,射频单元101还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址2000(Code Division Multiple Access 2000,CDMA2000)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、时分同步码分多址(Time Division-Synchronous Code Division Multiple Access,TD-SCDMA)、频分双工长期演进(Frequency Division Duplexing-Long Term Evolution,FDD-LTE)和分时双工长期演进(Time Division Duplexing-Long Term Evolution,TDD-LTE)等。The RF unit 101 may be configured to receive and transmit signals during transmission and reception of information or during a call. Specifically, the downlink information of the base station is received and processed by the processor 110; in addition, uplink data is transmitted to the base station. Generally, the radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The above wireless communication can use any communication standard or protocol, including but not limited to Global System (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access 2000 (Code Division Multiple Access 2000 (CDMA2000), Wideband Code Division Multiple Access (WCDMA), Time Division Synchronous Code Division Multiple Access (Time Division-Synchronous Code Division, Multiple Access, TD-SCDMA), Frequency Division Duplex Long-Term Evolution (Frequency Division Duplexing-Long Terminal Evolution (FDD-LTE)) and Time Division Duplex Long-Term Evolution (Time Division Duplexing-Long Terminal Evolution (TDD-LTE)).
WiFi属于短距离无线传输技术,移动终端通过WiFi模块102可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图1示出了WiFi模块102,但是可以理解的是,其并不属于移动终端的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。WiFi is a short-range wireless transmission technology. The mobile terminal can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 102. It provides users with wireless broadband Internet access. Although FIG. 1 shows the WiFi module 102, it can be understood that it does not belong to the necessary configuration of the mobile terminal, and can be omitted as needed without changing the essence of the invention.
音频输出单元103可以在移动终端100处于呼叫信号接收模式、通话模式、记录模式、语音识别模式、广播接收模式等等模式下时,将射频单元101或WiFi模块102接收的或者在存储器109中存储的音频数据转换成音频信号并且输出为声音。而且,音频输出单元103还可以提供与移动终端100执行的特定功能相关的音频输出(例如,呼叫信号接收声音、消息接收声音等等)。音频输出单元103可以包括扬声器、蜂鸣器等等。The audio output unit 103 may receive the RF unit 101 or the WiFi module 102 or store it in the memory 109 when the mobile terminal 100 is in a call signal receiving mode, a call mode, a recording mode, a voice recognition mode, a broadcast receiving mode, or the like. The audio data is converted into audio signals and output as sound. Moreover, the audio output unit 103 may also provide audio output (for example, a call signal receiving sound, a message receiving sound, etc.) related to a specific function performed by the mobile terminal 100. The audio output unit 103 may include a speaker, a buzzer, and the like.
A/V输入单元104配置为接收音频或视频信号。A/V输入单元104可以包括图形处理器(Graphics Processing Unit,GPU)1041和麦克风1042,图形处理器1041对在视频捕获模式或图像捕获模式中由图像捕获装置(如摄像头)获得的静态图片或视频的图像数据进行处理。处理后的图像帧可以显示在显示单元106上。经图形处理器1041处理后的图像帧可以存储在存储器109(或其它存储介质)中或者经由RF单元101或WiFi模块102进行发送。麦克风1042可以在电话通话模式、记录模式、语音识别模式等等运行模式中经由麦克风1042接收声音(音频数据),并且能够将这样的声音处理为音频数据。处理后的音频(语音)数据可以在电话通话模式的情况下转换为可经由RF单元101发送到移动通信基站的格式输出。麦克风1042可以实施各种类型的噪声消除(或抑制)算法以消除(或抑制)在接收和发送音频信号的过程中产生的噪声或者干扰。The A / V input unit 104 is configured to receive an audio or video signal. The A / V input unit 104 may include a graphics processing unit (Graphics Processing Unit, GPU) 1041 and a microphone 1042. The graphics processor 1041 pairs static images obtained by an image capture device (such as a camera) in a video capture mode or an image capture mode or The image data of the video is processed. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphics processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the RF unit 101 or the WiFi module 102. The microphone 1042 can receive sound (audio data) via the microphone 1042 in an operation mode such as a telephone call mode, a recording mode, a voice recognition mode, and can process such sound into audio data. The processed audio (voice) data can be converted into a format that can be transmitted to a mobile communication base station via the RF unit 101 in the case of a telephone call mode and output. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to eliminate (or suppress) noise or interference generated during the process of receiving and transmitting audio signals.
移动终端100还包括至少一种传感器105,比如光传感器、运动传感器以及其他传感器。具体地,光传感器包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1061的亮度,接近传感器可在移动终端100移动到耳边时,关闭显示面板1061和背光中至少之一。作为运动传感器的一种,加速计传感器可检测各个方向上(一般 为三轴)加速度的大小,静止时可检测出重力的大小及方向,可配置为识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的指纹传感器、压力传感器、虹膜传感器、分子传感器、陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。The mobile terminal 100 further includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor and a proximity sensor, wherein the ambient light sensor can adjust the brightness of the display panel 1061 according to the brightness of the ambient light, and the proximity sensor can close the display panel 1061 and the display panel 1061 when the mobile terminal 100 moves to the ear. At least one of the backlights. As a type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary. It can be configured as an application that recognizes the attitude of the mobile phone (such as horizontal and vertical screen switching, (Related games, magnetometer attitude calibration), vibration recognition-related functions (such as pedometer, tap), etc .; as for the mobile phone, the fingerprint sensor, pressure sensor, iris sensor, molecular sensor, gyroscope, barometer, hygrometer can also be configured , Thermometer, infrared sensor and other sensors, will not repeat them here.
显示单元106配置为显示由用户输入的信息或提供给用户的信息。显示单元106可包括显示面板1061,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板1061。The display unit 106 is configured to display information input by the user or information provided to the user. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
用户输入单元107可配置为接收输入的数字或字符信息,以及产生与移动终端的用户设置以及功能控制有关的键信号输入。具体地,用户输入单元107可包括触控面板1071以及其他输入设备1072。触控面板1071,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1071上或在触控面板1071附近的操作),并根据预先设定的程式驱动相应的连接装置。触控面板1071可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器110,并能接收处理器110发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1071。除了触控面板1071,用户输入单元107还可以包括其他输入设备1072。具体地,其他输入设备1072可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种,具体此处不做限定。The user input unit 107 may be configured to receive inputted numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 107 may include a touch panel 1071 and other input devices 1072. Touch panel 1071, also known as touch screen, can collect user's touch operations on or near it (such as the user using a finger, stylus, etc. any suitable object or accessory on touch panel 1071 or near touch panel 1071 Operation), and drive the corresponding connection device according to a preset program. The touch panel 1071 may include two parts, a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into contact coordinates, and sends it To the processor 110, and can receive the command sent by the processor 110 and execute it. In addition, various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch panel 1071. In addition to the touch panel 1071, the user input unit 107 may also include other input devices 1072. Specifically, the other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, which are not limited herein. .
在一实施例中,触控面板1071可覆盖显示面板1061,当触控面板1071检测到在其上或附近的触摸操作后,传送给处理器110以确定触摸事件的类型,随后处理器110根据触摸事件的类型在显示面板1061上提供相应的视觉输出。虽然在图1中,触控面板1071与显示面板1061是作为两个独 立的部件来实现移动终端的输入和输出功能,但是在某些实施例中,可以将触控面板1071与显示面板1061集成而实现移动终端的输入和输出功能,具体此处不做限定。In an embodiment, the touch panel 1071 may cover the display panel 1061. When the touch panel 1071 detects a touch operation on or near the touch panel 1071, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event. The type of touch event provides corresponding visual output on the display panel 1061. Although in FIG. 1, the touch panel 1071 and the display panel 1061 are implemented as two independent components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated. The implementation of the input and output functions of the mobile terminal is not specifically limited here.
接口单元108用作至少一个外部装置与移动终端100连接可以通过的接口。例如,外部装置可以包括有线或无线头戴式耳机端口、外部电源(或电池充电器)端口、有线或无线数据端口、存储卡端口、配置为连接具有识别模块的装置的端口、音频输入/输出(I/O)端口、视频I/O端口、耳机端口等等。接口单元108可以配置为接收来自外部装置的输入(例如,数据信息、电力等等)并且将接收到的输入传输到移动终端100内的一个或多个元件或者可以配置为在移动终端100和外部装置之间传输数据。The interface unit 108 functions as an interface through which at least one external device can connect with the mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power (or battery charger) port, a wired or wireless data port, a memory card port, a port configured to connect a device with an identification module, and audio input / output (I / O) port, video I / O port, headphone port, and more. The interface unit 108 may be configured to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the mobile terminal 100 or may be configured to connect the mobile terminal 100 and the external Transfer data between devices.
存储器109可配置为存储软件程序以及各种数据。存储器109可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器109可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 109 may be configured to store software programs and various data. The memory 109 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.), etc .; Data (such as audio data, phone book, etc.) created by the use of mobile phones. In addition, the memory 109 may include a high-speed random access memory, and may further include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
处理器110是移动终端的控制中心,利用各种接口和线路连接整个移动终端的各个部分,通过运行或执行存储在存储器109内的软件程序和/或模块,以及调用存储在存储器109内的数据,执行移动终端的各种功能和处理数据,从而对移动终端进行整体监控。处理器110可包括一个或多个处理单元;优选的,处理器110可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器110中。The processor 110 is a control center of the mobile terminal, and uses various interfaces and lines to connect various parts of the entire mobile terminal. The processor 110 runs or executes software programs and / or modules stored in the memory 109, and calls data stored in the memory 109. , Perform various functions of the mobile terminal and process data, so as to monitor the mobile terminal as a whole. The processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, etc. The processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 110.
移动终端100还可以包括给各个部件供电的电源111(比如电池),优选的,电源111可以通过电源管理系统与处理器110逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The mobile terminal 100 may further include a power source 111 (such as a battery) for supplying power to various components. Preferably, the power source 111 may be logically connected to the processor 110 through a power management system, so as to manage charging, discharging, and power consumption management through the power management system. And other functions.
尽管图1未示出,移动终端100还可以包括蓝牙模块等,在此不再赘述。Although not shown in FIG. 1, the mobile terminal 100 may further include a Bluetooth module and the like, and details are not described herein again.
为了便于理解本申请实施例,下面对本申请的移动终端所基于的通信网络系统进行描述。In order to facilitate understanding of the embodiments of the present application, the communication network system on which the mobile terminal of the present application is based is described below.
请参阅图2,图2为本申请实施例提供的一种通信网络系统架构图,该通信网络系统为通用移动通信技术的LTE系统,该LTE系统包括依次通讯连接的用户设备(User Equipment,UE)201,演进式UMTS陆地无线接入网(Evolved UMTS Terrestrial Radio Access Network,E-UTRAN)202,演进式分组核心网(Evolved Packet Core,EPC)203和运营商的IP业务204。Please refer to FIG. 2. FIG. 2 is a structural diagram of a communication network system according to an embodiment of the present application. The communication network system is a general mobile communication technology LTE system. The LTE system includes user equipment (User Equipment, UE 201), Evolved UMTS Terrestrial Radio Access Network (E-UTRAN) 202, Evolved Packet Core Network (EPC) 203, and IP service 204 of the operator.
具体地,UE201可以是上述终端100,此处不再赘述。Specifically, the UE 201 may be the foregoing terminal 100, and details are not described herein again.
E-UTRAN202包括eNodeB2021和其它eNodeB2022等。其中,eNodeB2021可以通过回程(backhaul)(例如X2接口)与其它eNodeB2022连接,eNodeB2021连接到EPC203,eNodeB2021可以提供UE201到EPC203的接入。 E-UTRAN 202 includes eNodeB 2021 and other eNodeB 2022. The eNodeB 2021 can be connected to other eNodeB 2022 through a backhaul (such as an X2 interface), the eNodeB 2021 is connected to the EPC203, and the eNodeB 2021 can provide UE201 to EPC203 access.
EPC203可以包括移动性管理实体(Mobility Management Entity,MME)2031,归属用户服务器(Home Subscriber Server,HSS)2032,其它MME2033,服务网关(Serving Gate Way,SGW)2034,分组数据网络网关(PDN Gate Way,PGW)2035和政策和资费功能实(Policy and Charging Rules Function,体PCRF)2036等。其中,MME2031是处理UE201和EPC203之间信令的控制节点,提供承载和连接管理。HSS2032配置为提供一些寄存器来管理诸如归属位置寄存器(图中未示)之类的功能,并且保存有一些有关服务特征、数据速率等用户专用的信息。所有用户数据都可以通过SGW2034进行发送,PGW2035可以提供UE 201的IP地址分配以及其它功能,PCRF2036是业务数据流和IP承载资源的策略与计费控制策略决策点,它为策略与计费执行功能单元(图中未示)选择及提供可用的策略和计费控制决策。EPC203 may include Mobility Management Entity (MME) 2031, Home Subscriber Server (HSS) 2032, other MME 2033, Serving Gateway (SGW) 2034, Packet Data Network Gateway (PDN GateWay) , PGW) 2035 and Policy and Charging Function (Function PCRF) 2036 and so on. Among them, MME2031 is a control node that processes signaling between UE201 and EPC203, and provides bearer and connection management. The HSS2032 is configured to provide some registers to manage functions such as the home location register (not shown in the figure), and holds some user-specific information about service characteristics, data rates, and so on. All user data can be sent through SGW2034. PGW2035 can provide UE 201 IP address allocation and other functions. PCRF2036 is a policy and charging control policy decision point for service data flows and IP bearer resources. It performs functions for policy and charging. Units (not shown) select and provide available policy and billing control decisions.
IP业务204可以包括因特网、内联网、IP多媒体子系统(IP Multimedia Subsystem,IMS)或其它IP业务等。The IP service 204 may include the Internet, an intranet, an IP Multimedia Subsystem (IMS), or other IP services.
虽然上述以LTE系统为例进行了介绍,但本领域技术人员应当知晓, 本申请实施例不仅仅适用于LTE系统,也可以适用于其他无线通信系统,例如GSM、CDMA2000、WCDMA、TD-SCDMA以及未来新的网络系统等,此处不做限定。Although the above is described by taking the LTE system as an example, those skilled in the art should know that the embodiments of the present application are not only applicable to the LTE system, but also applicable to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA, and New network systems in the future are not limited here.
基于上述移动终端硬件结构以及通信网络系统,提出本申请方法各个实施例。Based on the above-mentioned mobile terminal hardware structure and communication network system, various embodiments of the method of the present application are proposed.
请参考图3。本申请实施例提供一种语音通话的语音切换方法,应配置为移动终端,包括:Please refer to Figure 3. The embodiment of the present application provides a voice switching method for a voice call, which should be configured as a mobile terminal and includes:
S1、获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;S1. Acquire and approach the first PCM voice data received by the receiving end during the voice processing audio device switching process and store the first PCM voice data in the buffer;
S2、获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成,实现无缝切换。S2. Acquire and approach the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data Playback is complete within the switch time, enabling seamless switch.
在一实施例中,所述逼近语音处理为:采用逼近法结合语音处理对PCM语音数据的播放时间进行逼近,获得放逼近后的PCM语音数据。In an embodiment, the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
在一实施例中,在所述获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中的步骤S1之前,所述方法还包括:侦测语音通话过程中的音频设备切换的步骤,包括:接近传感器在语音通话过程中侦测移动终端的移动情况,如侦测到移动终端发生移动,判断出发生音频设备切换情况。In an embodiment, before step S1 of acquiring and approaching the first PCM voice data received by the receiving end during the process of acquiring and approaching the voice processing audio device switching and storing the first PCM voice data in the buffer, the method further includes: detecting the voice call process The step of switching the audio device includes: the proximity sensor detects the movement of the mobile terminal during a voice call, and if it detects that the mobile terminal has moved, it is determined that the audio device is switched.
在一实施例中,所述逼近语音处理为:采用逼近法结合语音处理对PCM语音数据的播放时间进行逼近,获得放逼近后的PCM语音数据。In an embodiment, the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
其中,所述语音处理为只改变语速并保持语音的语调及语义不变,所述语音处理分为语音分解、语音合成两个阶段。其中:所述语音分解阶段,完成原始PCM语音数据的分帧,分解后的帧用于语音合成处理;设帧长为N,帧移(相邻两帧的距离)为Sa;所述语音合成阶段,根据变速因子a=Ss/Sa,改变语音分解阶段的帧移Sa为语音合成阶段的帧移Ss=Sa*a,具体包括:保持语音分解阶段的第一帧位置不变,移动之后各帧,使得帧移Sa变为Ss,即可获得初步合成帧。Wherein, the speech processing is to change only the rate of speech and keep the intonation and semantics of the speech unchanged, and the speech processing is divided into two stages of speech decomposition and speech synthesis. Wherein: the speech decomposition phase completes the framing of the original PCM speech data, and the decomposed frames are used for speech synthesis processing; let the frame length be N and the frame shift (distance between two adjacent frames) be Sa; the speech synthesis In the phase, according to the shift factor a = Ss / Sa, the frame shift Sa of the speech decomposition phase is changed to the frame shift Ss = Sa * a of the speech synthesis phase. Specifically, the position of the first frame of the speech decomposition phase is maintained, Frame, so that the frame shift Sa becomes Ss, and a preliminary synthesized frame can be obtained.
在一实施例中,所述步骤S1中,所述获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中的步骤,包括:In an embodiment, in step S1, the step of acquiring and approaching the first PCM voice data received by the receiving end during the switching of the voice processing audio device and storing the first PCM voice data in the buffer includes:
S11、在语音通话过程中,接近传感器侦测移动终端发生移动,发生音频设备切换,例如发生听筒和扬声器之间的切换,记录所述移动终端即将离开人耳的第一时间戳T1;S11. During the voice call, the proximity sensor detects that the mobile terminal is moving and audio device switching occurs, such as switching between the handset and the speaker, and records the first time stamp T1 when the mobile terminal is about to leave the human ear;
S12、记录移动终端触控屏幕被点击音频设备切换时的第二时间戳T2,例如移动终端触控屏幕被点击从听筒模式到扬声器模式切换时的第二时间戳T2;S12. Record the second time stamp T2 when the touch screen of the mobile terminal is switched by clicking the audio device, for example, the second time stamp T2 when the touch screen of the mobile terminal is clicked to switch from the handset mode to the speaker mode;
S13、获取在音频设备切换时间T内接收端接收的第一PCM语音数据;其中,所述音频设备切换时间T是所述第二时间戳T2与所述第一时间戳T1之差,即T=T2-T1。同时,所述音频设备切换时间T也是该段时间内的存储在缓存的第一PCM语音数据的播放时间;S13. Acquire the first PCM voice data received by the receiving end within the audio device switching time T; wherein the audio device switching time T is a difference between the second time stamp T2 and the first time stamp T1, that is, T = T2-T1. At the same time, the audio device switching time T is also the playback time of the first PCM voice data stored in the buffer during that period;
S14、将获取到的所述第一PCM语音数据发送至缓存,存储在缓存中;S14. Send the obtained first PCM voice data to a buffer, and store the buffer in the buffer.
S15、逼近语音处理所述第一PCM语音数据,得到第1次第三PCM语音数据。S15. Approach the voice to process the first PCM voice data to obtain the first third PCM voice data.
在一实施例中,所述步骤S2中,所述获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成的步骤,包括:In an embodiment, in step S2, the acquiring and approximating the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first The steps of playing the PCM voice data and the second PCM voice data within the switching time include:
S21、播放第N次第三PCM语音数据;S21. Play the N-th third PCM voice data;
S22、在播放时间内获取音处理音频设备切换成功后接收端接收的第N次第二PCM语音数据;S22. Acquire the second Nth PCM voice data received by the receiving end after the audio processing audio device is successfully switched during the playing time;
S23、逼近语音处理所述第N次第二PCM语音数据,得到第N+1次第三PCM语音数据;S23: Approach the voice to process the N-th second PCM voice data to obtain the N + 1-th third PCM voice data;
循环执行步骤S21至步骤S23,直到逼近播放时间小于预设播放时间为止,此时,N等于逼近播放时间小于预设播放时间时的播放次数为止,所述第一PCM语音数据和N次所述第二PCM语音数据在切换时间内播放完成,其中,N为大于或等于1的整数。Steps S21 to S23 are executed in a loop until the approximate playback time is shorter than the preset playback time. At this time, N is equal to the number of playback times when the approximate playback time is shorter than the preset playback time. The first PCM voice data and N times are described. The playback of the second PCM voice data is completed within the switching time, where N is an integer greater than or equal to 1.
请参考图4。图4为本申请实施例采用逼近法结合缓存和语音处理完成 音频设备切换过程中的语音无缝切换的方法示意图。Please refer to Figure 4. FIG. 4 is a schematic diagram of a method for seamlessly switching voices during an audio device switching process by using an approximation method combining buffering and voice processing according to an embodiment of the present application.
在图4中,移动终端的音频设备切换(例如听筒切换到扬声器)发生在时间T[T1,T2],此时获取第一PCM语音数据,时间T即[T2,T3]为切换成功后紧接着与设备切换时间一致的接收端播放时间,此时获取第二PCM语音数据。In FIG. 4, the switching of the audio device of the mobile terminal (for example, switching of the handset to the speaker) occurs at time T [T1, T2]. At this time, the first PCM voice data is obtained, and time T is [T2, T3]. Then, the receiving end plays the same time as the device switching time, and at this time, the second PCM voice data is obtained.
采用逼近法结合缓存和语音处理逼近播放时间过程如下:The approximation method combined with buffering and speech processing to approximate the playback time is as follows:
在时间T[T1,T2]内:完成时间T[T1,T2]内第一PCM语音数据缓存和采用上述语音处理的语音分解、语音合成两个阶段过程,经过缓存和语音处理后将原始播放时间为T[T1,T2]的PCM语音数据(第一PCM语音数据)处理为播放时间Tx为T/2的PCM语音数据PCM-T/2,此时,PCM语音数据PCM-T/2即为处理后得到的第1次第三PCM语音数据,此时,播放时间相比本次原来播放时间减少一半,减少到原始播放时间的T/2。Within time T [T1, T2]: complete the first two PCM voice data buffers within the time T [T1, T2] and the two stages of speech decomposition and speech synthesis using the above-mentioned speech processing. After buffering and speech processing, the original playback The PCM voice data (first PCM voice data) with time T [T1, T2] is processed as PCM voice data PCM-T / 2 with playback time Tx of T / 2. At this time, the PCM voice data PCM-T / 2 is For the first and third PCM voice data obtained after processing, at this time, the playback time is reduced by half compared to the original playback time this time, and is reduced to T / 2 of the original playback time.
在时间T/2[T2,T2+T/2]内:完成上述PCM语音数据PCM-T/2的播放(即发送上述PCM语音数据PCM-T/2至切换后的音频设备进行播放),在完成播放的同时,完成时间T/2[T2,T2+T/2]内PCM语音数据(此时PCM语音数据包括T/2的第二PCM语音数据)缓存和采用上述语音处理的语音分解、语音合成两个阶段过程,经过缓存和语音处理后将原始播放时间为T/2[T2,T2+T/2]的PCM语音数据处理为播放时间Tx为T/4的PCM语音数据PCM-T/4,此时,PCM语音数据PCM-T/4即为处理后得到的第2次第三PCM语音数据,此时,播放时间相比本次原来播放时间减少一半,减少到原始播放时间的T/4。Within time T / 2 [T2, T2 + T / 2]: complete the playback of the above PCM voice data PCM-T / 2 (that is, send the above PCM voice data PCM-T / 2 to the switched audio device for playback), At the same time when the playback is completed, the PCM voice data within the time T / 2 [T2, T2 + T / 2] (in this case, the PCM voice data includes the second PCM voice data of T / 2) is buffered and the voice decomposition using the above voice processing is performed And speech synthesis. After buffering and speech processing, the PCM voice data with original playback time T / 2 [T2, T2 + T / 2] is processed into PCM voice data with playback time Tx T / 4. T / 4. At this time, the PCM voice data PCM-T / 4 is the second and third PCM voice data obtained after processing. At this time, the playback time is reduced by half compared to the original playback time and reduced to the original playback time. T / 4.
在时间T/4[T2+T/2,T2+T/2+T/4]内:完成上述PCM语音数据PCM-T/4的播放(即发送上述PCM语音数据PCM-T/4至切换后的音频设备进行播放),在完成播放的同时,完成时间T/4[T2+T/2,T2+T/2+T/4]内PCM语音数据(此时PCM语音数据包括(T/4的第二PCM语音数据)缓存和采用上述语音处理的语音分解、语音合成两个阶段过程,经过缓存和语音处理后将原始播放时间为T/4[T2+T/2,T2+T/2+T/4]的PCM语音数据处理为播放时间Tx为T/8的PCM语音数据PCM-T/8,此时,PCM语音数据PCM-T/8即为处理后得到的第3次第三PCM语音数据,此时,播放时间相比本次原 来播放时间减少一半,减少到原始播放时间的T/8。Within time T / 4 [T2 + T / 2, T2 + T / 2 + T / 4]: complete the playback of the above PCM voice data PCM-T / 4 (ie send the above PCM voice data PCM-T / 4 to switch The next audio device to play), while completing the playback, complete the PCM voice data within the time T / 4 [T2 + T / 2, T2 + T / 2 + T / 4] (At this time, the PCM voice data includes (T / 4 second PCM voice data) buffering and speech decomposition and speech synthesis using the above-mentioned speech processing, the original playback time is T / 4 [T2 + T / 2, T2 + T / 2 + T / 4] PCM voice data processing is PCM voice data PCM-T / 8 with playback time Tx of T / 8. At this time, PCM voice data PCM-T / 8 is the third time Three PCM voice data. At this time, the playback time is reduced by half compared to the original playback time and reduced to T / 8 of the original playback time.
…依次逼近,直到播放时间Tx小于预设播放时间Tu时,逼近结束。… Approximating in sequence until the playback time Tx is less than the preset playback time Tu, the approximation ends.
例如,如果音频设备切换时间为2秒,在上述步骤S1中,已经获取并缓存了在音频设备切换2秒期间的第一PCM语音数据,并对该2秒的PCM语音数据进行逼近语音处理后得到第1次播放时间为1秒的第三PCM语音数据。在音频设备切换成功后,在播放第1次第三PCM语音数据(1秒播放完成)的同时进行第1次第二PCM语音数据的获取,第1次第二PCM语音数据的原来播放时间为1秒,获取第1次第二PCM语音数据并对该第1次第二PCM语音数据进行逼近语音处理后使其播放时间为0.5秒的第2次第三PCM语音数据。For example, if the audio device switching time is 2 seconds, in step S1 above, the first PCM voice data during the 2-second audio device switching has been acquired and buffered, and the 2-second PCM voice data is subjected to approximate speech processing. The third PCM voice data with the first playback time of 1 second is obtained. After the audio device is successfully switched, the first second PCM voice data is acquired while playing the first third PCM voice data (1 second playback is complete). The original playback time of the first second PCM voice data is For 1 second, the second PCM voice data for the first time is obtained, and the second PCM voice data for the second time is processed after the first second PCM voice data is approximated to a speech time of 0.5 second.
在播放第2次第三PCM语音数据(0.5秒播放完成)的同时进行第2次第二PCM语音数据的获取,第2次第二PCM语音数据的原来播放时间为0.5秒,获取第2次第二PCM语音数据并对该第2次第二PCM语音数据进行逼近语音处理后使其播放时间为0.25秒的第3次第三PCM语音数据。Acquire the second second PCM voice data while playing the second third PCM voice data (0.5 second playback is complete), the original second playback time of the second second PCM voice data is 0.5 seconds, and the second time is acquired The third and third PCM voice data of the second PCM voice data are subjected to approximate voice processing for the second and second PCM voice data so that the playback time is 0.25 seconds.
在播放第3次第三PCM语音数据(0.25秒播放完成)的同时进行第3次第二PCM语音数据的获取,第3次第二PCM语音数据的原来播放时间为0.25秒,获取第3次第二PCM语音数据并对该第3次第二PCM语音数据进行逼近语音处理后使其播放时间为0.125秒的第4次第三PCM语音数据。Acquire the third second PCM voice data while playing the third third PCM voice data (0.25 second playback is complete). The original playback time of the third second PCM voice data is 0.25 seconds, and the third time is acquired. The fourth third PCM speech data is processed by the second PCM speech data, and the third second PCM speech data is subjected to approximate speech processing so that the playback time is 0.125 seconds.
…依次逼近,直到播放时间Tx小于预设播放时间Tu(如50毫秒)时,逼近结束。... approximating in sequence until the playback time Tx is less than the preset playback time Tu (for example, 50 milliseconds), the approximation ends.
第一次PCM语音数据播放在T/2时刻开始,播放时间Tx为T/2;第二次PCM语音数据播放在T/2+T/4时刻开始,播放时间Tx为T/4;第三次PCM语音数据播放在T/2+T/4+T/8时刻开始,播放时间Tx为T/8;依次类推,在T/2+T/4+T/8...+T/2^N时刻完成第N次PCM语音数据播放,播放时间Tx为T/2^N。由此可知,每一次PCM音频数据的播放时间成等比数列,首项为T/2,公比为1/2,根据以下等比数列求和公式,当N趋向于无穷大 时,PCM语音数据总的播放时间为T。The first PCM voice data playback starts at T / 2 and the playback time Tx is T / 2; the second PCM voice data playback starts at T / 2 + T / 4 and the playback time Tx is T / 4; the third The second PCM voice data playback starts at T / 2 + T / 4 + T / 8, and the playback time Tx is T / 8; and so on, at T / 2 + T / 4 + T / 8 ... + T / 2 ^ N completes the Nth PCM voice data playback, and the playback time Tx is T / 2 ^ N. It can be seen that the playback time of each PCM audio data is proportional, the first term is T / 2, and the common ratio is 1/2. According to the following proportional series summation formula, when N tends to infinity, PCM voice data The total playback time is T.
Figure PCTCN2019089623-appb-000001
Figure PCTCN2019089623-appb-000001
式中,T N为总的播放时间,1/2为每次播放时间之间的公比,N为播放次数。当N趋近于无穷大时,T N值为T。 In the formula, T N is the total playing time, 1/2 is the common ratio between each playing time, and N is the number of playing times. When N approaches infinity, the value of T N is T.
通过上述逼近法结合缓存、语音处理,当播放次数趋于无穷大时,时间T内可以实现无延时、无丢失的播放时间2T[T1,T3]内的所有PCM语音数据,实现无缝切换。Through the above-mentioned approximation method combined with buffering and voice processing, when the number of playbacks approaches infinity, all PCM voice data within 2T [T1, T3] without delay and loss can be realized within time T, and seamless switching can be realized.
在实际播放过程中,当播放时间Tx小于预设播放时间Tu(如50毫秒)时,就停止进行上述逼近,逼近过程结束。因为播放时间Tx小于预设播放时间Tu(如50毫秒)时用户几乎无法主观感觉到PCM语音数据未获取或者丢失。In the actual playback process, when the playback time Tx is shorter than the preset playback time Tu (for example, 50 milliseconds), the approximation is stopped, and the approximation process ends. Because the playback time Tx is less than the preset playback time Tu (such as 50 milliseconds), the user can hardly subjectively feel that the PCM voice data has not been acquired or lost.
在本申请一实施例中,对音频设备切换时间T有一定要求,要求音频设备切换时间T在预设最小切换时间Tmin(如0.5秒)和预设最大切换时间Tmax(如5秒)之间(预设最小切换时间Tmin<T<预设最大切换时间Tmax),当音频设备切换时间T大于预设最大切换时间Tmax时,音频设备切换时间过长,需要处理的PCM语音数据信息比较大,建议用户重新询问对话方请求重复。如果音频设备切换时间T小于预设最小切换时间Tmin时,音频设备切换时间非常短,可能用户在这个切换操作过程中几乎没有获取到的PCM语音数据信息,此时无需进行无缝切换。In an embodiment of the present application, there is a certain requirement for the switching time T of the audio device, and the switching time T of the audio device is required to be between a preset minimum switching time Tmin (such as 0.5 seconds) and a preset maximum switching time Tmax (such as 5 seconds). (The preset minimum switching time Tmin <T <the preset maximum switching time Tmax), when the audio device switching time T is greater than the preset maximum switching time Tmax, the audio device switching time is too long, and the PCM voice data information to be processed is relatively large, The user is advised to re-ask the interlocutor to request a repeat. If the switching time T of the audio device is shorter than the preset minimum switching time Tmin, the switching time of the audio device is very short, and the user may hardly obtain the PCM voice data information during this switching operation, and there is no need to perform seamless switching at this time.
通过本申请实施例提供的以上方法,可以将播放时间为T(音频设备切换时间)的所述第一PCM语音数据和相同播放时间为T的所述第二PCM语音数据经过本申请实施例的逼近语音缓存处理得到播放时间为T的处理后的第三PCM语音数据,播放时间相比原来减少一半。可以解决在通话过程中进行语音设备切换时使得下行所有PCM语音数据信息可以完整传送播放,无缝切换,使得用户在设备切换过程中不会丢失任何语音信息,增强了用户对话音信息的理解,增强了用户体验。Through the above method provided in the embodiment of the present application, the first PCM voice data with a playing time of T (audio device switching time) and the second PCM voice data with the same playing time of T can pass through the embodiments of the present application. The processed third PCM voice data with the playback time T is obtained by approaching the voice buffer processing, and the playback time is reduced by half compared to the original. It can solve the problem that when the voice device is switched during a call, all downlink PCM voice data information can be completely transmitted and played, and seamlessly switched, so that the user will not lose any voice information during the device switching process, and enhance the user's understanding of the dialogue information. Enhanced user experience.
请参考图5。本申请实施例提供一种语音通话的语音切换装置,应配置为移动终端,所述语音切换装置300包括:侦测模块301、获取模块302、缓存模块303、处理模块304,其中:Please refer to Figure 5. The embodiment of the present application provides a voice switching device for a voice call, which should be configured as a mobile terminal. The voice switching device 300 includes a detection module 301, an acquisition module 302, a cache module 303, and a processing module 304, of which:
所述侦测模块301,配置为侦测语音通话过程中的音频设备切换;The detection module 301 is configured to detect an audio device switch during a voice call;
所述获取模块302,配置为获取音频设备切换过程中接收端接收的第一PCM语音数据和在音频设备切换成功后与所述第一PCM语音数据时长相同的第二PCM语音数据;The obtaining module 302 is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same length as the first PCM voice data after the audio device is successfully switched;
所述缓存模块303,配置为缓存第一PCM语音数据以及在逼近语音处理过程中的PCM语音数据;The buffer module 303 is configured to buffer the first PCM voice data and the PCM voice data during the approaching voice processing;
所述处理模块304,配置为将缓存的所述第一PCM语音数据与音频设备切换成功后的所述第二PCM语音数据进行逼近语音处理,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。The processing module 304 is configured to perform approximate speech processing on the buffered first PCM speech data and the second PCM speech data after the audio device is successfully switched, so that the first PCM speech data and the second The PCM voice data is played within the switching time.
在一实施例中,所述侦测模块301为接近传感器。In one embodiment, the detection module 301 is a proximity sensor.
需要说明的是,上述装置实施例与方法实施例属于同一构思,其具体实现过程详见方法实施例,且方法实施例中的技术特征在装置实施例中均对应适用,这里不再赘述。It should be noted that the foregoing device embodiments and method embodiments belong to the same concept. For specific implementation processes, refer to the method embodiments, and the technical features in the method embodiments are correspondingly applicable in the device embodiments, and are not repeated here.
以下结合应用实施例对本申请的技术方案作进一步的详细描述。The technical solution of the present application is further described in detail in combination with the application examples below.
在本应用实施例中,以在通话过程中使语音通话从听筒无缝地切换到扬声器为例进行说明。In the embodiment of the present application, an example is described in which a voice call is seamlessly switched from a handset to a speaker during a call.
请参考图7。本申请应用实施例提供一种语音通话的语音切换方法,应用于移动终端,包括:Please refer to Figure 7. An application embodiment of the present application provides a voice switching method for a voice call, which is applied to a mobile terminal and includes:
S701、侦测语音通话过程中的音频设备切换。接近传感器在语音通话过程中侦测移动终端的移动情况,如侦测到移动终端发生移动,判断出发生音频设备切换情况。S701. Detect audio device switching during a voice call. The proximity sensor detects the movement of the mobile terminal during a voice call. If a movement of the mobile terminal is detected, it is determined that an audio device switching situation has occurred.
S702、在语音通话过程中,侦测移动终端发生移动,发生音频设备切换,即发生听筒和扬声器之间的切换,记录所述移动终端即将离开人耳的第一时间戳T1;S702. During a voice call, it is detected that the mobile terminal moves, and audio device switching occurs, that is, switching between the handset and the speaker, and records the first time stamp T1 when the mobile terminal is about to leave the human ear.
S703、记录移动终端触控屏幕被点击音频设备切换时的第二时间戳T2,即记录移动终端触控屏幕被点击从听筒模式到扬声器模式切换时的第二时间戳T2;S703. Record the second time stamp T2 when the touch screen of the mobile terminal is switched by clicking the audio device, that is, record the second time stamp T2 when the touch screen of the mobile terminal is clicked to switch from the handset mode to the speaker mode;
S704、获取在音频设备切换时间T内的第一PCM语音数据,并存储在缓存中;其中,所述音频设备切换时间T是所述第二时间戳T2与所述第一时间戳T1之差,即T=T2-T1。同时,所述音频设备切换时间T也是该段时间内的存储在缓存的第一PCM语音数据的播放时间。S704. Acquire the first PCM voice data within the audio device switching time T and store it in the buffer; wherein the audio device switching time T is a difference between the second time stamp T2 and the first time stamp T1. That is, T = T2-T1. At the same time, the audio device switching time T is also the playing time of the first PCM voice data stored in the buffer during this period.
S705、逼近语音处理所述第一PCM语音数据,得到第1次第三PCM语音数据。S705: Approach the voice to process the first PCM voice data to obtain the first third PCM voice data.
S706、播放第1次第三PCM语音数据,在播放时间内获取第1次第二PCM语音数据,并逼近语音处理第1次第二PCM语音数据,得到第2次第三PCM语音数据。S706. Play the first third PCM voice data, obtain the first second PCM voice data within the playback time, and approach the voice processing the first second PCM voice data to obtain the second third PCM voice data.
S707、播放第N次第三PCM语音数据,在播放时间内获取第N次第二PCM语音数据,并逼近语音处理第N次第二PCM语音数据,得到第N+1次第三PCM语音数据,其中,N>=2。S707. Play the N-th third PCM voice data, obtain the N-th second PCM voice data within the playback time, and approximate the voice to process the N-th second PCM voice data to obtain the N + 1-th third PCM voice data. , Where N> = 2.
S708、经过逼近结合语音处理后,对比逼近后的播放时间与预设播放时间,如果逼近后的播放时间小于预设播放时间,则停止逼近,转入S709,否则,转入S707,继续逼近播放时间。S708. After approximation combined with speech processing, compare the approximated playback time with the preset playback time. If the approximated playback time is less than the preset playback time, stop the approximation and go to S709, otherwise, go to S707 and continue the approximation playback. time.
S709、逼近结束。S709. The approaching ends.
此外,本申请实施例还提供一种终端,如图6所示,所述终端900包括:存储器902、处理器901及存储在所述存储器902中并可在所述处理器901上运行的一个或者多个计算机程序,所述存储器902和所述处理器901通过总线系统903耦合在一起,所述一个或者多个计算机程序被所述处理器901执行时以实现本申请实施例提供的一种语音通话的语音切换方法的以下步骤:In addition, an embodiment of the present application further provides a terminal. As shown in FIG. 6, the terminal 900 includes: a memory 902, a processor 901, and a processor stored in the memory 902 and operable on the processor 901. Or multiple computer programs, the memory 902 and the processor 901 are coupled together through a bus system 903, and the one or more computer programs are executed by the processor 901 to implement a method provided by an embodiment of the present application The following steps of the voice switching method for a voice call:
S1、获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;S1. Acquire and approach the first PCM voice data received by the receiving end during the voice processing audio device switching process and store the first PCM voice data in the buffer;
S2、获取并逼近语音处理音频设备切换成功后接收端接收的与所述第 一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成,实现无缝切换。S2. Acquire and approach the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data Playback is complete within the switch time, enabling seamless switch.
也就是说,所述处理器901运行计算机程序时实现本发明实施例方法的步骤。That is, when the processor 901 runs a computer program, the steps of the method of the embodiment of the present invention are implemented.
上述本申请实施例揭示的方法可以应用于所述处理器901中,或者由所述处理器901实现。所述处理器901可能是一种集成电路芯片,具有信号处理能力。在实现过程中,上述方法的各步骤可以通过所述处理器901中的硬件的集成逻辑电路或软件形式的指令完成。所述处理器901可以是通用处理器、DSP、或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。所述处理器901可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器902,所述处理器901读取存储器902中的信息,结合其硬件完成前述方法的步骤。The methods disclosed in the embodiments of the present application may be applied to the processor 901, or implemented by the processor 901. The processor 901 may be an integrated circuit chip and has a signal processing capability. In the implementation process, each step of the foregoing method may be completed by using an integrated logic circuit of hardware in the processor 901 or an instruction in the form of software. The processor 901 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. The processor 901 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present application. A general-purpose processor may be a microprocessor or any conventional processor. In combination with the steps of the method disclosed in the embodiments of the present application, the steps may be directly implemented by a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium. The storage medium is located in the memory 902. The processor 901 reads the information in the memory 902 and completes the steps of the foregoing method in combination with its hardware.
可以理解,本申请实施例的存储器902可以是易失性存储器或者非易失性存储器,也可以包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Read-Only Memory,EPROM)、电可擦除只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性随机存取存储器(Ferromagnetic Random Access Memory,FRAM)、闪存(Flash Memory)或其他存储器技术、光盘只读存储器(Compact Disk Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disk,DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置;易失性存储器可以是随机存取存储器(Random Access Memory,RAM),通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static Random Access Memory,SRAM)、静态随机存取存储器(Synchronous Static Random Access  Memory,SSRAM)、动态随机存取存储器(Dynamic Random Access Memory,DRAM)、同步动态随机存取存储器(Synchronous Dynamic Random Access Memory,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced Synchronous Dynamic Random Access Memory,ESDRAM)、同步连接动态随机存取存储器(SyncLink Dynamic Random Access Memory,SLDRAM)、直接内存总线随机存取存储器(Direct Rambus Random Access Memory,DRRAM)。本申请实施例描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory 902 in the embodiment of the present application may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory. The non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), and an erasable programmable read-only memory (PROM). , EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Ferromagnetic Random Access Memory (FRAM), Flash (Memory) or other memory technologies, CD-ROM read-only Memory (Compact Disk Read-Only Memory (CD-ROM), Digital Video Disk (DVD) or other optical disk storage, magnetic box, magnetic tape, disk storage or other magnetic storage devices; volatile memory can be random Random Access Memory (RAM). By way of example but not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Random Access Random Memory. Access Memory (SSRAM), Dynamic Random Access Memory (Dynamic Random Access Memory) mory (DRAM), synchronous dynamic random access memory (Synchronous Random Access Memory, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate synchronous Dynamic Random Access Memory (DDRSDRAM), enhanced synchronous dynamic random access memory (DDRSDRAM) Access memory (Enhanced, Dynamic, Random, Access, Memory, ESDRAM), Synchronous Link Dynamic Random Access Memory (SyncLink, Random, Access Memory, SLDRAM), Direct Memory Bus Random Access Memory (Direct Rambus, Random Access Memory, DRRAM). The memories described in the embodiments of the present application are intended to include, but not limited to, these and any other suitable types of memories.
需要说明的是,上述终端实施例与方法实施例属于同一构思,其具体实现过程详见方法实施例,且方法实施例中的技术特征在终端实施例中均对应适用,这里不再赘述。It should be noted that the above-mentioned terminal embodiments and method embodiments belong to the same concept. For specific implementation processes, refer to the method embodiments, and the technical features in the method embodiments are correspondingly applicable in the terminal embodiments, and are not repeated here.
另外,在示例性实施例中,本申请实施例还提供一种计算机存储介质,具体为计算机可读存储介质,例如包括存储计算机程序的存储器902,所述计算机存储介质上存储有语音通话的语音切换方法的一个或者多个程序,所述语音通话的语音切换方法的一个或者多个程序被处理器901执行时以实现本申请实施例提供的一种语音通话的语音切换方法的以下步骤:In addition, in an exemplary embodiment, an embodiment of the present application further provides a computer storage medium, specifically a computer-readable storage medium, such as a memory 902 including a computer program, where the computer storage medium stores a voice of a voice call. One or more programs of the switching method. When one or more programs of the voice switching method of the voice call are executed by the processor 901 to implement the following steps of the voice switching method of the voice call provided by the embodiment of the present application:
S1、获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;S1. Acquire and approach the first PCM voice data received by the receiving end during the voice processing audio device switching process and store the first PCM voice data in the buffer;
S2、获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成,实现无缝切换。S2. Acquire and approach the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data Playback is complete within the switch time, enabling seamless switch.
也就是说,所述语音通话的语音切换方法的一个或者多个程序被处理器901执行时实现本申请实施例提供的方法。That is, when one or more programs of the voice switching method of the voice call are executed by the processor 901, the method provided in the embodiment of the present application is implemented.
需要说明的是,上述计算机可读存储介质上的语音通话的语音切换方法程序实施例与方法实施例属于同一构思,其具体实现过程详见方法实施例,且方法实施例中的技术特征在上述计算机可读存储介质的实施例中均对应适用,这里不再赘述。It should be noted that the foregoing program switching method embodiment and method embodiment of a voice switching method for a voice call on a computer-readable storage medium belong to the same concept. For specific implementation processes, see the method embodiment, and the technical features in the method embodiment are described above. The embodiments of the computer-readable storage medium are correspondingly applicable, and are not repeated here.
本申请提供的一种语音通话的语音切换方法、装置、终端及计算机可读存储介质,获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成,实现无缝切换。通过以上逼近语音处理的技术手段,可以解决在通话过程中进行语音设备切换时使得下行所有PCM语音数据信息可以完整传送播放,无缝切换,使得用户在设备切换过程中不会丢失任何语音信息,增强了用户对话音信息的理解,增强了用户体验。A method, device, terminal, and computer-readable storage medium for a voice call provided by the present application, obtain and approach the first PCM voice data received by a receiving end during a voice processing audio device switching process, and store it in a buffer; obtain and Approach the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data are played within the switching time Done for seamless switching. Through the above-mentioned technical methods for approaching voice processing, it can be solved that when the voice device is switched during a call, all downlink PCM voice data information can be completely transmitted and played, and seamlessly switched, so that the user will not lose any voice information during the device switching process. The understanding of the user's dialogue information is enhanced, and the user experience is enhanced.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this article, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, It also includes other elements not explicitly listed, or elements inherent to such a process, method, article, or device. Without more restrictions, an element limited by the sentence "including a ..." does not exclude that there are other identical elements in the process, method, article, or device that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present application are merely for description, and do not represent the superiority or inferiority of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the methods in the above embodiments can be implemented by means of software plus a necessary universal hardware platform, and of course, also by hardware, but in many cases the former is better. Implementation. Based on such an understanding, the technical solution of this application that is essentially or contributes to the existing technology can be embodied in the form of a software product, which is stored in a storage medium (such as ROM / RAM, magnetic disk, The CD-ROM) includes several instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the methods described in the embodiments of the present application.
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本申请的保护之内。The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the specific implementations described above, and the specific implementations described above are only schematic and not restrictive. Those of ordinary skill in the art at Under the enlightenment of this application, many forms can be made without departing from the scope of the present application and the scope of protection of the claims, and these all fall into the protection of this application.

Claims (10)

  1. 一种语音切换方法,包括:A voice switching method includes:
    获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中;Acquiring and approaching the first PCM voice data received by the receiving end during the switching process of the voice processing audio device and storing it in the buffer;
    获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。Acquiring and approaching the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, so that the first PCM voice data and the second PCM voice data are switched at the switching time The playback is complete.
  2. 根据权利要求1所述的方法,其中,在所述获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存中的步骤之前,所述方法还包括:侦测语音通话过程中的音频设备切换。The method according to claim 1, wherein before the step of acquiring and approaching the first PCM voice data received by the receiving end during the process of switching the voice processing audio device and storing the first PCM voice data in the buffer, the method further comprises: detecting Audio device switching during a voice call.
  3. 根据权利要求1所述的方法,其中,所述获取并逼近语音处理音频设备切换过程中接收端接收的第一PCM语音数据并存储在缓存的步骤,包括:The method according to claim 1, wherein the step of obtaining and approximating the first PCM voice data received by the receiving end during the switching of the voice processing audio device and storing it in the buffer comprises:
    发生音频设备切换时,记录移动终端即将离开人耳的第一时间戳T1;When the audio device switching occurs, the first time stamp T1 at which the mobile terminal is about to leave the human ear is recorded;
    记录移动终端触控屏幕被点击音频设备切换时的第二时间戳T2;Record the second time stamp T2 when the touch screen of the mobile terminal is switched by clicking the audio device;
    获取在音频设备切换时间T内的第一PCM语音数据,存储在缓存中;其中,所述音频设备切换时间T=T2-T1;Acquiring the first PCM voice data within the audio device switching time T and storing it in a buffer; wherein, the audio device switching time T = T2-T1;
    逼近语音处理所述第一PCM语音数据,得到第1次第三PCM语音数据。The first PCM speech data is processed by approaching the speech to obtain the first third PCM speech data.
  4. 根据权利要求3所述的方法,其中,所述获取并逼近语音处理音频设备切换成功后接收端接收的与所述第一PCM语音数据时长相同的第二PCM语音数据,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成的步骤,包括:The method according to claim 3, wherein the acquiring and approximating the second PCM voice data of the same duration as the first PCM voice data received by the receiving end after the voice processing audio device is successfully switched, makes the first PCM The step of playing the voice data and the second PCM voice data within the switching time includes:
    播放第N次第三PCM语音数据;Play N-th third PCM voice data;
    在播放时间内获取音处理音频设备切换成功后接收端接收的第N次第二PCM语音数据;Acquiring the Nth second PCM voice data received by the receiving end after the switching of the audio processing audio device is successful within the playing time;
    逼近语音处理所述第N次第二PCM语音数据,得到第N+1次第三PCM语音数据;Approximate the speech and process the Nth second PCM speech data to obtain the N + 1th third PCM speech data;
    循环执行以上步骤,直到N等于逼近播放时间小于预设播放时间时的播放次数为止,其中,N为大于或等于1的整数。The above steps are performed in a loop until N is equal to the number of playback times when the playback time is shorter than the preset playback time, where N is an integer greater than or equal to 1.
  5. 根据权利要求1至4任一项所述的方法,其中,所述逼近语音处理为:采用逼近法结合语音处理对PCM语音数据的播放时间进行逼近,获得放逼近后的PCM语音数据。The method according to any one of claims 1 to 4, wherein the approximation speech processing is: approximating the playback time of PCM speech data by using an approximation method in combination with speech processing to obtain the approximation PCM speech data.
  6. 根据权利要求5所述的方法,其中,所述语音处理分为语音分解、语音合成两个阶段;其中:The method according to claim 5, wherein the speech processing is divided into two stages of speech decomposition and speech synthesis; wherein:
    所述语音分解阶段,完成原始PCM语音数据的分帧,分解后的帧用于语音合成处理;设帧长为N,帧移为Sa;In the speech decomposition phase, the framing of the original PCM speech data is completed, and the decomposed frames are used for speech synthesis processing; the frame length is set to N and the frame shift is Sa;
    所述语音合成阶段,保持语音分解阶段的第一帧位置不变,移动之后各帧,根据变速因子a=Ss/Sa,改变语音分解阶段的帧移Sa为语音合成阶段的帧移Ss=Sa*a。In the speech synthesis stage, the position of the first frame of the speech decomposition stage is maintained. After each frame is moved, the frame shift Sa of the speech decomposition stage is changed to the frame shift Ss = Sa of the speech synthesis stage according to the shift factor a = Ss / Sa. * a.
  7. 一种语音切换装置,包括:获取模块、缓存模块、处理模块,其中:A voice switching device includes: an acquisition module, a cache module, and a processing module, wherein:
    所述获取模块,配置为获取音频设备切换过程中接收端接收的第一PCM语音数据和在音频设备切换成功后与所述第一PCM语音数据时长相同的第二PCM语音数据;The obtaining module is configured to obtain first PCM voice data received by a receiving end during a switching process of an audio device and second PCM voice data having the same duration as the first PCM voice data after the audio device is successfully switched;
    所述缓存模块,配置为缓存所述第一PCM语音数据以及在逼近语音处理过程中的PCM语音数据;The buffer module is configured to buffer the first PCM voice data and the PCM voice data in the approaching voice processing process;
    所述处理模块,配置为将所述第一PCM语音数据与所述第二PCM语音数据进行逼近语音处理,使得所述第一PCM语音数据和所述第二PCM语音数据在切换时间内播放完成。The processing module is configured to perform approximate speech processing on the first PCM speech data and the second PCM speech data, so that the first PCM speech data and the second PCM speech data are played within the switching time. .
  8. 根据权利要求7所述的装置,其中,所述装置还包括侦测模块,所述侦测模块,配置为侦测语音通话过程中的音频设备切换。The device according to claim 7, wherein the device further comprises a detection module configured to detect an audio device switch during a voice call.
  9. 一种终端,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至6中任一项所述的语音切换方法的步骤。A terminal includes: a memory, a processor, and a computer program stored on the memory and operable on the processor. When the computer program is executed by the processor, any one of claims 1 to 6 is implemented. Steps of a speech switching method.
  10. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的语音切换方法的步骤。A computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the voice switching method according to any one of claims 1 to 6 are implemented.
PCT/CN2019/089623 2018-07-10 2019-05-31 Voice handover method, apparatus, terminal, and computer-readable storage medium WO2020010963A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810752172.2 2018-07-10
CN201810752172.2A CN110708417B (en) 2018-07-10 2018-07-10 Voice switching method, device, terminal and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2020010963A1 true WO2020010963A1 (en) 2020-01-16

Family

ID=69142181

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089623 WO2020010963A1 (en) 2018-07-10 2019-05-31 Voice handover method, apparatus, terminal, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN110708417B (en)
WO (1) WO2020010963A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113434309A (en) * 2021-06-23 2021-09-24 东风汽车有限公司东风日产乘用车公司 Message broadcasting method, device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009224911A (en) * 2008-03-13 2009-10-01 Onkyo Corp Headphone
CN101854425A (en) * 2009-04-02 2010-10-06 深圳富泰宏精密工业有限公司 Mobile device and sound mode switching method thereof
CN103811033A (en) * 2012-11-14 2014-05-21 北京新媒传信科技有限公司 Method and device for controlling voice playing modes
CN103984518A (en) * 2014-04-18 2014-08-13 青岛尚慧信息技术有限公司 Intelligent mobile terminal
CN103985394A (en) * 2014-04-18 2014-08-13 青岛尚慧信息技术有限公司 Audio file playing method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10079030B2 (en) * 2016-08-09 2018-09-18 Qualcomm Incorporated System and method to provide an alert using microphone activation
CN107577447A (en) * 2017-08-24 2018-01-12 联想(北京)有限公司 Control the method and apparatus for media playing of media data output
CN107872584A (en) * 2017-11-24 2018-04-03 维沃移动通信有限公司 A kind of multi-media processing method, multimedia equipment and terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009224911A (en) * 2008-03-13 2009-10-01 Onkyo Corp Headphone
CN101854425A (en) * 2009-04-02 2010-10-06 深圳富泰宏精密工业有限公司 Mobile device and sound mode switching method thereof
CN103811033A (en) * 2012-11-14 2014-05-21 北京新媒传信科技有限公司 Method and device for controlling voice playing modes
CN103984518A (en) * 2014-04-18 2014-08-13 青岛尚慧信息技术有限公司 Intelligent mobile terminal
CN103985394A (en) * 2014-04-18 2014-08-13 青岛尚慧信息技术有限公司 Audio file playing method

Also Published As

Publication number Publication date
CN110708417B (en) 2021-02-23
CN110708417A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
CN106982286B (en) Recording method, recording equipment and computer readable storage medium
CN108234876B (en) Tracking focusing method, terminal and computer readable storage medium
CN107493497A (en) A kind of video broadcasting method, terminal and computer-readable recording medium
CN109618052A (en) A kind of call audio switching method and device, mobile terminal and readable storage medium storing program for executing
CN112004174B (en) Noise reduction control method, device and computer readable storage medium
CN108076204A (en) The method and terminal of a kind of call treatment
WO2021190545A1 (en) Call processing method and electronic device
WO2021129835A1 (en) Volume control method and device, and computer-readable storage medium
CN109743764A (en) A kind of identification elevator scene method and device, mobile terminal device and storage medium
CN107707728A (en) A kind of audio transmission switching method, terminal and computer-readable recording medium
WO2021129818A1 (en) Video playback method and electronic device
CN107463255A (en) A kind of video broadcasting method, terminal and computer-readable recording medium
CN109660973A (en) Bluetooth control method, mobile terminal and storage medium
CN109711830B (en) Quick display method and device for two-dimension code payment, mobile terminal and storage medium
WO2021098708A1 (en) Calling method, and terminal apparatus
WO2020010963A1 (en) Voice handover method, apparatus, terminal, and computer-readable storage medium
WO2022105874A1 (en) Image display method, terminal device, and storage medium
CN112887776B (en) Method, equipment and computer readable storage medium for reducing audio delay
CN109067976A (en) desktop automatic switching method, mobile terminal and computer readable storage medium
CN109739641B (en) Self-adaptive CPU frequency modulation acceleration method and device, mobile terminal and storage medium
CN110087013B (en) Video chat method, mobile terminal and computer readable storage medium
CN109739642B (en) CPU frequency modulation method and device, mobile terminal and computer readable storage medium
CN107566745B (en) Shooting method, terminal and computer readable storage medium
CN109862163A (en) A kind of screen sound-emanating areas optimization method and device, mobile terminal and storage medium
CN110198378A (en) It is a kind of using the method and device of data buffer storage, mobile terminal and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19835212

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19835212

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19835212

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04/02/2022)