WO2020073536A1 - 语音切换方法、电子设备及系统 - Google Patents

语音切换方法、电子设备及系统 Download PDF

Info

Publication number
WO2020073536A1
WO2020073536A1 PCT/CN2018/125853 CN2018125853W WO2020073536A1 WO 2020073536 A1 WO2020073536 A1 WO 2020073536A1 CN 2018125853 W CN2018125853 W CN 2018125853W WO 2020073536 A1 WO2020073536 A1 WO 2020073536A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
voip
server
voice
mobile phone
Prior art date
Application number
PCT/CN2018/125853
Other languages
English (en)
French (fr)
Inventor
蔡双林
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201880098492.9A priority Critical patent/CN112806067B/zh
Priority to US17/281,514 priority patent/US11838823B2/en
Priority to CN202211073008.1A priority patent/CN115665100A/zh
Publication of WO2020073536A1 publication Critical patent/WO2020073536A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W36/00Hand-off or reselection arrangements
    • H04W36/34Reselection control
    • H04W36/36Reselection control by user or terminal equipment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1083In-session procedures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/10Architectures or entities
    • H04L65/1059End-user terminal functionalities specially adapted for real-time communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1083In-session procedures
    • H04L65/1094Inter-user-equipment sessions transfer or sharing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1101Session protocols
    • H04L65/1104Session initiation protocol [SIP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42229Personal communication services, i.e. services related to one subscriber independent of his terminal and/or location
    • H04M3/42263Personal communication services, i.e. services related to one subscriber independent of his terminal and/or location where the same subscriber uses different terminals, i.e. nomadism
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/58Arrangements for transferring received calls from one subscriber to another; Arrangements affording interim conversations between either the calling or the called party and a third party
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W36/00Hand-off or reselection arrangements
    • H04W36/34Reselection control
    • H04W36/36Reselection control by user or terminal equipment
    • H04W36/362Conditional handover
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/02Details of telephonic subscriber devices including a Bluetooth interface

Definitions

  • the present application relates to the field of mobile communications, and in particular to a voice switching method, electronic equipment and system.
  • a user or family often has multiple electronic devices that can communicate with each other.
  • users can use the same account to log in to mobile phones, home smart speakers, smart TVs and other electronic devices, so that multiple electronic devices under the same account form a local area network (LAN), each of the electronic devices in the local area network All can communicate with each other through the server.
  • LAN local area network
  • the speaker can be used to obtain audio resources from the server and play
  • the mobile phone can be used to obtain audio resources from the server and play.
  • the purpose of the present invention is to provide a voice switching method, electronic device and system, which can improve the efficiency of switching voices (such as VoIP calls or audio playback) between electronic devices and improve the user experience.
  • a method for voice switching may include: a second electronic device (eg, a smart speaker) detects a user's voice input; and in response to the voice input, the second electronic device communicates with a third electronic device through a VoIP server Establish a VoIP call; the first electronic device (such as a mobile phone) sends first switching request information to the VoIP server, and the first switching request information is used to request the VoIP server to switch the VoIP call on the second electronic device To the first electronic device; wherein, the first switching request information includes a first account (for example, HUAWEI-01), the first account is used to log in to the device management server; the VoIP server receives the first switching request Information; in response to the first switching request information, the VoIP server determines that the source device of the VoIP service corresponding to the first account is the second electronic device; the VoIP server will initiate the VoIP call on the second electronic device Switch to the aforementioned first electronic device.
  • a first electronic device eg, a smart speaker
  • the source device of the VoIP service is a smart speaker
  • the target device of the VoIP service is a mobile phone.
  • the mobile phone may send the first switching request information to the VoIP server, so that the VoIP server seamlessly switches the VoIP service being executed on the smart speaker under the same account to the mobile phone.
  • the VoIP service will not be interrupted during the switching process, and users do not need to repeatedly operate between multiple devices, thereby improving the efficiency of voice switching between multiple devices and the user experience.
  • the method may further include: both the first electronic device and the second electronic device log in to the device management server using the first account. This indicates that the two electronic devices belong to the same user's electronic device, and that both electronic devices use the same cloud service provider's services.
  • the first electronic device sends the first switching request information to the VoIP server, which may specifically include: when the first electronic device detects a specific operation of the user, in response to the specific operation, the first electronic device The device sends the first handover request information to the VoIP server.
  • the above specific operation is one of the following operations: flipping the phone, tapping the screen with the knuckle, double-clicking the power button, preset voice input, or preset slide gesture.
  • the first electronic device sends the first switching request information to the VoIP server, which may specifically include: when the first electronic device detects a specific condition, in response to the specific condition, the first electronic device The first switching request information may be sent to the VoIP server.
  • the above specific condition may be: Wi-Fi signal strength or Bluetooth signal strength in the WLAN network; for example: when the first electronic device detects that the Wi-Fi signal strength is lower than a preset threshold, the first electronic device sends a message to the VoIP server Sending the first switching request information; or, when the first electronic device detects that the Bluetooth signal strength of the second electronic device is lower than a preset threshold, the first electronic device sends the first switching request information to the VoIP server.
  • the first electronic device can automatically trigger the process of sending the first switching request information according to the detected specific conditions, thus reducing user participation, and implementing a more intelligent voice switching method, which further improves the voice Switching efficiency.
  • the above method may further include: the first electronic device sends a response message to the VoIP server to successfully join the VoIP call; after receiving the response message, the VoIP server interrupts the VoIP on the second electronic device business.
  • the first electronic device such as a mobile phone
  • joins the VoIP call there is a three-party VoIP call between the first electronic device, the second electronic device, and the third electronic device. If you want to switch to the first electronic device to continue execution, then interrupting the VoIP service of the second electronic device is a method that saves network resources.
  • the VoIP server determines that the source device of the VoIP service corresponding to the first account is the second electronic device, which may specifically include: the VoIP server sends the first account to the device management server; the device management server according to the above The first account determines at least one electronic device logged in using the first account; the device management server sends the device identification of the at least one electronic device to the VoIP server; the VoIP server determines that a VoIP call is being made under the first account according to the device identification
  • the source device is the second electronic device.
  • the source device that determines that the VoIP call is being made under the first account is a second electronic device, which may specifically include: when the VoIP server determines that the VoIP call is being made under the first account according to the device identifier
  • the VoIP server sends the device identifications of the at least two electronic devices to the first electronic device; at least two options are displayed on the first electronic device; the at least two options are used to indicate the above At least two electronic devices; the first electronic device detects a user's selection operation on one of the options; wherein, the above option represents the second electronic device; in response to the above selection operation, the first electronic device sends the device identification of the second electronic device To the VoIP server; the VoIP server determines according to the received device identifier of the second electronic device that the source device that is making the VoIP call under the first account is the second electronic device.
  • a voice switching system in a second aspect, includes a first electronic device, a second electronic device, a device management server, and a VoIP server; wherein: the second electronic device is used for detecting The VoIP server establishes a VoIP call with a third electronic device; the first electronic device is used to send first switching request information to the VoIP server, and the first switching request information is used to request the VoIP server to be on the second electronic device The ongoing VoIP call is switched to the first electronic device; wherein, the first switching request information includes a first account, the first account is used to log in to the device management server; the VoIP server is used to receive the first A switching request message, and determining that the source device of the VoIP service corresponding to the first account is the second electronic device; the VoIP server is also used to switch the VoIP call being performed on the second electronic device to the first On an electronic device.
  • both the first electronic device and the second electronic device log in to the device management server using the first account.
  • the first electronic device is further configured to: when detecting a specific operation of the user, send the first handover request information to the VoIP server; the specific operation is one of the following operations: Flip the phone, tap the knuckles on the screen, double-tap the power button, preset voice input or preset slide gestures.
  • the first electronic device sends the first switching request information to the VoIP server, which specifically includes: when the first electronic device detects a specific condition, the first electronic device sends to the VoIP server The first handover request information.
  • the specific condition is: Wi-Fi signal strength or Bluetooth signal strength in the WLAN network; wherein: when the first electronic device detects that the Wi-Fi signal strength is lower than a preset threshold When the first electronic device sends the first switching request information to the VoIP server; or, when the first electronic device detects that the Bluetooth signal strength of the second electronic device is lower than a preset threshold, the first An electronic device sends the first handover request information to the VoIP server.
  • the first electronic device is also used to send a response message to the VoIP server to successfully join the VoIP call; the VoIP server is also used to interrupt the second electronic device after receiving the response message VoIP service on the device.
  • the VoIP server determines that the source device of the VoIP service corresponding to the first account is the second electronic device, which specifically includes: the VoIP server sends the first account to the device management server; the The device management server determines at least one electronic device that logs in using the first account according to the first account number; the device management server sends the device identification of the at least one electronic device to the VoIP server; the VoIP server determines based on the device identification The source device making the VoIP call under the first account is the second electronic device.
  • the VoIP server is further configured to, when it is determined according to the device identification that the source device that is conducting the VoIP call under the first account has at least two electronic devices, The device identification is sent to the first electronic device; the first electronic device is also used to display at least two options; the at least two options are used to represent the at least two electronic devices; the first electronic device detects that the user The selection operation of the option; wherein, the option represents the second electronic device; the first electronic device is also used to send the device identification of the second electronic device to the VoIP server; the VoIP server is also used to The device identifier of the second electronic device determines that the source device that is conducting the VoIP call under the first account is the second electronic device.
  • the first electronic device is a mobile phone
  • the second electronic device is a smart speaker configured with a voice assistant system.
  • an electronic device for voice switching has a function to implement the behavior of the first electronic device in the above method.
  • the functions can be realized by hardware, or can also be realized by hardware executing corresponding software.
  • the hardware or software may include one or more modules corresponding to the above functions.
  • an electronic device for voice switching has a function to implement the behavior of the second electronic device in the above method.
  • the functions can be realized by hardware, or can also be realized by hardware executing corresponding software.
  • the hardware or software may include one or more modules corresponding to the above functions.
  • FIG. 1 is a schematic diagram of an implementation scenario of a voice switching system in an embodiment
  • FIG. 2 is a schematic structural diagram of a first electronic device (such as a mobile phone) in an embodiment
  • FIG. 3 is a schematic structural diagram of a second electronic device (such as a smart speaker) in an embodiment
  • FIG. 4 is a schematic flowchart of a voice switching method provided in an embodiment
  • FIG. 5 is a schematic diagram of a user interface of the electronic device 101 in an embodiment
  • FIG. 6 is a schematic flowchart of a voice switching method provided in another embodiment
  • FIG. 7 is a schematic diagram of a user interface of the electronic device 102 in another embodiment
  • FIG. 8 is a schematic flowchart of a voice switching method provided in another embodiment
  • FIG. 9 is a schematic structural diagram of a voice switching system in an embodiment.
  • the term “when” may be interpreted to mean “if " or “after” or “in response to determination " or “in response to detection ".
  • the phrase “when determined” or “if detected (the stated condition or event)” may be interpreted to mean “if determined " or “in response to determination " or “detected (The stated condition or event) “or” in response to the detection (the stated condition or event) ".
  • first electronic device “second electronic device”, etc. may be used herein to describe various electronic devices, these electronic devices should not be limited by these terms. These terms are only used to distinguish one electronic device from another electronic device.
  • the first electronic device may be named the second electronic device, and similarly, the second electronic device may also be named the first electronic device without departing from the scope of the present application.
  • Both the first electronic device and the second electronic device are electronic devices, but they may not be the same electronic device, or may be the same electronic device in some scenarios.
  • the electronic device may be a portable electronic device that also contains other functions such as personal digital assistant and / or music player functions, such as a mobile phone, a tablet computer, and a wearable electronic device with wireless communication function (such as a smart watch) Wait.
  • portable electronic devices include, but are not limited to Or portable electronic devices of other operating systems.
  • the above portable electronic device may also be other portable electronic devices, such as a laptop computer with a touch panel or a touch-sensitive surface. It should also be understood that, in some other embodiments, the electronic device may not be a portable electronic device, but a desktop computer.
  • the voice switching system 100 may include one or more electronic devices, such as a first electronic device (such as the electronic device 101 in FIG. 1) and a first Two electronic devices (such as electronic device 102 in FIG. 1).
  • a first electronic device such as the electronic device 101 in FIG. 1
  • a first Two electronic devices such as electronic device 102 in FIG. 1.
  • the specific structure of the first electronic device will be described in detail in conjunction with FIG. 2 in subsequent embodiments; the specific structure of the second electronic device will be described in detail in conjunction with FIG. 3 in subsequent embodiments.
  • the electronic device 101 may be connected to the electronic device 102 through one or more networks 109 (eg, wired or wireless).
  • the one or more communication networks 109 may be a local area network or a wide area network (WAN) such as the Internet.
  • WAN wide area network
  • the one or more communication networks 109 may be implemented using any known network communication protocol, and the network communication protocol may be various wired or wireless communication protocols, such as Ethernet, universal serial bus (USB), FireWire (FIREWIRE), global mobile communication system (global system for mobile communications, GSM), general packet radio service (general packet radio service (GPRS), code division multiple access (code division multiple access, CDMA), broadband code division Multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (time-division code division multiple access, TD-SCDMA), long-term evolution (long term evolution (LTE), Bluetooth, wireless fidelity, Wi-Fi), Voice over Internet Protocol (Voice over Internet Protocol, VoIP), or any other suitable communication protocol.
  • USB universal serial bus
  • FireWire FireWire
  • GSM global mobile communication system
  • GSM global system for mobile communications
  • general packet radio service general packet radio service
  • code division multiple access code division multiple access
  • CDMA broadband code division Multiple access
  • WCDMA wideband code division Multiple access
  • the voice switching system 100 described above may further include a device management server 103, which is used to manage at least one registered electronic device (such as the electronic device 101 and the electronic device 102).
  • a device management server 103 which is used to manage at least one registered electronic device (such as the electronic device 101 and the electronic device 102).
  • the device management server 103 can authenticate the electronic device (for example, verify whether the account number and password match); after the authentication is passed,
  • the device management server 103 may allow the electronic device 101 to access data corresponding to the device management server 103 such as storage space and the like.
  • the device management server 103 configures a storage space for the electronic device (101, 102), so that the electronic device (101, 102) can send data (such as pictures and videos) stored in the memory of the electronic device (101, 102) to the device management through the network 109.
  • the server 103 then saves the received data in the storage space configured for the electronic devices (101, 102) by the device management server.
  • the device management server 103 can also configure the electronic devices (101, 102) through the network 109.
  • the account number may refer to the credential used by the electronic device to log in to the device management server 103.
  • Electronic devices (101, 102) need to use an account to log in to the device management server 103 in order to use some functions of the electronic device. For example, the electronic device needs to log in to the account to use fingerprint recognition, contact synchronization, phone retrieval and other functions. In the case of, the above functions cannot be used.
  • the verification information may be sent to the device management server 103 through the network 109 for verification. It can be understood that, because the above-mentioned device management server is mainly used for authenticating the account of the electronic device, the cloud server can know which electronic devices have logged in to the same account.
  • the electronic device 101 and the electronic device 102 may be two different electronic devices belonging to the same user 108.
  • Thomas has a smart phone and a voice assistant device (such as a smart speaker); where the smart voice device is equipped with a voice assistant system (the voice assistant system is described in detail in the following embodiments), the voice assistant device can Receive the user's voice input and analyze the language input and other functions.
  • Both electronic devices can access the device management server 103 using an account owned by Thomas (for example, HUAWEI-01).
  • the device management server 103 can manage the access rights between each account and the electronic devices accessed using the account; in addition, the same account can also be simultaneously logged on two or more electronic devices managed by the device management server 103.
  • the first electronic device for example, the electronic device 101
  • the second electronic device for example, the electronic device 102
  • the user 108 can log in to the device management server 103 using the above account through other electronic devices, and adjust the access rights of the electronic devices stored on it, such as deleting the electronic device 101 and logging in with the account HUAWEI-01 , Or add another electronic device to log in using the above account, etc.
  • Table 1 is some information related to the registered electronic device stored in the device management server 103. It can be seen from this that two electronic devices (the device names are mobile phone 101 and smart speaker 102) have logged into the device management server 103 using the same account (HUAWEI-01). When these two electronic devices log in to the device management server 103, they may carry their respective device identifications (for example, IMEI in Table 1), or the device management server 103 requests these electronic devices for their corresponding device identifications after logging in to the electronic devices, so that Subsequent management of these electronic devices. The above device identification is used to uniquely identify the electronic device, so that other electronic devices or servers in the network can recognize them.
  • Common device identifications include international mobile equipment identity (IMEI), international mobile subscriber identity (IMSI), mobile equipment identifier (MEID), and serial number (serial) number, SN), integrated circuit card identification code (Integrate, circuit, card identity, ICCID), media access control layer (media access control, MAC) address, or other identifier that uniquely identifies an electronic device, etc.
  • IMEI international mobile equipment identity
  • IMSI international mobile subscriber identity
  • MEID mobile equipment identifier
  • serial number serial number
  • SN serial number
  • integrated circuit card identification code Integrate, circuit, card identity, ICCID
  • media access control layer media access control, MAC address
  • the electronic device 101 (for example, the mobile phone 101) can perform voice communication, such as a VoIP call, with the electronic device 107 (for example, the mobile phone 107) through the network 109 described above.
  • the electronic device 107 may log in to the device management server 103 using a second account (for example, HUAWI-02 in Table 1).
  • the voice switching system 100 described above may further include a voice assistant server 105.
  • the voice assistant server 105 can communicate with external services (such as streaming services, navigation services, calendar services, telephone services, photo services, etc.) through the network 109 to complete tasks or collect information.
  • the aforementioned voice assistant server 105 may be a part of a voice assistant system (not shown in the figure), and the voice assistant system may be implemented according to a client-server model.
  • the voice assistant system may include a client-side part (eg, a voice assistant client) executed on an electronic device (eg, the electronic device 102 in FIG. 1), and a server-side executed on the voice assistant server 105 Part (eg voice assistant system).
  • the voice assistant client can communicate with the voice assistant system through the network 109 described above.
  • the voice assistant client provides client-side functions such as user-oriented input and output processing and communication with the server-side voice assistant system.
  • the voice assistant system may provide server-side functions for one or more voice assistant clients, each of which is located on a corresponding electronic device (eg, electronic device 101 and electronic device 102).
  • the first electronic device (such as electronic device 101) shown in FIG. 1 may be a mobile phone 101
  • the second electronic device such as electronic device 102
  • a voice assistant device such as a smart speaker 102
  • both the mobile phone 101 and the smart speaker 102 can have a voice communication function.
  • the mobile phone 101 and the smart speaker 102 can provide VoIP services.
  • the voice switching system 100 may further include a VoIP server 104.
  • the VoIP server 104 may be used to implement voice communication related services such as call, answer, three-party call, and call transfer of the VoIP service.
  • the mobile phone 101 (or smart speaker 102) can perform voice communication with other electronic devices having voice communication functions through the VoIP server 104.
  • the voice assistant server 105 may provide the voice recognition result to the VoIP server 104.
  • the smart speaker 102 may send the collected voice signal to the voice assistant server 105 for voice recognition.
  • the voice assistant server 105 recognizes that the control instruction corresponding to the voice signal is: call contact John.
  • the voice assistant server 105 may send an instruction to call the contact John to the VoIP server 104.
  • the VoIP server 104 may initiate a voice call request to John's electronic device (such as a mobile phone), and when John accepts the voice call request, a voice call between the smart speaker 102 and John's mobile phone may be established to implement the VoIP service .
  • the above-mentioned voice switching system 100 may further include a content server 106.
  • the content server 106 may be used to provide music, video, and other streaming media content to the smart speaker 102 (or the mobile phone 101) according to the request of the user 108.
  • the smart speaker 102 may send the collected voice signal to the voice assistant server 105 through the network 109 for voice recognition.
  • the voice assistant server 105 recognizes that the control instruction corresponding to the voice signal is: acquiring media resources of the song Silence.
  • the voice assistant server 105 may send request information to the content server 106 to obtain the media resources of the song Silence.
  • the voice assistant server 105 In response to the request information sent by the voice assistant server 105, after the content server 106 finds the media resource of the song Silence, it can return the playback address to the voice assistant server 105, and the voice assistant server 105 sends the media resource to the smart speaker 102, The smart speaker 102 obtains the address of the song Silence according to the media resource and plays or saves it.
  • the mobile phone 101 can also directly interact with the content server 106 through the network 109.
  • the mobile phone 101 may send request information for playing the song Silence to the content server 106 according to the input of the user 108; after receiving the above request information, the content server 106 finds the media resource of the song Silence, Then the media resource can be returned to the mobile phone 101 (or smart speaker 102); the mobile phone 101 (or smart speaker 102) obtains the address of the song Silence according to the received media resource and plays it, or the song can be saved in the mobile phone 101 In the memory or in the memory of the smart speaker 102.
  • the mobile phone 101 and the smart speaker 102 log in to the same account, when the mobile phone 101 performs a voice call service (such as the VoIP service described above), if the user 108 wishes to switch the voice call service at this time to the smart speaker 102 , The user 108 can perform a preset specific operation on the mobile phone 101 or the smart speaker 102, such as a specific gesture or voice input, etc., thereby triggering the VoIP server 104 to switch the voice call service on the mobile phone 101 to the smart speaker 102, so that The voice call service continues on the smart speaker 102.
  • a voice call service such as the VoIP service described above
  • the user 108 can also perform a preset specific operation on the mobile phone 101 or the smart speaker 102 to trigger the VoIP server 104 to switch the voice call service on the smart speaker 102 to the mobile phone 101, so that the voice call service continues on the mobile phone 101. That is to say, in the above embodiment, the user 108 only needs to perform the above-mentioned specific operation on the electronic device, and the VoIP server 104 can automatically switch the ongoing voice call service from the first electronic device to the second electronic device. In this way, the voice call service will not be interrupted during the entire switching process, and the user does not need to repeatedly operate between multiple electronic devices, thereby improving the efficiency of voice switching between multiple electronic devices and the user experience.
  • the mobile phone 101 and the smart speaker 102 log in to the same account (for example, HUAWEI-01)
  • the mobile phone 101 is playing audio / video
  • the user 108 wishes to switch the played audio / video (for example, no Switch) to the smart speaker 102 to continue playing
  • the user 108 can do a preset input operation on the mobile phone 101 or the smart speaker 102, thereby triggering the content server 106 to switch the audio / video service being played on the mobile phone 101 to the smart speaker Continue playing on 102.
  • the user 108 may also perform a preset input operation on the mobile phone 101 or the smart speaker 102 to trigger the content server 106 to convert the audio / video being played on the smart speaker 102 Switch to mobile phone 101 to continue playing.
  • the first electronic device may also be a tablet computer, a wearable electronic device with wireless communication function (such as a smart watch), a virtual reality device, etc. that supports audio / video services or voice call services
  • the specific form of the first electronic device is not particularly limited.
  • the second electronic device may also be an electronic device that supports audio / video services such as a smart TV, a tablet computer, a notebook computer, and a desktop computer.
  • the specific form of the second electronic device is not particularly limited in the following embodiments .
  • the first electronic device may be a mobile phone, and then the second electronic device may be a smart speaker or a notebook computer configured with a voice assistant system.
  • FIG. 2 shows a schematic structural diagram of a first electronic device, that is, the electronic device 101 (for example, a mobile phone) in FIG. 1.
  • the electronic device 101 for example, a mobile phone
  • the electronic device 101 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , Mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor 180, key 190, motor 191, indicator 192, camera 193, display 194, and user Identification module (subscriber identification module, SIM) card interface 195 and so on.
  • SIM subscriber identification module
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), and an image signal processor. (image) signal processor (ISP), controller, video codec, digital signal processor (DSP), baseband processor, and / or neural-network processing unit (NPU), etc.
  • image image signal processor
  • ISP image signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • different processing units may be independent devices, or may be integrated in one or more processors.
  • the electronic device 101 may also include one or more processors 110.
  • the controller may be the nerve center and command center of the electronic device 101. The controller can generate the operation control signal according to the instruction operation code and the timing signal to complete the control of fetching instructions and executing instructions.
  • the processor 110 may also be provided with a memory for storing instructions and data.
  • the memory in the processor 110 is a cache memory.
  • the memory may store instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. The repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the electronic device 101 system.
  • the processor 110 may include one or more interfaces.
  • Interfaces can include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit, sound, I2S) interface, pulse code modulation (pulse code modulation (PCM) interface, universal asynchronous transceiver (universal asynchronous) receiver / transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input / output (GPIO) interface, subscriber identity module (SIM) interface, and / Or universal serial bus (USB) interface, etc.
  • the USB interface 130 is an interface that conforms to the USB standard, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and so on.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 101, and can also be used to transfer data between the electronic device 101 and peripheral devices. It can also be used to connect headphones and play audio through the headphones.
  • the interface connection relationship between the modules illustrated in the embodiments of the present invention is only a schematic description, and does not constitute a limitation on the structure of the electronic device 101.
  • the electronic device 101 may also use different interface connection methods in the foregoing embodiments, or a combination of multiple interface connection methods.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 140 may receive the charging input of the wired charger through the USB interface 130.
  • the charging management module 140 may receive wireless charging input through the wireless charging coil of the electronic device 101. While the charging management module 140 charges the battery 142, it can also supply power to the electronic device 101 through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and / or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, and the wireless communication module 160.
  • the power management module 141 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
  • the power management module 141 may also be disposed in the processor 110.
  • the power management module 141 and the charging management module 140 may also be set in the same device.
  • the wireless communication function of the electronic device 101 can be realized by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
  • Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the electronic device 101 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 1 can be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide a wireless communication solution including 2G / 3G / 4G / 5G and the like applied to the electronic device 101.
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier, and the like.
  • the mobile communication module 150 can receive the electromagnetic wave from the antenna 1, filter and amplify the received electromagnetic wave, and transmit it to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor and convert it to electromagnetic wave radiation through the antenna 1.
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110.
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low-frequency baseband signal to be transmitted into a high-frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
  • the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is processed by the baseband processor and then passed to the application processor.
  • the application processor outputs a sound signal through an audio device (not limited to a speaker 170A, a receiver 170B, etc.), or displays an image or video through a display screen 194.
  • the modem processor may be an independent device.
  • the modem processor may be independent of the processor 110, and may be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as Wi-Fi networks), Bluetooth (Bluetooth, BT), and global navigation satellite systems (GNSS) that are applied to the electronic device 101. ), Frequency Modulation (FM), Near Field Communication (NFC), Infrared (IR) and other wireless communication solutions.
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives the electromagnetic wave via the antenna 2, frequency-modulates and filters the electromagnetic wave signal, and sends the processed signal to the processor 110.
  • the wireless communication module 160 may also receive the signal to be transmitted from the processor 110, frequency-modulate it, amplify it, and convert it to electromagnetic waves through the antenna 2 to radiate it out.
  • the wireless communication module 160 may be specifically used to establish a short-range wireless communication link with the second electronic device (for example, the electronic device 102), so as to perform short-range wireless data transmission between the two.
  • the above short-range wireless communication link may be a Bluetooth communication link, a Wi-Fi communication link, an NFC communication link, or the like. Therefore, the wireless communication module 160 may specifically include a Bluetooth communication module, a Wi-Fi communication module, or an NFC communication module.
  • the antenna 1 of the electronic device 101 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 101 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include GSM, GPRS, CDMA, WCDMA, TD-SCDMA, LTE, GNSS, WLAN, NFC, FM, and / or IR technologies.
  • the above GNSS may include a global positioning system (GPS), a global navigation satellite system (GLONASS), a beidou navigation system (BDS), and a quasi-zenith satellite system (quasi- zenith satellite system (QZSS) and / or satellite-based augmentation systems (SBAS).
  • the electronic device 101 can realize a display function through a GPU, a display screen 194, and an application processor.
  • the GPU is a microprocessor for image processing, connecting the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations, and is used for graphics rendering.
  • the processor 110 may include one or more GPUs that execute instructions to generate or change display information.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel may use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light-emitting diode or an active matrix organic light-emitting diode (active-matrix organic light) emitting diode, AMOLED, flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diode (QLED), etc.
  • the electronic device 101 may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the electronic device 101 can realize a shooting function through an ISP, one or more cameras 193, a video codec, a GPU, one or more display screens 194, an application processor, and the like.
  • NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • the NPU can realize applications such as intelligent recognition of the electronic device 101, such as image recognition, face recognition, voice recognition, and text understanding.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 101.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save data files such as music, photos, and videos in an external memory card.
  • the internal memory 121 may be used to store one or more computer programs including instructions.
  • the processor 110 can execute the above instructions stored in the internal memory 121, so that the electronic device 101 executes the voice switching method provided in some embodiments of the present application, as well as various functional applications and data processing.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store the operating system; the storage program area can also store one or more application programs (such as gallery, contacts, etc.) and so on.
  • the storage data area may store data (such as photos, contacts, etc.) created during the use of the electronic device 101 and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and so on.
  • the processor 110 may cause the electronic device 101 to execute the instructions provided in the embodiments of the present application by running the instructions stored in the internal memory 121 and / or the instructions stored in the memory provided in the processor 110 Voice switching method, and various functional applications and data processing.
  • the electronic device 101 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, a headphone interface 170D, and an application processor.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and also used to convert analog audio input into digital audio signal.
  • the audio module 170 can also be used to encode and decode audio signals.
  • the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.
  • the speaker 170A also called “speaker", is used to convert audio electrical signals into sound signals.
  • the electronic device 101 can listen to music through the speaker 170A, or listen to a hands-free call.
  • the receiver 170B also known as “handset” is used to convert audio electrical signals into sound signals.
  • the voice can be received by bringing the receiver 170B close to the ear.
  • Microphone 170C also known as “microphone”, “microphone”, is used to convert sound signals into electrical signals.
  • the user can make a sound by approaching the microphone 170C through a person's mouth, and input a sound signal to the microphone 170C.
  • the electronic device 101 may be provided with at least one microphone 170C. In other embodiments, the electronic device 101 may be provided with two microphones 170C.
  • the electronic device 101 may also be provided with three, four, or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions.
  • the headset interface 170D is used to connect wired headsets.
  • the headphone jack 170D can be a USB jack 130, or a 3.5mm open mobile electronic device (open mobile terminal) platform (OMTP) standard jack, or the cellular telecommunications industry association (cellular telecommunication industry association of the United States, CTIA) Standard interface.
  • OMTP open mobile terminal
  • CTIA cellular telecommunications industry association
  • the sensor 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L , Bone conduction sensor 180M, etc.
  • the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
  • the pressure sensor 180A may be provided on the display screen 194.
  • the capacitive pressure sensor may be a parallel plate including at least two conductive materials.
  • the electronic device 101 determines the intensity of the pressure according to the change in capacitance.
  • the electronic device 101 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 101 may also calculate the touched position based on the detection signal of the pressure sensor 180A.
  • touch operations that act on the same touch position but have different touch operation intensities may correspond to different operation instructions. For example, when a touch operation with a touch operation intensity less than the first pressure threshold acts on the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the movement posture of the electronic device 101.
  • the angular velocity of the electronic device 101 around three axes ie, x, y, and z axes
  • the gyro sensor 180B can be used for shooting anti-shake.
  • the gyro sensor 180B detects the jitter angle of the electronic device 101, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to counteract the jitter of the electronic device 101 through reverse movement to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation, somatosensory game scenes, and the like.
  • the acceleration sensor 180E can detect the magnitude of acceleration of the electronic device 101 in various directions (generally three axes). When the electronic device 101 is stationary, the magnitude and direction of gravity can be detected. It can also be used to recognize the posture of electronic devices, and be used in applications such as horizontal and vertical screen switching and pedometers.
  • the distance sensor 180F is used to measure the distance.
  • the electronic device 101 can measure the distance by infrared or laser. In some embodiments, when shooting scenes, the electronic device 101 may use the distance sensor 180F to measure distance to achieve fast focusing.
  • the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the electronic device 101 emits infrared light outward through the light emitting diode.
  • the electronic device 101 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 101. When insufficient reflected light is detected, the electronic device 101 may determine that there is no object near the electronic device 101.
  • the electronic device 101 can use the proximity light sensor 180G to detect that the user holds the electronic device 101 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in leather case mode, pocket mode automatically unlocks and locks the screen.
  • the ambient light sensor 180L is used to sense the brightness of ambient light.
  • the electronic device 101 can adaptively adjust the brightness of the display screen 194 according to the perceived brightness of the ambient light.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 101 is in a pocket to prevent accidental touch.
  • the fingerprint sensor 180H (also called fingerprint reader) is used to collect fingerprints.
  • the electronic device 101 can use the collected fingerprint characteristics to realize fingerprint unlocking, access to application lock, fingerprint photo taking, fingerprint answering call, and the like.
  • fingerprint sensor please refer to the international patent application PCT / CN2017 / 082773 titled “Method and Method for Processing Notifications", the entire contents of which are incorporated by reference in this application.
  • the touch sensor 180K can also be called a touch panel or a touch-sensitive surface.
  • the touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 constitute a touch screen, also called a touch screen.
  • the touch sensor 180K is used to detect a touch operation acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • the visual output related to the touch operation can be provided through the display screen 194.
  • the touch sensor 180K may also be disposed on the surface of the electronic device 101, which is different from the location where the display screen 194 is located.
  • the bone conduction sensor 180M can acquire vibration signals.
  • the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human body part.
  • the bone conduction sensor 180M can also contact the pulse of the human body and receive a blood pressure beating signal.
  • the bone conduction sensor 180M may also be provided in the earphone and combined into a bone conduction earphone.
  • the audio module 170 may parse out the voice signal based on the vibration signal of the vibrating bone block of the voice part acquired by the bone conduction sensor 180M to realize the voice function.
  • the application processor may analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M to implement the heart rate detection function.
  • the key 190 includes a power-on key, a volume key, and the like.
  • the key 190 may be a mechanical key or a touch key.
  • the electronic device 101 can receive key input and generate key signal input related to user settings and function control of the electronic device 101.
  • the SIM card interface 195 is used to connect a SIM card.
  • the SIM card can be inserted into or removed from the SIM card interface 195 to achieve contact and separation with the electronic device 101.
  • the electronic device 101 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • the SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc.
  • the same SIM card interface 195 can insert multiple cards at the same time. The types of the multiple cards may be the same or different.
  • the SIM card interface 195 can also be compatible with different types of SIM cards.
  • the SIM card interface 195 can also be compatible with external memory cards.
  • the electronic device 101 interacts with the network through the SIM card to realize functions such as call and data communication.
  • the electronic device 101 uses eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the electronic device 101 and cannot be separated from the electronic device 101.
  • the structure of the second electronic device may be the same as that of the first electronic device (for example, the electronic device 101 in FIG. 2), so the structure of the second electronic device will not be described here.
  • the second electronic device (for example, the electronic device 102) may be a voice assistant device; therefore, the structure of the second electronic device may also be different from the structure of the first electronic device.
  • FIG. 3 shows a schematic structural diagram of a second electronic device in other embodiments.
  • FIG. 3 is a schematic structural diagram of a second electronic device (such as electronic device 102) in some embodiments.
  • the electronic device 102 may specifically be a voice assistant device (for example, a smart speaker 102), and the voice assistant device is configured with a voice assistant system.
  • the voice assistant system may refer to interpreting natural language input in verbal and / or text form to infer user intent (eg, identifying a task type corresponding to natural language input) and perform actions based on the inferred user intent (For example, any information processing system that performs a task corresponding to the identified task type).
  • infer user intent eg, identifying a task type corresponding to natural language input
  • actions based on the inferred user intent e.g, any information processing system that performs a task corresponding to the identified task type.
  • the system can perform one or more of the following operations: identify the task flow by designing steps and parameters to realize the inferred user intent (eg, identify the task Type), input specific requirements from the inferred user intent into the task flow, and execute the task flow (for example, send a request to the service by calling programs, methods, services, application programming interfaces (APIs), etc.) Provider); and generating an output response to the user's auditory (eg speech) and / or visual form.
  • the voice assistant system can accept user requests in the form of at least partially natural language commands, requests, statements, narrations, and / or queries.
  • user requests either seek an information response from the voice assistant system or perform tasks on the voice assistant system.
  • a satisfactory response to a user request is usually to provide the requested informational answer, perform the requested task, or a combination of both.
  • the user may ask questions such as "Where am I now?" To the voice assistant system. Based on the user's current location, the voice assistant may answer "You are near the west gate of Central Park.”
  • the user can also request to perform tasks, for example, by saying "Please invite my friends to my birthday party next week.”
  • the voice assistant The system can confirm the request by generating a voice output "OK, immediately", and then send the appropriate calendar invitation from the user's email address to each user of the user listed in the user's electronic address book or contact list.
  • the voice assistant system can also provide other visual or audio responses (eg, like text, alarms, music, video, animation, etc.).
  • the electronic device 102 may specifically include a processor 310, an external memory interface 320, a memory 321, a USB interface 330, a charging management module 340, a power management module 341, a battery 342, an antenna 343, a network communication interface 350, an input / Output (I / O) interface 351, wireless communication module 360, audio module 370, one or more speaker arrays 370A, one or more microphone arrays 370B, one or more sensors 380, buttons 390, motor 391, indicators 392, camera 393, and display 394. These components communicate with each other through one or more communication buses or signal lines.
  • a processor 310 an external memory interface 320, a memory 321, a USB interface 330, a charging management module 340, a power management module 341, a battery 342, an antenna 343, a network communication interface 350, an input / Output (I / O) interface 351, wireless communication module 360, audio module 370, one or more speaker arrays 370A, one or more microphone arrays 370B, one or
  • the structure illustrated in this embodiment does not constitute a specific limitation on the electronic device 102.
  • the electronic device 102 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the external memory interface 320, USB interface 330, charging management module 340, power management module 341, battery 342, antenna 343, wireless communication module 360, audio module 370, one or more sensors 380 in FIG. 3 , Button 390, motor 391, indicator 392, camera 393, display screen 394, etc. may have the same or similar structure and / or function as some components of the electronic device 101 in FIG. 2, so the specifics of the above components in FIG. 3 For the description, please refer to the corresponding description in FIG. 2 and related embodiments, which will not be repeated here.
  • the network communication interface 350 may include one or more wired communication ports, or, one or more wireless transmission / reception circuits.
  • One or more wired communication ports receive and send communication signals via one or more wired interfaces such as Ethernet, USB, FireWire, and so on.
  • Wireless circuits generally receive RF signals or optical signals from communication networks and other electronic devices, and send RF signals or optical signals to communication networks and other electronic devices.
  • Wireless communication may use any of a variety of communication standards, protocols, and technologies, such as GSM, CDMA, WCDMA, TDMA, Bluetooth, Wi-Fi, VoIP, or any other suitable communication protocol.
  • the network communication interface 350 enables the electronic device 102 to communicate with other electronic devices (such as the mobile phone 101) or a server on the network side through a network, such as the Internet, a wireless network such as a cellular network, a wireless local area network, or the like.
  • a network such as the Internet, a wireless network such as a cellular network, a wireless local area network, or the like.
  • the memory 321 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a general-purpose flash memory, and so on.
  • the processor 310 may run the instructions stored in the memory 321 or the instructions stored in the memory provided in the processor 310 to cause the electronic device 102 to perform the voice switching method provided in the embodiments of the present application , And various functional applications and data processing.
  • the memory 321 may store programs, modules, instructions, and data structures including all or a subset of the following: an operating system 321A, a communication module 321B, a user interface module 321C, One or more application programs 321D and a voice assistant module 321E.
  • One or more processors 310 execute these programs, modules, and instructions, and read data from or write data to the data structure.
  • the operating system 321A (for example, Darwin, RTXC, LINUX, UNIX, OS X, WINDOWS, or embedded operating system) includes for controlling and managing general system tasks (for example, memory management, storage device control, power management, etc.) Various software components or drivers, and facilitate communication between various hardware, firmware, and software components.
  • the communication module 321B facilitates communication between the voice assistant device 300 and other electronic devices through the network communication interface 350.
  • the communication module 321B can communicate with the electronic device 101 shown in FIG. 2.
  • the communication module 321B may also include various software components that can be used to process data received by the network communication interface 350.
  • the user interface module 321C receives commands or input from the user via the I / O interface 351 (for example, from a keyboard, touch screen, or microphone connected to the I / O interface 351), and displays the user interface on the display.
  • the application program 321D includes a program or module configured to be executed by one or more processors 310.
  • the application 321D may include applications such as games, calendars, navigation, or mail.
  • the applications 321D may include applications such as resource management, diagnosis, or scheduling.
  • the memory 321 also stores a voice assistant module 321E.
  • the voice assistant module 321E may include the following sub-modules, or a subset or superset thereof: I / O processing module 321F, speech-to-text (STT) processing module 321G, natural language
  • the voice assistant module 321E is mainly used to implement the voice assistant system in the embodiment of the present application through the information interaction of the above submodules.
  • the voice assistant device 300 can perform at least one of the following operations: recognize the user's intention expressed in natural language input received from the user; actively lead out and obtain a user for fully inferring the user Information required for the intent of (eg, by disambiguating words, names, intentions, etc.); determining the task flow used to achieve the inferred intent; and executing the task flow to achieve the inferred intent.
  • the voice assistant device 300 when a satisfactory response is not provided or cannot be provided to the user due to various reasons, the voice assistant device 300 also takes other appropriate actions.
  • the I / O processing module 321F can receive user commands (eg voice commands) or input (eg voice input) through one or more microphone arrays 370B; the I / O processing module 321F can also pass I / O The interface 351 receives commands or input from the user from other connected devices (such as a microphone, touch screen, or keyboard); the I / O processing module 321F can also pass one or more speaker arrays 370A or indicators 392 or display screen 394 Provide response to user input to interact with the user; the I / O processing module 321F can also interact with other electronic devices (such as the electronic device 101 in FIG. 2) through the network communication interface 350 to obtain user input (such as voice input ) And provide a response to user input.
  • user commands eg voice commands
  • input eg voice input
  • I / O processing module 321F can also pass I / O
  • the interface 351 receives commands or input from the user from other connected devices (such as a microphone, touch screen, or keyboard); the I / O processing module
  • the I / O processing module 321F may obtain context information associated with user input from other electronic devices when the user input is received or shortly after the user input is received.
  • Context information includes user-specific data, vocabulary, or preferences related to user input.
  • the context information may also include the software and hardware status of the electronic device (eg, electronic device 101) when the user request is received, or information related to the user's surrounding environment when the user request is received.
  • the I / O processing module 321F also sends follow-up questions related to user requests to the user and receives answers from the user.
  • the I / O processing module 321F may forward the voice input to the STT processing module 321G for voice-to-text conversion.
  • the STT processing module 321G receives voice input through the I / O processing module 321F.
  • the STT processing module 321G can use various sound models and language models to recognize speech input as a sequence of phonemes, and finally recognize it as a sequence of words or tokens written in one or more languages .
  • the STT processing module 321G can use any suitable speech recognition technology, sound model, and language model, such as hidden Markov model, speech recognition based on dynamic time warping, and other statistical or analytical techniques, to implement the embodiments of the present application.
  • speech-to-text processing may be performed at least in part by a third-party service or on an electronic device. Once the STT processing module 321G obtains the result of speech-to-text processing (such as a sequence of words or symbols), it transmits the result to the natural language processing module 321H for inference intent.
  • the natural language processing module 321H (also called natural language processor) obtains the sequence of words or symbols ("symbol sequence") generated by the STT processing module 321G, and attempts to compare the symbol sequence with the voice assistant module 321E The identified one or more "actionable intents" are associated.
  • "Executable intent” means a task that can be executed by the voice assistant module 321E, and has an associated task flow implemented in the task flow model.
  • the associated task flow is a series of programmed actions and steps taken by the voice assistant module 321E to perform the task.
  • the range of capabilities of the voice assistant system depends on the number and types of task flows that have been implemented and stored in the task flow model; or in other words, on the number and types of "executable intentions" recognized by the voice assistant module 321E.
  • the effectiveness of the voice assistant system 300 also depends on the ability of the voice assistant module 321E to infer the correct "one or more executable intentions" from user requests expressed in natural language.
  • the natural language processing module 321H may also receive contextual information associated with user requests (eg, from the I / O processing module 321F) .
  • the natural language processing module 321H may also use the context information to clarify, supplement, or further define the information contained in the symbol sequence received from the STT processing module 321G.
  • Context information includes, for example, user preferences, hardware and / or software status of the user device, sensor information collected before, during, or shortly after the user's request, previous interactions (such as conversations) between the voice assistant system and the user, and so on.
  • the natural language processing module 321H may specifically include an ontology, vocabulary, user data, and classification modules.
  • the knowledge ontology is a hierarchical structure containing multiple nodes, and each node represents an “executable intention” or “attribute” related to one or more of “executable intention” or other “attributes”.
  • “executable intent” means a task that the voice assistant system 300 can perform (eg, an executable or executable task).
  • attributete represents a parameter associated with an executable intention or a sub-aspect of another attribute.
  • Each node in the ontology of knowledge is associated with a set of words and / or phrases related to the attributes or executable intentions represented by the nodes.
  • the corresponding words and / or phrases associated with each node are the so-called "vocabulary" associated with the node.
  • the corresponding set of words and / or phrases associated with each node may be stored in the vocabulary index associated with the attribute or executable intent represented by the node.
  • the vocabulary index can include words and phrases in different languages.
  • the natural language processor 321H receives the symbol sequence (eg, text string) from the STT processing module 321G and determines which nodes are involved in the words in the symbol sequence.
  • User data includes user-specific information, such as user-specific vocabulary, user preferences, user address, user's default language and second language, user's contact list, and other short-term or long-term information for each user.
  • the natural language processor 321H may use user data to supplement the information contained in the user input to further define user intent. For example, in response to a user request to "invite my friends to my birthday party", the natural language processor 321H can access the user data to determine who the "friends" are and when and where the "birthday party" will be held, without the need Users explicitly provide such information in their requests.
  • the natural language processor 321H may also include a classification module.
  • the classification module determines, for example, whether each of the one or more words in the text string is one of an entity, activity, or location.
  • the natural language processor 321H generates a structured query to represent the identified executable intent.
  • the natural language processor 321H may populate some parameters of the structured query with the received context information. For example, if the user requests a "near me" sushi restaurant, the natural language processor 321H may use GPS coordinates from the voice assistant device 300 to populate the location parameters in the structured query.
  • the natural language processor 321H transmits the structured query to the task stream processing module 321J (also may be referred to as a task stream processor).
  • the task flow processor 321J is configured to perform one or more of the following: receiving the structured query from the natural language processor 321H, completing the structured query, and performing the actions required by the user's final request.
  • various processes necessary to complete these tasks are provided in the task flow model in the task flow processing module 321J.
  • the task flow model includes a process for acquiring additional information from the user, and a task flow for performing actions associated with executable intent.
  • the task flow processor 321J may need to initiate additional conversations with the user in order to obtain additional information, and / or to disambiguate potentially ambiguous utterances.
  • the task flow processor 321J calls the dialog processing module 321I (also referred to as a dialog processor) to conduct a dialog with the user.
  • the dialog processing module 321I determines how (and / or when) to ask the user for additional information, and receives and processes the user's response.
  • the question is provided to the user through the I / O processing module 351 and the answer is received from the user.
  • the dialogue processing module 321I presents dialogue output to the user via the speaker array 370A and / or the display screen 394, and receives input from the user.
  • the dialog processor 3211 may include a disambiguation module.
  • the disambiguation module is used to disambiguate one or more ambiguous words (for example, one or more ambiguous words in the text string corresponding to the voice output associated with the digital photo).
  • the disambiguation module recognizes that the first word in the one or more words has multiple candidate meanings, prompts the user for additional information about the first word, receives additional information from the user in response to the prompt and based on the additional information To identify the entity, activity, or location associated with the first word.
  • the task flow processor 321J continues to execute the final task associated with the executable intent. Therefore, the task flow processor 321J can execute the steps and instructions in the task flow model according to the specific parameters included in the structured query. In some embodiments, the task flow processor 321J completes the task requested in the user input with the assistance of the service processing module 321K (also referred to as a service processor), or provides the informational response requested in the user input.
  • the service processing module 321K also referred to as a service processor
  • the service processor 321K may replace the task flow processor 321J to initiate phone calls, set calendar entries, call map search, call or interact with other applications installed on the user device, and call third-party services (eg, Restaurant booking portals, social networking sites or services, bank portals, etc.) or interact with these third-party services.
  • the protocol and application programming interface (API) required by each service can be specified through the corresponding service model in the service model in the service processing module 321K.
  • the service processor 321K accesses an appropriate service model for the service, and generates a request for the service according to the protocol and API required by the service according to the service model.
  • the natural language processor 321H, the dialog processor 321K, and the task flow processor 321J are used together and repeatedly to infer and define the user's intention, obtain information to further clarify and refine the user's intention, and finally generate a response ( For example, provide the output to the user, or complete the task) to meet the user's intention.
  • the voice assistant system formulates a confirmation response and sends the response back to the user through the I / O processing module 321F. If the user requests an informational answer, the confirmation response presents the requested information to the user.
  • the I / O interface 351 may couple the I / O devices of the voice assistant device 300 such as a keyboard, a touch screen, a microphone, and the like to the user interface module 321C.
  • the I / O interface 351 is combined with the user interface module 321C to receive user input (for example, voice input, keyboard input, touch input, etc.) and process these inputs accordingly.
  • the above-mentioned second electronic device may be distributed across multiple computers, thereby forming a client-server voice assistant system.
  • this voice assistant system some of the modules and functions of the voice assistant system are divided into a server part and a client part.
  • the client part may be located on the second electronic device (for example, the electronic device 102) and communicate with the server part (for example, the voice assistant server 105) through the network 109, as shown in FIG.
  • the voice assistant system may be the embodiment of the voice assistant server 105 shown in FIG. 1.
  • the voice assistant system may be implemented in the electronic device 102, thereby eliminating the need for a client-server system.
  • the above-mentioned voice assistant system is only an example, and the voice assistant system may have more or fewer components than shown, two or more components may be combined, or may have different configurations or layouts of components .
  • the mobile phone 101 will be used as the first electronic device and the smart speaker 102 will be used as the second electronic device, and the voice switching method provided in the embodiments of the present application will be described in detail with reference to the drawings.
  • the ongoing VoIP service can be switched between the mobile phone 101 and the smart speaker 102.
  • the electronic device that is performing the VoIP service before switching may be referred to as a source device of the VoIP service, and the source device may be a mobile phone 101 or a smart speaker 102.
  • the device that continues to perform the VoIP service after the handover may be called the target device of the VoIP service.
  • the target device of the VoIP service When the source device is a mobile phone 101, the target device of the VoIP service may be the smart speaker 102; when the source device is the smart speaker 102, the target device of the VoIP service may be the mobile phone 101.
  • the following describes how to switch the VoIP service between the mobile phone 101 and the smart speaker 102 in conjunction with the specific scenario of the embodiment.
  • the user 10 can perform VoIP services with other electronic devices (such as the electronic device 107 in FIG. 1) through the smart speaker 102 . If the user 108 wishes to switch the VoIP service from the smart speaker 102 to the mobile phone 101, the user can perform a preset input operation on the mobile phone 101 to trigger the VoIP server 104 to switch the ongoing VoIP service on the smart speaker 102 through the network 109 To mobile phone 101.
  • other electronic devices such as the electronic device 107 in FIG. 1
  • the user can perform a preset input operation on the mobile phone 101 to trigger the VoIP server 104 to switch the ongoing VoIP service on the smart speaker 102 through the network 109 To mobile phone 101.
  • this embodiment provides a voice switching method, which may be implemented in the electronic device or server involved in the foregoing embodiment, and may include the following steps:
  • Step S401 Both the mobile phone 101 and the smart speaker 102 use the first account to log in to the device management server 103.
  • the first account may be an account of an application (for example, Kugou Music), or an account of a service (for example, Huawei Cloud Service).
  • the user 108 can use the account (HUAWEI-01) in the Kugou music APP of the mobile phone 101 to log in to the device management server 103 corresponding to the APP; In addition, the user 108 can also log in to the device management server 103 corresponding to the APP in the Kugou Music APP of the smart speaker 102 using the same account (HUAWEI-01).
  • the mobile phone 101 and the smart speaker 102 are both Huawei-branded electronic devices, both the mobile phone 101 and the smart speaker 102 can provide users 108 with Huawei cloud services.
  • the user can use the account (HUAWEI-01) in the mobile phone 101 to log in to the device management server 103 of Huawei Cloud Service, and the user 108 can also use the same account (HUAWEI-01) in the smart speaker 102 to log in to the device of the Huawei Cloud service Management server 103.
  • the device management server 103 stores an account number and device information of the electronic device logged in with the account number, such as a device identification (as shown in Table 1).
  • a device identification as shown in Table 1.
  • the device management server 103 may establish a correspondence between the first account and the electronic device using the first account. In this way, the device management server 103 can inquire which specific electronic devices are logged in under an account.
  • Step S402 The smart speaker 102 detects an input operation in which the user 108 initiates a voice call.
  • Step S403 In response to the above input operation, the smart speaker 102 establishes a VoIP call with the third electronic device (eg, the mobile phone 107 shown in FIG. 1) through the VoIP server 104.
  • the third electronic device eg, the mobile phone 107 shown in FIG. 1
  • the user 108 may initiate an input operation of a VoIP voice call on the smart speaker 102.
  • the input operation may specifically be an operation in which the user 108 inputs Susan's phone number on the display screen 394 of the smart speaker 102.
  • the input operation may also be a voice input operation performed by the user 108 to the smart speaker 102.
  • the user 108 may say "call Susan” to the smart speaker 102.
  • the voice assistant system in the smart speaker 102 may be used to perform voice recognition on the voice signal to obtain a control instruction corresponding to the voice signal.
  • the voice assistant system recognizes that the control instruction is "call contact Susan” based on the voice signal.
  • the smart speaker 102 may send a call request to the VoIP server 104 to the VoIP server 104 according to Susan's phone number in the address book, so that the VoIP server 104 calls Susan's mobile phone 107 of the contact.
  • the mobile phone 107 may send a successful answer message to the VoIP server 104, thereby establishing the smart speaker 102 and the called party (i.e. contact Susan ’s mobile phone 107) ) Between VoIP calls. In this way, the user can use the smart speaker 102 to make a VoIP call with a contact.
  • the smart speaker 102 can collect the user's voice input (that is, voice signal) from different directions through one or more microphone arrays 370B, and the smart speaker 102 can pass one or more A speaker array 370A plays the voice feedback of the voice recognition result from the voice assistant system.
  • voice input that is, voice signal
  • a speaker array 370A plays the voice feedback of the voice recognition result from the voice assistant system.
  • the smart speaker 102 may also send the voice signal to the voice assistant server 105, and the voice assistant server 105 performs voice recognition on the voice signal to obtain the corresponding to the voice signal Control instruction.
  • the voice assistant server 105 may send the recognized control instruction to the VoIP server 104, and the VoIP server 104 establishes the VoIP service between the smart speaker 102 and the called party according to the above method.
  • the user can also initiate the call operation of the VoIP service by performing preset gestures and the like, which is not limited in this embodiment of the present application.
  • Step S404 The mobile phone 101 sends first switching request information to the VoIP server 104.
  • the first switching request information is used to request the VoIP server 104 to switch the ongoing VoIP service in the smart speaker 102 to the mobile phone 101 to continue.
  • the first switching request information may include a first account number.
  • the user may wish to switch the ongoing VoIP call in the smart speaker 102 to the mobile phone 101 for execution.
  • the smart speaker 102 is located in the home of the user 108, and the user can use the smart speaker 102 to make VoIP calls with other electronic devices (such as a mobile phone 107) while at home.
  • other electronic devices such as a mobile phone 107
  • the user needs to use the mobile phone 101 with better portability to continue the VoIP call with the mobile phone 107.
  • the user needs to switch the VoIP service being executed in the smart speaker 102 to the mobile phone 101 to continue to execute.
  • the mobile phone 101 may preset a specific operation for switching the VoIP service.
  • the specific operation may be an input operation such as flipping a mobile phone, tapping a screen with a knuckle, double-clicking a power button, or sliding an operation.
  • the specific operation may also be preset voice input.
  • the user 108 can input a voice command of “switch voice call” to the mobile phone 101 by voice. It can be understood that, those skilled in the art may set the preset operation according to actual application scenarios or actual experience, and this embodiment does not make any limitation on this.
  • the mobile phone 101 may send the first switching request information to the VoIP server 104 .
  • the mobile phone 101 when the mobile phone 101 detects the above specific operation, it means that the user 101 needs to switch the ongoing VoIP service under the first account (for example, HUAWEI-01) to the mobile phone 101 at this time. Furthermore, in response to the specific operation, the mobile phone 101 may send the first handover request information to the VoIP server 104 through the network 109.
  • the first account for example, HUAWEI-01
  • the above specific operation may be preset as a swipe gesture on the touch screen of the mobile phone 101.
  • the sliding trajectory of the sliding gesture is X-shaped
  • the sliding gesture is used to instruct to switch the ongoing VoIP call on the mobile phone 101 to the user's smart speaker 102, and the mobile phone 101 can use the smart speaker 102 as the target device of the VoIP call .
  • the sliding trajectory of the sliding gesture is Y-shaped
  • the sliding gesture is used to instruct to switch the ongoing VoIP call on the mobile phone 101 to the user's tablet 111 (not shown). Then, if it is detected that the sliding trajectory of the sliding operation performed by the user is X-shaped, the mobile phone 101 may use the tablet 111 as a target device for VoIP calls.
  • the first handover request information may further include the VoIP identifier of the mobile phone 101 in the VoIP service (for example, the phone number or IP address of the mobile phone 101), and the VoIP identifier of the smart speaker 102 in the VoIP service .
  • the first handover request information may further include a device identification of the mobile phone 101, so that the VoIP server 104 performs legality verification on the electronic device 101 after receiving the first handover request information. This further improves the security of voice switching.
  • the mobile phone 101 may also automatically send the above-mentioned first handover request information to the VoIP server 104 through the network 109 according to the detected specific conditions, without the user entering a specific operation to the mobile phone 101 in the above embodiment. trigger.
  • the above-mentioned specific condition may be Wi-Fi signal strength in the WLAN network.
  • both the mobile phone 101 and the smart speaker 102 are connected to the same Wi-Fi network, that is, both electronic devices can use the same service set identifier (SSID) in the WLAN network to access the network .
  • SSID service set identifier
  • the mobile phone 101 is more portable than the smart speaker 102, when the mobile phone 101 and the smart speaker 102 are connected to the same Wi-Fi network (for example, the Wi-Fi network with the SSID name "123"), the mobile phone 101 can The detected change in the Wi-Fi signal strength of the Wi-Fi network determines whether to send the first handover request information to the VoIP server 104.
  • the mobile phone 101 may automatically send the first switching request information to the VoIP server 104.
  • the above situation indicates that the user has carried the mobile phone 101 away from the Wi-Fi network and away from the smart speaker 102 at this time.
  • the mobile phone 101 can request the VoIP server 104 to switch the ongoing VoIP call in the smart speaker 102 to the mobile phone 101, so that the user can conveniently continue the VoIP call on the mobile phone 101.
  • the specific condition may be Bluetooth signal strength.
  • a Bluetooth connection can be established between the mobile phone 101 and the smart speaker 102, so that the mobile phone 101 can determine whether to send the first switching request to the VoIP server 104 according to the detected Bluetooth signal strength between the mobile phone 101 and the smart speaker 102 information. For example, when the mobile phone 101 detects that the Bluetooth connection with the smart speaker 102 is disconnected, or when the mobile phone 101 detects that the Bluetooth signal strength of the smart speaker 102 is less than a preset threshold, the mobile phone 101 may automatically send the above to the VoIP server 104 First handover request information. The above situation indicates that the user has carried the mobile phone 101 away from the smart speaker 102 at this time. In this case, the mobile phone 101 can request the VoIP server 104 to seamlessly switch the ongoing VoIP call in the smart speaker 102 to the mobile phone 101, so that the user can conveniently continue the VoIP call on the mobile phone 101.
  • the mobile phone 101 can be connected to the dock device through a wired connection, and connected to the smart speaker 102 through the dock device.
  • the mobile phone 101 can automatically pass the network 109 to The server 104 sends the above-mentioned first handover request information.
  • Step S405 The VoIP server 104 receives the first handover request information sent by the mobile phone 101.
  • Step S406 In response to receiving the first switching request information, the VoIP server 104 determines that the source device of the VoIP service corresponding to the first account is the smart speaker 102.
  • the VoIP server 104 may send the first account number (ie, HUAWEI-01) carried in the first handover request information to the device management server 103. Since the device management server 103 stores the account number, the device identification of the electronic device, etc. (as shown in Table 1), the device management server 103 can find the login using the first account according to the first account sent from the VoIP server 104 Each electronic device, such as the smart speaker 102, also uses the first account to log in.
  • the first account number ie, HUAWEI-01
  • the device management server 103 may send the device identifications of all the electronic devices logged in using the first account to the VoIP server 104 through the network 109. In other cases, the device management server 103 may also send the device identifications of all electronic devices that support the VoIP service among all electronic devices logged in using the first account to the VoIP server 104.
  • the VoIP server 104 may query the source device that is performing the VoIP service under the first account through the device identifier. For example, the device management server 103 queries the device identification of the smart speaker 102 according to the account HUAWEI-01, and sends the device identification to the VoIP server. The VoIP server 104 can thus determine that the source device performing the VoIP service is the smart speaker 102, that is, the user needs to switch the VoIP service being executed on the smart speaker 102 to the mobile phone 101.
  • the tablet computer 111 is also performing VoIP services. Then, the VoIP server 104 can send the device identifications of the two electronic devices to the mobile phone 101 through the network 109.
  • a prompt box 501 may be displayed on the touch screen of the mobile phone 101, and there are one or more options in the prompt box 501, the options list the ongoing A list of multiple source devices for VoIP services. In this way, the user can select which electronic device to switch the VoIP service to the mobile phone 101 in the prompt box 501.
  • the mobile phone 101 may send the identification of the smart speaker 102 to the VoIP server 104.
  • the VoIP server 104 can determine that the source device of the VoIP service that the user needs to switch to is the smart speaker 102.
  • Step S407 The VoIP server 104 switches the ongoing VoIP call on the smart speaker to the mobile phone 101.
  • the VoIP server 104 may add the mobile phone 101 to the VoIP call between the smart speaker 102 and the mobile phone 107. Specifically, the VoIP server 104 may add the mobile phone 101 to the VoIP service between the smart speaker 102 and the mobile phone 107 according to the VoIP identifier of the mobile phone 101 in the VoIP service. At this time, a multi-party call of the VoIP service is established between the mobile phone 101, the smart speaker 102 and the mobile phone 107.
  • the VoIP identifier of the mobile phone 101 may be carried in the first handover request information of the mobile phone 101.
  • the VoIP server 104 may pre-register the VoIP identification of each electronic device in the VoIP service. In this way, the VoIP server 104 can query the VoIP identifier of the mobile phone 101 in the VoIP service.
  • the mobile phone 101 can send a response message to the VoIP server 104 through the network 109 to successfully join the VoIP service.
  • the VoIP server 104 After receiving the above response message, the VoIP server 104 interrupts the VoIP service of the smart speaker 102. After the interruption, only the mobile phone 101 and the mobile phone 107 are in a VoIP call at this time.
  • the VoIP server 104 adds the mobile phone 101 to the VoIP service between the smart speaker 102 and the mobile phone 107, if the mobile phone 101 successfully accesses the VoIP service, it means that the user has connected to the mobile phone using the mobile phone 101 VoIP voice calls between 107. Furthermore, the mobile phone 101 can send a response message to the VoIP server 104 to successfully join the VoIP service. After receiving the response message, the VoIP server 104 can remove the smart speaker 102 in the multi-party call composed of the mobile phone 101, the smart speaker 102, and the mobile phone 107, that is, interrupt the VoIP service of the smart speaker 102, thereby making the VoIP service from the smart speaker 102 Switch to the mobile phone 101 to continue execution.
  • the VoIP service is connected to the smart speaker 102 and the mobile phone 101.
  • the mobile phone 101 sends the response message to the VoIP server 104 to trigger the VoIP server 104 to interrupt the VoIP service of the smart speaker 102.
  • the VoIP service will not be interrupted when switching between the mobile phone 101 and the smart speaker 102, and when the user switches from the smart speaker 102 to the mobile phone 101 to make a VoIP voice call, a seamless connection of the VoIP voice call can be achieved, thereby improving It improves the efficiency of voice switching between multiple devices and user experience.
  • the VoIP server 104 may also transfer the VoIP call of the smart speaker 102 to the mobile phone 101 through the call transfer service according to the VoIP identification of the mobile phone 101 in the VoIP service (for example, the phone number of the mobile phone 101), thereby transferring The ongoing VoIP call on the smart speaker 102 is switched to the mobile phone 101 to continue.
  • the source device of the VoIP service is the smart speaker 102
  • the target device of the VoIP service is the mobile phone 101.
  • the mobile phone 101 can recognize the switching requirement of the VoIP service in response to the specific operation of the user.
  • the mobile phone 101 may send the first switching request information to the VoIP server 104, so that the VoIP server 104 seamlessly switches the VoIP service being executed on the smart speaker 102 under the same account to the mobile phone 101.
  • the VoIP service will not be interrupted during the switching process, and users do not need to repeatedly operate between multiple devices, thereby improving the efficiency of voice switching between multiple devices and the user experience.
  • the user may enter a specific operation on the smart speaker 102 to trigger the VoIP server 104 to The VoIP call is switched to another electronic device (for example, mobile phone 101) logged in under the same account.
  • another electronic device for example, mobile phone 101
  • a voice switching method provided in this embodiment includes:
  • step S601 the mobile phone 101 and the smart speaker 102 use the first account to log in to the device management server 103.
  • Step S602 The smart speaker 102 detects an input operation in which the user 108 initiates a voice call.
  • Step S603 In response to the above input operation, the smart speaker 102 establishes a VoIP call with the third electronic device (eg, the mobile phone 107 shown in FIG. 1) through the VoIP server 104.
  • the third electronic device eg, the mobile phone 107 shown in FIG. 1
  • steps S601-S603 For the specific implementation method of steps S601-S603, reference may be made to the relevant description of steps S401-S402 in the foregoing embodiment, so details are not repeated here.
  • Step S604 The smart speaker 102 sends second switching request information to the VoIP server 104.
  • the second switching request information is used to request the VoIP server 104 to switch the ongoing VoIP service in the smart speaker 102 to the mobile phone 101 to continue.
  • the second switching request information may include the above-mentioned first account.
  • the user wishes to switch the ongoing VoIP call in the smart speaker 102 to the mobile phone 101 to continue execution.
  • the user can trigger an input operation to the source device (ie, the smart speaker 102) to trigger the VoIP server 104 to switch the VoIP call being executed in the smart speaker 102 to the mobile phone 101 for execution.
  • the above input operation may be a user's voice input.
  • the voice assistant system of the smart speaker 102 may handle the inactive state, then the user may first input a wake-up word into the smart speaker 102, "Hello, little E".
  • the voice assistant system of the smart speaker 102 is started, and further voice input of the user is collected, so that the voice assistant system performs voice recognition processing on the voice input.
  • the user may continue to make voice input to the smart speaker 102.
  • the user's voice input may be: "switch the voice call to my mobile phone", that is, the target device performing the VoIP call is switched to the user's mobile phone 101.
  • the smart speaker 102 may generate second switching request information and send the second switching request information to the VoIP server 104, where the second switching request information is used to request the VoIP server 104: Switch the ongoing VoIP call to the second handover request information of the mobile phone 101.
  • the second switching request information may include the first account (for example, HUAWEI-01) currently logged in to the smart speaker 102 and the device identification of the target device (that is, the mobile phone 101).
  • the voice input may also be: "switch voice call".
  • the smart speaker 102 performs voice recognition on the voice input, it can be determined that the user's operation intention is to switch the VoIP call being performed in the smart speaker 102 to the user's other electronic device, but the voice input does not clearly indicate that the above Which electronic device of the user the VoIP call is switched to continue to be executed.
  • the second switching request information generated by the smart speaker 102 may include the first account currently logged in by the smart speaker 102, but does not include the device identification of the target device (that is, the mobile phone 101).
  • the smart speaker 102 may be pre-set to switch the VoIP call to another default electronic device of the user, such as the mobile phone 101, to continue execution. Then, when the target device is not indicated in the voice input, the smart speaker 102 may use the user's mobile phone 101 as the target device for performing VoIP calls after switching by default. At this time, the second switching request information generated by the smart speaker 102 may further include the device identification of the default target device (ie, the mobile phone 101).
  • the voice input may also be sent to the voice assistant server 105 through the network 109, and the voice assistant server 105 performs voice recognition on the voice input. Furthermore, the voice assistant server 105 can feed back the voice recognition result to the smart speaker 102, and the smart speaker 102 generates the second switching request information according to the voice recognition result and sends it to the VoIP server 104 through the network 109.
  • the smart speaker 102 may also automatically send the second switching request information to the VoIP server 104 through the network 109 according to the detected specific conditions without requiring the user to input a voice input to the smart speaker 102.
  • the above-mentioned specific condition may be the Wi-Fi signal strength in the WLAN network.
  • Both the mobile phone 101 and the smart speaker 102 are connected to the Wi-Fi network of the same router.
  • the router detects that the mobile phone 101 has disconnected the Wi-Fi network connection at a certain moment, or the detected Wi-Fi signal of the mobile phone 101 is below a preset threshold, the router can automatically send a notification message to the smart speaker 102, The notification information indicates that the mobile phone 101 has moved away from the Wi-Fi network.
  • the smart speaker 102 can automatically send the second switching request message to the VoIP server 104 through the network 109, requesting the VoIP server 104 to switch the ongoing VoIP call in the smart speaker 102 to the mobile phone 101, so that the user can easily
  • the mobile phone 101 continues to make VoIP calls.
  • the user first makes a VoIP call with the mobile phone 107 through the smart speaker 102, and then the user picks up the mobile phone and moves away from the smart speaker 102, while also away from the Wi-Fi network.
  • the above VoIP call is automatically switched to the user's mobile phone to continue the VoIP call, which improves the efficiency of the VoIP call and improves the user's experience.
  • the above specific condition may also be Bluetooth signal strength.
  • a Bluetooth connection can be established between the mobile phone 101 and the smart speaker 102, so that the smart speaker 102 can determine whether to automatically send the above second to the VoIP server 104 according to the detected Bluetooth signal strength between the mobile phone 101 and the smart speaker 102 Switch request information. For example, when the smart speaker 102 detects that the Bluetooth connection with the mobile phone 101 is disconnected, or when the smart speaker 102 detects that the Bluetooth signal strength of the mobile phone 101 is less than a preset threshold, the smart speaker 102 may automatically send to the VoIP server 104 The above second handover request information.
  • the smart speaker 102 may preset a gesture of tapping the smart speaker 102 once for triggering the smart speaker 102 to switch an ongoing VoIP call to a mobile phone, and a gesture of tapping the smart speaker 102 twice for triggering the smart speaker 102 to switch the VoIP The call is switched to the tablet 111 and so on.
  • Step S605 The VoIP server 104 receives the second switching request information sent by the smart speaker 102.
  • Step S606 In response to receiving the second switching request information, the VoIP server 104 determines that the source device of the VoIP service corresponding to the first account is the smart speaker 102.
  • Step S607 The VoIP server 104 switches the VoIP call to the mobile phone 101 to perform.
  • the VoIP server 104 may determine that the target device performing the VoIP call is the mobile phone 101 from the multiple electronic devices logged in using the first account. After receiving the second switching request information from the smart speaker 102, the VoIP server 104 may send the first account number (ie, HUAWEI-01) carried in the second switching request information to the device management server 103. Furthermore, the device management server 103 can query all electronic devices currently using the first account. For example, in addition to the smart speaker 102, the mobile phone 101 and the tablet 111 also use the first login.
  • the first account number ie, HUAWEI-01
  • the device management server 103 may send the device identifiers of the mobile phone 101 and the tablet 111 to the VoIP server 104 through the network 109, and the VoIP server 104 determines the target devices that will continue to perform the VoIP call instead of the smart speaker 102 in these electronic devices.
  • the VoIP server 104 may query whether the mobile device 101 is included in the device identification sent from the device management server 103 Equipment Identity. If the device identification of the mobile phone 101 is included, the VoIP server 104 may determine that the target device for performing the VoIP call is the mobile phone 101.
  • the VoIP server 104 may select one of the device identifications sent from the device management server 103 as the device identification of the target device that subsequently performs the VoIP call.
  • the VoIP server 104 may send the device identifications of multiple electronic devices sent from the device management server 103 to the smart speaker 102 through the network 109.
  • the smart speaker 102 may display a prompt box 701 in which options of target devices that can continue to perform VoIP calls under the first account are listed.
  • the user can manually select in which electronic device to switch the VoIP call on the smart speaker 102 in the prompt box 701 to perform.
  • the smart speaker 102 may send the device identification of the mobile phone 101 to the VoIP server 104.
  • the VoIP server 104 can determine that the target device for continuing the VoIP call is the mobile phone 101.
  • the VoIP server 104 may pre-register the VoIP identification of each electronic device in the VoIP service.
  • the VoIP identifier may be a telephone number or IP address used when performing VoIP services. Then, after the VoIP server 104 determines that the mobile phone 101 is the target device for continuing to perform the VoIP call instead of the smart speaker 102, the VoIP identifier of the mobile phone 101 can be queried, for example, the phone number of the mobile phone 101 is 123456. Furthermore, the VoIP server 104 can add the mobile phone 101 to the VoIP call between the smart speaker 102 and the mobile phone 107 according to the phone number. At this time, the VoIP server 104 establishes a VoIP multi-party call between the mobile phone 101, the smart speaker 102, and the mobile phone 107.
  • the VoIP server 104 interrupts the VoIP service of the smart speaker 102. After the interruption, only the mobile phone 101 and the mobile phone 107 are in a VoIP call at this time, that is, the VoIP call is switched to the mobile phone 101. At this time, the mobile phone 101 and the mobile phone 107 make a VoIP call.
  • the source device of the VoIP service is the smart speaker 102
  • the target device of the VoIP service is the mobile phone 101.
  • the smart speaker 102 can recognize the switching requirement of the VoIP service in response to the trigger operation performed by the user.
  • the smart speaker 102 may send second switching request information to the VoIP server 104, so that the VoIP server 104 seamlessly switches the VoIP service executed on the smart speaker 102 under the same account number to the mobile phone 101 to continue execution.
  • the VoIP service will not be interrupted during the switching process, and users do not need to repeatedly operate between multiple devices, thereby improving the efficiency of voice switching between multiple devices and the user experience.
  • the user can use the mobile phone 101 and other electronic devices (for example, the mobile phone 107) performs VoIP services. Subsequently, if the user wishes to switch the VoIP service from the mobile phone 101 to the smart speaker 102, the user can perform a preset input operation on the smart speaker 102 or the mobile phone 101 to trigger the VoIP server 104 to switch the ongoing VoIP on the mobile phone 101 The call is automatically switched to the smart speaker 102; or, when the mobile phone 101 or the smart speaker 102 detects a specific condition, the process of switching the VoIP call to the smart speaker 102 is automatically performed.
  • the mobile phone 101 or the smart speaker 102 detects a specific condition
  • the mobile phone 101 or the smart speaker 102 detects a specific condition which can be the following.
  • the specific condition may be the current state information of the mobile phone 101.
  • the mobile phone 101 can collect various environmental information, mobile phone posture information, etc. through one or more sensors 180. For example, when the mobile phone detects that the mobile phone is currently stationary for more than a preset period of time through the acceleration sensor 180E, and the mobile phone has been connected to the same Wi-Fi network as the smart speaker 102, the mobile phone 101 can Automatically send switching request information to the VoIP server 104, so that the VoIP server 104 automatically switches the ongoing VoIP call on the mobile phone 101 to the smart speaker 102.
  • the specific condition may also be a Bluetooth connection established between the mobile phone 101 and the smart speaker 102.
  • the user initially makes a VoIP call with the mobile phone 107 on the mobile phone 101.
  • the mobile phone 101 and the smart speaker 102 can automatically establish a Bluetooth connection.
  • the mobile phone 101 or the smart speaker 102 can automatically send a switching request message to the VoIP server 104, so that the VoIP server 104 automatically switches the ongoing VoIP call on the mobile phone 101 to the smart speaker 102.
  • the user can use the smart speaker 102 to perform audio / video playback services. If the user wishes to switch the audio playback service from the smart speaker 102 to the mobile phone 101, the user can perform a preset specific operation on the mobile phone 101 or the smart speaker 102 to trigger the content server 106 to switch the audio playback service on the smart speaker 102 to Phone 101.
  • a voice switching method provided in this embodiment includes:
  • Step S801 Both the mobile phone 101 and the smart speaker 102 use the same account (for example, the first account) to log in to the device management server 103.
  • the specific method for the mobile phone 101 and the smart speaker 102 to log in to the device management server 103 using the first account can be referred to the above-mentioned related embodiments, so it will not be repeated here.
  • Step S802 the smart speaker 102 receives a user's voice input, and the voice input is used to instruct the smart speaker 102 to play audio B;
  • Step S803 In response to the voice input, the smart speaker 102 determines a playback instruction of audio B.
  • Step S804 According to the above playback instruction, the smart speaker 102 obtains playback information from the content server 106 and plays audio B.
  • the smart speaker 102 When the user wants to use the smart speaker 102 to play a certain audio B (for example, the song Silence), the user can say “I want to listen to the song Silence” to the smart speaker 102. Furthermore, the smart speaker 102 recognizes the voice input as a voice instruction according to the configured voice assistant system, and the voice instruction is used to instruct to play the song Silence; then, the smart speaker 102 sends the request information for playing audio to the content server 106 through the network 109 After receiving the above request information, the content server 106, like the smart speaker 102, provides the playback service of the song Silence. Furthermore, the smart speaker 102 plays the song Silence from the playback information acquired from the content server 106, and the song can be played through one or more speaker arrays 370A of the smart speaker 102.
  • the smart speaker 102 plays the song Silence from the playback information acquired from the content server 106, and the song can be played through one or more speaker arrays 370A of the smart speaker 102.
  • the smart speaker 102 may carry the voice input in the recognition request and send it to the voice assistant server 105 through the network 109.
  • the voice assistant server 105 can use a voice recognition algorithm to perform voice recognition on the above voice input to obtain a play instruction of the song Silence.
  • the voice assistant server 105 sends the identified playback instruction to the content server 106, and the content server 106 provides the smart speaker 102 with the playback service of the song Silence.
  • the user may also trigger the smart speaker 102 to obtain a certain audio playback instruction through other preset methods.
  • This embodiment of the present application No restrictions. For example, when it is detected that the user taps the smart speaker 102, indicating that the user wishes to continue to play the most recently listened to program (for example, program C), the smart speaker 102 may generate a playback instruction for program C and send it to the content server 106.
  • the smart speaker 102 may generate a playback instruction for program C and send it to the content server 106.
  • the user can select the audio to be played on the touch screen, and then trigger the smart speaker 102 to generate a play instruction of the audio and send it to the content server 106.
  • the content server 106 may be used to maintain resource information of audio content such as music and programs. After the voice assistant server 105 sends the recognized audio playback instruction to the content server 106, the content server 106 may search for the resource information of audio B.
  • the resource information may be the audio resource of Audio B, or may be the playback address or download address of Audio B, and so on.
  • the content server 106 sends the resource information of audio B to the smart speaker 102, so that the smart speaker 102 plays audio B according to the resource information. Taking the resource information of Audio B as the playback address of Audio B as an example, the content server 106 may send the playback address of Audio B to the smart speaker 102. In this way, the smart speaker 102 can obtain the audio resource of audio B according to the playback address, and then perform the audio playback service of audio B.
  • the content server 106 may store information such as audio resources and the device identification of the device requesting playback to facilitate further processing.
  • Step S805 The mobile phone 101 sends a playback switching request to the content server 106.
  • Step S806 In response to the playback switching request, the content server 106 determines that the source device for playing audio B is the smart speaker 102.
  • Step S807 The content server 106 switches the audio playback service to the mobile phone 101 to continue.
  • the playback switching request may include a first account, and both the mobile phone 101 and the smart speaker 102 use the first account to log in to the device management server 103.
  • the playback switching request may also include the device identification of the mobile phone 101 and / or the device identification of the smart speaker 102.
  • the user can input a preset specific operation on the mobile phone 101.
  • the mobile phone 101 may send a playback switching request to the content server 106, and the content server 106 determines that the source device that is playing audio B is the smart speaker 102.
  • the content server 106 may send the first account to the device management server 103, and the device management server 103 queries which electronic devices are currently logged in to the first account. Furthermore, the content server 106 may determine the electronic device that is playing audio B (such as the smart speaker 102 described above) as the source device that plays audio content B.
  • audio B such as the smart speaker 102 described above
  • the content server 106 may determine that The source device for playing audio B under the first account is the smart speaker 102.
  • the content server 106 determines that the source device for playing audio B is the smart speaker 102, in order to enable subsequent switching of audio B to the smart speaker 102, the mobile phone 101 can continue to play the audio B from the current playback position, the content server 106 can query the audio B's progress on the smart speaker 102.
  • the content server 106 sends audio B resource information and playback progress to the mobile phone 101.
  • the mobile phone 101 continues to play audio B according to the resource information and playback progress of audio B.
  • the content server 106 may send the playback progress and resource information of the audio B to the mobile phone 101.
  • the mobile phone 101 can obtain the audio B according to the resource information of the audio B, and the mobile phone 101 can continue to play the audio B from the current playback position of the smart speaker 102 according to the playback progress of the audio B. Seamless switching between 101.
  • the content server may automatically interrupt the audio playback service of playing the audio B on the smart speaker 102.
  • the mobile phone 101 may send a playback event of audio B to the content server 106 through the network 109; in response to the playback event, the content server 106 interrupts the audio playback service of playing audio B on the smart speaker 102.
  • the audio B originally played on the smart speaker 102 may not be automatically interrupted. Then, when the mobile phone 101 starts playing audio B, the mobile phone 101 can automatically send the audio B playing event to the content server 106. In this way, the content server 106 can stop the audio playback service of the smart speaker 102 after receiving the playback event. For example, in response to the playback event reported by the mobile phone 101, the content server 106 may send a stop playing instruction to the smart speaker 102, so that the smart speaker 102 stops playing audio B on the smart speaker 102 in response to the stop playing instruction.
  • the audio server B may be sent a playback event to the content server 106.
  • the smart speaker 102 also plays audio B at the same time. In this way, even if some audio is missed due to transmission delay when the audio playback service is switched to the mobile phone 101, the user can combine audio B played on the smart speaker 102 to obtain complete audio content, thereby improving voice switching efficiency and user experience.
  • the source device of the audio playback service is the smart speaker 102
  • the target device of the audio playback service is the mobile phone 101.
  • the mobile phone 101 can recognize the switching requirement of the audio playback service in response to the trigger operation performed by the user.
  • the mobile phone 101 may send a first playback switching request to the content server 106, and the content server 106 seamlessly switches the audio playback service executed on the smart speaker 102 under the same account to the mobile phone 101 to continue execution.
  • the audio playback service will not be interrupted, and the user does not need to repeatedly operate between multiple devices, thereby improving the efficiency of voice switching between multiple devices and the user's experience.
  • the user can use the smart speaker 102 to perform audio playback services. If the user wishes to switch the audio playback service from the smart speaker 102 to the mobile phone 101, the user can also perform a specific preset operation on the smart speaker 102 to trigger the content server 106 to switch the audio playback service on the smart speaker 102 To mobile phone 101.
  • the specific operation may be a user's voice input to the smart speaker 102, for example, the user says to the smart speaker 102 "switch the song Silence to play on the mobile phone".
  • the manner in which the smart speaker 102 processes the voice input can participate in the related description in the above embodiment, and details are not described herein again.
  • this embodiment provides a voice switching system 900.
  • the system 900 may include: a first electronic device 901 (such as the mobile phone 101 in FIG. 1), and a second electronic device 902 (such as the smart in FIG. 1 The speaker 102), the device management server 903 (for example, the device management server 103 in FIG. 1), and the VoIP server 904 (for example, the VoIP server 104 in FIG. 1).
  • the system 900 may be used to implement the technical solution of voice switching involved in the foregoing embodiments, which will not be repeated here.
  • the above system 900 may further include a voice assistant server 905 (eg, voice assistant server 105 in FIG. 1) and a content server 906 (eg, content server 106 in FIG. 1).
  • the function of the voice assistant server 905 is the same as the function of the voice assistant server 105 in the above embodiment
  • the function of the content server 906 is the same as the function of the content server 106 in the above embodiment.
  • the functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or software function unit.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server or data center Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device including a server, a data center, and the like integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, DVD), or a semiconductor medium (for example, a solid-state hard disk), or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种语音切换方法、电子设备及系统,其中语音切换方法可以包括手机或语音助理设备根据用户的操作向VoIP服务器发送语音切换请求信息,VoIP服务器根据该请求信息将手机上正在进行的语音通话切换到手机附近的语音助理设备例如智能音箱。基于该方法,手机的用户可以脱离手机,并在不中断语音通话的情况下通过语音助理设备来继续进行语音通话。该方法提高了语音切换的效率,方便了用户,改善了用户体验。

Description

语音切换方法、电子设备及系统
本申请要求本申请要求于2018年10月09日提交中国国家知识产权局、申请号为201811172286.6、发明名称为“一种在智能音箱与智能手机间无缝切换语音通话和音频播放的方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及移动通信领域,尤其涉及一种语音切换方法、电子设备及系统。
背景技术
随着智能家居技术的发展,一个用户或家庭中往往具备多个能够互相通信的电子设备。例如,用户可使用同一个帐号登录手机以及家中的智能音箱、智能电视等电子设备,使得同一帐号下的多个电子设备组成一个局域网(local area network,LAN),该局域网内的各电子设备之间均可通过服务器互相通信。
这样一来,用户可以根据实际需求选择使用局域网中的不同电子设备实现某一功能。例如,用户在家中时可以使用音箱从服务器中获取音频资源并播放,用户离开家时可以使用手机从服务器中获取音频资源并播放。
但是,当用户需要将音箱上播放的音频接续在手机上播放时,通常需要用户在音箱上停止播放正在播放的音频,并重新在手机中打开该音频继续播放。显然,这种在多个电子设备之间进行语音切换时的流程较为繁琐,在切换过程中正在播放的语音会发生中断,这样就使得电子设备之间切换音频播放的效率极大降低。
发明内容
本发明的目的在于申请提供了一种语音切换方法、电子设备及系统,可使得电子设备之间切换语音(例如VoIP通话或者音频播放)的效率提高哦,并提高用户体验。
上述目标和其他目标将通过独立权利要求中的特征来达成。进一步的实现方式在从属权利要求、说明书和附图中体现。
第一方面,提供一种语音切换的方法,可以包括:第二电子设备(例如智能音箱)检测到用户的语音输入;响应于上述语音输入,上述第二电子设备通过VoIP服务器与第三电子设备建立VoIP通话;第一电子设备(例如手机)向上述VoIP服务器发送第一切换请求信息,上述第一切换请求信息用于请求上述VoIP服务器将在上述第二电子设备上正在进行的上述VoIP通话切换到上述第一电子设备上;其中,上述第一切换请求信息中包括第一账号(例如HUAWEI-01),上述第一账号用于登录到设备管理服务器;上述VoIP服务器接收到上述第一切换请求信息;响应上述第一切换请求信息, 上述VoIP服务器确定与上述第一帐号对应的VoIP业务的源设备为上述第二电子设备;上述VoIP服务器将在上述第二电子设备上正进行的上述VoIP通话切换到上述第一电子设备上。
在上述技术方案中,VoIP业务的源设备为智能音箱,VoIP业务的目标设备为手机。手机可向VoIP服务器发送第一切换请求信息,使得VoIP服务器将同一帐号下在智能音箱上正在执行的VoIP业务无缝切换到手机上。切换过程中不会发生VoIP业务中断的现象,用户也无需在多个设备之间反复操作,从而提高了多设备之间的语音切换效率和用户的使用体验。
在一种可能的实现方式中,在上述方法之前还可以包括:第一电子设备和第二电子设备均使用上述第一帐号登录到设备管理服务器。这样就表明这两个电子设备都属于同一个用户的电子设备,而且也表明这两个电子设备都用了同一个云服务提供商的服务。
在一种可能的实现方式中,第一电子设备向VoIP服务器发送第一切换请求信息,具体可以包括:当第一电子设备检测到用户的特定操作时,则响应于上述特定操作,第一电子设备向VoIP服务器发送上述第一切换请求信息。上述特定操作为以下操作的其中一种:翻转手机、指关节敲击屏幕、双击电源键、预设的语音输入或预设的滑动手势。在本方案下,只有当用户做出一特定操作时,才能出发第一电子设备发送第一切换请求信息,这样就可以基于用户的实际需求来触发语音切换的流程,使得第一电子设备更加智能化,同时也提高了用户体验。
在另外一种可能的实现方式中,第一电子设备向VoIP服务器发送第一切换请求信息,可以具体包括:当第一电子设备检测到特定条件时,则响应于上述特定条件,第一电子设备可以向VoIP服务器发送上述第一切换请求信息。上述特定条件可以为:WLAN网络中的Wi-Fi信号强度或蓝牙信号强度;例如:当第一电子设备检测到上述Wi-Fi信号强度低于一预设阈值时,第一电子设备向VoIP服务器发送上述第一切换请求信息;或者,当第一电子设备检测到第二电子设备的蓝牙信号强度低于一预设阈值时,第一电子设备向VoIP服务器发送上述第一切换请求信息。在本方案下,第一电子设备可以根据检测到的特定条件来自动触发发送第一切换请求信息的流程,这样就减少了用户的参与,更加智能化的实行语音切换的方法,进一步提高了语音切换的效率。
在一种可能的实现方式中,上述方法还可以包括:第一电子设备向VoIP服务器发送成功加入VoIP通话的响应消息;在接到上述响应消息后,VoIP服务器中断在第二电子设备上的VoIP业务。在第一电子设备例如手机加入到该VoIP通话后,此时有第一电子设备、第二电子设备、第三电子设备之间的三方VoIP通话。如果要切换到第一电子设备上继续执行,那么中断第二电子设备的VoIP业务是一个比较节省网络资源的方法。
在一种可能的实现方式中,VoIP服务器确定与第一帐号对应的VoIP业务的源设备为第二电子设备,具体可以包括:VoIP服务器向设备管理服务器发送上述第一帐号;设备管理服务器根据上述第一帐号确定使用该第一帐号登录的至少一个电子设备;设备管理服务器将上述至少一个电子设备的设备标识发送给VoIP服务器;VoIP服务器根据上述设备标识确定在第一帐号下正在进行VoIP通话的源设备为第二电子设备。
在一种可能的实现方式中,上述确定在第一帐号下正在进行VoIP通话的源设备为第二电子设备,具体可以包括:当VoIP服务器根据上述设备标识确定在第一帐号下正在进行VoIP通话的源设备有至少两个电子设备时,VoIP服务器将上述至少两个电子设备的设备标识发送给第一电子设备;第一电子设备上显示至少两个选项;上述至少两个选项用于表示上述至少两个电子设备;第一电子设备检测到用户对其中一个选项的选择操作;其中,上述选项表示第二电子设备;响应于上述选择操作, 第一电子设备将第二电子设备的设备标识发送给VoIP服务器;VoIP服务器根据接收到的第二电子设备的设备标识确定,在第一帐号下正在进行VoIP通话的源设备为第二电子设备。
第二方面,提供一种语音切换系统,该系统包括第一电子设备、第二电子设备、设备管理服务器和VoIP服务器;其中:该第二电子设备用于在检测到用户的语音输入时,通过该VoIP服务器与第三电子设备建立VoIP通话;该第一电子设备用于向该VoIP服务器发送第一切换请求信息,该第一切换请求信息用于请求该VoIP服务器将在该第二电子设备上正在进行的该VoIP通话切换到该第一电子设备上;其中,该第一切换请求信息中包括第一账号,该第一账号用于登录到该设备管理服务器;该VoIP服务器用于接收该第一切换请求信息,并确定与该第一帐号对应的VoIP业务的源设备为该第二电子设备;该VoIP服务器还用于将在该第二电子设备上正进行的该VoIP通话切换到该第一电子设备上。
在一种可能的实现方式中,该第一电子设备和该第二电子设备均使用该第一帐号登录到该设备管理服务器。
在一种可能的实现方式中,该第一电子设备还用于:当检测到用户的特定操作时,向该VoIP服务器发送该第一切换请求信息;该特定操作为以下操作的其中一种:翻转手机、指关节敲击屏幕、双击电源键、预设的语音输入或预设的滑动手势。
在一种可能的实现方式中,该第一电子设备向该VoIP服务器发送第一切换请求信息,具体包括:当该第一电子设备检测到特定条件时,该第一电子设备向该VoIP服务器发送该第一切换请求信息。
在一种可能的实现方式中,该特定条件为:WLAN网络中的Wi-Fi信号强度或蓝牙信号强度;其中:当该第一电子设备检测到该Wi-Fi信号强度低于一预设阈值时,该第该第一电子设备向该VoIP服务器发送该第一切换请求信息;或者,当该第一电子设备检测到该第二电子设备的蓝牙信号强度低于一预设阈值时,该第一电子设备向该VoIP服务器发送该第一切换请求信息。
在一种可能的实现方式中,该第一电子设备还用于向该VoIP服务器发送成功加入VoIP通话的响应消息;该VoIP服务器还用于在接到该响应消息后,中断在该第二电子设备上的VoIP业务。
在一种可能的实现方式中,该VoIP服务器确定与该第一帐号对应的VoIP业务的源设备为该第二电子设备,具体包括:该VoIP服务器向该设备管理服务器发送该第一帐号;该设备管理服务器根据该第一帐号确定使用该第一帐号登录的至少一个电子设备;该设备管理服务器将该至少一个电子设备的设备标识发送给该该VoIP服务器;该VoIP服务器根据该设备标识确定在该第一帐号下正在进行VoIP通话的源设备为该第二电子设备。
在一种可能的实现方式中,该VoIP服务器还用于当根据该设备标识确定在该第一帐号下正在进行VoIP通话的源设备有至少两个电子设备时,将该至少两个电子设备的设备标识发送给该第一电子设备;该第一电子设备还用于显示至少两个选项;该至少两个选项用于表示该至少两个电子设备;该第一电子设备检测到用户对其中一个选项的选择操作;其中,该选项表示该第二电子设备;该第一电子设备还用于将该第二电子设备的设备标识发送给该VoIP服务器;该VoIP服务器还用于根据接收到的该第二电子设备的设备标识确定,在该第一帐号下正在进行VoIP通话的源设备为该第二电子设备。
在一种可能的实现方式中,该第一电子设备为手机,该第二电子设备为配置有语音助理系统的智能音箱。
第三方面,还提供一种语音切换的电子设备,该电子设备具有实现上述方法实际中第一电子设备行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件可以包括一个或多个与上述功能相对应的模块。
第四方面,还提供一种语音切换的电子设备,该电子设备具有实现上述方法实际中第二电子设备行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件可以包括一个或多个与上述功能相对应的模块。
应当理解的是,说明书中对技术特征、技术方案、优点或类似语言的描述并不是暗示在任意的单个实施例中可以实现所有的特点和优点。相反,可以理解的是对于特征或优点的描述意味着在至少一个实施例中包括特定的技术特征、技术方案或优点。因此,本说明书中对于技术特征、技术方案或优点的描述并不一定是指相同的实施例。进而,还可以任何适当的方式组合以下各个实施例中所描述的技术特征、技术方案和优点。本领域技术人员将会理解,无需特定实施例的一个或多个特定的技术特征、技术方案或优点即可实现实施例。在其他实施例中,还可在没有体现所有实施例的特定实施例中识别出额外的技术特征和优点。
附图说明
图1为一实施例中语音切换系统的实施场景的示意图;
图2为一实施例中第一电子设备(如手机)的结构示意图;
图3为一实施例中第二电子设备(如智能音箱)的结构示意图;
图4为一实施例中提供的语音切换方法的流程示意图;
图5为一实施例中电子设备101的用户界面示意图;
图6为另一实施例中提供的语音切换方法的流程示意图;
图7为另一实施例中电子设备102的用户界面示意图;
图8为另一实施例中提供的语音切换方法的流程示意图;
图9为一实施例中语音切换系统的结构示意图。
具体实施方式
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本申请中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的至少一个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参 考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
以下实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
应当理解的是,虽然术语“第一电子设备”、“第二电子设备”等可能在本文中用来描述各种电子设备,但是这些电子设备不应当被这些术语限定。这些术语只是用来将一个电子设备与另一电子设备区分开。例如,第一电子设备可以被命名为第二电子设备,并且类似地,第二电子设备也可以被命名为第一电子设备,而不背离本申请的范围。第一电子设备和第二电子设备二者都是电子设备,但是它们可以不是同一电子设备,在某些场景下也可以是同一电子设备。
以下介绍了电子设备(例如第一电子设备、第二电子设备等)、用于这样的电子设备的用户界面、和用于使用这样的电子设备的实施例。在一些实施例中,电子设备可以是还包含其它功能诸如个人数字助理和/或音乐播放器功能的便携式电子设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴电子设备(如智能手表)等。便携式电子设备的示例性实施例包括但不限于搭载
Figure PCTCN2018125853-appb-000001
Figure PCTCN2018125853-appb-000002
或者其它操作系统的便携式电子设备。上述便携式电子设备也可以是其它便携式电子设备,诸如具有触控面板或触敏表面的膝上型计算机(Laptop)等。还应当理解的是,在其他一些实施例中,上述电子设备也可以不是便携式电子设备,而是台式计算机。
下面将结合附图对本申请中各种实施例进行详细描述。
如图1所示,本申请一实施例提供一种语音切换系统100,该语音切换系统100可以包括一个或多个电子设备,例如第一电子设备(如图1中的电子设备101)和第二电子设备(如图1中的电子设备102)。其中,第一电子设备的具体结构,在后续实施例中将结合图2进行详细描述;第二电子设备的具体结构,在后续实施例中将结合图3进行详细描述。
如图1所示,电子设备101可以通过一个或多个网络109与电子设备102连接(例如有线或无线)。示例性地,该一个或多个通信网络109可以是局域网,也可以是广域网(wide area networks,WAN)例如互联网。该一个或多个通信网络109可使用任何已知的网络通信协议来实现,上述网络通信协议可以是各种有线或无线通信协议,诸如以太网、通用串行总线(universal serial bus,USB)、火线(FIREWIRE)、全球移动通讯系统(global system for mobile communications,GSM)、通用分组无线服务(general packet radio service,GPRS)、码分多址接入(code division multiple access,CDMA)、宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE)、蓝牙、无线保真(wireless fidelity,Wi-Fi)、基于互联网协议的语音通话(Voice over Internet Protocol,VoIP)、或任何其他合适的通信协议。
上述语音切换系统100还可以包括设备管理服务器103,该设备管理服务器103用于管理注册的至少一个电子设备(如电子设备101和电子设备102)。示例性地,当电子设备101通过网络109向设备管理服务器103发送访问请求时,该设备管理服务器103可以对该电子设备进行鉴权(例如验证帐号、 密码是否匹配);在鉴权通过后,该设备管理服务器103可以允许电子设备101访问设备管理服务器103上与之对应的数据例如存储空间等。又例如例如设备管理服务器103为电子设备(101,102)配置存储空间,这样电子设备(101,102)可以通过网络109将存储在电子设备(101,102)存储器中的数据(例如图片、视频等)发送给设备管理服务器103,再由设备管理服务器将接收到的数据保存在为电子设备(101,102)配置的储存空间中。示例性地,设备管理服务器103还可以通过网络109对电子设备(101,102)进行参数配置。
其中,帐号可以是指电子设备登录到设备管理服务器103所使用的凭证。电子设备(101,102)需要使用一个帐号登录到设备管理服务器103,才能使用电子设备的一些功能,例如电子设备需要登录帐号,才能使用指纹识别、联系人同步、找回手机等功能,在没有登录帐号的情况下,无法使用上述功能。当用户在电子设备上输入帐号及密码进行登录时,验证信息可以通过网络109发送给设备管理服务器103进行验证。可以理解的是,由于上述设备管理服务器主要是用于对电子设备的帐号进行鉴权,因此该云服务器可以知道哪些电子设备登录了同一个帐号。
在一些实施例中,电子设备101和电子设备102可以是属于同一个用户108的两个不同的电子设备。例如托马斯有一部智能手机,还有一个语音助理设备(例如智能音箱);其中,该智能语音设备配置有语音助理系统(以下实施例中对给语音助理系统有详细说明),该语音助理设备可以接收用户的语音输入并对该语言输入进行分析等功能。这两个电子设备都可以使用托马斯所拥有的帐号(例如HUAWEI-01)访问设备管理服务器103。设备管理服务器103可以管理各个帐号和使用该帐号访问的电子设备之间的访问权限;另外,同一个帐号也可在设备管理服务器103所管理的2个或2个以上的电子设备上同时登录。这样,登录同一帐号的第一电子设备(例如电子设备101)和第二电子设备(例如电子设备102)可通过设备管理服务器103实现数据交互等。可以理解的是,该用户108可以通过其他电子设备使用上述帐号登录到设备管理服务器103中,并对存储在其上的电子设备的访问权限进行调整,例如删除电子设备101使用帐号HUAWEI-01登录的权限,或,增加另一电子设备使用上述帐号登录的权限等。
示例性地,表1中是存储在设备管理服务器103中的一些与登录的电子设备相关的信息。从中可以看到,有两个电子设备(设备名称为手机101、智能音箱102)使用同一帐号(HUAWEI-01)登录到了设备管理服务器103中。这两个电子设备登录设备管理服务器103时可以携带有各自的设备标识(例如表1中的IMEI),或者设备管理服务器103在电子设备登录后,向这些电子设备请求其对应的设备标识,以便后续对这些电子设备进行管理。上述设备标识用于唯一标识电子设备,以便网络中的其他电子设备或者服务器辨识它们。常见的设备标识包括国际移动设备身份码(international mobile equipment identity,IMEI)、国际移动用户识别码(international mobile subscriber identification number,IMSI)、移动设备识别码(mobile equipment identifier,MEID)、序列号(serial number,SN)、集成电路卡识别码(Integrate circuit card identity,ICCID)、以及媒体介入控制层(media access control,MAC)地址或其他能够唯一标识电子设备的标识符等。这样,设备管理服务器103就可以通过不同的设备标识来识别不同的电子设备,尽管这些电子设备是以同一个帐号登录的。
Figure PCTCN2018125853-appb-000003
Figure PCTCN2018125853-appb-000004
表1
在一些实施例中,电子设备101(例如手机101)可以通过上述网络109与电子设备107(例如手机107)进行语音通信例如VoIP通话。其中,电子设备107可以使用第二账号(例如表1中的HUAWI-02)登录到设备管理服务器103中。
在一些实施例中,上述语音切换系统100还可以包括语音助理服务器105。语音助理服务器105可以通过上述网络109与外部服务(例如流媒体服务、导航服务、日历服务、电话服务、照片服务等)进行通信以完成任务或采集信息。上述语音助理服务器105可以是语音助理系统(图中未示出)的一部分,该语音助理系统可以根据客户端-服务器模型来实现。示例性地,该语音助理系统可以包括在电子设备(例如如图1中的电子设备102)上执行的客户端侧部分(例如语音助理客户端),以及在语音助理服务器105上执行的服务器侧部分(例如语音助理系统)。语音助理客户端可以通过上述网络109与语音助理系统进行通信。语音助理客户端提供客户端侧功能诸如面向用户的输入和输出处理以及与服务器侧的语音助理系统的通信。语音助理系统可以为一个或多个语音助理客户端提供服务器侧功能,该一个或多个的语音助理客户端各自位于相应的电子设备上(例如电子设备101和电子设备102)。
示例性地,如图1所示的第一电子设备(如电子设备101)可以为手机101,第二电子设备(如电子设备102)可以为语音助理设备例如智能音箱102。
在一些实施例中,手机101和智能音箱102均可具有语音通信功能。例如,手机101和智能音箱102可提供VoIP业务。那么,如图1所示,语音切换系统100中还可以包括VoIP服务器104。VoIP服务器104可用于实现VoIP业务的呼叫、接听、三方通话以及呼叫转移等与语音通信相关业务。这样,手机101(或智能音箱102)可通过VoIP服务器104与其他具有语音通信功能的电子设备进行语音通信。
在一些实施例中,语音助理服务器105可以向VoIP服务器104提供语音识别结果。示例性地,用户108向智能音箱102输入“打电话给约翰”的语音信号后,智能音箱102可将采集到的该语音信号发送给语音助理服务器105进行语音识别。语音助理服务器105识别出与上述语音信号对应的控制指令为:呼叫联系人约翰。进而,语音助理服务器105可向VoIP服务器104发送呼叫联系人约翰的指令。响应于该指令,VoIP服务器104可向约翰的电子设备(例如手机)发起语音呼叫请求,当约翰接受该语音呼叫请求后便可建立智能音箱102与约翰的手机之间的语音通话,实现VoIP业务。
在另外一些实施例中,如图1所示,上述语音切换系统100中还可以包括内容服务器106。该内容服务器106可用于根据用户108的请求向智能音箱102(或手机101)提供音乐、视频等流媒体内容。例如,用户108在向智能音箱102发出“播放歌曲Silence”的语音信号后,智能音箱102可将采集到的该语音信号通过网络109发送给语音助理服务器105进行语音识别。语音助理服务器105识别出与上述语音信号对应的控制指令为:获取歌曲Silence的媒体资源。进而,语音助理服务器105可向内容服务器106发送请求信息,以获取歌曲Silence的媒体资源。响应于语音助理服务器105发送的请求信息,内容服务器106查找到歌曲Silence的媒体资源后可将该播放地址返回给语音助理服务器105,并由语音助理服务器105将该媒体资源发送给智能音箱102,使得智能音箱102根据该媒体资源获取歌曲Silence的地址并播放或者保存。
当然,手机101(或智能音箱102)也可以通过网络109与内容服务器106直接进行交互。示例性地,手机101(或智能音箱102)可以根据用户108的输入,向内容服务器106发送播放歌曲Silence的请求信息;内容服务器106在接收到上述请求信息后,查找到歌曲Silence的媒体资源,然后可将该媒体资源返回给手机101(或智能音箱102);手机101(或智能音箱102)根据收到的该媒体资源获取歌曲Silence的地址并播放,也可以将该歌曲保存在手机101的存储器中或智能音箱102的存储器中。
在其他一些实施例中,手机101与智能音箱102登录同一帐号后,当手机101执行语音通话业务(例如上述VoIP业务)时,如果用户108希望将此时的语音通话业务切换到智能音箱102上,则用户108可在手机101或智能音箱102上做一个预设的特定操作,诸如特定手势或语音输入等,从而触发VoIP服务器104将手机101上的语音通话业务切换至智能音箱102,以便该语音通话业务继续在智能音箱102上进行。相应的,当智能音箱102正在执行语音通话时,用户108也可在手机101或智能音箱102上做一个预设的特定操作,以触发VoIP服务器104将智能音箱102上的语音通话业务切换至手机101,以便该语音通话业务继续在手机101上进行。也就是说,在上述实施例中,用户108只需对电子设备做上述特定操作,VoIP服务器104便可将正在进行的语音通话业务从第一电子设备自动切换到第二电子设备中。这样一来,整个切换过程中不会发生语音通话业务中断的现象,用户也无需在多个电子设备之间反复操作,从而提高了多个电子设备之间的语音切换效率和用户的使用体验。
在另一些实施例中,当手机101与智能音箱102登录同一帐号(例如HUAWEI-01)后,当手机101正在播放音频/视频时,如果用户108希望将该播放的音频/视频切换(例如无缝切换)到智能音箱102上继续播放,用户108可在手机101或智能音箱102上做一个预设的输入操作,从而触发内容服务器106将手机101上正在播放的音频/视频业务切换至智能音箱102上继续播放。同样的,当智能音箱102执行音频/视频业务时,用户108也可在手机101或智能音箱102上做一个预设的输入操作,以触发内容服务器106将智能音箱102上正在播放的音频/视频切换至手机101上继续播放。
需要说明的是,第一电子设备除了可以是手机101外,还可以是平板电脑、具备无线通讯功能的可穿戴电子设备(如智能手表)、虚拟现实设备等支持音频/视频业务或语音通话业务的电子设备,以下实施例中对第一电子设备的具体形式不做特殊限制。第二电子设备除了智能音箱102外,还可以为智能电视、平板电脑、笔记本电脑、台式计算机等支持音频/视频业务的电子设备,以下实施例中对第二电子设备的具体形式不做特殊限制。在一些实施例中,第一电子设备可以是手机,那么,第二电子设备可以是配置有语音助理系统的智能音箱或笔记本电脑。
示例性的,图2示出了第一电子设备即图1中的电子设备101(例如手机)的结构示意图。
电子设备101可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。可以理解的是,本实施例示意的结构并不构成对电子设备101的具体限定。在本申请另一些实施例中,电子设备101可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件,或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。在一些实施例中,电子设备101也可以包括一个或多个处理器110。其中,控制器可以是电子设备101的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了电子设备101系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。其中,USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备101充电,也可以用于电子设备101与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。
可以理解的是,本发明实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备101的结构限定。在本申请另一些实施例中,电子设备101也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备101的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备101供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备101的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。天线1和天线2用于发射和接收电磁波信号。电子设备101中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备101上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备101上的包括无线局域网(wireless local area networks,WLAN)(如Wi-Fi网络),蓝牙(Bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
无线通信模块160具体可以用于与第二电子设备(例如电子设备102)建立短距离无线通信链路,以便两者之间相互进行短距离无线数据传输。示例性地,上述短距离无线通信链路可以是蓝牙通信链路、Wi-Fi通信链路、NFC通信链路等。因此,无线通信模块160具体可以包括蓝牙通信模块、Wi-Fi通信模块或NFC通信模块。
在一些实施例中,电子设备101的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备101可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括GSM,GPRS,CDMA,WCDMA,TD-SCDMA,LTE,GNSS,WLAN,NFC,FM,和/或IR技术等。上述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备101通过GPU,显示屏194,以及应用处理器等可以实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot  light emitting diodes,QLED)等。在一些实施例中,电子设备101可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备101可以通过ISP,一个或多个摄像头193,视频编解码器,GPU,一个或多个显示屏194以及应用处理器等实现拍摄功能。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备101的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备101的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐、照片、视频等数据文件保存在外部存储卡中。
内部存储器121可以用于存储一个或多个计算机程序,该一个或多个计算机程序包括指令。处理器110可以通过运行存储在内部存储器121的上述指令,从而使得电子设备101执行本申请一些实施例中所提供的语音切换方法,以及各种功能应用以及数据处理等。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统;该存储程序区还可以存储一个或多个应用程序(比如图库、联系人等)等。存储数据区可存储电子设备101使用过程中所创建的数据(比如照片,联系人等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。在一些实施例中,处理器110可以通过运行存储在内部存储器121的指令,和/或存储在设置于处理器110中的存储器的指令,来使得电子设备101执行本申请实施例中所提供的语音切换方法,以及各种功能应用及数据处理。
电子设备101可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。其中,音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备101可以通过扬声器170A收听音乐,或收听免提通话。受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备101接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备101可以设置至少一个麦克风170C。在另一些实施例中,电子设备101可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备101还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,还可以是美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
传感器180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
其中,压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备101根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备101根据压力传感器180A检测所述触摸操作强度。电子设备101也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备101的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备101围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备101抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备101的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景等。
加速度传感器180E可检测电子设备101在各个方向上(一般为三轴)加速度的大小。当电子设备101静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备101可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备101可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备101通过发光二极管向外发射红外光。电子设备101使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备101附近有物体。当检测到不充分的反射光时,电子设备101可以确定电子设备101附近没有物体。电子设备101可以利用接近光传感器180G检测用户手持电子设备101贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。电子设备101可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备101是否在口袋里,以防误触。
指纹传感器180H(也称为指纹识别器),用于采集指纹。电子设备101可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。另外,关于指纹传感器的其他记载可以参见名称为“处理通知的方法及电子设备”的国际专利申请PCT/CN2017/082773,其全部内容通过引用结合在本申请中。
触摸传感器180K,也可称触控面板或触敏表面。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称触控屏。触摸传感器180K用于检测作用于其上或附近 的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备101的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于所述骨传导传感器180M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。
按键190包括开机键,音量键等。按键190可以是机械按键,也可以是触摸式按键。电子设备101可以接收按键输入,产生与电子设备101的用户设置以及功能控制有关的键信号输入。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备101的接触和分离。电子设备101可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。电子设备101通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备101采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备101中,不能和电子设备101分离。
下面实施例将详细介绍第二电子设备的结构。应当理解的是,在一些实施例中,上述第二电子设备可以与第一电子设备(例如图2中的电子设备101)的结构相同,因此在此不再赘述第二电子设备的结构。在另外一些实施例中,第二电子设备(例如电子设备102)可以是语音助理设备;因此,第二电子设备的结构也可以与第一电子设备的结构不相同。示例性的,图3示出了在另外一些实施例中的第二电子设备的结构示意图。
图3为一些实施例中的第二电子设备(例如电子设备102)的结构示意图。示例性地,电子设备102具体可以为一语音助理设备(例如智能音箱102),该语音助理设备配置有语音助理系统。
示例性地,该语音助理系统可以是指解译口头和/或文本形式的自然语言输入以推断用户意图(例如,识别对应于自然语言输入的任务类型)并基于推断出的用户意图来执行动作(例如,执行对应于所识别的任务类型的任务)的任何信息处理系统。例如,为遵照推断出的用户意图来执行动作,该系统可执行以下操作中的一个或多个:通过设计用以实现所推断出的用户意图的步骤和参数来识别任务流(例如,识别任务类型),将来自推断出的用户意图的具体要求输入到任务流中,通过调用程序、方法、服务、应用程序编程接口(application programming interface,API)等来执行任务流(例如,发送请求至服务提供方);以及生成对用户的听觉(例如语音)和/或视觉形式的输出响应。具体地讲,该语音助理系统一旦启动,就能够接受至少部分地为自然语言命令、请求、声明、讲述和/或询问的形式的用户请求。通常,用户请求要么寻求语音助理系统作出信息性回答,要么寻求语音助理系统执行任务。对用户请求的令人满意的响应通常是提供所请求的信息性回答、执行所请求的任务、或这两者的组合。例如,用户可向语音助理系统提出诸如“我现在在哪里?”之类的问题。基于用户的当前位置,语音助理可能回答“你在中央公园西门附近。”用户还可请求执行任务,例如通过叙述“请邀请我的朋友下周来参加我的生日聚会。”作为响应,语音助理系统可通过生成语音输出“好 的,马上”来确认请求,并且然后将合适的日历邀请从用户的电子邮件地址发送到用户的电子通讯录或联系人列表中列出的用户的每个朋友。在一些实施例中,存在与语音助理系统进行交互以请求信息或执行各种任务的许多其他方法。除了提供口头应答并进行程序化动作之外,语音助理系统还可提供其他视觉或音频形式的应答(例如,像文本、警报、音乐、视频、动画等)。
如图3所示,电子设备102具体可以包括处理器310,外部存储器接口320,存储器321,USB接口330,充电管理模块340,电源管理模块341,电池342,天线343,网络通信接口350,输入/输出(I/O)接口351,无线通信模块360,音频模块370,一个或多个扬声器阵列370A,一个或多个麦克风阵列370B,一个或多个传感器380,按键390,马达391,指示器392,摄像头393,以及显示屏394等。这些部件通过一条或多条通信总线或信号线彼此通信。
可以理解的是,本实施例示意的结构并不构成对电子设备102的具体限定。在另一些实施例中,电子设备102可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件,或软件和硬件的组合实现。
在一些实施例中,图3中的外部存储器接口320、USB接口330、充电管理模块340、电源管理模块341、电池342、天线343、无线通信模块360、音频模块370、一个或多个传感器380、按键390、马达391、指示器392、摄像头393以及显示屏394等,可以与图2中电子设备101的一些部件具有相同或类似的结构和/或功能,因此对图3中上述部件的具体描述可以参见图2及相关实施例中相应的描述,在此不再赘述。
网络通信接口350可以包括一个或多个有线通信端口,或,一个或多个无线发射/接收电路。一个或多个有线通信端口经由一个或多个有线接口例如以太网、USB、火线等接收和发送通信信号。无线电路通常从通信网络及其他电子设备接收RF信号或光学信号,以及将RF信号或光学信号发送至通信网络及其他电子设备。无线通信可使用多种通信标准、协议和技术中的任一种,诸如GSM、CDMA、WCDMA、TDMA、蓝牙、Wi-Fi、VoIP或任何其他合适的通信协议。网络通信接口350使电子设备102能够通过网络,诸如互联网、无线网络诸如蜂窝网络、无线局域网等,与其他电子设备(例如手机101)或网络侧的服务器进行通信。
存储器321可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器等。在一些实施例中,处理器310可以通过运行存储在存储器321的指令,或存储在设置于处理器310中的存储器的指令,来使得电子设备102执行本申请实施例中所提供的语音切换方法,以及各种功能应用及数据处理。
在一些实施例中,存储器321可以存储程序、模块、指令和数据结构,这些程序、模块、指令和数据结构包括以下中的全部或子集:操作系统321A、通信模块321B、用户界面模块321C、一个或多个应用程序321D、以及语音助理模块321E。一个或多个处理器310执行这些程序、模块和指令,并从数据结构读取数据或将数据写到数据结构。
其中,操作系统321A(例如,Darwin、RTXC,LINUX、UNIX、OS X、WINDOWS、或嵌入式操作系统)包括用于控制和管理一般系统任务(例如,存储器管理、存储设备控制、电力管理等)的各种软件部件或驱动器,并促成各种硬件、固件与软件部件之间的通信。通信模块321B促成语音助理设备300与其他电子设备之间通过网络通信接口350进行的通信。例如,通信模块321B可与图2中所示的电子设备101进行通信。通信模块321B还可以包括各种软件部件,该各种软件部件可以用于处理由 网络通信接口350所接收的数据。用户界面模块321C经由I/O接口351(例如从与I/O接口351连接的键盘、触摸屏或麦克风)接收来自用户的命令或输入,并将用户界面显示在显示器上。应用程序321D包括被配置为由一个或多个处理器310执行的程序或模块。例如,如果语音助理系统在语音助理设备上独立实施,则应用程序321D可包括诸如游戏、日历、导航或邮件等应用程序。如果语音助理系统在服务器上实施,则应用程序321D可包括例如资源管理、诊断或调度等应用程序。
存储器321还存储语音助理模块321E。在一些具体的实施例中,语音助理模块321E可以包括以下子模块、或者它们的子集或超集:I/O处理模块321F、语音转文本(speech to text,STT)处理模块321G、自然语言处理模块321H、对话流处理模块321I、任务流处理模块321J以及服务处理模块321K。该语音助理模块321E主要用于通过上述子模块的信息交互来实现本申请实施例中的语音助理系统。
在一些实施例中,通过使用处理模块(例如I/O处理模块321F、STT处理模块321G、自然语言处理模块321H、对话流处理模块321I、任务流处理模块321J或服务处理模块321K)、数据以及在语音助理模块321E中实现的模型,语音助理设备300可以执行以下操作中的至少一种:识别从用户接收的自然语言输入中所表达的用户的意图;积极地引出并获得用于充分推断用户的意图所需的信息(例如,通过消除字词、名称、意向等的歧义);确定用于实现推断出的意图的任务流(task flow);以及执行任务流以实现推断出的意图。在另外一些实施例中,当由于各种原因而未向或不能向用户提供令人满意的响应时,语音助理设备300也采取其他适当的行动。
在一些实施例中,I/O处理模块321F可以通过一个或多个麦克风阵列370B接收用户的命令(例如语音命令)或者输入(例如语音输入);I/O处理模块321F也可以通过I/O接口351从所连接的其他设备(例如麦克风、触摸屏或键盘等)中接收来自用户的命令或输入;I/O处理模块321F还可以通过一个或多个扬声器阵列370A或者指示器392或显示屏394等提供对用户输入的响应,从而与用户进行交互;I/O处理模块321F还可以通过网络通信接口350与其他电子设备(例如图2中的电子设备101)交互以获得用户输入(例如语音输入)并提供对用户输入的响应。I/O处理模块321F可以在接收到用户输入时或在接收到用户输入之后不久,获得与来自其他电子设备的用户输入相关联的上下文信息。上下文信息包括特定于用户的数据、词汇,或与用户输入相关的偏好。在另一些实施例中,上下文信息还可以包括在接收到用户请求时电子设备(例如电子设备101)的软件和硬件状态,或与在接收到用户请求时用户的周围环境相关的信息。在一些实施例中,I/O处理模块321F还向用户发送与用户请求有关的跟进问题,并从用户接收回答。在一些实施例中,当用户请求被I/O处理模块321F接收到且用户请求包含语音输入时,I/O处理模块321F可以将该语音输入转发至STT处理模块321G以进行语音文本转换。
在一些实施例中,STT处理模块321G通过I/O处理模块321F接收语音输入。STT处理模块321G可以使用各种声音模型和语言模型来将语音输入识别为音素的序列,并最终将其识别为以一种或多种语言书写的字词(words)或符号(tokens)的序列。STT处理模块321G可以使用任何合适的语音识别技术、声音模型以及语言模型,诸如隐马尔可夫模型、基于动态时间规整的语音识别以及其他统计或分析技术,来加以实施本申请实施例。在一些实施例中,语音转文本处理可至少部分地由第三方服务执行或在电子设备上执行。一旦STT处理模块321G获得语音转文本处理的结果(例如字词或符号的序列),它便将结果传送至自然语言处理模块321H以进行意图推断。
自然语言处理模块321H(也可称之为自然语言处理器)得到由STT处理模块321G生成的字词或符号的序列(“符号序列”),并尝试将该符号序列与由语音助理模块321E所识别的一个或多个“可执行 意图(actionable intents)”相关联。“可执行意图”表示可由语音助理模块321E执行的任务,并且具有在任务流模型中实施的相关联任务流。相关联任务流是语音助理模块321E为了执行任务而采取的一系列经编程的动作和步骤。语音助理系统的能力范围取决于已在任务流模型中实现并存储的任务流的数量和种类;或换言之,取决于语音助理模块321E所识别的“可执行意图”的数量和种类。然而,语音助理系统300的有效性还取决于语音助理模块321E从以自然语言表达的用户请求中推断出正确的“一个或多个可执行意图”的能力。
在一些实施例中,除从STT处理模块321G获得的字词或符号的序列之外,自然语言处理模块321H还可以接收与用户请求相关联的上下文信息(例如,来自I/O处理模块321F)。自然语言处理模块321H还可以使用上下文信息来明确、补充和或进一步定义在从STT处理模块321G接收的符号序列中包含的信息。上下文信息包括例如用户偏好、用户设备的硬件和/或软件状态,在用户请求之前、期间或之后不久收集的传感器信息,语音助理系统与用户之间的先前交互(例如对话),等等。
在一些实施例中,自然语言处理模块321H具体可以包括知识本体(ontology)、词汇、用户数据和分类模块。其中,知识本体为包含多个节点的分级结构,每个节点表示与“可执行意图”或其他“属性”中的一者或多者相关的“可执行意图”或“属性”。如上所述,“可执行意图”表示语音助理系统300能够执行的任务(例如可执行或可被进行的任务)。“属性”代表与可执行意图或另一属性的子方面相关联的参数。知识本体中的每个节点,与跟由节点所代表的属性或可执行意图有关的一组字词和/或短语相关联。与每个节点相关联的相应的字词和/或短语是所谓的与节点相关联的“词汇”。与每个节点相关联的相应的一组字词和/或短语,可存储在与由节点所表示的属性或可执行意图相关联的词汇索引中。词汇索引可以包括不同语言的字词和短语。在一些实施例中,自然语言处理器321H接收来自STT处理模块321G的符号序列(例如文本串),并且确定符号序列中的字词牵涉哪些节点。
用户数据包括特定于用户的信息,诸如特定于用户的词汇、用户偏好、用户地址、用户的默认语言和第二语言、用户的联系人列表,以及每位用户的其他短期或长期信息等。自然语言处理器321H可使用用户数据来补充用户输入中所包含的信息以进一步限定用户意图。例如,针对用户请求“邀请我的朋友参加我的生日派对”,自然语言处理器321H能够访问用户数据以确定“朋友”是哪些人以及“生日派对”将于何时何地举行,而不需要用户在其请求中明确地提供此类信息。
自然语言处理器321H还可以包括分类模块。在一些实施例中,分类模块确定例如文本串中一个或多个词语中的每个词语是否均为实体、活动或位置中的一者。一旦基于用户请求识别出可执行意图,自然语言处理器321H便生成结构化查询以表示所识别的可执行意图。在一些实施例中,自然语言处理器321H可以用所接收的上下文信息来填充结构化查询的一些参数。例如,如果用户请求“我附近的”寿司店,则自然语言处理器321H可用来自语音助理设备300的GPS坐标来填充结构化查询中的位置参数。
在一些实施例中,自然语言处理器321H将结构化查询传送至任务流处理模块321J(也可称为任务流处理器)。任务流处理器321J被配置为执行以下中的一个或多个:从自然语言处理器321H接收结构化查询,完成结构化查询,以及执行用户的最终请求所需的动作。在一些实施例中,完成这些任务所必需的各种过程在任务流处理模块321J中的任务流模型中提供。任务流模型包括用于获取来自用户的附加信息的过程,以及用于执行与可执行意图相关联的动作的任务流。如上所述,为了完成结构化查询,任务流处理器321J可能需要发起与用户的附加对话,以便获得附加信息,和/或消除可能模糊的话语的歧义。当此类交互有必要时,任务流处理器321J调用对话处理模块321I(也可称为对话 处理器)以进行与用户的对话。在一些实施例中,对话处理模块321I确定如何(和/或何时)向用户询问附加信息,并接收和处理用户的响应。在一些实施例中,通过I/O处理模块351将问题提供给用户并从用户接收回答。例如,对话处理模块321I经由扬声器阵列370A和/或显示屏394向用户呈现对话输出,并接收来自用户的输入。
在一些实施例中,对话处理器321I可以包括消歧模块。消歧模块用于消除一个或多个模糊词语(例如,对应于与数字照片相关联的语音输出的文本串中的一个或多个模糊词语)的歧义。在一些实施例中,消歧模块识别出一个或多个词语中的第一词语具有多个候选含义,提示用户关于第一词语的附加信息,响应于提示接收来自用户的附加信息并且根据附加信息来识别与第一词语相关联的实体、活动或位置。
一旦任务流处理器321J已针对可执行意图完成结构化查询,任务流处理器321J就继续执行与可执行意图相关联的最终任务。因此,任务流处理器321J可以根据结构化查询中包含的特定参数来执行任务流模型中的步骤和指令。在一些实施例中,任务流处理器321J在服务处理模块321K(也可称为服务处理器)的辅助下完成用户输入中所请求的任务,或者提供用户输入中所请求的信息性回答。例如,服务处理器321K可代替任务流处理器321J发起电话呼叫、设置日历条目、调用地图搜索、调用用户设备上安装的其他应用程序或与这些其他应用程序交互,以及调用第三方服务(例如,餐厅预订门户网站、社交网站或服务、银行门户网站等)或与这些第三方服务交互。在一些实施例中,可通过服务处理模块321K中服务模型中的相应服务模型指定每项服务所需的协议和应用程序编程接口(application programming interface,API)。服务处理器321K针对服务访问适当的服务模型,并依据服务模型根据该服务所需的协议和API生成针对该服务的请求。
在一些实施例中,自然语言处理器321H、对话处理器321K以及任务流处理器321J共同且反复地使用以推断并限定用户的意图、获得信息以进一步明确并提炼用户意图、并最终生成响应(例如,将输出提供至用户,或完成任务)以满足用户的意图。在一些实施例中,在已执行满足用户请求所需的所有任务之后,语音助理系统制定确认响应,并通过I/O处理模块321F将该响应发送回用户。如果用户请求寻求信息性回答,则确认响应向用户呈现所请求的信息。
在另一些实施例中,I/O接口351可以将语音助理设备300的I/O设备诸如键盘、触摸屏、麦克风等耦接至用户界面模块321C。I/O接口351与用户界面模块321C结合接收用户输入(例如,语音输入、键盘输入、触摸输入等)并相应地对这些输入进行处理。
可以理解的是,上述第二电子设备可以跨多个计算机分布,从而形成一个客户端-服务器的语音助理系统。在该语音助理系统中,语音助理系统的模块和功能中的一些被划分成服务器部分和客户端部分。其中客户端部分可以位于第二电子设备(例如电子设备102)上,并通过网络109与服务器部分(例如语音助理服务器105)通信,如图1中所示。在一些实施例中,语音助理系统可以为图1中所示的语音助理服务器105的实施例。在另一些实施例中,语音助理系统可以在电子设备102中实现,从而消除了对客户端-服务器系统的需求。应当指出,上述语音助理系统仅为一个实例,并且该语音助理系统可具有比示出的更多或更少的部件、可组合两个或更多个部件、或可具有部件的不同配置或布局。
以下将以手机101作为第一电子设备,以智能音箱102作为第二电子设备,结合附图详细阐述本申请实施例提供的语音切换方法。
示例性的,以VoIP业务举例,本申请实施例中可在手机101与智能音箱102之间切换正在进行的该VoIP业务。其中,切换前正在进行VoIP业务的电子设备可以称为VoIP业务的源设备,该源设备可以为手机101,也可以为智能音箱102。而切换后继续进行VoIP业务的设备可以称为VoIP业务的目标设备。当源设备为手机101时,VoIP业务的目标设备可以是智能音箱102;当源设备为智能音箱102时,VoIP业务的目标设备可以是手机101。
以下将结合实施例的具体场景阐述如何在手机101和智能音箱102之间切换VoIP业务。
在一些应用场景中,用户108在手机101和智能音箱102上使用同一帐号登录设备管理服务器103后,用户10可通过智能音箱102与其他电子设备(例如图1中的电子设备107)进行VoIP业务。如果用户108希望将该VoIP业务从智能音箱102切换至手机101上,用户可在手机101上执行预设的输入操作,以触发VoIP服务器104通过网络109将智能音箱102上正在进行的VoIP业务切换至手机101上。
示例性的,如图4所示,本实施例提供一种语音切换方法,该方法可以在上述实施例中所涉及的电子设备、服务器中实现,可以包括如下步骤:
步骤S401:手机101和智能音箱102均使用第一帐号登录到设备管理服务器103。
其中,上述第一帐号(例如HUAWEI-01)可以是某一应用(例如酷狗音乐)的帐号,也可以是某一服务(例如华为云服务)的帐号。
例如,如果手机101和智能音箱102中均安装有酷狗音乐这个APP,那么,用户108可在手机101的酷狗音乐APP中使用帐号(HUAWEI-01)登录该APP对应的设备管理服务器103;并且,用户108也可在智能音箱102的酷狗音乐APP中使用同一帐号(HUAWEI-01)登录该APP对应的设备管理服务器103。又例如,如果手机101和智能音箱102均为华为品牌的电子设备,则手机101和智能音箱102均可向用户108提供华为云服务。那么,用户可在手机101中使用帐号(HUAWEI-01)登录华为云服务的设备管理服务器103,并且,用户108也可在智能音箱102中使用同一帐号(HUAWEI-01)登录华为云服务的设备管理服务器103。
设备管理服务器103中保存有帐号以及用该用帐号登录的电子设备的设备信息例如设备标识(如表1中所示)。当手机101和智能音箱102均使用同一帐号登录设备管理服务器103后,设备管理服务器103可建立第一帐号与使用第一帐号的电子设备之间的对应关系。这样,在设备管理服务器103中就可以查询到在某一帐号下登录的电子设备具体有哪些。
步骤S402:智能音箱102检测到用户108发起语音通话的输入操作。
步骤S403:响应于上述输入操作,智能音箱102通过VoIP服务器104建立与第三电子设备(例如图1所示的手机107)之间的VoIP通话。
示例性地,如果用户108需要使用VoIP业务与联系人(例如苏珊)通话,用户108可在智能音箱102上发起VoIP语音通话的输入操作。该输入操作具体可以是用户108在智能音箱102的显示屏394上输入苏珊的电话号码的操作。
在另一些实施例中,该输入操作也可以是用户108向智能音箱102进行的语音输入操作。例如,用户108可以对着智能音箱102说“给苏珊打电话”。智能音箱102采集到该语音信号后,可利用智能音箱102中的语音助理系统对该语音信号进行语音识别,得到与该语音信号对应的控制指令。例如,上述语音助理系统根据上述语音信号识别出控制指令为“呼叫联系人苏珊”。进而,智能音箱102可根 据通讯录中苏珊的电话号码,向VoIP服务器104发送呼叫该电话号码的呼叫请求,使得VoIP服务器104呼叫联系人苏珊的手机107。后续,如果被呼叫的手机107成功接听了智能音箱102发起的呼叫,则手机107可向VoIP服务器104发送成功接听的消息,从而建立智能音箱102与被呼叫方(即联系人苏珊的手机107)之间的VoIP通话。这样,用户可使用智能音箱102与联系人进行VoIP通话。
可以理解的是,本申请各个实施例中,智能音箱102是可以通过一个或多个麦克风阵列370B来从不同方位采集用户的语音输入(也即语音信号),智能音箱102是可以通过一个或多个扬声器阵列370A来播放语音助理系统对语音识别结果的语音反馈。
在一些实施例中,智能音箱102采集到用户的语音信号后,也可以将该语音信号发送给语音助理服务器105,由语音助理服务器105对该语音信号进行语音识别,得到与该语音信号对应的控制指令。当识别出的控制指令与VoIP业务相关时,语音助理服务器105可将识别出的控制指令发送给VoIP服务器104,由VoIP服务器104按照上述方法建立智能音箱102与被呼叫方之间的VoIP业务。当然,用户除了通过输入语音信号发起VoIP业务的呼叫操作外,还可以通过执行预设的手势等方式发起VoIP业务的呼叫操作,本申请实施例对此不做任何限制。
步骤S404:手机101向VoIP服务器104发送第一切换请求信息,该第一切换请求信息用于请求VoIP服务器104将智能音箱102中正在进行的VoIP业务切换到手机101上继续进行。
在一些实施例中,该第一切换请求信息中可以包括第一帐号。
在一些应用场景下,用户可能希望将上述智能音箱102中正在进行的VoIP通话切换到手机101上执行。例如,智能音箱102位于用户108的家中,用户在家中时可以使用智能音箱102与其他电子设备(例如手机107)进行VoIP通话。当用户离开家时,用户需要使用便携性更好的手机101继续与手机107进行VoIP通话。此时,用户需要将智能音箱102中正在执行的VoIP业务切换到手机101上继续执行。
为了实现VoIP业务在手机101和智能音箱102之间的切换功能,手机101可以预先设置一个用于切换VoIP业务的特定操作。示例性地,该特定操作可以为翻转手机、指关节敲击屏幕、双击电源键或滑动操作等输入操作。或者,该特定操作也可以为预设的语音输入。例如,用户108可以向手机101语音输入“切换语音通话”的语音指令。可以理解的是,本领域技术人员可根据实际应用场景或实际经验设置该预设操作,本实施例对此不做任何限制。
那么,当手机101检测到用户对手机101的一输入操作时,手机101如果确定该输入操作是上述特定操作,则响应于确定上述特定操作,手机101可向VoIP服务器104发送第一切换请求信息。
换句话说,当手机101检测到上述特定操作时,说明用户101此时需要将第一帐号(例如HUAWEI-01)下正在进行的VoIP业务切换到手机101上。进而,响应于该特定操作,手机101可通过网络109向VoIP服务器104发送第一切换请求信息。
示例性的,可以预先设置上述特定操作为在手机101的触摸屏上的一滑动手势。例如,当滑动手势的滑动轨迹为X型时,该滑动手势用于指示将手机101上正在进行的VoIP通话切换至用户的智能音箱102上,手机101可将智能音箱102作为VoIP通话的目标设备。当滑动手势的滑动轨迹为Y型时,该滑动手势用于指示将手机101上正在进行的VoIP通话切换至用户的平板电脑111(图中未示出)上。 那么,如果检测到用户执行的滑动操作的滑动轨迹为X形,则手机101可将平板电脑111作为VoIP通话的目标设备。
在另一些实施例中,上述第一切换请求信息中还可以包括手机101在VoIP业务中的VoIP标识(例如手机101的电话号码或IP地址等),和智能音箱102在VoIP业务中的VoIP标识。
在一些实施例中,上述第一切换请求信息中还可以包括手机101的设备标识,以使得VoIP服务器104在接收到上述第一切换请求信息后,对电子设备101进行合法性验证。这样就进一步提高了语音切换的安全性。
在其他一些实施例中,手机101还可以根据检测到的特定条件,自动通过网络109向VoIP服务器104发送上述第一切换请求信息,而不需要上述实施例中用户向手机101输入一个特定操作来触发。
在一些实施例中,上述特定条件可以是WLAN网络中的Wi-Fi信号强度。示例性的,手机101与智能音箱102均接入到了同一个Wi-Fi网络,也即这两个电子设备都可以使用WLAN网络中的同一个服务集标识(service set identifier,SSID)来访问网络。由于手机101相对于智能音箱102的便携性更好,因此,当手机101与智能音箱102接入同一Wi-Fi网络(例如SSID名称为“123”的Wi-Fi网络)时,手机101可以根据检测到的该Wi-Fi网络的Wi-Fi信号强度的变化,确定是否向VoIP服务器104发送上述第一切换请求信息。例如,当手机101无法检测到上述网络的Wi-Fi信号或者检测到的Wi-Fi信号低于一预设阈值时,手机101可自动向VoIP服务器104发送上述第一切换请求信息。上述情况说明用户此时已经携带手机101远离了上述Wi-Fi网络,也远离了智能音箱102。在这种情况下,手机101可以请求VoIP服务器104将在智能音箱102中正在进行的VoIP通话切换至手机101,这样用户就可以方便地在手机101继续进行VoIP通话了。
在另一些实施例中,上述特定条件也可以是蓝牙信号强度。示例性地,手机101与智能音箱102之间可建立蓝牙连接,这样,手机101可以根据检测到的手机101与智能音箱102之间蓝牙信号强度,确定是否向VoIP服务器104发送上述第一切换请求信息。例如,当手机101检测到与智能音箱102之间的蓝牙连接断开,或者,当手机101检测到智能音箱102的蓝牙信号强度小于一预设阈值时,手机101可自动向VoIP服务器104发送上述第一切换请求信息。上述情况说明用户此时已经携带手机101远离了智能音箱102。在这种情况下,手机101可以请求VoIP服务器104将在智能音箱102中正在进行的VoIP通话无缝切换至手机101,这样用户就可以方便地在手机101继续进行VoIP通话了。
需要说明的是,本领域技术人员可以根据实际应用场景或实际经验设置触发手机101向VoIP服务器104发送第一切换请求信息的其他技术方案,本申请实施例对此不做任何限制。例如,手机101可以与dock设备通过有线方式连接,并通过dock设备与智能音箱102连接,当检测到手机101从该dock设备拔出时,响应于该事件,手机101可自动通过网络109向VoIP服务器104发送上述第一切换请求信息。
步骤S405:VoIP服务器104接收到手机101发送的上述第一切换请求信息。
步骤S406:响应于接收到上述第一切换请求信息,VoIP服务器104确定与上述第一帐号对应的VoIP业务的源设备为智能音箱102。
示例性地,VoIP服务器104接收到手机101通过网络109发送来的第一切换请求信息后,可向设备管理服务器103发送该第一切换请求信息中携带的第一帐号(即HUAWEI-01)。由于设备管理服务 器103中存储有帐号、电子设备的设备标识等(如表1所示),因此,设备管理服务器103可以根据VoIP服务器104发来的第一帐号,查找到使用该第一帐号登录的各个电子设备,例如智能音箱102也使用了第一帐号登录了。当然,使用该第一帐号登录的电子设备可能有一个或多个,设备管理服务器103可以将所有使用第一帐号登录的电子设备的设备标识通过网络109发送给VoIP服务器104。在另一些情况下,设备管理服务器103也可以将所有使用第一帐号登录的电子设备中支持VoIP业务的电子设备的设备标识发送给VoIP服务器104。
VoIP服务器104接收到设备管理服务器103发来的设备标识符后,可通过设备标识查询在上述第一帐号下正在进行VoIP业务的源设备。例如,设备管理服务器103根据帐号HUAWEI-01查询到智能音箱102的设备标识,并将该设备标识发送给VoIP服务器。VoIP服务器104可以由此确定出执行VoIP业务的源设备为智能音箱102,也即用户需要将智能音箱102上正在执行的VoIP业务切换至手机101。
在另外一些实施例中,如果使用第一帐号登录的有两个或两个以上的电子设备都在执行VoIP业务,例如,除了智能音箱102外,平板电脑111上也正在执行VoIP业务。那么,VoIP服务器104可将这两个电子设备的设备标识均通过网络109发送给手机101。此时,如图5所示,手机101的触摸屏上可显示提示框501,在提示框501中有一个或多个选项,该选项列出了在第一帐号(例如HUAWEI-01)下正在进行VoIP业务的多个源设备的列表。这样,用户可以在提示框501中选择将哪一个电子设备上的VoIP业务切换到手机101中。例如,当手机101检测到用户选择了表示智能音箱102的选项后,手机101可将智能音箱102的标识发送给VoIP服务器104。这样,VoIP服务器104便可确定出用户需要切换的VoIP业务的源设备为智能音箱102。
步骤S407:VoIP服务器104将在智能音箱上正进行的VoIP通话切换到手机101上。
示例性地,首先,VoIP服务器104可以将手机101添加到智能音箱102与手机107之间的VoIP通话中。具体的,VoIP服务器104可以根据手机101在VoIP业务中的VoIP标识,将手机101添加到智能音箱102与手机107之间的VoIP业务中。此时,就在手机101、智能音箱102与手机107之间建立了VoIP业务的多方通话。
手机101的VoIP标识可以携带在手机101第一切换请求信息中。或者,VoIP服务器104中可以预先注册各个电子设备在VoIP业务中的VoIP标识。这样,在VoIP服务器104可以查询到手机101在VoIP业务中的VoIP标识。
在手机101、智能音箱102与手机107之间建立了VoIP业务的多方通话后,手机101可以通过网络109向VoIP服务器104发送成功加入VoIP业务的响应消息。
在接到上述响应消息后,VoIP服务器104中断智能音箱102的VoIP业务。中断后,此时只有手机101与手机107在进行VoIP通话。
在一些实施例中,当VoIP服务器104将手机101添加到智能音箱102与手机107之间的VoIP业务后,如果手机101成功接入了该VoIP业务,说明用户已经使用手机101接通了与手机107之间的VoIP语音通话。进而,手机101可向VoIP服务器104发送成功加入该VoIP业务的响应消息。接收到该响应消息后,VoIP服务器104可在手机101、智能音箱102与手机107组成的多方通话中将智能音箱102去除,即中断智能音箱102的VoIP业务,从而使得该VoIP业务从智能音箱102切换到手机101上继续执行。
可以看出,在手机101向VoIP服务器104发送成功加入VoIP业务的响应消息之前,智能音箱102和手机101上均接入了VoIP业务。当用户在手机101上接听了该VoIP语音通话后,手机101才会向VoIP服务器104发送上述响应消息,以触发VoIP服务器104中断智能音箱102的VoIP业务。这样一来,VoIP业务在手机101和智能音箱102之间切换时不会发生中断,且用户从智能音箱102切换到手机101上进行VoIP语音通话时可实现VoIP语音通话的无缝衔接,从而提高了多设备之间的语音切换效率和用户的使用体验。
在另一些实施例中,VoIP服务器104也可以根据手机101在VoIP业务中的VoIP标识(例如手机101的电话号码),通过呼叫转移业务将智能音箱102的VoIP通话转移至手机101中,从而将智能音箱102上正在进行的VoIP通话切换至手机101中继续进行。
在上述实施例提供的技术方案中,VoIP业务的源设备为智能音箱102,VoIP业务的目标设备为手机101。手机101可以响应用户的特定操作识别出VoIP业务的切换需求。进而,手机101可向VoIP服务器104发送第一切换请求信息,使得VoIP服务器104将同一帐号下在智能音箱102上正在执行的VoIP业务无缝切换到手机101上。切换过程中不会发生VoIP业务中断的现象,用户也无需在多个设备之间反复操作,从而提高了多设备之间的语音切换效率和用户的使用体验。
在另一些应用场景中,如果用户希望将正在智能音箱102上进行的VoIP通话切换至手机101上,则用户可对智能音箱102输入一个的特定操作,以触发VoIP服务器104将智能音箱102上的VoIP通话切换至在该同一帐号下登录的其他电子设备(例如手机101)上。
示例性的,如图6所示,本实施例提供的一种语音切换方法包括:
步骤S601、手机101和智能音箱102均使用第一帐号登录到设备管理服务器103。
步骤S602:智能音箱102检测到用户108发起语音通话的输入操作。
步骤S603:响应于上述输入操作,智能音箱102通过VoIP服务器104建立与第三电子设备(例如图1所示的手机107)之间的VoIP通话。
其中,步骤S601-S603的具体实现方法可参见上述实施例中步骤S401-S402的相关描述,故此处不再赘述。
步骤S604、智能音箱102向VoIP服务器104发送第二切换请求信息,该第二切换请求信息用于请求VoIP服务器104将智能音箱102中正在进行的VoIP业务切换到手机101上继续进行。
其中,该第二切换请求信息中可以包括上述第一帐号。
在该应用场景下,用户希望将智能音箱102中正在进行的VoIP通话切换到手机101上继续执行。用户可以通过向源设备(即智能音箱102)上执行一输入操作,触发VoIP服务器104将上述智能音箱102中正在执行的VoIP通话切换到手机101上执行。
示例性的,上述输入操作可以为用户的语音输入。例如,当用户希望将智能音箱102中正在执行的VoIP通话切换到手机101上执行时,智能音箱102的语音助理系统可能处理未激活状态,那么用户可以先向智能音箱102语音输入唤醒词,例如“你好,小E”。当智能音箱102检测到该唤醒词时,智能音箱102的上述语音助理系统被启动,并采集用户进一步的语音输入,以便上述语音助理系统对该语音输入进行语音识别处理。
在一些实施例中,在上述语音助理系统被唤醒后,用户可以继续向智能音箱102进行语音输入。例如,用户的语音输入可以为:“将语音通话切换到我的手机”,即执行VoIP通话的目标设备切换为用户的手机101。智能音箱102对该语音输入进行语音识别后,智能音箱102可生成第二切换请求信息,并将该第二切换请求信息发送给VoIP服务器104,其中,该第二切换请求信息用于请求VoIP服务器104将正在进行的VoIP通话切换到手机101的第二切换请求信息。该第二切换请求信息中可以包括智能音箱102当前登录的第一帐号(例如HUAWEI-01)以及目标设备(即手机101)的设备标识。
在其他一些实施例中,上述语音输入也可以为:“切换语音通话”。智能音箱102对该语音输入进行语音识别后,可以确定出用户的操作意图为将智能音箱102中正在执行的VoIP通话切换到用户的其他电子设备中,但该语音输入中并没有明确指出将上述VoIP通话切换到用户的哪个电子设备继续执行。此时,智能音箱102生成的第二切换请求信息中可包括智能音箱102当前登录的第一帐号,但不包括目标设备(即手机101)的设备标识。
在另一些实施例中,可以在智能音箱102中预先设置将VoIP通话切换至用户的另一默认的电子设备例如手机101上继续执行。那么,当上述语音输入中没有指出目标设备时,智能音箱102可默认将用户的手机101作为切换后执行VoIP通话的目标设备。此时,智能音箱102生成的第二切换请求信息中还可以包括默认的目标设备(即手机101)的设备标识。
在另一些实施例中,智能音箱102采集到用户的语音输入后,也可以将该语音输入通过网络109发送给语音助理服务器105,由语音助理服务器105对该语音输入进行语音识别。进而,语音助理服务器105将语音识别结果可以反馈给智能音箱102,由智能音箱102根据语音识别结果生成上述第二切换请求信息,并通过网络109发送给VoIP服务器104。
在其他一些实施例中,智能音箱102还可以根据检测到的特定条件,自动通过网络109向VoIP服务器104发送上述第二切换请求信息,而不需要用户向智能音箱102输入一个语音输入。
示例性的,上述特定条件可以是WLAN网络中的Wi-Fi信号强度。手机101与智能音箱102均接入到了同一个路由器的Wi-Fi网络。当路由器检测到手机101在某一个时刻断开了Wi-Fi网络连接,或检测到的手机101的Wi-Fi信号低于一预设阈值时,路由器可自动向智能音箱102发送一条通知信息,该通知信息指示手机101已远离Wi-Fi网络。在这种情况下,智能音箱102可以自动通过网络109发送上述第二切换请求信息给VoIP服务器104,请求VoIP服务器104在智能音箱102中正在进行的VoIP通话切换至手机101,这样用户就可以方便地在手机101继续进行VoIP通话了。在该场景中,用户首先通过智能音箱102与手机107正在进行VoIP通话,然后,用户拿起手机远离智能音箱102,也同时远离了上述Wi-Fi网络。这个时候,将上述VoIP通话自动切换至用户的手机,以继续进行该VoIP通话,这样就提高了VoIP通话的效率,也提高了用户的体验。
示例性的,上述特定条件也可以是蓝牙信号强度。示例性地,手机101与智能音箱102之间可建立蓝牙连接,这样,智能音箱102可以根据检测到的手机101与智能音箱102之间蓝牙信号强度,确定是否自动向VoIP服务器104发送上述第二切换请求信息。例如,当智能音箱102检测到与手机101之间的蓝牙连接断开,或者,当智能音箱102检测到手机101的蓝牙信号强度小于一预设阈值时,智能音箱102可自动向VoIP服务器104发送上述第二切换请求信息。
需要说明的是,本领域技术人员可以根据实际应用场景或实际经验设置智能音箱102向VoIP服务器104发送第二切换请求信息的具体技术方案,本实施例对此不做任何限制。例如,智能音箱102可 以预先设置敲击智能音箱102一次的手势用于触发智能音箱102将正在进行的VoIP通话切换至手机,敲击智能音箱102两次的手势用于触发智能音箱102将上述VoIP通话切换至平板电脑111等。
步骤S605:VoIP服务器104接收到智能音箱102发送的上述第二切换请求信息。
步骤S606:响应于接收到上述第二切换请求信息,VoIP服务器104确定与上述第一帐号对应的VoIP业务的源设备为智能音箱102。
步骤S607:VoIP服务器104将上述VoIP通话切换至手机101上进行。
首先,VoIP服务器104可以从使用第一帐号登录的多个电子设备中确定执行VoIP通话的目标设备为手机101。VoIP服务器104接收到智能音箱102发来的第二切换请求信息后,可将第二切换请求信息中携带的第一帐号(即HUAWEI-01)发送给设备管理服务器103。进而,设备管理服务器103可查询到当前使用第一帐号的所有电子设备。例如,除了智能音箱102外,还有手机101和平板电脑111也使用了第一登录。那么,设备管理服务器103可将手机101和平板电脑111的设备标识通过网络109发送给VoIP服务器104,由VoIP服务器104在这些电子设备中确定后续代替智能音箱102继续执行VoIP通话的目标设备。
示例性的,如果上述第二切换请求信息中携带有目标设备的设备标识(例如手机101的设备标识),则VoIP服务器104可在设备管理服务器103发来的设备标识中查询是否包含手机101的设备标识。若包含手机101的设备标识,则VoIP服务器104可确定执行VoIP通话的目标设备为手机101。
或者,如果上述第二切换请求信息中没有携带目标设备的标识,则VoIP服务器104可以从设备管理服务器103发来的设备标识中选择一个作为后续执行VoIP通话的目标设备的设备标识。
又或者,如果上述第二切换请求信息中没有携带目标设备的设备标识,VoIP服务器104可以将设备管理服务器103发来的多个电子设备的设备标识通过网络109发送给智能音箱102。如图7所示,智能音箱102可显示提示框701,在提示框701中列出在第一帐号下能够继续执行VoIP通话的目标设备的选项。这样,用户可以在提示框701中手动选择将智能音箱102上的VoIP通话切换到哪一个电子设备上执行。例如,当检测到用户选中了提示框901中的手机101后,智能音箱102可将手机101的设备标识发送给VoIP服务器104。这样,VoIP服务器104便可确定出后续继续执行VoIP通话的目标设备为手机101。
VoIP服务器104中可以预先注册各个电子设备在VoIP业务中的VoIP标识。该VoIP标识可以是执行VoIP业务时使用的电话号码或IP地址等。那么,VoIP服务器104将手机101确定为后续代替智能音箱102继续执行VoIP通话的目标设备后,可以查询到手机101的VoIP标识,例如手机101的电话号码为123456。进而,VoIP服务器104可按照该电话号码将手机101添加到智能音箱102与手机107之间的VoIP通话中。此时,VoIP服务器104在手机101、智能音箱102与手机107之间建立了VoIP多方通话。然后,VoIP服务器104中断智能音箱102的VoIP业务。中断后,此时只有手机101与手机107在进行VoIP通话,也即VoIP通话切换到手机101上。此时,手机101与手机107进行VoIP通话。
在本实施例提供的语音切换方法中,VoIP业务的源设备为智能音箱102,VoIP业务的目标设备为手机101。智能音箱102可以响应用户执行的触发操作识别出VoIP业务的切换需求。进而,智能音箱102可向VoIP服务器104发送第二切换请求信息,使得VoIP服务器104将同一帐号下在智能音箱102 上执行的VoIP业务无缝切换到手机101上继续执行。切换过程中不会发生VoIP业务中断的现象,用户也无需在多个设备之间反复操作,从而提高了多设备之间的语音切换效率和用户的使用体验。
在另一些应用场景中,用户在手机101和智能音箱102上使用同一帐号登录设备管理服务器103后,与上述实施例中所涉及的应用场景不同的是,用户可通过手机101与其他电子设备(例如手机107)进行VoIP业务。后续,如果用户希望将该VoIP业务从手机101切换至智能音箱102上,用户可在智能音箱102上或者手机101上做预设的输入操作,以触发VoIP服务器104将手机101上正在进行的VoIP通话自动切换至智能音箱102上;或者,手机101或智能音箱102检测到一特定条件,而自动进行切换VoIP通话到智能音箱102的流程。在该应用场景下的具体技术方案与上述实施例中涉及的技术方案类似,在此不再赘述。需要详细说明的是,在本应用场景中,手机101或智能音箱102检测到一特定条件可以是以下几种。
示例性地,该特定条件可以是手机101当前的状态信息。手机101可以通过一个或多个传感器180采集各种环境信息、手机姿态信息等。例如,当手机通过加速度传感器180E检测手机当前静止了超过一预设时间段了,而且手机已经接入了与智能音箱102同一个Wi-Fi网络,则手机101根据检测到的上述状态信息,可以自动向VoIP服务器104发送切换请求信息,以便VoIP服务器104将手机101上正在进行的VoIP通话自动切换至智能音箱102上。
示例性地,该特定条件也可以是手机101与智能音箱102建立的蓝牙连接。例如,用户最初是在手机101上与手机107进行VoIP通话,当用户到家时,手机101与智能音箱102可以自动建立蓝牙连接。在这两个设备建立了蓝牙连接后,手机101或者智能音箱102可以自动向VoIP服务器104发送切换请求信息,以便VoIP服务器104将手机101上正在进行的VoIP通话自动切换至智能音箱102上。
在其他一些应用场景中,用户在手机101和智能音箱102上使用同一帐号登录设备管理服务器103后,用户可使用智能音箱102执行音频/视频播放业务。如果用户希望将音频播放业务从智能音箱102切换至手机101上,用户可在手机101或智能音箱102上执行预设的特定操作,以触发内容服务器106将智能音箱102上的音频播放业务切换至手机101上。
示例性的,如图8所示,本实施例提供的一种语音切换方法,包括:
步骤S801:手机101和智能音箱102均使用同一帐号(例如第一帐号)登录设备管理服务器103。
其中,手机101和智能音箱102使用第一帐号登录设备管理服务器103的具体方法可参见上述相关实施例,故此处不再赘述。
步骤S802:智能音箱102接收用户的语音输入,该语音输入用于指示智能音箱102播放音频B;
步骤S803:响应于该语音输入,智能音箱102确定音频B的播放指令。
步骤S804:根据上述播放指令,智能音箱102从内容服务器106中获取播放信息,并播放音频B。
当用户希望使用智能音箱102播放某一音频B(例如歌曲Silence)时,用户可向智能音箱102说“我要收听Silence这首歌”。进而,智能音箱102根据其配置的语音助理系统将该语音输入识别为语音指令,该语音指令用于指示播放歌曲Silence;然后,智能音箱102将播放音频的请求信息通过网络109发送给内容服务器106,内容服务器106在接收到上述请求信息后,像智能音箱102提供歌曲Silence的播放业务。进而,智能音箱102从内容服务器106中获取的播放信息中播放歌曲Silence,可以通过智能音箱102的一个或多个扬声器阵列370A来播放歌曲。
在另一些实施例中,采集到用户的上述语音输入后,智能音箱102可将该语音输入携带在识别请求中通过网络109发送给语音助理服务器105。语音助理服务器105可使用语音识别算法对上述语音输入进行语音识别,得到歌曲Silence的播放指令。语音助理服务器105将识别出的播放指令发送给内容服务器106,由内容服务器106向智能音箱102提供歌曲Silence的播放业务。
在另一些实施例中,用户除了通过输入语音的方式触发智能音箱102执行音频播放业务外,还可以通过其他预设的方式触发智能音箱102得到某一音频的播放指令,本申请实施例对此不做任何限制。例如,当检测到用户敲击智能音箱102后,说明用户希望继续播放最近一次收听的节目(例如节目C),则智能音箱102可生成节目C的播放指令发送给内容服务器106。又例如,如果智能音箱中设置有触摸屏,则用户可以在触摸屏中选择需要播放的音频,进而触发智能音箱102生成该音频的播放指令并发送给内容服务器106。
内容服务器106可用于维护音乐、节目等音频内容的资源信息。语音助理服务器105将识别出的音频播放指令发送给内容服务器106后,内容服务器106可查找音频B的资源信息。其中,该资源信息可以是音频B的音频资源,也可以是音频B的播放地址或下载地址等。内容服务器106将音频B的资源信息发送给智能音箱102,以使得智能音箱102按照该资源信息播放音频B。以音频B的资源信息为音频B的播放地址举例,内容服务器106可将音频B的播放地址发送给智能音箱102。这样,智能音箱102根据该播放地址可以获取到音频B的音频资源,进而执行音频B的音频播放业务。
可以理解的是,内容服务器106中可以存储音频资源、请求播放的设备的设备标识等信息,以便后续对进一步处理。
步骤S805:手机101向内容服务器106发送播放切换请求。
步骤S806:响应于上述播放切换请求,内容服务器106确定播放音频B的源设备为智能音箱102。
步骤S807:内容服务器106将上述音频播放业务切换至手机101上继续进行。
在一些实施例中,上述播放切换请求中可以包括第一帐号,手机101和智能音箱102均使用该第一帐号登录到了设备管理服务器103中。在另一些实施例中,上述播放切换请求也可以包括手机101的设备标识和/或智能音箱102的设备标识。
当用户希望将智能音箱102中正在播放的音频B切换到手机101上继续播放时,用户可以在手机101上输入一预设的特定操作。响应于该特定操作,手机101可向内容服务器106发送播放切换请求,由内容服务器106确定正在播放音频B的源设备即智能音箱102。
内容服务器106可向设备管理服务器103发送第一帐号,由设备管理服务器103查询当前登录该第一帐号的电子设备有哪些。进而,内容服务器106可将正在播放音频B的电子设备(例如上述智能音箱102)确定为播放音频内容B的源设备。
当然,如果第一播放切换请求中携带有用户指定的源设备(例如智能音箱102),且智能音箱102正在帐号A下播放音频内容B(即执行音频播放业务),则内容服务器106可确定在第一帐号下播放音频B的源设备为智能音箱102。
容服务器106确定出播放音频B的源设备为智能音箱102后,为了使得后续将音频B切换到智能音箱102时,手机101能够从当前的播放位置继续播放该音频B,内容服务器106可查询音频B在智能音箱102上的播放进度。
内容服务器106向手机101发送音频B的资源信息和播放进度。手机101按照音频B的资源信息和播放进度继续播放音频B。
内容服务器106得到音频内容B在智能音箱102中的播放进度后,可以将音频B的播放进度和资源信息发送给手机101。这样,手机101可以按照音频B的资源信息获取到音频B,并且,手机101可以按照音频B的播放进度从智能音箱102当前播放的位置继续播放音频B,实现音频播放业务在智能音箱102和手机101之间的无缝切换。在手机101接收到该视频B后,内容服务器可以自动中断在智能音箱102上播放音频B的音频播放业务。
在另一些实施例中,手机101可以通过网络109向内容服务器106发送音频B的播放事件;响应于上述播放事件,内容服务器106中断在智能音箱102上播放音频B的音频播放业务。
当手机101开始播放音频B后,原本在智能音箱102上播放的音频B可能并未自动中断。那么,当手机101开始播放音频B后,手机101可自动向内容服务器106发送音频B的播放事件。这样,内容服务器106接收到该播放事件后便可停止智能音箱102的音频播放业务。例如,响应于手机101上报的播放事件,内容服务器106可向智能音箱102发送停止播放指令,以使得智能音箱102响应该停止播放指令停止在智能音箱102上播放音频B。
在另一些实施例中,可以设置当智能音箱102和手机101同时播放一段时间的音频B后,再停止在智能音箱102上播放音频B。例如,可以在手机101播放了3s的音频B后,再向内容服务器106发送音频B的播放事件。那么,在手机101刚开始播放音频内容B的3秒内,智能音箱102也在同时播放音频B。这样,即使音频播放业务切换至手机101时因传输时延导致一些音频遗漏,用户也可结合智能音箱102上播放的音频B得到完整的音频内容,从而提高语音切换效率和使用体验。
在上述实施例提供的语音切换方法中,音频播放业务的源设备为智能音箱102,音频播放业务的目标设备为手机101。手机101可以响应用户执行的触发操作识别出音频播放业务的切换需求。进而,手机101可向内容服务器106发送第一播放切换请求,内容服务器106将同一帐号下在智能音箱102上执行的音频播放业务无缝切换到手机101上继续执行。切换过程中不会发生音频播放业务中断的现象,用户也无需在多个设备之间反复操作,从而提高了多设备之间的语音切换效率和用户的使用体验。
在另一些应用场景中,用户在手机101和智能音箱102上使用同一帐号登录设备管理服务器103后,用户可使用智能音箱102执行音频播放业务。如果用户希望将该音频播放业务从智能音箱102切换至手机101上,则用户也可在智能音箱102上做预设的一特定操作,以触发内容服务器106将智能音箱102上的音频播放业务切换至手机101上。该特定操作可以是用户对智能音箱102的语音输入,例如用户对智能音箱102说“把歌曲Silence切换到手机上播放”。智能音箱102对该语音输入的处理方式可以参加上述实施例中的相关描述,在此不再赘述。
在另一些应用场景中,当用户使用手机101执行音频播放业务时,如果后续用户希望将该音频播放业务从手机101切换至智能音箱102上,则用户可在智能音箱102/或手机101上做一个预设的输入操作,以触发内容服务器106将手机101上的音频播放业务切换至智能音箱102上。具体技术方案请参见上述实施例中的相关描述,在此不再赘述。
如图9所示,本实施例提供一种语音切换系统900,该系统900可以包括:第一电子设备901(例如图1中的手机101)、第二电子设备902(例如图1中的智能音箱102)、设备管理服务器903(例如 图1中的设备管理服务器103)和VoIP服务器904(例如图1中的VoIP服务器104)。该系统900可以用于实现上述各个实施例中所涉及的语音切换的技术方案,在此不再赘述。
在另一些实施例中,上述系统900还可以包括语音助理服务器905(例如图1中的语音助理服务器105)和内容服务器906(例如图1中的内容服务器106)。其中,语音助理服务器905的功能与上述实施例中语音助理服务器105的功能相同,内容服务器906的功能与上述实施例中内容服务器106的功能相同。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
以上所述,仅为本申请实施例的实施例方式,但本申请实施例的保护范围并不局限于此,任何在本申请实施例揭露的技术范围内的变化或替换,都应涵盖在本申请实施例的保护范围之内。因此,本申请实施例的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种语音切换的方法,其特征在于,包括:
    第二电子设备检测到用户的语音输入;
    响应于所述语音输入,所述第二电子设备通过VoIP服务器与第三电子设备建立VoIP通话;
    第一电子设备向所述VoIP服务器发送第一切换请求信息,所述第一切换请求信息用于请求所述VoIP服务器将在所述第二电子设备上正在进行的所述VoIP通话切换到所述第一电子设备上;其中,所述第一切换请求信息中包括第一账号,所述第一账号用于登录到设备管理服务器;
    所述VoIP服务器接收到所述第一切换请求信息;
    响应所述第一切换请求信息,所述VoIP服务器确定与所述第一帐号对应的VoIP业务的源设备为所述第二电子设备;
    所述VoIP服务器将在所述第二电子设备上正进行的所述VoIP通话切换到所述第一电子设备上。
  2. 根据权利要求1所述的语音切换方法,其特征在于,在所述方法之前还包括:所述第一电子设备和所述第二电子设备均使用所述第一帐号登录到所述设备管理服务器。
  3. 根据权利要求1所述的语音切换方法,其特征在于,所述第一电子设备向所述VoIP服务器发送第一切换请求信息,具体包括:
    当所述第一电子设备检测到用户的特定操作时,则响应于所述特定操作,所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息。
  4. 根据权利要求3所述的语音切换方法,其特征在于,所述特定操作为以下操作的其中一种:翻转手机、指关节敲击屏幕、双击电源键、预设的语音输入或预设的滑动手势。
  5. 根据权利要求1所述的语音切换方法,其特征在于,所述第一电子设备向所述VoIP服务器发送第一切换请求信息,具体包括:
    当所述第一电子设备检测到特定条件时,则响应于所述特定条件,所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息。
  6. 根据权利要求5所述的语音切换方法,其特征在于,所述特定条件为:WLAN网络中的Wi-Fi信号强度或蓝牙信号强度;其中:
    当所述第一电子设备检测到所述Wi-Fi信号强度低于一预设阈值时,所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息;或者,
    当所述第一电子设备检测到所述第二电子设备的蓝牙信号强度低于一预设阈值时,所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息。
  7. 根据权利要求1所述的语音切换方法,其特征在于,所述方法还包括:
    所述第一电子设备向所述VoIP服务器发送成功加入VoIP通话的响应消息;
    在接到所述响应消息后,所述VoIP服务器中断在所述第二电子设备上的VoIP业务。
  8. 根据权利要求1所述的语音切换方法,其特征在于,所述VoIP服务器确定与所述第一帐号对应的VoIP业务的源设备为所述第二电子设备,具体包括:
    所述VoIP服务器向所述设备管理服务器发送所述第一帐号;
    所述设备管理服务器根据所述第一帐号确定使用该第一帐号登录的至少一个电子设备;
    所述设备管理服务器将所述至少一个电子设备的设备标识发送给所述VoIP服务器;
    所述VoIP服务器根据所述设备标识确定在所述第一帐号下正在进行VoIP通话的源设备为所述第二电子设备。
  9. 根据权利要求8所述的语音切换方法,其特征在于,所述确定在所述第一帐号下正在进行VoIP通话的源设备为所述第二电子设备,具体包括:
    当所述VoIP服务器根据所述设备标识确定在所述第一帐号下正在进行VoIP通话的源设备有至少两个电子设备时,所述VoIP服务器将所述至少两个电子设备的设备标识发送给所述第一电子设备;
    所述第一电子设备上显示至少两个选项;所述至少两个选项用于表示所述至少两个电子设备;
    所述第一电子设备检测到用户对其中一个选项的选择操作;其中,所述选项表示所述第二电子设备;
    响应于所述选择操作,所述第一电子设备将所述第二电子设备的设备标识发送给所述VoIP服务器;
    所述VoIP服务器根据接收到的所述第二电子设备的设备标识确定,在所述第一帐号下正在进行VoIP通话的源设备为所述第二电子设备。
  10. 一种语音切换系统,其特征在于,所述系统包括第一电子设备、第二电子设备、设备管理服务器和VoIP服务器;其中:
    所述第二电子设备用于在检测到用户的语音输入时,通过所述VoIP服务器与第三电子设备建立VoIP通话;
    所述第一电子设备用于向所述VoIP服务器发送第一切换请求信息,所述第一切换请求信息用于请求所述VoIP服务器将在所述第二电子设备上正在进行的所述VoIP通话切换到所述第一电子设备上;其中,所述第一切换请求信息中包括第一账号,所述第一账号用于登录到所述设备管理服务器;
    所述VoIP服务器用于接收所述第一切换请求信息,并确定与所述第一帐号对应的VoIP业务的源设备为所述第二电子设备;
    所述VoIP服务器还用于将在所述第二电子设备上正进行的所述VoIP通话切换到所述第一电子设备上。
  11. 根据权利要求10所述的语音切换系统,其特征在于,所述第一电子设备和所述第二电子设备均使用所述第一帐号登录到所述设备管理服务器。
  12. 根据权利要求10所述的语音切换系统,其特征在于,所述第一电子设备还用于:
    当检测到用户的特定操作时,向所述VoIP服务器发送所述第一切换请求信息;
    所述特定操作为以下操作的其中一种:翻转手机、指关节敲击屏幕、双击电源键、预设的语音 输入或预设的滑动手势。
  13. 根据权利要求10所述的语音切换系统,其特征在于,所述第一电子设备向所述VoIP服务器发送第一切换请求信息,具体包括:
    当所述第一电子设备检测到特定条件时,所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息。
  14. 根据权利要求13所述的语音切换系统,其特征在于,所述特定条件为:WLAN网络中的Wi-Fi信号强度或蓝牙信号强度;其中:
    当所述第一电子设备检测到所述Wi-Fi信号强度低于一预设阈值时,所述第所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息;或者,
    当所述第一电子设备检测到所述第二电子设备的蓝牙信号强度低于一预设阈值时,所述第一电子设备向所述VoIP服务器发送所述第一切换请求信息。
  15. 根据权利要求10所述的语音切换系统,其特征在于,所述第一电子设备还用于向所述VoIP服务器发送成功加入VoIP通话的响应消息;所述VoIP服务器还用于在接到所述响应消息后,中断在所述第二电子设备上的VoIP业务。
  16. 根据权利要求10所述的语音切换系统,其特征在于,所述VoIP服务器确定与所述第一帐号对应的VoIP业务的源设备为所述第二电子设备,具体包括:
    所述VoIP服务器向所述设备管理服务器发送所述第一帐号;
    所述设备管理服务器根据所述第一帐号确定使用该第一帐号登录的至少一个电子设备;
    所述设备管理服务器将所述至少一个电子设备的设备标识发送给所述所述VoIP服务器;
    所述VoIP服务器根据所述设备标识确定在所述第一帐号下正在进行VoIP通话的源设备为所述第二电子设备。
  17. 根据权利要求16所述的语音切换系统,其特征在于,
    所述VoIP服务器还用于当根据所述设备标识确定在所述第一帐号下正在进行VoIP通话的源设备有至少两个电子设备时,将所述至少两个电子设备的设备标识发送给所述第一电子设备;
    所述第一电子设备还用于显示至少两个选项;所述至少两个选项用于表示所述至少两个电子设备;
    所述第一电子设备检测到用户对其中一个选项的选择操作;其中,所述选项表示所述第二电子设备;
    所述第一电子设备还用于将所述第二电子设备的设备标识发送给所述VoIP服务器;
    所述VoIP服务器还用于根据接收到的所述第二电子设备的设备标识确定,在所述第一帐号下正在进行VoIP通话的源设备为所述第二电子设备。
  18. 根据权利要求X所述的语音切换系统,其特征在于,所述第一电子设备为手机,所述第二电子设备为配置有语音助理系统的智能音箱。
PCT/CN2018/125853 2018-10-09 2018-12-29 语音切换方法、电子设备及系统 WO2020073536A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201880098492.9A CN112806067B (zh) 2018-10-09 2018-12-29 语音切换方法、电子设备及系统
US17/281,514 US11838823B2 (en) 2018-10-09 2018-12-29 Voice switchover method and system, and electronic device
CN202211073008.1A CN115665100A (zh) 2018-10-09 2018-12-29 语音切换方法、电子设备及系统

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811172286.6 2018-10-09
CN201811172286 2018-10-09

Publications (1)

Publication Number Publication Date
WO2020073536A1 true WO2020073536A1 (zh) 2020-04-16

Family

ID=70164145

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/125853 WO2020073536A1 (zh) 2018-10-09 2018-12-29 语音切换方法、电子设备及系统

Country Status (3)

Country Link
US (1) US11838823B2 (zh)
CN (2) CN115665100A (zh)
WO (1) WO2020073536A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111885447A (zh) * 2020-07-16 2020-11-03 歌尔科技有限公司 电子设备及其控制方法、控制装置和可读存储介质
CN112291430A (zh) * 2020-10-23 2021-01-29 北京蓦然认知科技有限公司 一种基于身份确认的智能应答方法、装置
CN114216228A (zh) * 2021-11-11 2022-03-22 青岛海尔空调器有限总公司 空调的控制方法、控制系统、电子设备和存储介质
WO2022089660A1 (zh) * 2020-10-27 2022-05-05 中国移动通信有限公司研究院 切换方法、终端和网络侧设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11638172B2 (en) * 2021-06-02 2023-04-25 Meta Platforms Technologies, Llc Intralink based session negotiation and media bit rate adaptation
CN116266855A (zh) * 2021-12-17 2023-06-20 华为技术有限公司 电子设备及其语音传输方法、介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101242663A (zh) * 2008-03-20 2008-08-13 华为技术有限公司 基于同号移动终端与软终端通话切换方法、系统及设备
CN104427288A (zh) * 2013-08-26 2015-03-18 联想(北京)有限公司 一种信息处理的方法及服务器
CN105338425A (zh) * 2015-10-29 2016-02-17 深圳云聚汇数码有限公司 一种实现多屏间视频无缝切换的系统及方法
CN105872439A (zh) * 2015-12-15 2016-08-17 乐视致新电子科技(天津)有限公司 一种多设备视频通话方法、装置及服务器
WO2016160558A1 (en) * 2015-03-27 2016-10-06 Qualcomm Incorporated Techniques for maintaining data continuity in offloading wireless communications

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8761153B2 (en) * 2005-06-21 2014-06-24 Michael D. Metcalf Remote configuration of a voice over internet protocol telephone for smart dial tone
CN101227526B (zh) 2007-01-19 2011-03-02 中兴通讯股份有限公司 盲转业务实现方法和装置
KR100872234B1 (ko) 2007-04-18 2008-12-05 엘지전자 주식회사 건조기의 막힘감지방법
CN101227482B (zh) 2008-02-02 2011-11-30 中兴通讯股份有限公司 一种网络电话通话中媒体协商方法、装置及系统
CN101340722A (zh) 2008-08-12 2009-01-07 中兴通讯股份有限公司 一种通话无缝切换方法和移动终端
CN101854598B (zh) 2009-04-01 2013-02-13 中国电信股份有限公司 即时通信业务在客户端之间切换的方法和系统
CN102739885A (zh) * 2011-04-15 2012-10-17 鸿富锦精密工业(深圳)有限公司 Pstn通话与voip通话切换系统及方法
CN102891886B (zh) 2012-09-14 2015-05-20 吉视传媒股份有限公司 基于云计算的多屏互动方法及系统
US20150189426A1 (en) 2013-01-01 2015-07-02 Aliphcom Mobile device speaker control
US9077785B2 (en) * 2013-02-07 2015-07-07 Qualcomm Incorporated Originator mobile device assisted voice call technology selection
CN104348989B (zh) 2013-08-05 2017-11-21 中国移动通信集团公司 机顶盒与通话终端切换通话的方法及应用服务器
CN105516791A (zh) 2014-09-29 2016-04-20 宇龙计算机通信科技(深圳)有限公司 一种智能家居中流媒体数据无缝连接实现方法及系统
FR3030979A1 (fr) * 2014-12-17 2016-06-24 Orange Procede de controle de la restitution d' un flux media lors d' un appel telephonique
CN104506523B (zh) 2014-12-22 2018-05-04 迈普通信技术股份有限公司 一种智能终端VoIP的呼叫转接方法
US10104235B2 (en) * 2015-04-16 2018-10-16 Algoblu Holdings Limited Forwarding roaming call to VoIP system
CN105389118B (zh) 2015-12-10 2018-12-11 广东欧珀移动通信有限公司 一种音频文件的切换方法及用户终端
CN105657767A (zh) * 2016-03-04 2016-06-08 努比亚技术有限公司 一种控制语音切换的方法及移动终端
US10965800B2 (en) * 2016-05-20 2021-03-30 Huawei Technologies Co., Ltd. Interaction method in call and device
CN106126182B (zh) 2016-06-30 2022-06-24 联想(北京)有限公司 数据输出方法及电子设备
CN108257590B (zh) * 2018-01-05 2020-10-02 携程旅游信息技术(上海)有限公司 语音交互方法、装置、电子设备、存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101242663A (zh) * 2008-03-20 2008-08-13 华为技术有限公司 基于同号移动终端与软终端通话切换方法、系统及设备
CN104427288A (zh) * 2013-08-26 2015-03-18 联想(北京)有限公司 一种信息处理的方法及服务器
WO2016160558A1 (en) * 2015-03-27 2016-10-06 Qualcomm Incorporated Techniques for maintaining data continuity in offloading wireless communications
CN105338425A (zh) * 2015-10-29 2016-02-17 深圳云聚汇数码有限公司 一种实现多屏间视频无缝切换的系统及方法
CN105872439A (zh) * 2015-12-15 2016-08-17 乐视致新电子科技(天津)有限公司 一种多设备视频通话方法、装置及服务器

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111885447A (zh) * 2020-07-16 2020-11-03 歌尔科技有限公司 电子设备及其控制方法、控制装置和可读存储介质
CN112291430A (zh) * 2020-10-23 2021-01-29 北京蓦然认知科技有限公司 一种基于身份确认的智能应答方法、装置
WO2022089660A1 (zh) * 2020-10-27 2022-05-05 中国移动通信有限公司研究院 切换方法、终端和网络侧设备
CN114501556A (zh) * 2020-10-27 2022-05-13 中国移动通信有限公司研究院 一种切换方法、终端和网络侧设备
CN114216228A (zh) * 2021-11-11 2022-03-22 青岛海尔空调器有限总公司 空调的控制方法、控制系统、电子设备和存储介质

Also Published As

Publication number Publication date
US11838823B2 (en) 2023-12-05
US20210352560A1 (en) 2021-11-11
CN112806067B (zh) 2022-09-09
CN115665100A (zh) 2023-01-31
CN112806067A (zh) 2021-05-14

Similar Documents

Publication Publication Date Title
WO2020073536A1 (zh) 语音切换方法、电子设备及系统
WO2021052282A1 (zh) 数据处理方法、蓝牙模块、电子设备与可读存储介质
WO2020155014A1 (zh) 智能家居设备分享系统、方法及电子设备
EP3826280B1 (en) Method for generating speech control command, and terminal
US11968058B2 (en) Method for adding smart home device to contacts and system
WO2014008843A1 (zh) 一种声纹特征模型更新方法及终端
US20180034772A1 (en) Method and apparatus for bluetooth-based identity recognition
WO2021037146A1 (zh) 一种移动终端的文件共享方法及设备
US11949805B2 (en) Call method and apparatus
WO2022068513A1 (zh) 无线通信方法和终端设备
WO2020078330A1 (zh) 一种基于语音通话的翻译方法及电子设备
CN111696553B (zh) 一种语音处理方法、装置及可读介质
CN112334978A (zh) 支持个性化装置连接的电子装置及其方法
US10236016B1 (en) Peripheral-based selection of audio sources
WO2022161077A1 (zh) 语音控制方法和电子设备
EP4171135A1 (en) Device control method, and related apparatus
CN113365274B (zh) 一种网络接入方法和电子设备
WO2022247455A1 (zh) 一种音频分流的方法及电子设备
WO2022127670A1 (zh) 一种通话方法、相关设备和系统
CN111245629B (zh) 会议控制方法、装置、设备及存储介质
CN116708674B (zh) 通信方法及电子设备
WO2022143048A1 (zh) 对话任务管理方法、装置及电子设备
WO2022135201A1 (zh) 音效调节方法及电子设备
CN116668586B (zh) 私享彩铃业务状态的控制方法、终端、服务器及存储介质
WO2024002137A1 (zh) 通信方法、通信系统及电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18936596

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18936596

Country of ref document: EP

Kind code of ref document: A1