WO2020073288A1 - 一种触发电子设备执行功能的方法及电子设备 - Google Patents
一种触发电子设备执行功能的方法及电子设备 Download PDFInfo
- Publication number
- WO2020073288A1 WO2020073288A1 PCT/CN2018/109888 CN2018109888W WO2020073288A1 WO 2020073288 A1 WO2020073288 A1 WO 2020073288A1 CN 2018109888 W CN2018109888 W CN 2018109888W WO 2020073288 A1 WO2020073288 A1 WO 2020073288A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- wake
- electronic device
- voice data
- text
- word
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
Definitions
- This embodiment relates to the field of electronic devices, and in particular, to a method and electronic device for triggering an electronic device to perform a function.
- the user can trigger the operation of the physical keys of the mobile phone (such as the volume "+" key, the power key, etc.), or touch the mobile phone display to trigger the mobile phone to perform the corresponding function.
- the user is inconvenient to use his finger to operate the mobile phone, he often chooses to control the mobile phone by voice to perform the corresponding function.
- voice assistants can provide users with voice control services to achieve the purpose of performing corresponding functions through voice control of mobile phones.
- Voice assistant is an important application of artificial intelligence on mobile phones. The voice assistant can recognize the voice command input by the user and trigger the mobile phone to execute the function corresponding to the voice command, thereby realizing the intelligent interaction between the user and the mobile phone.
- the voice assistant is in a dormant state, and the user needs to wake up the voice assistant before using the voice assistant. Only after the voice assistant is woken up, can the user receive and recognize the voice command input by the user.
- the voice data used to wake up the voice assistant may be called a wake-up word (or called wake-up voice).
- the wake word may be registered in the mobile phone by the user in advance. For example, the wake-up word registered in the mobile phone in advance is "Hello little E". If the user wants to use the voice assistant to trigger the phone to turn down the volume of the phone, they first need to say "Hello little E" to wake up the voice assistant. After the voice assistant is awakened, the user then says “turn down the phone volume”. At this time, the voice assistant can receive and recognize the user's voice command "turn down the phone volume” and trigger the phone to turn down the volume.
- This embodiment provides a method and an electronic device for triggering an electronic device to perform a function, without requiring the user to input voice data multiple times to trigger the electronic device to perform a corresponding function, which improves the use efficiency of the electronic device and realizes Efficient interaction.
- a first aspect of this embodiment provides a method for triggering an electronic device to perform a function
- the electronic device is provided with at least two first wake-up words, and each of the at least two first wake-up words corresponds to a first instruction
- the electronic device performs different functions in response to the first instructions corresponding to different first wake-up words;
- the electronic device may include a main processor, and the main processor is in a sleep state.
- the method may include the electronic device receiving the first voice data input by the user.
- the electronic device determines whether there is a wake-up word whose text matches the text corresponding to the first voice data among the at least two first wake-up words.
- the electronic device may wake up the main processor from the sleep state to determine the first instruction corresponding to the first voice data, and The function corresponding to the first instruction is executed by the main processor.
- the electronic device can wake up the main processor of the electronic device from the sleep state, and determine the instruction corresponding to the input voice data to trigger the electronic device to pass the main
- the processor performs the function corresponding to the instruction. It can be seen that as long as the electronic device has no other software and hardware to use the microphone to collect voice data (even if it is in a black screen state and the AP is in a sleep state), the user does not need to enter a wake-up word to enable the electronic device to start the voice assistant and then enter the voice Command, but input a voice data can wake up the main processor of the electronic device, and trigger the electronic device to perform the corresponding function. In this way, the use efficiency of the electronic device is improved, and efficient interaction between the electronic device and the user is realized. At the same time, the user experience is improved.
- the method may further include: the electronic device determines the voiceprint feature of the first voice data and at least two first The voiceprint features corresponding to the wake words match. In this way, only the person who has registered the wake-up word in the electronic device can trigger the electronic device to perform the corresponding function by inputting voice data, which improves the security of using the voice control service.
- the electronic device wakes up the main processor from the sleep state, determines the first instruction corresponding to the first voice data, and executes the first instruction through the main processor
- the function corresponding to one instruction is specifically: the electronic device wakes up the main processor from the sleep state, activates the voice assistant through the main processor, determines the first instruction corresponding to the first voice data through the voice assistant, and executes the first instruction through the main processor Corresponding function.
- the voice assistant realizes the analysis of the detected first voice data.
- the electronic device further includes a first coprocessor; the first voice data received by the electronic device from the user may specifically be: the electronic device uses the first The coprocessor monitors the first voice data input by the user; the electronic device determines whether there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words; if there is text in at least two first wake-up words If the wake-up word matches the text corresponding to the first voice data, the electronic device wakes the main processor from the sleep state. Specifically, the electronic device uses the first coprocessor to determine whether the text and the first word exist in at least two first wake-up words. A wake-up word that matches the text corresponding to the voice data; if it exists, the first coprocessor wakes the main processor from the sleep state.
- the main processor is an AP
- the first coprocessor is a DSP.
- the electronic device further includes a first coprocessor; the first voice data received by the electronic device from the user may specifically be: the electronic device uses the first The coprocessor monitors the first voice data input by the user; the electronic device determines whether there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words; if there is text in at least two first wake-up words If the wake-up word matches the text corresponding to the first voice data, the electronic device wakes the main processor from the sleep state. Specifically, the electronic device uses the first coprocessor to determine whether the text and the first word exist in at least two first wake-up words.
- the matching degree of the text corresponding to one voice data satisfies the wake word of the first precision; if there is at least two first wake words, the matching degree of the text corresponding to the text corresponding to the first voice data meets the wake word of the first precision, the first The coprocessor wakes up the main processor from the sleep state; determine the first instruction corresponding to the first voice data, and pass the main processor
- the function corresponding to the first instruction may specifically be: the electronic device uses the main processor to determine whether the matching degree of the text and the text corresponding to the first voice data in the at least two first wake-up words satisfies the wake-up word of the second precision; if at least If there is a wake-up word whose matching degree between the text and the text corresponding to the first voice data in the two first wake-up words meets the second precision, the first instruction corresponding to the first voice data is determined, and the first instruction corresponding to the first processor is executed Function; the first precision is less than the second precision.
- the main processor is an AP, and the
- the method before the electronic device receives the first voice data input by the user, the method may further include that the electronic device enters a predetermined mode. In this way, after the electronic device enters the predetermined mode, the user directly inputs the first wake-up word to trigger the electronic device to perform the corresponding function. While realizing efficient interaction between the electronic device and the user, the power consumption of the electronic device is saved as much as possible.
- the electronic device further includes a second coprocessor; before the electronic device enters a predetermined mode, the method may further include: the electronic device uses the second The coprocessor monitors voice data. Before the electronic device enters the predetermined mode, the second coprocessor with lower power consumption is used to monitor the voice data to ensure the normal use of the voice assistant and save the power consumption of the electronic device.
- the second coprocessor is a DSP, the processing performance of the DSP is lower than that of the first coprocessor, and the memory is smaller than the memory of the first coprocessor.
- the second wake-up word is also set in the electronic device;
- the electronic device entering the predetermined mode may specifically be: the electronic device receives the second voice input by the user Data; the electronic device determines whether the second voice data matches the second wake-up word; if the second voice data matches the second wake-up word, the electronic device wakes up the main processor from the dormant state and starts the voice assistant through the main processor;
- the device receives the third voice data input by the user through the voice assistant, and determines the second instruction corresponding to the third voice data, and executes the function corresponding to the second instruction through the main processor.
- the second instruction is used to instruct the electronic device to enter a predetermined mode.
- whether the electronic device determines whether the second voice data matches the second wake-up word may specifically be: the electronic device determines that the text corresponding to the second voice data is Whether the text of the second wake-up word matches, if the text corresponding to the second voice data matches the text of the second wake-up word, the second voice data matches the second wake-up word.
- whether the electronic device determines whether the second voice data matches the second wake-up word may specifically be: the electronic device determines that the text corresponding to the second voice data is Whether the text of the second wake-up word matches, determine whether the voiceprint feature of the second voice data matches the voiceprint feature corresponding to the second wake-up word; if the text corresponding to the second voice data matches the text of the second wake-up word, and If the voiceprint feature of the second voice data matches the voiceprint feature corresponding to the second wake-up word, then the second voice data matches the second wake-up word.
- the electronic device further includes a second coprocessor; the electronic device receiving the second voice data input by the user may specifically be: the electronic device uses the second The coprocessor monitors the second voice data input by the user; the electronic device determines whether the second voice data matches the second wake-up word; if the second voice data matches the second wake-up word, the electronic device wakes the main processor from the sleep state ,
- Starting the voice assistant through the main processor may specifically be: the electronic device uses the second coprocessor to determine whether the matching degree between the text of the second wake-up word and the text corresponding to the second voice data meets the third precision; If the matching degree between the text and the text corresponding to the second voice data meets the third precision, the second coprocessor wakes up the main processor from the sleep state; the electronic device uses the main processor to determine the text of the second wake-up word and the second voice data Whether the matching degree of the corresponding text meets the fourth precision; if the text of the second wake-up
- an electronic device is provided.
- the electronic device is provided with at least two first wake-up words.
- Each of the at least two first wake-up words corresponds to a first instruction.
- the electronic device responds to different first
- the first instruction corresponding to a wake-up word performs different functions;
- the electronic device may include: an input unit for receiving first voice data input by a user; and a verification unit for determining whether at least two first wake-up words exist
- the wake-up word whose text matches the text corresponding to the first voice data;
- the wake-up unit is configured to process the main processing of the electronic device if there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words
- the device wakes up from the sleep state;
- a determination execution unit is used to determine the first instruction corresponding to the first voice data and execute the function corresponding to the first instruction through the main processor.
- the verification unit may be further configured to determine that the voiceprint feature of the first voice data matches the voiceprint feature corresponding to at least two first wake-up words.
- the above determination execution unit is specifically configured to: start the voice assistant through the main processor, and determine the first instruction corresponding to the first voice data through the voice assistant And execute the function corresponding to the first instruction through the main processor.
- the verification unit is specifically configured to determine whether there is a match between the text in the at least two first wake-up words and the text corresponding to the first voice data The wake-up word whose degree satisfies the first precision; the above-mentioned wake-up unit is specifically used to connect the electronic device if the matching degree of the text corresponding to the text corresponding to the first voice data in at least two first wake-up words meets the first precision
- the main processor of the computer wakes up from the sleep state; the verification unit is also used to determine whether there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words that meet the second precision;
- the unit is specifically configured to determine the first instruction corresponding to the first voice data if there is a wake-up word whose matching degree between the text and the text corresponding to the first voice data in at least two first wake-up words meets the second precision, and pass the master
- the processor executes the function
- the electronic device may further include: a trigger unit, configured to trigger the electronic device to enter a predetermined mode.
- the second wake-up word is also set in the electronic device; the electronic device may further include: a startup unit.
- the input unit is also used for receiving the second voice data input by the user; the verification unit is also used for judging whether the second voice data matches the second wake-up word; the wake-up unit is also used if the second voice data and the second wake-up word Match, then wake up the main processor from the sleep state;
- the start unit is used to start the voice assistant through the main processor;
- the input unit is also used to receive the third voice data input by the user through the voice assistant;
- the determination execution unit is also used to The second instruction corresponding to the third voice data is determined, and the function corresponding to the second instruction is executed by the main processor, and the second instruction is used to instruct the electronic device to enter a predetermined mode.
- the matching unit is specifically configured to determine whether the text corresponding to the second voice data matches the text of the second wake-up word, if the second voice data If the corresponding text matches the text of the second wake-up word, the second voice data matches the second wake-up word.
- the matching unit is specifically configured to: determine whether the text corresponding to the second voice data matches the text of the second wake-up word, and determine the second voice Whether the voiceprint feature of the data matches the voiceprint feature corresponding to the second wake-up word; if the text corresponding to the second voice data matches the text of the second wake-up word, and the voiceprint feature of the second voice data corresponds to the second wake-up word Matches the voiceprint feature of, the second voice data matches the second wake-up word.
- the matching unit is specifically configured to determine whether the matching degree between the text of the second wake-up word and the text corresponding to the second voice data meets the third precision ;
- Wake-up unit specifically used to wake up the main processor from the sleep state if the matching degree between the text of the second wake-up word and the text corresponding to the second voice data meets the third precision;
- the matching unit is also specifically used to determine the second Whether the matching degree of the text of the wake-up word and the text corresponding to the second voice data meets the fourth precision;
- the starting unit is used to match the text of the second wake-up word and the text corresponding to the second voice data to the fourth precision, then Start the voice assistant through the main processor;
- the third precision is less than the fourth precision.
- an electronic device may include: a processor, a memory, and a display; the memory, the display, and the processor are coupled; the display is used to display images generated by the processor; and the memory is used to store the computer Program code; the processor may include a main processor, the main processor is in a sleep state; at least two first wake-up words are provided in the electronic device, each of the at least two first wake-up words corresponds to a first instruction, the electronic device The functions performed by the first instruction corresponding to different first wake words are different; the computer program code includes computer instructions, and when the processor executes the above computer instructions, the processor is configured to receive the first voice data input by the user; determine at least two Whether there is a wake-up word whose text matches the text corresponding to the first voice data in the first wake-up words; if there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words, the main processing The device wakes up from the
- the processor is further configured to determine that a voiceprint feature of the first voice data matches a voiceprint feature corresponding to at least two first wake words.
- the processor is configured to wake up the main processor from the sleep state, determine the first instruction corresponding to the first voice data, and pass the main processing
- the function corresponding to the first instruction executed by the device is specifically as follows: the processor is used to wake up the main processor from the sleep state, start the voice assistant through the main processor, determine the first instruction corresponding to the first voice data through the voice assistant, and pass the main instruction
- the processor executes the function corresponding to the first instruction.
- the processor further includes a first co-processor; the processor for receiving the first voice data input by the user is specifically: the first co-processor The processor is configured to monitor the first voice data input by the user.
- a processor configured to determine whether there is a wake-up word in the text that matches the text corresponding to the first voice data in at least two first wake-up words; if there is a text in the at least two first wake-up words that matches the text corresponding to the first voice data
- the wake-up word of Wake the main processor from the sleep state specifically: the first coprocessor is used to determine whether there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words; if If it exists, wake up the main processor from the sleep state.
- the processor further includes a first co-processor; the processor for receiving the first voice data input by the user is specifically: the first co-processor The processor is configured to monitor the first voice data input by the user.
- a processor configured to determine whether there is a wake-up word in the text that matches the text corresponding to the first voice data in at least two first wake-up words; if there is a text in the at least two first wake-up words that matches the text corresponding to the first voice data Wake-up words, the main processor wakes up from the sleep state specifically: the first coprocessor is used to determine whether there is at least two first wake-up words that match the text with the text corresponding to the first voice data to meet the first A wake-up word of precision; if there is a wake-up word whose matching degree between the text and the text corresponding to the first voice data in the at least two first wake-up words satisfies the first precision, the main processor is woken up from the sleep state.
- the processor is used to determine the first instruction corresponding to the first voice data, and the function corresponding to the first instruction executed by the main processor is specifically as follows: the main processor is used to determine whether the text and The matching degree of the text corresponding to the first voice data satisfies the wake word of the second precision; if there is at least two first wake words that match the text and the text corresponding to the first voice data with the wake word of the second precision, it is determined The first instruction corresponding to the first voice data and execute the function corresponding to the first instruction; the first accuracy is less than the second accuracy.
- the processor is further configured to trigger the electronic device to enter a predetermined mode.
- the processor further includes a second coprocessor; the second coprocessor is used to monitor voice data before the electronic device enters a predetermined mode .
- the second wake-up word is also set in the electronic device; the processor is also used to trigger the electronic device to enter a predetermined mode specifically: the processor, It is also used to receive the second voice data input by the user; determine whether the second voice data matches the second wake-up word; if the second voice data matches the second wake-up word, wake up the main processor from the sleep state and pass the main processing Activate the voice assistant; receive the third voice data input by the user through the voice assistant, and determine the second instruction corresponding to the third voice data, and execute the function corresponding to the second instruction through the main processor, the second instruction is used to instruct the electronic device to enter Reservation mode.
- the processor is configured to determine whether the second voice data matches the second wake-up word specifically: the processor is configured to determine the second voice Whether the text corresponding to the data matches the text of the second wake-up word. If the text corresponding to the second voice data matches the text of the second wake-up word, the second voice data matches the second wake-up word.
- the processor is configured to determine whether the second voice data matches the second wake-up word specifically: the processor is configured to determine the second voice Whether the text corresponding to the data matches the text of the second wake-up word, and determine whether the voiceprint feature of the second voice data matches the voiceprint feature corresponding to the second wake-up word; if the text corresponding to the second voice data matches the text of the second wake-up word If the text matches, and the voiceprint feature of the second voice data matches the voiceprint feature corresponding to the second wake-up word, then the second voice data matches the second wake-up word.
- the processor further includes a second coprocessor; the processor, which is further configured to receive second voice data input by the user, specifically: the second The coprocessor is used to monitor the second voice data input by the user. The processor is also used to determine whether the second voice data matches the second wake-up word; if the second voice data matches the second wake-up word, wake up the main processor from the sleep state, and start the voice assistant through the main processor.
- the second coprocessor is also used to determine whether the matching degree between the text of the second wake-up word and the text corresponding to the second voice data meets the third precision; if the text of the second wake-up word corresponds to the text of the second voice data If the matching degree meets the third precision, the main processor is woken up from the sleep state; the main processor is also used to determine whether the matching degree between the text of the second wake-up word and the text corresponding to the second voice data meets the fourth precision; The matching degree between the text of the second wake-up word and the text corresponding to the second voice data meets the fourth precision, and the voice assistant is started; the third precision is less than the fourth precision.
- a computer storage medium includes computer instructions.
- the computer instructions run on an electronic device, the electronic device executes the first aspect or a possible implementation manner of the first aspect. Any one of the methods for triggering an electronic device to perform a function.
- a fifth aspect of this embodiment provides a computer program product that, when the computer program product runs on a computer, causes the computer to execute the triggering electronic device as described in the first aspect or any possible implementation manner of the first aspect The method of performing the function.
- FIG. 1 is a schematic structural diagram of an electronic device according to this embodiment
- FIG. 2 is a block diagram of a software structure of an electronic device provided by this embodiment
- FIG. 3 is a schematic flowchart of a method for triggering an electronic device to perform a function according to this embodiment
- FIG. 4 is a schematic diagram of some graphical user interfaces displayed on the electronic device in this embodiment.
- FIG. 5 is a schematic flowchart of another method for triggering an electronic device to perform a function according to this embodiment
- FIG. 6 is a schematic diagram of some other graphical user interfaces displayed on the electronic device in this embodiment.
- FIG. 7 is a schematic diagram of some other graphical user interfaces displayed on the electronic device in this embodiment.
- FIG. 8 is a schematic diagram of some other graphical user interfaces displayed on the electronic device in this embodiment.
- FIG. 9 is a schematic structural diagram of another electronic device according to this embodiment.
- FIG. 10 is a schematic structural diagram of yet another electronic device provided by this embodiment.
- first and second are used for description purposes only, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features.
- the features defined as “first” and “second” may explicitly or implicitly include one or more of the features.
- the meaning of “plurality” is two or more.
- the method for triggering an electronic device to perform a function can enable an electronic device to provide a voice control service for a user without requiring the user to input voice data multiple times. That is, the electronic device can be triggered to perform the corresponding function without requiring the user to input voice data multiple times.
- the use efficiency of the electronic device is improved, and the efficient interaction between the electronic device and the user is realized. At the same time, the user experience is improved.
- the electronic device described in this embodiment may be a mobile phone, a tablet computer, a desktop, a laptop, a handheld computer, a notebook computer, a personal computer (Personal Computer, PC), a netbook, a cellular phone, and Personal digital assistants (Personal Digital Assistants, PDAs), wearable devices (such as smart watches), smart home devices, in-vehicle computers, etc., this embodiment does not specifically limit the specific form of the device.
- FIG. 1 shows a schematic structural diagram of an electronic device 100 provided by this embodiment.
- the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, Antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, key 190, motor 191, indicator 192, camera 193, display 194 , And subscriber identification module (SIM) card interface 195, etc.
- SIM subscriber identification module
- the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light Sensor 180L, bone conduction sensor 180M, etc.
- the structure illustrated in this embodiment does not constitute a specific limitation on the electronic device 100.
- the electronic device 100 may include more or fewer components than illustrated, or combine certain components, or split certain components, or arrange different components.
- the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
- the processor 110 may include one or more processing units.
- the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), and an image signal processor. (image) signal processor (ISP), controller, memory, video codec, digital signal processor (DSP), baseband processor, and / or neural-network processing unit (NPU) Wait.
- image image signal processor
- ISP image signal processor
- controller memory
- video codec digital signal processor
- DSP digital signal processor
- NPU neural-network processing unit
- different processing units may be independent devices, or may be integrated in one or more processors.
- the processor 110 may include a DSP, such as the first DSP.
- One or more first wake-up words may be set in the electronic device 100.
- each of the multiple first wake-up words corresponds to one instruction, and the electronic device 100 performs different functions according to instructions corresponding to different first wake-up words.
- the first DSP can monitor the voice data in real time through the microphone 170C of the electronic device 100.
- the voice data monitored by the first DSP can perform text verification of the first precision on the monitored voice data. If the text verification with the first precision passes, the first DSP can wake up the AP and notify the AP to perform text verification with the second precision on the received voice data.
- the first accuracy is less than the second accuracy.
- the AP determines that the voice data matches the first wake-up word.
- the electronic device 100 may determine the instruction corresponding to the voice data, and trigger the electronic device 100 to execute the corresponding function according to the instruction through the AP.
- the electronic device may determine the instruction corresponding to the voice data according to the predefined correspondence between the first wake-up word and the instruction.
- the electronic device 100 can wake up the voice assistant in the electronic device 100, and perform semantic analysis on the text of the voice data through the voice assistant to determine the instruction corresponding to the voice data, thereby triggering the electronic device 100 to perform the function corresponding to the instruction.
- the text verification operation with the first precision and the text verification operation with the second precision may be performed by only one or both.
- the first wake word may be a predefined wake word.
- the first wake-up word may also be a user-defined wake-up word. If the first wake-up word is a user-defined wake-up word, optionally, after the AP receives the voice data, it can also perform voiceprint verification on the voice data. When both the text verification and the voiceprint verification pass, the AP determines that the voice data matches the first wake-up word.
- the above-mentioned first DSP may be a DSP with larger memory and higher processing performance.
- the first DSP may be a high-fidelity (HIFI) DSP provided in a system on chip (System On Chip, SOC).
- the first DSP may also be a codec DSP (codec DSP) provided outside the SOC.
- the processor 110 may further include another DSP, such as a second DSP.
- a second wake word can also be set in the electronic device.
- the second DSP can monitor the voice data in real time through the microphone 170C of the electronic device 100.
- the voice data monitored by the second DSP can perform text verification of the third precision on the monitored voice data. If the text verification with the third accuracy passes, the second DSP can transmit the voice data to the AP.
- the AP may perform fourth-precision text verification and voiceprint verification on the voice data (where voiceprint verification is an optional verification operation).
- the third precision is less than the fourth precision.
- the third precision may be the same as the first precision, or may be different from the first precision.
- the fourth precision may be the same as the second precision, or may be different from the second precision.
- the AP determines that the voice data matches the second wake-up word.
- the electronic device 100 can turn on the voice assistant. After the voice assistant is turned on, the electronic device 100 can receive a voice command input by the user through the voice assistant, so as to trigger the electronic device 100 to perform the corresponding function.
- the above-mentioned second DSP may be a DSP with smaller memory and lower processing performance.
- the processor 110 includes two DSPs, such as the above-mentioned first DSP and second DSP.
- the microphone 170C of the electronic device 100 only establishes a path with one of the DSPs at the same time, so as to transmit the received voice data to the corresponding DSP.
- the microphone 170C of the electronic device 100 only establishes a path with the first DSP. If the microphone 170C collects voice data input by the user, the collected voice data is transmitted to the first DSP through the established channel, so that the first DSP can perform subsequent processing.
- the electronic device 100 switches the microphone 170C path from the first DSP to the second DSP, that is, the microphone 170C only establishes a path with the second DSP. If the microphone 170C collects voice data input by the user, the collected voice data is transmitted to the second DSP through the established channel, so that the second DSP can perform subsequent processing.
- the controller may be the nerve center and command center of the electronic device 100.
- the controller can generate the operation control signal according to the instruction operation code and the timing signal to complete the control of fetching instructions and executing instructions.
- the processor 110 may also be provided with a memory for storing instructions and data.
- the memory in the processor 110 is a cache memory.
- the memory may store instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. The repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the system.
- the processor 110 may include one or more interfaces.
- Interfaces can include integrated circuit (inter-integrated circuit, I2C) interface, integrated circuit built-in audio (inter-integrated circuit, sound, I2S) interface, pulse code modulation (pulse code modulation (PCM) interface, universal asynchronous transceiver (universal asynchronous) receiver / transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input / output (GPIO) interface, subscriber identity module (SIM) interface, and / Or universal serial bus (USB) interface, etc.
- I2C integrated circuit
- I2S integrated circuit built-in audio
- PCM pulse code modulation
- PCM pulse code modulation
- UART universal asynchronous transceiver
- MIPI mobile industry processor interface
- GPIO general-purpose input / output
- SIM subscriber identity module
- USB universal serial bus
- the I2C interface is a bidirectional synchronous serial bus, including a serial data line (serial data line, SDA) and a serial clock line (derail clock line, SCL).
- the processor 110 may include multiple sets of I2C buses.
- the processor 110 may be coupled to the touch sensor 180K, the charger, the flash, the camera 193, etc. through different I2C bus interfaces.
- the processor 110 may couple the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to realize the touch function of the electronic device 100.
- the I2S interface can be used for audio communication.
- the processor 110 may include multiple sets of I2S buses.
- the processor 110 may be coupled to the audio module 170 through an I2S bus to implement communication between the processor 110 and the audio module 170.
- the audio module 170 can transmit audio signals to the wireless communication module 160 through the I2S interface, to realize the function of answering the phone call through the Bluetooth headset.
- the PCM interface can also be used for audio communication, sampling, quantizing and encoding analog signals.
- the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
- the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface to realize the function of answering the call through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
- the UART interface is a universal serial data bus used for asynchronous communication.
- the bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
- the UART interface is generally used to connect the processor 110 and the wireless communication module 160.
- the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to implement the Bluetooth function.
- the audio module 170 may transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
- the MIPI interface can be used to connect the processor 110 to peripheral devices such as the display screen 194 and the camera 193.
- MIPI interface includes camera serial interface (camera serial interface, CSI), display serial interface (display serial interface, DSI) and so on.
- the processor 110 and the camera 193 communicate through a CSI interface to implement the shooting function of the electronic device 100.
- the processor 110 and the display screen 194 communicate through the DSI interface to realize the display function of the electronic device 100.
- the GPIO interface can be configured via software.
- the GPIO interface can be configured as a control signal or a data signal.
- the GPIO interface may be used to connect the processor 110 to the camera 193, the display screen 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like.
- GPIO interface can also be configured as I2C interface, I2S interface, UART interface, MIPI interface, etc.
- the USB interface 130 is an interface that conforms to the USB standard, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
- the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transfer data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through the headphones.
- the interface can also be used to connect other electronic devices, such as AR devices.
- the interface connection relationship between the modules illustrated in this embodiment is only a schematic description, and does not constitute a limitation on the structure of the electronic device 100.
- the electronic device 100 may also use different interface connection methods in the foregoing embodiments, or a combination of multiple interface connection methods.
- the charging management module 140 is used to receive charging input from the charger.
- the charger can be a wireless charger or a wired charger.
- the charging management module 140 may receive the charging input of the wired charger through the USB interface 130.
- the charging management module 140 may receive wireless charging input through the wireless charging coil of the electronic device 100. While the charging management module 140 charges the battery 142, it can also supply power to the electronic device through the power management module 141.
- the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
- the power management module 141 receives input from the battery 142 and / or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
- the power management module 141 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
- the power management module 141 may also be disposed in the processor 110.
- the power management module 141 and the charging management module 140 may also be set in the same device.
- the wireless communication function of the electronic device 100 can be realized by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor, and the baseband processor.
- Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in the electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
- the antenna 1 can be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
- the mobile communication module 150 can provide a wireless communication solution including 2G / 3G / 4G / 5G and the like applied to the electronic device 100.
- the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), and the like.
- the mobile communication module 150 can receive the electromagnetic wave from the antenna 1, filter and amplify the received electromagnetic wave, and transmit it to the modem processor for demodulation.
- the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor and convert it to electromagnetic wave radiation through the antenna 1.
- at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110.
- at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be provided in the same device.
- the modem processor may include a modulator and a demodulator.
- the modulator is used to modulate the low-frequency baseband signal to be transmitted into a high-frequency signal.
- the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal.
- the demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
- the low-frequency baseband signal is processed by the baseband processor and then passed to the application processor.
- the application processor outputs a sound signal through an audio device (not limited to a speaker 170A, a receiver 170B, etc.), or displays an image or video through a display screen 194.
- the modem processor may be an independent device.
- the modem processor may be independent of the processor 110, and may be set in the same device as the mobile communication module 150 or other functional modules.
- the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (bluetooth, BT), and global navigation satellites that are applied to the electronic device 100. Wireless communication solutions such as global navigation (satellite system, GNSS), frequency modulation (FM), near field communication (NFC), infrared (IR), etc.
- the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives the electromagnetic wave via the antenna 2, frequency-modulates and filters the electromagnetic wave signal, and sends the processed signal to the processor 110.
- the wireless communication module 160 may also receive the signal to be transmitted from the processor 110, frequency-modulate it, amplify it, and convert it to electromagnetic waves through the antenna 2 to radiate it out.
- the antenna 1 of the electronic device 100 and the mobile communication module 150 are coupled, and the antenna 2 and the wireless communication module 160 are coupled so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
- the wireless communication technology may include global mobile communication system (global system for mobile communications, GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Wideband code division multiple access (WCDMA), time-division code division multiple access (TD-SCDMA), long-term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and / or IR technology, etc.
- the GNSS may include a global positioning system (GPS), a global navigation satellite system (GLONASS), a beidou navigation system (BDS), and a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and / or satellite-based augmentation system (SBAS).
- GPS global positioning system
- GLONASS global navigation satellite system
- BDS beidou navigation system
- QZSS quasi-zenith satellite system
- SBAS satellite-based augmentation system
- the electronic device 100 realizes a display function through a GPU, a display screen 194, and an application processor.
- the GPU is a microprocessor for image processing, connecting the display screen 194 and the application processor.
- the GPU is used to perform mathematical and geometric calculations, and is used for graphics rendering.
- the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos and the like.
- the display screen 194 includes a display panel.
- the display panel may use a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light-emitting diode or an active matrix organic light-emitting diode (active-matrix organic light) emitting diode (AMOLED), flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diode (QLED), etc.
- the electronic device 100 may include 1 or N display screens 194, where N is a positive integer greater than 1.
- the electronic device 100 can realize a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
- the ISP processes the data fed back by the camera 193. For example, when taking a picture, the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
- ISP can also optimize the algorithm of image noise, brightness and skin color. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be set in the camera 193.
- the camera 193 is used to capture still images or videos.
- the object generates an optical image through the lens and projects it onto the photosensitive element.
- the photosensitive element may be a charge coupled device (charge coupled device, CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
- CCD charge coupled device
- CMOS complementary metal-oxide-semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing.
- DSP converts digital image signals into standard RGB, YUV and other image signals.
- the electronic device 100 may include 1 or N cameras 193, where N is a positive integer greater than 1.
- the digital signal processor is used to process digital signals. In addition to digital image signals, it can also process other digital signals. For example, when the electronic device 100 is selected at a frequency point, the digital signal processor is used to perform Fourier transform on the energy at the frequency point.
- Video codec is used to compress or decompress digital video.
- the electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in various encoding formats, for example: moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
- MPEG moving picture experts group
- NPU is a neural-network (NN) computing processor.
- NN neural-network
- the NPU can realize applications such as intelligent recognition of the electronic device 100, such as image recognition, face recognition, voice recognition, and text understanding.
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example, save music, video and other files in an external memory card.
- the internal memory 121 may be used to store computer executable program code, where the executable program code includes instructions.
- the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121.
- the internal memory 121 may include a storage program area and a storage data area.
- the storage program area may store an operating system, at least one function required application programs (such as sound playback function, image playback function, etc.) and so on.
- the storage data area may store data (such as audio data, phone book, etc.) created during use of the electronic device 100 and the like.
- the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and so on.
- a non-volatile memory such as at least one disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and so on.
- the electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, a headphone interface 170D, and an application processor. For example, music playback, recording, etc.
- the audio module 170 is used to convert digital audio information into analog audio signal output, and also used to convert analog audio input into digital audio signal.
- the audio module 170 can also be used to encode and decode audio signals.
- the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.
- the speaker 170A also called “speaker” is used to convert audio electrical signals into sound signals.
- the electronic device 100 can listen to music through the speaker 170A, or listen to a hands-free call.
- the receiver 170B also known as "handset" is used to convert audio electrical signals into sound signals.
- the voice can be received by bringing the receiver 170B close to the ear.
- Microphone 170C also known as “microphone”, “microphone”, is used to convert sound signals into electrical signals.
- the user can approach the microphone 170C through the human mouth to sound, and input the sound signal to the microphone 170C.
- the electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C. In addition to collecting sound signals, it may also implement a noise reduction function. In other embodiments, the electronic device 100 may also be provided with three, four, or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions.
- the headset interface 170D is used to connect wired headsets.
- the earphone interface 170D may be a USB interface 130, or a 3.5mm open mobile electronic device (open mobile terminal) platform (OMTP) standard interface, the American Telecommunications Industry Association (cellular telecommunications industry association of the United States, CTIA) standard interface.
- OMTP open mobile electronic device
- CTIA cellular telecommunications industry association of the United States
- the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
- the pressure sensor 180A may be provided on the display screen 194.
- the capacitive pressure sensor may be a parallel plate including at least two conductive materials. When force is applied to the pressure sensor 180A, the capacitance between the electrodes changes.
- the electronic device 100 determines the intensity of the pressure according to the change in capacitance.
- the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
- the electronic device 100 may also calculate the touched position based on the detection signal of the pressure sensor 180A.
- touch operations that act on the same touch position but have different touch operation intensities may correspond to different operation instructions. For example, when a touch operation with a touch operation intensity less than the first pressure threshold acts on the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
- the gyro sensor 180B may be used to determine the movement posture of the electronic device 100.
- the angular velocity of the electronic device 100 around three axes ie, x, y, and z axes
- the gyro sensor 180B can be used for shooting anti-shake.
- the gyro sensor 180B detects the shaking angle of the electronic device 100, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to counteract the shaking of the electronic device 100 through reverse movement to achieve anti-shake.
- the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
- the air pressure sensor 180C is used to measure air pressure.
- the electronic device 100 calculates the altitude by using the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
- the magnetic sensor 180D includes a Hall sensor.
- the electronic device 100 can detect the opening and closing of the flip holster using the magnetic sensor 180D.
- the electronic device 100 may detect the opening and closing of the clamshell according to the magnetic sensor 180D.
- characteristics such as automatic unlocking of the flip cover are set.
- the acceleration sensor 180E can detect the magnitude of acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to recognize the posture of electronic devices, and be used in applications such as horizontal and vertical screen switching and pedometers.
- the distance sensor 180F is used to measure the distance.
- the electronic device 100 can measure the distance by infrared or laser. In some embodiments, when shooting scenes, the electronic device 100 may use the distance sensor 180F to measure distance to achieve fast focusing.
- the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector, such as a photodiode.
- the light emitting diode may be an infrared light emitting diode.
- the electronic device 100 emits infrared light outward through the light emitting diode.
- the electronic device 100 uses a photodiode to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it may be determined that there is an object near the electronic device 100. When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100.
- the electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
- the proximity light sensor 180G can also be used in leather case mode, pocket mode automatically unlocks and locks the screen.
- the ambient light sensor 180L is used to sense the brightness of ambient light.
- the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived brightness of the ambient light.
- the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket to prevent accidental touch.
- the fingerprint sensor 180H is used to collect fingerprints.
- the electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, access to application locks, fingerprint taking pictures, fingerprint answering calls, and the like.
- the temperature sensor 180J is used to detect the temperature.
- the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 performs performance reduction of the processor located near the temperature sensor 180J in order to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature. In some other embodiments, when the temperature is below another threshold, the electronic device 100 performs boosting on the output voltage of the battery 142 to avoid abnormal shutdown due to low temperature.
- Touch sensor 180K also known as "touch panel”.
- the touch sensor 180K may be provided on the display screen 194, and the touch sensor 180K and the display screen 194 constitute a touch screen, also called a "touch screen”.
- the touch sensor 180K is used to detect a touch operation acting on or near it.
- the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
- the visual output related to the touch operation can be provided through the display screen 194.
- the touch sensor 180K may also be disposed on the surface of the electronic device 100, which is different from the location where the display screen 194 is located.
- the bone conduction sensor 180M can acquire vibration signals.
- the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human body part.
- the bone conduction sensor 180M can also contact the pulse of the human body and receive a blood pressure beating signal.
- the bone conduction sensor 180M may also be provided in the earphone and combined into a bone conduction earphone.
- the audio module 170 may parse out the voice signal based on the vibration signal of the vibrating bone block of the voice part acquired by the bone conduction sensor 180M to realize the voice function.
- the application processor may analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M to implement the heart rate detection function.
- the key 190 includes a power-on key, a volume key, and the like.
- the key 190 may be a mechanical key. It can also be a touch button.
- the electronic device 100 can receive key input and generate key signal input related to user settings and function control of the electronic device 100.
- the motor 191 may generate a vibration prompt.
- the motor 191 can be used for vibration notification of incoming calls and can also be used for touch vibration feedback.
- touch operations applied to different applications may correspond to different vibration feedback effects.
- the motor 191 can also correspond to different vibration feedback effects.
- Different application scenarios for example: time reminder, receiving information, alarm clock, game, etc.
- Touch vibration feedback effect can also support customization.
- the indicator 192 may be an indicator light, which may be used to indicate a charging state, a power change, and may also be used to indicate a message, a missed call, a notification, and the like.
- the SIM card interface 195 is used to connect a SIM card.
- the SIM card can be inserted into or removed from the SIM card interface 195 to achieve contact and separation with the electronic device 100.
- the electronic device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
- the SIM card interface 195 can support Nano SIM cards, Micro SIM cards, SIM cards, etc.
- the same SIM card interface 195 can insert multiple cards at the same time. The types of the multiple cards may be the same or different.
- the SIM card interface 195 can also be compatible with different types of SIM cards.
- the SIM card interface 195 can also be compatible with external memory cards.
- the electronic device 100 interacts with the network through a SIM card to realize functions such as call and data communication.
- the electronic device 100 uses eSIM, that is, an embedded SIM card.
- the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.
- the software system of the electronic device 100 may adopt a layered architecture, event-driven architecture, micro-core architecture, micro-service architecture, or cloud architecture.
- This embodiment takes a layered architecture Android system as an example to exemplarily explain the software structure of the electronic device 100.
- FIG. 2 is a software structural block diagram of an electronic device 100 provided by this embodiment.
- the layered architecture divides the software into several layers, and each layer has a clear role and division of labor.
- the software interface communicates between layers.
- the Android system is divided into four layers, from top to bottom are the application layer, the application framework layer, the Android runtime and the system library, and the kernel layer.
- the application layer may include a series of application packages.
- the application package may include applications such as voice assistant, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
- the application framework layer provides an application programming interface (application programming interface) and programming framework for applications at the application layer.
- the application framework layer includes some predefined functions.
- the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and so on.
- the window manager is used to manage window programs.
- the window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, intercept the screen, etc.
- Content providers are used to store and retrieve data and make it accessible to applications.
- the data may include videos, images, audio, calls made and received, browsing history and bookmarks, phone book, etc.
- the view system includes visual controls, such as controls for displaying text and controls for displaying pictures.
- the view system can be used to build applications.
- the display interface can be composed of one or more views.
- a display interface including an SMS notification icon may include a view that displays text and a view that displays pictures.
- the phone manager is used to provide the communication function of the electronic device 100. For example, the management of the call state (including connection, hang up, etc.).
- the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
- the notification manager enables applications to display notification information in the status bar, which can be used to convey notification-type messages, and can disappear after a short stay without user interaction.
- the notification manager is used to notify the completion of downloading, message reminders, etc.
- the notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window.
- the text message is displayed in the status bar, a prompt sound is emitted, the electronic device vibrates, and the indicator light flashes.
- Android Runtime includes core library and virtual machine. Android runtime is responsible for the scheduling and management of the Android system.
- the core library contains two parts: one part is the function function that Java language needs to call, and the other part is the core library of Android.
- the application layer and the application framework layer run in the virtual machine.
- the virtual machine executes the java files of the application layer and the application framework layer into binary files.
- the virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, security and exception management, and garbage collection.
- the system library may include multiple functional modules. For example: surface manager (surface manager), media library (Media library), 3D graphics processing library (for example: OpenGL ES), 2D graphics engine (for example: SGL), etc.
- surface manager surface manager
- media library Media library
- 3D graphics processing library for example: OpenGL ES
- 2D graphics engine for example: SGL
- the surface manager is used to manage the display subsystem and provides the fusion of 2D and 3D layers for multiple applications.
- the media library supports a variety of commonly used audio, video format playback and recording, and still image files.
- the media library can support multiple audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
- the 3D graphics processing library is used to realize 3D graphics drawing, image rendering, synthesis, and layer processing.
- the 2D graphics engine is a drawing engine for 2D drawing.
- the kernel layer is the layer between hardware and software.
- the kernel layer contains at least the display driver, camera driver, audio driver, and sensor driver.
- FIG. 3 is a schematic flowchart of a method for triggering an electronic device to perform a function according to this embodiment. As shown in FIG. 3, the method may include the following S301-S304.
- first wake-up words are set in the electronic device.
- the first wake-up word may be predefined or user-defined. If the first wake-up word is user-defined, then the process of registering the first wake-up word in the electronic device can refer to the specific description in the conventional technology, which will not be repeated here in this embodiment.
- the first wake-up word may be a voice command frequently used by the user, such as "mute", "answer a call", and so on.
- the first wake-up word set in the electronic device may correspond to a first instruction.
- the functions performed by the electronic device in response to the first instructions corresponding to different first wake-up words Is different.
- the electronic device is provided with at least two first wake-up words as an example for detailed description.
- the electronic device receives the first voice data input by the user.
- the first DSP of the electronic device can monitor in real time whether the user has voice data input through the microphone.
- the user wants to trigger the electronic device to perform certain functions by inputting voice data
- the user can approach the microphone of the mobile phone to sound, so as to input the sound to the microphone.
- the first DSP of the electronic device can monitor the corresponding voice data through the microphone, such as the first voice data.
- the first DSP of the electronic device can monitor the user Voice data.
- the AP of the electronic device may be in a sleep state or in a non-sleep state.
- the electronic device determines whether there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words.
- the electronic device After the electronic device receives the first voice data, it can perform text verification on the first voice data, that is, determine whether text and the first voice exist in at least two first wake-up words set in the electronic device The text corresponding to the data matches the wake-up word to determine whether the received first voice data is the first wake-up word set in the electronic device.
- the electronic device may execute S303. If the text verification fails, it indicates that the received first voice data is not the first wake-up word set in the electronic device. At this time, the electronic device may perform S304.
- the electronic device performing text verification on the first voice data may specifically include: the first DSP of the electronic device performs text verification on the first voice data, and / or the AP of the electronic device performs verification on the first voice data.
- Text verification of voice data if the electronic device performs text verification on the first voice data specifically includes: the first DSP performs text verification on the first voice data, and the AP performs text verification on the first voice data, then, the first DSP and The accuracy of AP text verification can be different.
- the first DSP performs text verification on the first voice data with a first precision
- the AP performs text verification on the first voice data with a second precision
- the first precision is less than the second precision.
- performing text verification on the first voice data by the electronic device specifically includes: the first DSP performs text verification on the first voice data with a first precision, and the AP performs text verification on the first voice data with a second precision Verification, the AP of the electronic device is in a dormant state as an example to introduce in detail the process of text verification on the first voice data.
- the first DSP of the electronic device monitors the first voice data, the first DSP may perform text verification on the first voice data with a first precision (or low precision). That is, the first DSP can determine whether there is a wake-up word whose matching degree between the text and the text corresponding to the first voice data satisfies the first precision among the at least two first wake-up words.
- the first DSP determines that there is a wake-up word whose matching degree between the text and the text corresponding to the first voice data satisfies the first precision in at least two first wake-up words, that is, the first DSP If the text verification with accuracy is passed, the first DSP can wake up the AP of the electronic device and transmit the monitored first voice data to the AP. If the first DSP determines that there is no wake-up word whose matching degree between the text and the text corresponding to the first voice data satisfies the first precision among the at least two first wake-up words, it indicates that the received first voice data is not set in the electronic device The first wake-up word in the electronic device may execute S304.
- the AP of the electronic device may perform text verification with a second precision (or high precision) on the first voice data.
- the AP can determine whether there is a wake-up word whose matching degree between the text and the text corresponding to the first voice data satisfies the second precision among the at least two first wake-up words. If the AP determines that there is a match between the text and the text corresponding to the first voice data in the at least two first wake-up words that meet the second precision of the wake-up word, that is, the AP passes the second precision text verification of the first voice data, then It indicates that the received first voice data is the first wake-up word set in the electronic device, and the electronic device may execute S303.
- the AP of the electronic device determines that there is no match between the text and the text corresponding to the first voice data in the at least two first wake-up words, the wake-up word that meets the second precision meets the second precision text correction of the first voice data by the AP If the test fails, it indicates that the received first voice data is not the first wake-up word set in the electronic device, and the electronic device may execute S304.
- the electronic device determines the first instruction corresponding to the first voice data, and executes the function corresponding to the first instruction through the main processor of the electronic device.
- the electronic device may determine the first instruction corresponding to the first voice data and execute the corresponding Function to achieve the purpose of triggering an electronic device to perform a corresponding function by inputting voice data.
- the electronic device determining the first instruction corresponding to the first voice data may specifically be: the electronic device stores a correspondence between at least two first wake-up words set in the electronic device and the instruction. After the electronic device passes the text verification of the first voice data, the electronic device may look up the corresponding relationship to determine the first instruction corresponding to the first voice data.
- the electronic device may perform semantic analysis on the text of the first voice data to determine the first instruction corresponding to the first voice data.
- the semantic analysis function of the electronic device may be implemented in the electronic device as a separate module, or may be implemented by a module integrated in an application program.
- the voice analysis function is implemented by the module integrated in the voice assistant. Then, when the electronic device determines that there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words, the electronic device may start a voice assistant, and perform the text corresponding to the first voice data through the voice assistant Semantic analysis to determine the first instruction corresponding to the first voice data, and trigger the electronic device to execute the function corresponding to the first instruction through the main processor.
- the first voice data can not only be used as a wake-up word to wake up the voice assistant, but also as a voice command to trigger the electronic device to perform the corresponding function.
- the above voice assistant may be an application (Application, APP) installed in the electronic device.
- the voice assistant may be a system application or a third-party application.
- System applications also known as embedded applications, are application programs provided as part of electronic device implementation.
- a third-party application also known as a downloadable application, is an application that can provide its own Internet Protocol Multimedia Subsystem (Internet) Multimedia Subsystem (IMS) connection. It can be pre-installed in an electronic device or downloaded by a user And installed in electronic equipment.
- Internet Internet Multimedia Subsystem
- the electronic device deletes the first voice data.
- the electronic device may determine an instruction corresponding to the input voice data after the received voice data passes the verification, to trigger the electronic device to perform the function corresponding to the instruction. It can be seen that as long as the electronic device has no other software and hardware to use the microphone to collect voice data (even if it is in a black screen state and the AP is in a sleep state), the user does not need to enter a wake-up word to enable the electronic device to start the voice assistant and then enter the voice Command, but input a voice data can trigger the electronic device to perform the corresponding function. In this way, the use efficiency of the electronic device is improved, and efficient interaction between the electronic device and the user is realized. At the same time, the user experience is improved.
- voiceprint verification may be continued. That is, after the text verification is passed, the AP can determine whether the voiceprint features of the first voice data match the voiceprint features corresponding to at least two first wake-up words set in the electronic device. If the voiceprint feature of the first voice data matches the voiceprint feature corresponding to at least two first wake words set in the electronic device, the voiceprint verification of the first voice data passes, and the electronic device may execute the above S303.
- the voiceprint verification of the first voice data fails, and the electronic device may perform the above S304. That is to say, after receiving the first voice data, the electronic device may execute the above S303 only after text verification and voiceprint verification are passed on the first voice data. In this way, only the person who has registered the wake-up word in the electronic device can trigger the electronic device to perform the corresponding function by inputting voice data, which improves the security of using the voice control service.
- the electronic device may generate the first voiceprint model according to the voice data input by the user when setting the at least two first wake-up words.
- the first voiceprint model may be used to characterize the voiceprint features of the at least two first wake words. If the electronic device determines that there is a wake-up word whose text matches the text corresponding to the first voice data in at least two first wake-up words, that is, the text verification is passed, the AP may continue to respond to the first voice according to the first voiceprint model Perform voiceprint verification on the data.
- the voice data input by the user when setting at least two first wake-up words may be used as an input value
- the first voiceprint threshold is obtained after being input into the first voiceprint model.
- the electronic device determines that the monitored first voice data has passed the text verification
- the first voice data may be used as an input value
- a voiceprint value such as a second voiceprint value
- the electronic device can determine whether the difference between the second voiceprint value and the first voiceprint threshold is less than a preset threshold. If the difference between the second voiceprint value and the first voiceprint threshold is less than a preset threshold, the voiceprint verification is passed. If the difference between the second voiceprint value and the first voiceprint threshold is greater than or equal to the preset threshold, the voiceprint verification fails.
- the first wake-up word set in the electronic device includes the following commonly used voice commands: system setting commands, such as “mute”, “unmute”, “turn up the volume” (or “a little louder”) ), “Turn down the volume” (or “lower volume”), “lock screen”, etc.
- system setting commands such as “mute”, “unmute”, “turn up the volume” (or “a little louder”)
- “Turn down the volume” or “lower volume”
- Lock screen etc.
- Navigation setting commands such as “exit navigation”, “stop navigation”, “switch route” (or “change another route"), "navigation home”, “navigation to company”, etc.
- Music setting commands such as “previous song”, “next song” (or “cut song”), “pause music” (or “pause play”), “start music” (or “start play”), “stop Music “(or” Stop Playing “), etc.
- Communication setting commands such as “hang up the phone”, “answer the phone”, “view SMS”, “reply SMS”, “read short” aloud, “read WeChat”, “reply WeChat”, etc.
- the electronic device performs semantic analysis through the voice assistant to obtain instructions corresponding to the voice data.
- Electronic devices currently have no other software or hardware to use microphones to collect voice data.
- the electronic device is in a black screen state, and the AP of the electronic device is also in a sleep state.
- the user wants to trigger the electronic device to lower the volume of the electronic device by inputting voice data.
- the user can say "turn down the volume” near the phone's microphone.
- the first DSP of the electronic device can monitor the corresponding voice data "turn down the volume” through the microphone. After the first DSP detects the voice data "turn down the volume", it can perform text verification of the first precision on the voice data "turn down the volume”.
- the first DSP can wake up the AP and transmit the voice data "turning down the volume” to the AP.
- the AP can perform a second-precision text verification on the voice data by "turning down the volume”.
- the voice assistant can be activated, and the voice assistant can perform a semantic analysis on the voice data "turn down the volume” to determine the voice data "turn down” Low volume “corresponds to the instruction.
- the AP triggers the electronic device to lower the volume of the system according to the instruction.
- the electronic device may light the screen and display a prompt message to remind the user that the corresponding response has been made.
- the electronic device may light up the screen and display the voice assistant interface 401.
- the voice assistant interface 401 may include the text “turn down the volume” 402 corresponding to the recognized voice data input by the user.
- prompt information 403 may be displayed in the voice assistant interface 401. The prompt information 403 is used to prompt the user that the volume of the system has been turned down.
- the first DSP since the first DSP has a higher processing capability, its power consumption is relatively higher. In this embodiment, in order to achieve efficient interaction between the electronic device and the user, the power consumption of the electronic device is saved as much as possible.
- the voice data collected by the microphone may be processed using the first DSP only in a specific scenario. That is to say, the above S301-S304 can be executed only in a specific scenario to achieve the purpose of triggering the electronic device to perform the corresponding function by inputting voice data. That is, as shown in FIG. 5, before the above S301, the method of triggering the electronic device to perform a function may further include S501.
- the electronic device enters a predetermined mode.
- the predetermined mode may be a driving mode, a home mode, and the like.
- the microphone of the electronic device can transmit the collected voice data to the first DSP of the electronic device, so that the first DSP can perform voice data deal with.
- the electronic device may automatically enter a predetermined mode under certain specific circumstances.
- the electronic device may automatically enter the predetermined mode at certain specific time periods or at certain specific locations.
- the electronic device can use the historical usage record to obtain the user's intention to use the electronic device at the current location or the current time, such as whether to trigger the electronic device to perform a corresponding function by inputting voice data.
- the electronic device may automatically enter a predetermined mode. That is to say, the electronic device can use the historical usage record to determine the time and / or location where the user frequently triggers the electronic device to perform the corresponding function by inputting voice data.
- the electronic device can automatically enter the predetermined mode.
- the electronic device obtains that the user often triggers the electronic device to perform the corresponding function by inputting voice data within the time period of 19: 00-20: 30. Then, the electronic device can automatically enter the predetermined mode when it is determined that the current system time is within the time period of 19: 00-20: 30. For another example, if the electronic device obtains that the user is in a certain geographic location range (if the geographic location range is the user's home), the electronic device often triggers the electronic device to perform the corresponding function by inputting voice data, then the electronic device may determine the current electronic device When the geographical position of is within the range of the geographical position, it automatically enters a predetermined mode (such as home mode).
- a predetermined mode such as home mode
- the electronic device may automatically enter a predetermined mode (such as a driving mode) when it is determined that the moving speed of the electronic device is greater than a certain value.
- a predetermined mode such as a driving mode
- the electronic device may frequently input voice data to trigger the electronic device to perform a corresponding function, such as a voice assistant to trigger a map application of the electronic device to navigate. Therefore, when the electronic device detects that the current moving speed of the electronic device is greater than a certain value, it automatically enters a predetermined mode (such as a driving mode).
- the electronic device may enter a predetermined mode in response to the user's specific input.
- the specific input may be a user's trigger operation on a specific virtual button or physical key.
- a specific input is a user's trigger operation on a specific virtual button (such as a switch button of a "predetermined mode" option) as an example.
- the electronic device includes settings. As shown in FIG. 6, the electronic device can receive a user's click operation on the set icon. In response to the user's clicking operation on the set icon, the electronic device may display the setting interface 601 shown in FIG. 6.
- the setting interface 601 may include “airplane mode” setting options, “WLAN” setting options, “Bluetooth” setting options, “mobile network” setting options, and “reserved mode” setting options (as set in “driving mode” in FIG. 6) Options are shown as examples).
- “airplane mode” option, “WLAN” option, “Bluetooth” option and “mobile network” option please refer to the specific description in the conventional technology, which will not be repeated here in this embodiment.
- the electronic device may enter a predetermined mode, such as a driving mode, in response to a user's click operation on the switch button 602 of the "driving mode” option.
- a predetermined mode such as a driving mode
- the electronic device can exit the driving mode.
- the display effect of the switch button 602 of the “driving mode” option shown in FIG. 6 is used to indicate that the driving mode is not turned on, and the user can perform a click operation on the switch button 602 at this time to make the electronic device enter the driving mode.
- the above specific input may also be a voice command input by the user.
- the voice command may be input through a voice assistant of the electronic device.
- the second wake-up word may also be set in the electronic device.
- the second wake-up word can be used to wake up the voice assistant in the electronic device. After the user wakes up the voice assistant through the second wake-up word, he can input a voice command through the voice assistant to trigger the electronic device to enter the predetermined mode. Before the electronic device enters a predetermined mode, the electronic device can use another DSP, such as a second DSP, to monitor voice data so that the voice assistant can be woken up by the second wake-up word.
- another DSP such as a second DSP
- the electronic device may use the second DSP to process the voice data collected by the microphone. Since the processing performance of the second DSP is lower than that of the first DSP, the memory is smaller than the memory of the first DSP. Therefore, its power consumption is lower than that of the first DSP. In this way, not only can the efficient interaction between the electronic device and the user be ensured when the user uses the voice assistant in a predetermined scenario, but also the power consumption of the electronic device can be saved. Moreover, when the electronic device is in a non-specific scene, the voice assistant can still be used to trigger the electronic device to perform the corresponding function, such as triggering the electronic device to enter a predetermined mode.
- the voice assistant in a sleep state as an example.
- the user wants to input a voice command through the voice assistant to trigger the electronic device to enter a predetermined mode
- the user can sound near the microphone of the mobile phone to input the sound to the microphone.
- the second DSP of the electronic device can monitor the voice data input by the user through the microphone, such as the second voice data.
- the electronic device can determine whether the second voice data matches the second wake-up word set in the electronic device.
- the voice assistant may be activated.
- the user may input voice data for triggering the electronic device to enter a predetermined mode through the voice assistant, such as third voice data.
- the electronic device may receive the third voice data input by the user through the voice assistant, and determine the second instruction corresponding to the third voice data.
- the second instruction may be used to instruct the electronic device to enter a predetermined mode.
- the electronic device can execute the function corresponding to the second instruction, that is, enter the predetermined mode.
- the electronic device determines whether the second voice data matches the second wake-up word set in the electronic device. Specifically, the electronic device performs text verification on the second voice data. If the text corresponding to the second voice data is the same as the second If the text of the wake-up word matches, the text check passes, and the second voice data matches the second wake-up word. If the text corresponding to the second voice data does not match the text of the second wake-up word, the text verification fails, and the second voice data does not match the second wake-up word.
- the electronic device determining whether the second voice data matches the second wake-up word set in the electronic device may specifically be: the electronic device performs text verification and voiceprint verification on the second voice data.
- the text check and the voiceprint check pass.
- the second voice data matches the second wake word. If the text corresponding to the second voice data does not match the text of the second wake-up word, or the voiceprint feature of the second voice data does not match the voiceprint feature corresponding to the second wake-up word, the text verification and voiceprint verification are not Pass, the second voice data does not match the second wake word.
- the electronic device performing text verification on the second voice data may include: the second DSP of the electronic device performing text verification on the second voice data with a third precision, and / or, the AP of the electronic device verifies the second
- the second voice data is checked for text with a fourth precision.
- the third precision is less than the fourth precision.
- the third precision may be the same as the first precision described above, or may be different from the first precision.
- the fourth accuracy may be the same as the above-mentioned second accuracy, or may be different from the above-mentioned first accuracy.
- whether the electronic device determines whether the second voice data matches the second wake-up word set in the electronic device specifically includes: the electronic device performs text verification and voiceprint verification on the second voice data.
- the electronic device performing text verification on the second voice data includes: the second DSP performing text verification on the second voice data with a third precision, and the AP performing text verification on the second voice data with a fourth precision as an example, detailed description
- the second DSP of the electronic device can determine whether the matching degree of the text corresponding to the second voice data and the text of the second wake-up word meets the third precision.
- the second voice data may be transmitted to the AP.
- the AP may determine whether the matching degree between the text corresponding to the second voice data and the text of the second wake-up word meets the fourth precision. If the matching degree between the text corresponding to the second voice data and the text of the second wake-up word meets the fourth precision, the AP of the electronic device may determine whether the voiceprint feature of the second voice data matches the voiceprint feature corresponding to the second wake-up word. If the voiceprint feature of the second voice data matches the voiceprint feature corresponding to the second wake-up word, it indicates that the second voice data is the second wake-up word set in the electronic device, and the electronic device may start the voice assistant at this time.
- the electronic device After the electronic device enters the predetermined mode, if the user wants to trigger the electronic device to perform the corresponding function by inputting voice data, it only needs to input the first wake-up word to realize the intention. That is to say, after the electronic device enters the predetermined mode, the user does not need to input voice data multiple times, but directly the first wake-up word, which can trigger the electronic device to perform the corresponding function. While realizing efficient interaction between the electronic device and the user, the power consumption of the electronic device is saved as much as possible.
- the electronic device is a mobile phone.
- the phone includes a voice assistant.
- the mobile phone performs a semantic analysis on the text corresponding to the voice data input by the user through a voice assistant to obtain the corresponding instruction.
- the phone includes two DSPs, DSP1 and DSP2.
- the microphone included in the mobile phone only establishes a channel with one of the two DSPs at the same time.
- the microphone establishes a channel with DSP2 by default. After the mobile phone enters the driving mode, switch the microphone channel from DSP2 to DSP1.
- the mobile phone is in a black screen state.
- the AP is in a sleep state.
- the voice assistant is in a sleep state.
- the microphone near the phone says "Hello little E”.
- the microphone of the mobile phone collected the voice data corresponding to "Hello Little E” 1.
- the microphone of the mobile phone transmits the collected voice data 1 to the DSP 2.
- DSP 2 judges whether the matching degree of the text corresponding to the voice data 1 and the text of the second wake-up word "Hello Little E" set in the mobile phone meets the accuracy 1 to determine whether the voice data 1 is suspected to be the first in the mobile phone. Two wake words "Hello little E”.
- DSP 2 determines that the received voice data 1 is suspected to be the second wake-up word "Hello Little E" set in the mobile phone, it wakes the AP from the sleep state and transmits the voice data 1 to the mobile phone's AP. After receiving the voice data 1, the AP determines whether the matching degree between the text corresponding to the voice data 1 and the text of the second wake-up word "Hello Little E" set in the mobile phone satisfies the accuracy 2. Accuracy 2 is greater than accuracy 1.
- the voiceprint feature of the voice data 1 is judged to be the second Whether the voiceprint features corresponding to the wake-up word "Hello Xiao E" match, that is, to perform voiceprint verification. If the voiceprint verification passes.
- the AP of the mobile phone can wake up the voice assistant. As shown in (b) of FIG. 7, the mobile phone lights up the screen and displays a voice assistant interface 701.
- the voice assistant interface 701 may include prompt information 702.
- the prompt information 702 is used to prompt the user to input a voice command at this time to trigger the mobile phone to perform the corresponding function.
- the mobile phone can receive the corresponding voice data 2 through the voice assistant.
- the mobile phone may display the text “enter driving mode” 704 corresponding to the voice data 2 in the voice assistant interface 703.
- the mobile phone performs a semantic analysis on the voice data 2 through a voice assistant to determine the instruction corresponding to the voice data 2.
- the mobile phone can trigger the mobile phone to enter the driving mode according to the instruction. And, the mobile phone switches the microphone channel from DSP2 to DSP1.
- the mobile phone may display prompt information 705 in the voice assistant interface 703.
- the prompt information 705 is used to prompt the user that the user has entered the driving mode, and then the first wake-up word can be directly spoken to trigger the mobile phone to perform the corresponding function.
- the voice assistant will enter the idle state. Or the mobile phone will control the voice assistant to enter the sleep state again when it is determined that the user operation is not received within a predetermined time. In addition, if no user operation is received within a certain period of time, the AP may also re-enter the sleep state.
- the voice assistant is in the idle state or enters the sleep state (or the voice assistant is in the idle state or enters the sleep state, and the AP is also in the sleep state)
- the mobile phone after the mobile phone enters the driving mode, even if the voice assistant is in the idle state or enters the sleep state (or the voice assistant is in the idle state or enters the sleep state, and the AP is in the sleep state), the user only needs to enter the above A wake-up word can achieve the intention.
- the voice assistant After entering the driving mode with the mobile phone, the voice assistant enters the sleep state again, and as shown in (a) of FIG. 8, the mobile phone is in the black screen state again, and the AP is in the sleep state.
- the user speaks "navigation home” near the phone's microphone.
- the microphone of the mobile phone collects the voice data corresponding to “navigation home” 3.
- the microphone of the mobile phone transmits the collected voice data 3 to the DSP 1.
- DSP1 can perform lower-precision text matching on the voice data 3, that is, to determine whether there is a wake-up word that matches the accuracy of the text with the text corresponding to the voice data 3 among the 5 first wake-up words set in the mobile phone .
- DSP 1 wakes up the AP when the text matching with lower precision passes, and transmits the voice data 3 to the AP.
- the AP of the mobile phone can perform high-precision text matching on the voice data 3, that is, whether the text matches the text corresponding to the voice data 3 in the 5 first wake-up words set in the mobile phone meets the accuracy 4 Wake word.
- Accuracy 4 is greater than accuracy 3.
- the mobile phone can wake up the voice assistant.
- the mobile phone can also perform a semantic analysis on the voice data 3 through the voice assistant to determine the instruction corresponding to the voice data 3.
- the mobile phone can call the corresponding interface according to the instruction to trigger the map application to display the corresponding navigation route to the user (or the mobile phone can also simulate the user's click operation according to the instruction to display the corresponding navigation route to the user in the map application).
- the mobile phone can also broadcast navigation route information through the speaker. For example, as shown in (b) of FIG. 8, the mobile phone may light up the screen and display the voice assistant interface 801.
- the voice assistant interface 801 may include the recognized text "navigation home" 802 corresponding to the voice data 3 input by the user.
- the voice assistant interface 801 is a jump interface, that is, after the voice assistant interface 801 is displayed on the display screen of the mobile phone, it immediately jumps to the navigation interface 803 shown in (c) in FIG. 8.
- the voice assistant interface 801 may not be displayed on the display screen of the mobile phone, but after the user inputs "navigation home", the mobile phone directly lights up the screen and displays the navigation interface 803. In this way, users do not need to input voice data multiple times to achieve their intentions. Improve the efficiency of human-computer interaction and improve the user experience.
- the above-mentioned electronic device includes a hardware structure and / or a software module corresponding to each function.
- this embodiment can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed by hardware or computer software driven hardware depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the embodiments of the present application.
- This embodiment also provides an electronic device that implements the foregoing method embodiments.
- the electronic device may be divided into functional modules, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
- the above integrated modules can be implemented in the form of hardware or software function modules. It should be noted that the division of the modules in the embodiments of the present application is schematic, and is only a division of logical functions. In actual implementation, there may be another division manner.
- FIG. 9 shows a possible structural schematic diagram of the electronic device 900 involved in the above embodiment
- the electronic device 900 may include: input Unit 901, verification unit 902, wake-up unit 903, and determination execution unit 904.
- the input unit 901 is used to support the electronic device 900 to perform S301 in the above method embodiments and / or other processes used in the technology described herein.
- the input unit 901 supports the electronic device 900 to perform receiving the second voice data and the third voice data input by the user in the above method embodiments.
- the verification unit 902 is used to support the electronic device 900 to perform S302 in the foregoing method embodiments and / or other processes for the technology described herein.
- the verification unit 902 supports the electronic device 900 to perform voiceprint verification in the above method embodiment.
- the wake-up unit 903 is used to support the electronic device 900 to perform the operation of waking up the main processor (such as an AP) in the above method embodiments and / or other processes used in the technology described herein.
- the main processor such as an AP
- the determination execution unit 904 is used to support the electronic device 900 to execute S303 in the foregoing method embodiments and / or other processes for the technology described herein.
- the electronic device 900 may further include: a trigger unit 905 and a start unit 906.
- the trigger unit 905 is used to support the electronic device 900 to execute S501 in the above method embodiments and / or other processes used in the technology described herein.
- the starting unit 906 is used to support the electronic device 900 to perform the operation of starting the voice assistant in the above method embodiment and / or other processes for the technology described herein.
- the electronic device 900 may further include a deletion unit.
- the deleting unit may be used to support the electronic device to perform S304 in the above method embodiments and / or other processes for the technology described herein.
- the electronic device 900 includes but is not limited to the above listed unit modules.
- the electronic device 900 may further include a receiving unit for receiving data or signals sent by other devices, a display unit for displaying content, and the like.
- the specific functions that can be achieved by the above functional units also include, but are not limited to, functions corresponding to the method steps described in the above examples.
- a detailed description of other units of the electronic device 900 refer to the detailed description of the corresponding method steps. The embodiments are not repeated here.
- the electronic device may include: a processing module, a storage module, and a display module.
- the processing module is used to control and manage the actions of the electronic device.
- the display module is used to display content according to the instructions of the processing module.
- the storage module is used to save the program code and data of the electronic device.
- the storage module may also be used to save the text corresponding to the first wake-up word and / or corresponding voiceprint feature information in the above embodiments, and the text corresponding to the second wake-up word and / or corresponding voiceprint feature Information etc.
- the electronic device may further include an input module and a communication module.
- the communication module is used to support communication between the electronic device and other network entities, so as to realize functions such as communication, data interaction, and Internet access of the electronic device.
- the processing module may be a processor or a controller.
- the communication module may be a transceiver, an RF circuit, or a communication interface.
- the storage module may be a memory.
- the display module may be a screen or a display.
- the input module may be a touch screen, a voice input device, or a fingerprint sensor.
- the processing module is a processor
- the communication module is a circuit
- the storage module is a memory
- the display module is a touch screen
- the electronic device provided in this embodiment may be the electronic device shown in FIG. 1.
- the above communication module may include not only an RF circuit, but also a Wi-Fi module, an NFC module, and a Bluetooth module.
- Communication modules such as RF circuits, NFC modules, Wi-Fi modules, and Bluetooth modules can be collectively referred to as communication interfaces.
- the above processor, RF circuit, touch screen and memory may be coupled together through a bus.
- an electronic device 1000 which may include: a display 1001; one or more processors 1002; a memory 1003; and one or more computer program codes 1004.
- the above devices may be connected through one or more communication buses 1005.
- the one or more computer program codes 1004 are stored in the above-mentioned memory 1003 and are configured to be executed by the one or more processors 1002.
- the electronic device 1000 is provided with at least two first wake words, each of the at least two first wake words corresponds to a first instruction, and the electronic device 1000 executes in response to the first instructions corresponding to different first wake words Function is different.
- the one or more computer program codes 1004 include computer instructions.
- the above computer instructions may be used to perform various steps performed by the electronic device in FIG. 3 or FIG. 5 and the corresponding embodiments.
- the electronic device 1000 includes, but is not limited to, the devices listed above.
- the electronic device 1000 may further include a radio frequency circuit, a positioning device, a sensor, etc.
- the electronic device 1000 may It is the electronic device shown in FIG. 1.
- the processor 1002 may include an AP 1006 and a first DSP 1007. Further, the processor 1002 may also include a second DSP 1008.
- FIG. 3 Other embodiments of the present application also provide a computer storage medium, the computer storage medium includes computer instructions, and when the above computer instructions run on an electronic device, the electronic device is executed as shown in FIG. 3 or FIG.
- FIG. 3 Other embodiments of the present application also provide a computer program product containing instructions.
- the computer program product runs on a computer, the computer is caused to perform relevant method steps as shown in any of the drawings in FIG. 3 or FIG. 5, such as S301, S302, S303, S304, and S501 implement the method for triggering an electronic device to perform a function in the foregoing embodiments.
- control device includes a processor and a memory.
- the memory is used to store computer program code.
- the computer program code includes computer instructions.
- the control device executes the relevant method steps as shown in any one of the drawings in FIG. 3 or FIG. method.
- the control device may be an integrated circuit IC or a system-on-chip SOC.
- the integrated circuit may be a general-purpose integrated circuit, a field programmable gate array FPGA, or an application-specific integrated circuit ASIC.
- inventions of the present application also provide an apparatus that triggers an electronic device to perform a function, and the apparatus has a function to implement the behavior of the electronic device in the above method.
- the functions can be realized by hardware, or can also be realized by hardware executing corresponding software.
- the hardware or software includes one or more modules corresponding to the above functions.
- the electronic devices, computer storage media, computer program products, or control devices provided in the embodiments of the present application are used to perform the corresponding methods provided above. Therefore, for the beneficial effects that can be achieved, refer to the The beneficial effects in the corresponding method will not be repeated here.
- the disclosed system, device, and method may be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of the modules or units is only a division of logical functions.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each of the embodiments of this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or software function unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
- the technical solution of this embodiment essentially or part of the contribution to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium
- several instructions are included to enable a computer device (which may be a personal computer, server, or network device, etc.) or processor to perform all or part of the steps of the methods described in the various embodiments.
- the foregoing storage media include: flash memory, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk, and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Telephone Function (AREA)
Abstract
一种触发电子设备执行功能的方法及电子设备,涉及电子设备领域,无需用户多次输入语音数据便可触发电子设备执行对应功能,提高了电子设备的使用效率,实现了电子设备与用户之间的高效互动。电子设备中设置有至少两个第一唤醒词,该至少两个第一唤醒词中的每个对应一个第一指令,电子设备响应不同第一唤醒词对应的第一指令所执行的功能不同;电子设备包括主处理器,其处于休眠状态;该方法包括:电子设备接收用户输入的第一语音数据(S301);判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词(S302);若存在,则将主处理器从休眠状态唤醒,确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能(S303)。
Description
本实施例涉及电子设备领域,尤其涉及一种触发电子设备执行功能的方法及电子设备。
通常,用户可以通过对手机的物理按键(如音量“+”键,电源键等)进行触发操作,或在手机显示屏上进行触摸操作以触发手机执行对应的功能。而在用户不方便使用手指操作手机时,往往选择通过语音控制手机执行对应功能。目前,语音助手可以为用户提供语音控制服务,以实现通过语音控制手机执行对应功能的目的。语音助手是人工智能在手机上的重要应用。语音助手可以识别用户输入的语音命令,并触发手机执行该语音命令对应的功能,从而实现用户与手机的智能交互。
但是,通常情况下,语音助手是处于休眠状态的,用户在使用语音助手前,需要对语音助手进行语音唤醒。只有在语音助手被唤醒后,才可以接收并识别用户输入的语音命令。用于唤醒语音助手的语音数据可以称为唤醒词(或称为唤醒语音)。唤醒词可以是用户预先注册在手机中的。例如,预先注册在手机中的唤醒词为“你好小E”。如果用户想要使用语音助手触发手机将手机的音量调小,则先需说出“你好小E”,来唤醒语音助手。在语音助手被唤醒后,用户再说出“调小手机音量”。此时,语音助手才可以接收并识别用户的语音命令“调小手机音量”,并触发手机将音量调小。
可以看到的是,用户需要多次输入语音数据才能够触发手机执行对应的功能,极大降低了手机的使用效率。
发明内容
本实施例提供一种触发电子设备执行功能的方法及电子设备,无需用户多次输入语音数据便可触发电子设备执行对应功能,提高了电子设备的使用效率,实现了电子设备与用户之间的高效互动。
为达到上述目的,本实施例提供如下技术方案:
本实施例的第一方面,提供一种触发电子设备执行功能的方法,该电子设备中设置有至少两个第一唤醒词,至少两个第一唤醒词中的每个对应一个第一指令,电子设备响应不同第一唤醒词对应的第一指令所执行的功能不同;该电子设备可以包括主处理器,该主处理器处于休眠状态。该方法可以包括:电子设备接收用户输入的第一语音数据。电子设备判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词。如果电子设备确定至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则可以将主处理器从休眠状态唤醒,确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能。
本实施例提供的技术方案,电子设备可以在接收到的语音数据通过校验后,将电子设备的主处理器从休眠状态唤醒,并确定输入的语音数据对应的指令,以触发电子 设备通过主处理器执行该指令对应的功能。可以看到的是,只要电子设备没有其他软硬件使用麦克风采集语音数据(即使处于黑屏状态,且AP处于休眠状态),不需要用户先输入唤醒词,以使电子设备启动语音助手,再输入语音命令,而是输入一条语音数据便可唤醒电子设备的主处理器,并触发电子设备执行对应的功能。这样,提高了电子设备的使用效率,实现了电子设备与用户之间的高效互动。同时,提高了用户的使用体验。
结合第一方面,在一种可能的实现方式中,在电子设备将主处理器从休眠状态唤醒之前,该方法还可以包括:电子设备确定第一语音数据的声纹特征与至少两个第一唤醒词对应的声纹特征匹配。这样,只有在电子设备中注册过唤醒词的人才能够通过输入语音数据触发电子设备执行对应功能,提高了语音控制服务的使用安全性。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备将主处理器从休眠状态唤醒,确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能具体为:电子设备将主处理器从休眠状态唤醒,通过主处理器启动语音助手,通过语音助手确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能。通过语音助手实现了对监测到的第一语音数据的解析。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备还包括第一协处理器;电子设备接收用户输入的第一语音数据具体可以为:电子设备使用第一协处理器监测用户输入的第一语音数据;电子设备判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则电子设备将主处理器从休眠状态唤醒具体可以为:电子设备使用第一协处理器判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若存在,则第一协处理器将主处理器从休眠状态唤醒。其中,在一些实施例中,上述主处理器为AP,第一协处理器为DSP。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备还包括第一协处理器;电子设备接收用户输入的第一语音数据具体可以为:电子设备使用第一协处理器监测用户输入的第一语音数据;电子设备判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则电子设备将主处理器从休眠状态唤醒具体可以为:电子设备使用第一协处理器判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词,则第一协处理器将主处理器从休眠状态唤醒;确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能具体可以为:电子设备使用主处理器判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词,则确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能;第一精度小于第二精度。在一些实施例中,上述主处理器为AP,第一协处理器为DSP。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,在电子设备 接收用户输入的第一语音数据之前,该方法还可以包括:电子设备进入预定模式。这样,在电子设备进入预定模式后,用户直接输入上述第一唤醒词,便可以触发电子设备执行对应功能。在实现电子设备与用户之间高效互动的同时,尽可能的节省电子设备的功耗。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备还包括第二协处理器;在电子设备进入预定模式之前,该方法还可以包括:电子设备使用第二协处理器监测语音数据。在电子设备进入预定模式之前,通过使用功耗较低的第二协处理器监测语音数据,保证了语音助手的正常使用,节省了电子设备的功耗。在一些实施例中,第二协处理器为DSP,该DSP的处理性能相较于第一协处理器的处理性能低,内存相较于第一协处理器的内存小。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备中还设置有第二唤醒词;电子设备进入预定模式具体可以为:电子设备接收用户输入的第二语音数据;电子设备判断第二语音数据与第二唤醒词是否匹配;若第二语音数据与第二唤醒词匹配,则电子设备将主处理器从休眠状态唤醒,通过主处理器启动语音助手;电子设备通过语音助手接收用户输入的第三语音数据,并确定第三语音数据对应的第二指令,通过主处理器执行第二指令对应的功能,第二指令用于指示电子设备进入预定模式。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备判断第二语音数据与第二唤醒词是否匹配具体可以为:电子设备判断第二语音数据对应的文本与第二唤醒词的文本是否匹配,若第二语音数据对应的文本与第二唤醒词的文本匹配,则第二语音数据与第二唤醒词匹配。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备判断第二语音数据与第二唤醒词是否匹配具体可以为:电子设备判断第二语音数据对应的文本与第二唤醒词的文本是否匹配,判断第二语音数据的声纹特征与第二唤醒词对应的声纹特征是否匹配;若第二语音数据对应的文本与第二唤醒词的文本匹配,且第二语音数据的声纹特征与第二唤醒词对应的声纹特征匹配,则第二语音数据与第二唤醒词匹配。这样,在第二语音数据的声纹验证和文本验证均通过后,确定第二语音数据与第二唤醒词匹配,以将主处理器由休眠状态唤醒,进而启动语音助手,提高了语音控制服务的使用安全性。
结合第一方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备还包括第二协处理器;电子设备接收用户输入的第二语音数据具体可以为:电子设备使用第二协处理器监测用户输入的第二语音数据;电子设备判断第二语音数据与第二唤醒词是否匹配;若第二语音数据与第二唤醒词匹配,则电子设备将主处理器从休眠状态唤醒,通过主处理器启动语音助手具体可以为:电子设备使用第二协处理器判断第二唤醒词的文本与第二语音数据对应的文本的匹配度是否满足第三精度;若第二唤醒词的文本与第二语音数据对应的文本的匹配度满足第三精度,则第二协处理器将主处理器从休眠状态唤醒;电子设备使用主处理器判断第二唤醒词的文本与第二语音数据对应的文本的匹配度是否满足第四精度;若第二唤醒词的文本与第二语音数据对应的文本的匹配度满足第四精度,则电子设备通过主处理器启动语音助手;第三精度小于 第四精度。
本实施例的第二方面,提供一种电子设备,该电子设备中设置有至少两个第一唤醒词,至少两个第一唤醒词中的每个对应一个第一指令,电子设备响应不同第一唤醒词对应的第一指令所执行的功能不同;该电子设备可以包括:输入单元,用于接收用户输入的第一语音数据;验证单元,用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;唤醒单元,用于若至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则将电子设备的主处理器从休眠状态唤醒;确定执行单元,用于确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能。
结合第二方面,在一种可能的实现方式中,上述验证单元,还可以用于确定第一语音数据的声纹特征与至少两个第一唤醒词对应的声纹特征匹配。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,上述确定执行单元具体用于:通过主处理器启动语音助手,通过语音助手确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,上述验证单元,具体用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词;上述唤醒单元,具体用于若至少两个第一唤醒词中存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词,则将电子设备的主处理器从休眠状态唤醒;上述验证单元,还用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词;上述确定执行单元,具体用于若至少两个第一唤醒词中存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词,则确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能;第一精度小于第二精度。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备还可以包括:触发单元,用于触发电子设备进入预定模式。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备中还设置有第二唤醒词;电子设备还可以包括:启动单元。输入单元,还用于接收用户输入的第二语音数据;验证单元,还用于判断第二语音数据与第二唤醒词是否匹配;唤醒单元,还用于若第二语音数据与第二唤醒词匹配,则将主处理器从休眠状态唤醒;启动单元,用于通过主处理器启动语音助手;输入单元,还用于通过语音助手接收用户输入的第三语音数据;确定执行单元,还用于确定第三语音数据对应的第二指令,通过主处理器执行第二指令对应的功能,第二指令用于指示电子设备进入预定模式。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,匹配单元,具体用于判断第二语音数据对应的文本与第二唤醒词的文本是否匹配,若第二语音数据对应的文本与第二唤醒词的文本匹配,则第二语音数据与第二唤醒词匹配。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,匹配单元,具体用于:判断第二语音数据对应的文本与第二唤醒词的文本是否匹配,判断第二语音数据的声纹特征与第二唤醒词对应的声纹特征是否匹配;若第二语音数据对应的文本与第二唤醒词的文本匹配,且第二语音数据的声纹特征与第二唤醒词对应的声纹特 征匹配,则第二语音数据与第二唤醒词匹配。
结合第二方面或上述可能的实现方式,在另一种可能的实现方式中,匹配单元,具体用于判断第二唤醒词的文本与第二语音数据对应的文本的匹配度是否满足第三精度;唤醒单元,具体用于若第二唤醒词的文本与第二语音数据对应的文本的匹配度满足第三精度,则将主处理器从休眠状态唤醒;匹配单元,具体还用于判断第二唤醒词的文本与第二语音数据对应的文本的匹配度是否满足第四精度;启动单元,用于若第二唤醒词的文本与第二语音数据对应的文本的匹配度满足第四精度,则通过主处理器启动语音助手;第三精度小于第四精度。
本实施例的第三方面,提供一种电子设备,该电子设备可以包括:处理器、存储器和显示器;存储器、显示器与处理器耦合;显示器用于显示处理器生成的图像;存储器用于存储计算机程序代码;处理器可以包括主处理器,主处理器处于休眠状态;电子设备中设置有至少两个第一唤醒词,至少两个第一唤醒词中的每个对应一个第一指令,电子设备响应不同第一唤醒词对应的第一指令所执行的功能不同;计算机程序代码包括计算机指令,当处理器执行上述计算机指令时,处理器,用于接收用户输入的第一语音数据;判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则将主处理器从休眠状态唤醒,确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能。
结合第三方面,在一种可能的实现方式中,处理器,还用于确定第一语音数据的声纹特征与至少两个第一唤醒词对应的声纹特征匹配。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器,用于将主处理器从休眠状态唤醒,确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能具体为:处理器,用于将主处理器从休眠状态唤醒,通过主处理器启动语音助手,通过语音助手确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器还包括第一协处理器;处理器,用于接收用户输入的第一语音数据具体为:第一协处理器,用于监测用户输入的第一语音数据。处理器,用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则将主处理器从休眠状态唤醒具体为:第一协处理器,用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若存在,则将主处理器从休眠状态唤醒。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器还包括第一协处理器;处理器,用于接收用户输入的第一语音数据具体为:第一协处理器,用于监测用户输入的第一语音数据。处理器,用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则将主处理器从休眠状态唤醒具体为:第一协处理器,用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数 据对应的文本的匹配度满足第一精度的唤醒词,则将主处理器从休眠状态唤醒。处理器,用于确定第一语音数据对应的第一指令,并通过主处理器执行第一指令对应的功能具体为:主处理器,用于判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词;若至少两个第一唤醒词中存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词,则确定第一语音数据对应的第一指令,并执行第一指令对应的功能;第一精度小于第二精度。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器,还用于触发电子设备进入预定模式。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器还包括第二协处理器;第二协处理器,用于在电子设备进入预定模式之前,监测语音数据。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,电子设备中还设置有第二唤醒词;处理器,还用于触发电子设备进入预定模式具体为:处理器,还用于接收用户输入的第二语音数据;判断第二语音数据与第二唤醒词是否匹配;若第二语音数据与第二唤醒词匹配,则将主处理器从休眠状态唤醒,通过主处理器启动语音助手;通过语音助手接收用户输入的第三语音数据,并确定第三语音数据对应的第二指令,通过主处理器执行第二指令对应的功能,第二指令用于指示电子设备进入预定模式。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器,用于判断第二语音数据与第二唤醒词是否匹配具体为:处理器,用于判断第二语音数据对应的文本与第二唤醒词的文本是否匹配,若第二语音数据对应的文本与第二唤醒词的文本匹配,则第二语音数据与第二唤醒词匹配。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器,用于判断第二语音数据与第二唤醒词是否匹配具体为:处理器,用于判断第二语音数据对应的文本与第二唤醒词的文本是否匹配,判断第二语音数据的声纹特征与第二唤醒词对应的声纹特征是否匹配;若第二语音数据对应的文本与第二唤醒词的文本匹配,且第二语音数据的声纹特征与第二唤醒词对应的声纹特征匹配,则第二语音数据与第二唤醒词匹配。
结合第三方面或上述可能的实现方式,在另一种可能的实现方式中,处理器还包括第二协处理器;处理器,还用于接收用户输入的第二语音数据具体为:第二协处理器,用于监测用户输入的第二语音数据。处理器,还用于判断第二语音数据与第二唤醒词是否匹配;若第二语音数据与第二唤醒词匹配,则将主处理器从休眠状态唤醒,通过主处理器启动语音助手具体为:第二协处理器,还用于判断第二唤醒词的文本与第二语音数据对应的文本的匹配度是否满足第三精度;若第二唤醒词的文本与第二语音数据对应的文本的匹配度满足第三精度,则将主处理器从休眠状态唤醒;主处理器,还用于判断第二唤醒词的文本与第二语音数据对应的文本的匹配度是否满足第四精度;若第二唤醒词的文本与第二语音数据对应的文本的匹配度满足第四精度,则启动语音助手;第三精度小于第四精度。
本实施例的第四方面,提供一种计算机存储介质,该计算机存储介质包括计算机指令,当计算机指令在电子设备上运行时,使得电子设备执行如第一方面或第一方面 的可能的实现方式中任一所述的触发电子设备执行功能的方法。
本实施例的第五方面,提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行如第一方面或第一方面的可能的实现方式中任一所述的触发电子设备执行功能的方法。
应当理解的是,本实施例中对技术特征、技术方案、有益效果或类似语言的描述并不是暗示在任意的单个实施例中可以实现所有的特点和优点。相反,可以理解的是对于特征或有益效果的描述意味着在至少一个实施例中包括特定的技术特征、技术方案或有益效果。因此,本说明书中对于技术特征、技术方案或有益效果的描述并不一定是指相同的实施例。进而,还可以任何适当的方式组合本实施例中所描述的技术特征、技术方案和有益效果。本领域技术人员将会理解,无需特定实施例的一个或多个特定的技术特征、技术方案或有益效果即可实现实施例。在其他实施例中,还可在没有体现所有实施例的特定实施例中识别出额外的技术特征和有益效果。
图1为本实施例提供的一种电子设备的结构示意图;
图2为本实施例提供的一种电子设备的软件结构框图;
图3为本实施例提供的一种触发电子设备执行功能的方法的流程示意图;
图4为本实施例中电子设备上显示的一些图形用户界面的示意图;
图5为本实施例提供的另一种触发电子设备执行功能的方法的流程示意图;
图6为本实施例中电子设备上显示的其他一些图形用户界面的示意图;
图7为本实施例中电子设备上显示的另外一些图形用户界面的示意图;
图8为本实施例中电子设备上显示的另外一些图形用户界面的示意图;
图9为本实施例提供的另一种电子设备的结构示意图;
图10为本实施例提供的又一种电子设备的结构示意图。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
本实施例提供的触发电子设备执行功能方法,无需用户多次输入语音数据便可以使得电子设备为用户提供语音控制服务。即无需用户多次输入语音数据便可触发电子设备执行对应的功能。提高了电子设备的使用效率,实现了电子设备与用户之间的高效互动。同时,提高了用户的使用体验。
需要说明的是,本实施例中所述的电子设备,可以为手机、平板电脑、桌面型、膝上型、手持计算机、笔记本电脑、个人计算机(Personal Computer,PC)、上网本、蜂窝电话、以及个人数字助理(Personal Digital Assistant,PDA)、可穿戴式设备(如智能手表)、智能家居设备、车载电脑等,本实施例对该设备的具体形式不做特殊限制。
请参考图1,其示出了本实施例提供的一种电子设备100的结构示意图。其中,电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行 总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本实施例示意的结构并不构成对电子设备100的具体限定。在另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
在本实施例中,处理器110可以包括一DSP,如称为第一DSP。电子设备100中可以设置一个或多个第一唤醒词。在电子设备100中设置有多个第一唤醒词时,该多个第一唤醒词中的每个对应一个指令,且电子设备100根据不同第一唤醒词对应的指令执行的功能不同。其中,第一DSP可以通过电子设备100的麦克风170C实时监测语音数据。当第一DSP监测到的语音数据时,可以对监测到的语音数据进行第一精度的文本校验。若第一精度的文本校验通过,第一DSP可以唤醒AP,并通知AP对接收到该语音数据进行第二精度的文本校验。第一精度小于第二精度。在第二精度的文本校验通过时,AP确定该语音数据与第一唤醒词匹配。此时,电子设备100可以确定该语音数据对应的指令,并通过AP触发电子设备100根据该指令执行对应功能。例如,电子设备可以根据预定义的第一唤醒词与指令的对应关系,确定语音数据对应的指令。又例如,电子设备100可以唤醒电子设备100中的语音助手,并通过语音助手对该语音数据的文本进行语义分析,以确定语音数据对应的指令,从而触发电子设备100执行指令对应的功能。
在一些实施例中,上述第一精度的文本校验操作与第二精度的文本校验操作可以仅执行一个,也可以都执行。第一唤醒词可以是预定义的唤醒词。第一唤醒词也可以是用户自定义的唤醒词。若第一唤醒词是用户自定义的唤醒词,可选的,AP接收到语音数据后,还可以对该语音数据进行声纹校验。在文本校验与声纹校验均通过时,AP确定该语音数据与第一唤醒词匹配。
上述第一DSP可以是具有较大内存以及较高处理性能的DSP。例如,第一DSP可以是设置在片上系统(System On Chip,SOC)内的高保真度(High Fidelity,HIFI)DSP。第一DSP还可以是设置在SOC之外的编解码器DSP(codec DSP)。
在一些实施例中,处理器110还可以包括另一DSP,如称为第二DSP。电子设备中还可以设置一个第二唤醒词。其中,第二DSP可以通过电子设备100的麦克风170C实时监测语音数据。当第二DSP监测到的语音数据时,可以对监测到的语音数据进行第三精度的文本校验。若第三精度的文本校验通过,第二DSP可以将该语音数据传输给AP。AP接收到该语音数据后,可以对该语音数据进行第四精度的文本校验和声纹校验(其中,声纹校验是可选的校验操作)。第三精度小于第四精度。第三精度可以与第一精度相同,也可以与第一精度不同。第四精度可以与第二精度相同,也可以与第二精度不同。在文本校验和声纹校验均通过时,AP确定该语音数据与第二唤醒词匹配。此时,电子设备100可以开启语音助手。在语音助手开启后,电子设备100可以通过语音助手接收用户输入的语音命令,以便触发电子设备100执行对应的功能。
上述第二DSP可以是具有较小内存以及较低处理性能的DSP。
需要说明的是,在本实施例中,如果处理器110包括两个DSP,如上述第一DSP和第二DSP。电子设备100的麦克风170C同一时刻仅与其中一个DSP建立通路,以便将接收到的语音数据传输至对应的DSP。例如,在电子设备100处于预定模式下时,电子设备100的麦克风170C仅与第一DSP建立通路。如果麦克风170C采集到用户输入的语音数据,则通过建立的通路将采集到的语音数据传输至该第一DSP,以便第一DSP进行后续处理。又例如,在电子设备退出预定模式,或者说未处于预定模式下时,电子设备100将麦克风170C通路由第一DSP切换到第二DSP,即此时麦克风170C仅与第二DSP建立通路。如果麦克风170C采集到用户输入的语音数据,则通过建立的通路将采集到的语音数据传输至该第二DSP,以便第二DSP进行后续处理。
控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器110可以通过不同的I2C总线接口分别耦合触摸传感器180K,充电器,闪光灯,摄像头193等。例如:处理器110可以通过I2C接口耦合触摸传感器180K,使处理器110与触摸传感器180K通过I2C总线接口通信,实现电子设备100的触摸功能。
I2S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。处理器110可以通过I2S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。在一些实施例中,音频模块170可以通过I2S接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块160可以通过PCM总线接口耦合。在一些实施例中,音频模块170也可以通过PCM接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器110与无线通信模块160。例如:处理器110通过UART接口与无线通信模块160中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块170可以通过UART接口向无线通信模块160传递音频信号,实现通过蓝牙耳机播放音乐的功能。
MIPI接口可以被用于连接处理器110与显示屏194,摄像头193等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器110和摄像头193通过CSI接口通信,实现电子设备100的拍摄功能。处理器110和显示屏194通过DSI接口通信,实现电子设备100的显示功能。
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头193,显示屏194,无线通信模块160,音频模块170,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。
可以理解的是,本实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。 在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(code division multiple access,CDMA),宽带码分多址(wideband code division multiple access,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN, NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidou navigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellite system,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。 通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息或需要通过语音助手触发电子设备100执行某些功能时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显 示屏194,电子设备100根据压力传感器180A检测所述触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
磁传感器180D包括霍尔传感器。电子设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当电子设备100是翻盖机时,电子设备100可以根据磁传感器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值, 电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于所述骨传导传感器180M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。电子设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。电子设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备100中,不能和电子设备100分离。
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本实施例以分层架构的Android系统为例,示例性说明电子设备100的软件结构。
请参考图2,其是本实施例提供的一种电子设备100的软件结构框图。其中,分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接 口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。
应用程序层可以包括一系列应用程序包。
如图2所示,应用程序包可以包括语音助手,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG, PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
示例性的,以下实施例中所涉及的技术方案均可以在具有上述硬件架构和软件架构的电子设备100中实现。以下结合附图和应用场景对本实施例提供的触发电子设备执行功能的方法进行详细介绍。
图3为本实施例提供的一种触发电子设备执行功能的方法的流程示意图。如图3所示,该方法可以包括以下S301-S304。
其中,电子设备中设置有一个或多个第一唤醒词。该第一唤醒词可以是预定义的,也可以是用户自定义的。如果第一唤醒词是用户自定义的,那么在电子设备中注册第一唤醒词的过程可以参考常规技术中的具体描述,本实施例此处不予赘述。该第一唤醒词可以是用户会经常用到的语音命令,如“静音”,“接听电话”等。电子设备中设置的第一唤醒词可以对应一个第一指令。在本实施例中,当电子设备中设置有多个第一唤醒词(或者,称为至少两个第一唤醒词)时,电子设备响应不同第一唤醒词对应的第一指令所执行的功能是不同的。以下实施例中,以电子设备中设置有至少两个第一唤醒词为例进行详细说明。
S301、电子设备接收用户输入的第一语音数据。
其中,在电子设备没有其他软硬件使用麦克风采集语音数据的情况下,电子设备的第一DSP可以通过麦克风实时监测用户是否有语音数据输入。一般情况下,在用户想要通过输入语音数据触发电子设备执行某些功能时,可以靠近手机的麦克风发声,以将发出的声音输入到麦克风。此时若电子设备没有其他软硬件正在使用麦克风采集语音数据,则电子设备的第一DSP可以通过麦克风监测到对应的语音数据,如第一语音数据。
需要说明的是,在本实施例中,电子设备处于黑屏,锁屏,亮屏等任何状态时,只要电子设备没有其他软硬件正在使用麦克风采集语音数据,电子设备的第一DSP均可以监测用户的语音数据。其中,在黑屏状态下,电子设备的AP可以处于休眠状态,也可以处于非休眠状态。
S302、电子设备判断至少两个第一唤醒词中是否存在文本与第一语音数据对应的文本匹配的唤醒词。
其中,在电子设备接收到上述第一语音数据后,可以对该第一语音数据进行文本校验,即判断设置在电子设备中的至少两个第一唤醒词中,是否存在文本与第一语音数据对应的文本匹配的唤醒词,以确定接收到的第一语音数据是否是设置在电子设备中的第一唤醒词。
如果文本校验通过,则表明接收到的第一语音数据是设置在电子设备中的第一唤醒词,此时电子设备可以执行S303。如果文本校验未通过,则表明接收到的第一语音数据不是设置在电子设备中的第一唤醒词,此时电子设备可以执行S304。
在一些实施例中,电子设备对第一语音数据进行文本校验具体的可以包括:电子 设备的第一DSP对该第一语音数据进行文本校验,和/或,电子设备的AP对该第一语音数据进行文本校验。其中,如果电子设备对第一语音数据进行文本校验具体包括:第一DSP对该第一语音数据进行文本校验,以及AP对该第一语音数据进行文本校验,那么,第一DSP与AP进行文本校验时的精度可以不同。例如,第一DSP对第一语音数据进行第一精度的文本校验,AP对第一语音数据进行第二精度的文本校验,第一精度小于第二精度。
例如,以电子设备对第一语音数据进行文本校验具体包括:第一DSP对该第一语音数据进行第一精度的文本校验,以及AP对该第一语音数据进行第二精度的文本校验,电子设备的AP处于休眠状态为例详细介绍对第一语音数据进行文本校验的过程。在电子设备的第一DSP监测到第一语音数据后,第一DSP可以对该第一语音数据进行第一精度(或称为低精度)的文本校验。也就是说,第一DSP可以判断至少两个第一唤醒词中,是否存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词。若第一DSP确定至少两个第一唤醒词中,存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词,也就是说,第一DSP对第一语音数据的第一精度的文本校验通过,则第一DSP可以唤醒电子设备的AP,并将监测到的第一语音数据传输至AP。如果第一DSP确定至少两个第一唤醒词中,不存在文本与第一语音数据对应的文本的匹配度满足第一精度的唤醒词,则表明接收到的第一语音数据不是设置在电子设备中的第一唤醒词,电子设备可以执行S304。
电子设备的AP接收到第一语音数据后,可以对该第一语音数据进行第二精度(或称为高精度)的文本校验。也就是说,AP可以判断至少两个第一唤醒词中,是否存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词。如果AP确定至少两个第一唤醒词中存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词,即AP对第一语音数据的第二精度的文本校验通过,则表明接收到的第一语音数据是设置在电子设备中的第一唤醒词,电子设备可以执行S303。如果电子设备的AP确定至少两个第一唤醒词中不存在文本与第一语音数据对应的文本的匹配度满足第二精度的唤醒词,即AP对第一语音数据的第二精度的文本校验未通过,则表明接收到的第一语音数据不是设置在电子设备中的第一唤醒词,电子设备可以执行S304。
S303、电子设备确定第一语音数据对应的第一指令,通过电子设备的主处理器执行第一指令对应的功能。
如果电子设备确定出至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,则电子设备可以确定第一语音数据对应的第一指令,并执行第一指令对应的功能,以实现通过输入语音数据触发电子设备执行对应功能的目的。
例如,电子设备确定第一语音数据对应的第一指令,具体可以为:电子设备中存储有设置在电子设备中的至少两个第一唤醒词与指令的对应关系。在电子设备对第一语音数据的文本校验通过后,电子设备可以查找该对应关系,以确定出第一语音数据对应的第一指令。
又例如,电子设备可以对该第一语音数据的文本进行语义分析,以确定该第一语音数据对应的第一指令。电子设备的语义分析功能可以以某个单独的模块在电子设备中实现,也可以由集成在某个应用程序中模块实现。如,语音分析功能由集成在语音 助手中的模块实现。那么,在电子设备确定出至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词时,电子设备可以启动语音助手,通过语音助手对第一语音数据对应的文本进行语义分析,以确定该第一语音数据对应的第一指令,并通过主处理器触发电子设备执行该第一指令对应的功能。可以看到的是,在该实现方式中,第一语音数据不仅可以作为唤醒词唤醒语音助手,还可以作为语音命令触发电子设备执行对应功能。上述语音助手可以是安装在电子设备中的应用程序(Application,APP)。在本实施例中,语音助手可以是系统应用,也可以是第三方应用。系统应用,也可称为嵌入式应用,其是作为电子设备实现的一部分提供的应用程序。第三方应用,也可称为可下载应用,其是一个可以提供自己的因特网协议多媒体子系统(Internet Protocol Multimedia Subsystem,IMS)连接的应用程序,可以预先安装在电子设备中,也可以由用户下载并安装在电子设备中。
S304、电子设备删除第一语音数据。
本实施提供的触发电子设备执行功能的方法,电子设备可以在接收到的语音数据通过校验后,确定输入的语音数据对应的指令,以触发电子设备执行该指令对应的功能。可以看到的是,只要电子设备没有其他软硬件使用麦克风采集语音数据(即使处于黑屏状态,且AP处于休眠状态),不需要用户先输入唤醒词,以使电子设备启动语音助手,再输入语音命令,而是输入一条语音数据便可触发电子设备执行对应的功能。这样,提高了电子设备的使用效率,实现了电子设备与用户之间的高效互动。同时,提高了用户的使用体验。
在一些实施例中,如果上述第一唤醒词是用户自定义的唤醒词,那么,在对第一语音数据进行文本校验通过后,可以继续对其进行声纹校验。也就是说,在文本校验通过后,AP可以判断第一语音数据的声纹特征与设置在电子设备中的至少两个第一唤醒词对应的声纹特征是否匹配。如果第一语音数据的声纹特征与设置在电子设备中的至少两个第一唤醒词对应的声纹特征匹配,则对第一语音数据的声纹校验通过,电子设备可以执行上述S303。如果第一语音数据的声纹特征与设置在电子设备中的至少两个第一唤醒词对应的声纹特征不匹配,则对第一语音数据的声纹校验未通过,电子设备可以执行上述S304。也就是说,在电子设备接收到上述第一语音数据后,可以在对该第一语音数据进行文本校验和声纹校验均通过后,才可执行上述S303。这样,只有在电子设备中注册过唤醒词的人才能够通过输入语音数据触发电子设备执行对应功能,提高了语音控制服务的使用安全性。
例如,在电子设备中设置该至少两个第一唤醒词时,电子设备可以根据设置至少两个第一唤醒词时用户输入的语音数据生成第一声纹模型。该第一声纹模型可以用于表征该至少两个第一唤醒词的声纹特征。若电子设备确定出至少两个第一唤醒词中存在文本与第一语音数据对应的文本匹配的唤醒词,即文本校验通过,则AP可以继续根据该第一声纹模型对上述第一语音数据进行声纹校验。具体的:电子设备生成第一声纹模型后,可以将设置至少两个第一唤醒词时用户输入的语音数据作为输入值,输入到第一声纹模型后得到第一声纹门限。在电子设备确定监测到的第一语音数据通过文本校验后,可以将该第一语音数据作为输入值,输入到第一声纹模型后得到一个声纹值,如第二声纹值。电子设备可以判断该第二声纹值与第一声纹门限的差值是否小 于预设阈值。如果该第二声纹值与第一声纹门限的差值小于预设阈值,则声纹验证通过。如果该第二声纹值与第一声纹门限的差值大于或者等于预设阈值,则声纹验证未通过。
示例性的,以电子设备中设置有多个预定义的第一唤醒词为例。例如,如表1所示,电子设备中设置的第一唤醒词包括如下常用的语音命令:系统设置命令,如“静音”,“取消静音”,“调高音量”(或“声音大一点”),“调低音量”(或“声音小一点”),“锁定屏幕”等。导航设置命令,如“退出导航”,“停止导航”,“切换路线”(或“换一条路线”),“导航回家”,“导航去公司”等。音乐设置命令,如“上一首”,“下一首”(或“切歌”),“暂停音乐”(或“暂停播放”),“开始音乐”(或“开始播放”),“停止音乐”(或“停止播放”)等。通信设置命令,如“挂断电话”,“接听电话”,“查看短信”,“回复短信”,“朗读短”信,“读微信”,“回复微信”等。
表1
例如,电子设备通过语音助手进行语义分析获得语音数据对应的指令。电子设备当前没有其他软硬件使用麦克风采集语音数据。且如图4中的(a)所示,电子设备处于黑屏状态,电子设备的AP也处于休眠状态。用户想要通过输入语音数据触发电子设备将电子设备的音量调小。用户可以靠近手机的麦克风说出“调低音量”。电子设备的第一DSP可以通过麦克风监测到对应的语音数据“调低音量”。第一DSP监测到语音数据“调低音量”后,可以对该语音数据“调低音量”进行第一精度的文本校验。在第一DSP对语音数据“调低音量”的第一精度的文本校验通过时,第一DSP可以唤醒AP,并将语音数据“调低音量”传输至AP。AP可以对该语音数据“调低音量”进行第二精度的文本校验。在AP对语音数据“调低音量”的第二精度的文本校验通过时,可以启动语音助手,并通过语音助手对语音数据“调低音量”进行语义分析,以确定出该语音数据“调低音量”对应的指令。此时AP根据该指令触发电子设备将系统的音量调低。
为了方便用户获知输入语音数据后,电子设备是否已经进行了相应的响应。电子设备可以在进行了相应的响应后,点亮屏幕并显示提示信息,以提示用户已进行了相应的响应。例如,结合上述示例,如图4中的(b)所示,在语音助手启动后,电子设备可以点亮屏幕,并显示语音助手界面401。该语音助手界面401中可以包括识别出的用户输入的语音数据对应的文字“调低音量”402。在电子设备将系统的音量调低后,在语音助手界面401中可以显示提示信息403。该提示信息403用于向用户提示已将系统的音量调低。可以看到的是,在电子设备处于黑屏状态,且AP处于休眠状态的情况下,无需用户先输入语音数据“你好小E”,来唤醒语音助手,再输入语音数据“调 小手机音量(或调低音量)”,而是只需输入一条语音数据“调低音量”,便可触发电子设备将音量调低,达成意图。提高了人机交互的效率,用户体验有了很大的提升。
可以理解的是,由于上述第一DSP具有较高的处理能力,因此其功耗也相对较高。在本实施例中,为了在实现电子设备与用户之间高效互动的同时,尽可能的节省电子设备的功耗。可以仅在特定的场景下使用上述第一DSP对麦克风采集到的语音数据进行处理。也就是说,可以仅在特定的场景下通过执行以上S301-S304,以达到通过输入语音数据触发电子设备执行对应功能的目的。即如图5所示,在上述S301之前,该触发电子设备执行功能的方法还可以包括S501。
S501、电子设备进入预定模式。
其中,该预定模式可以是驾驶模式,居家模式等。在该预定模式下,当电子设备没有其他软硬件使用麦克风采集语音数据的情况下,电子设备的麦克风可以将采集到的语音数据传输至电子设备的第一DSP,以便第一DSP对语音数据进行处理。
在一些实施例中,电子设备可以在某些特定情况下自动进入预定模式。示例性的,电子设备可以在某些特定的时间段,或某些特定的地点自动进入预定模式。
一般的,电子设备可以利用历史使用记录获取用户在当前地点或当前时间使用电子设备的意图,如是否通过输入语音数据触发电子设备执行对应功能。在使用电子设备的意图是通过输入语音数据触发电子设备执行对应功能时,电子设备可以自动进入预定模式。也就是说,电子设备可以利用历史使用记录确定用户频繁通过输入语音数据触发电子设备执行对应功能的时间和/或地点。在确定当前时间或当前地点与历史使用记录相匹配时,电子设备可以自动进入预定模式。例如,电子设备获取到用户在19:00-20:30这个时间段内经常通过输入语音数据触发电子设备执行对应功能。那么,电子设备可以在确定当前系统时间处于19:00-20:30这个时间段时,自动进入预定模式。又例如,电子设备获取到用户在某个地理位置范围内(如该地理位置范围是用户的家)经常通过输入语音数据触发电子设备执行对应功能,那么,电子设备可以在确定当前电子设备所述的地理位置处于该地理位置范围内时,自动进入预定模式(如居家模式)。
在其他一些实施例中,电子设备可以在确定电子设备的移动速度大于特定值时自动进入预定模式(如驾驶模式)。一般的,当用户在驾驶汽车时,可能会较频繁的通过输入语音数据触发电子设备执行对应功能,如通过语音助手触发电子设备的地图应用进行导航等。因此,当电子设备检测到电子设备当前的移动速度大于特定值时,自动进入预定模式(如驾驶模式)。
在另外一些实施例中,电子设备可以响应于用户的特定输入,进入预定模式。该特定输入可以是用户对某特定虚拟按钮或物理按键的触发操作。例如,以特定输入是用户对特定虚拟按钮(如“预定模式”选项的开关按钮)的触发操作为例。电子设备包括设置(Settings)。如图6所示,电子设备可以接收用户对设置的图标的点击操作。响应于用户对设置的图标的点击操作,电子设备可以显示图6所示的设置界面601。该设置界面601中可以包括“飞行模式”设置选项、“WLAN”设置选项、“蓝牙”设置选项、“移动网络”设置选项、“预定模式”设置选项(如图6中以“驾驶模式”设置选项为例示出)等。其中,“飞行模式”选项、“WLAN”选项、“蓝牙”选项 和“移动网络”选项的具体功能可以参考常规技术中的具体描述,本实施例这里不予赘述。用户在想要使用上述第一唤醒词触发电子设备执行对应功能时,可以对“驾驶模式”选项的开关按钮602进行点击操作。电子设备响应于用户对“驾驶模式”选项的开关按钮602的点击操作,可以进入预定模式,如驾驶模式。当用户再次对“驾驶模式”选项的开关按钮602执行点击操作后,电子设备可以退出驾驶模式。其中,图6所示的“驾驶模式”选项的开关按钮602的显示效果,用于指示驾驶模式未打开,用户此时可以对该开关按钮602执行点击操作,以便该电子设备进入驾驶模式。
上述特定输入也可以是用户输入的语音命令。该语音命令可以是通过电子设备的语音助手输入的。例如,电子设备中还可以设置有第二唤醒词。该第二唤醒词可以用于唤醒电子设备中的语音助手。在用户通过第二唤醒词唤醒语音助手后,便可以通过语音助手输入语音命令,以触发电子设备进入上述预定模式。在电子设备进入预定模式之前,电子设备可以使用另一DSP,如第二DSP监测语音数据,以便可以通过第二唤醒词唤醒语音助手。在电子设备退出上述预定模式后,或者未处于上述预定模式下时,电子设备可以使用该第二DSP对麦克风采集到的语音数据进行处理。由于第二DSP的处理性能相较于第一DSP的处理性能低,内存相较于第一DSP的内存小。因此,其功耗相较于第一DSP的功耗低。这样,不仅可以保证在预定场景下用户使用语音助手时电子设备与用户之间的高效互动,同时节省了电子设备的功耗。而且,在电子设备处于非特定的场景下时,仍可以通过语音助手触发电子设备执行对应功能,如触发电子设备进入预定模式。
示例性的,以语音助手处于休眠状态为例。在用户想要通过语音助手输入语音命令触发电子设备进入预定模式时,用户可以靠近手机的麦克风发声,以将发出的声音输入到麦克风。此时电子设备的第二DSP可以通过麦克风监测到用户输入的语音数据,如第二语音数据。在电子设备的第二DSP监测到第二语音数据后,电子设备可以判断该第二语音数据与设置在电子设备中的第二唤醒词是否匹配。当第二语音数据与第二唤醒词匹配时,可以启动语音助手。此时用户可以通过语音助手输入用于触发电子设备进入预定模式的语音数据,如第三语音数据。电子设备通过语音助手可以接收到用户输入的第三语音数据,并确定该第三语音数据对应的第二指令。该第二指令可以用于指示电子设备进入预定模式。电子设备可以执行第二指令对应的功能,即进入预定模式。
其中,电子设备判断第二语音数据与设置在电子设备中的第二唤醒词是否匹配具体的可以是:电子设备对第二语音数据进行文本校验,如果第二语音数据对应的文本与第二唤醒词的文本匹配,则文本校验通过,第二语音数据与第二唤醒词匹配。如果第二语音数据对应的文本与第二唤醒词的文本不匹配,则文本校验未通过,第二语音数据与第二唤醒词不匹配。或者,电子设备判断第二语音数据与设置在电子设备中的第二唤醒词是否匹配具体的可以是:电子设备对第二语音数据进行文本校验和声纹校验。如果第二语音数据对应的文本与第二唤醒词的文本匹配,且第二语音数据的声纹特征与第二唤醒词对应的声纹特征匹配,则文本校验和声纹校验通过,第二语音数据与第二唤醒词匹配。如果第二语音数据对应的文本与第二唤醒词的文本不匹配,或第二语音数据的声纹特征与第二唤醒词对应的声纹特征不匹配,则文本校验和声纹校验 未通过,第二语音数据与第二唤醒词不匹配。
在一些实施例中,电子设备对第二语音数据进行文本校验可以包括:电子设备的第二DSP对第二语音数据进行第三精度的文本校验,和/或,电子设备的AP对第二语音数据进行第四精度的文本校验。第三精度小于第四精度。其中,第三精度可以与上述第一精度相同,也可以与第一精度不同。第四精度可以与上述第二精度相同,也可以与上述第一精度不同。
例如,以电子设备判断第二语音数据与设置在电子设备中的第二唤醒词是否匹配具体包括:电子设备对第二语音数据进行文本校验和声纹校验。电子设备对第二语音数据进行文本校验包括:第二DSP对第二语音数据进行第三精度的文本校验,以及AP对第二语音数据进行第四精度的文本校验为例,详细说明电子设备判断第二语音数据与第二唤醒词是否匹配的过程。电子设备的第二DSP监测到第二语音数据后,可以判断第二语音数据对应的文本与第二唤醒词的文本的匹配度是否满足第三精度。如果第二语音数据对应的文本与第二唤醒词的文本的匹配度满足第三精度,则可以将第二语音数据传输至AP。AP可以判断第二语音数据对应的文本与第二唤醒词的文本的匹配度是否满足第四精度。如果第二语音数据对应的文本与第二唤醒词的文本的匹配度满足第四精度,电子设备的AP可以判断第二语音数据的声纹特征与第二唤醒词对应的声纹特征是否匹配。如果第二语音数据的声纹特征与第二唤醒词对应的声纹特征匹配,则表明第二语音数据是设置在电子设备中的第二唤醒词,此时电子设备可以启动语音助手。
在电子设备进入预定模式后,用户若想通过输入语音数据触发电子设备执行对应功能,则仅需输入上述第一唤醒词,便可实现意图。也就是说,在电子设备进入预定模式后,用户无需多次输入语音数据,而是直接上述第一唤醒词,便可以触发电子设备执行对应功能。在实现电子设备与用户之间高效互动的同时,尽可能的节省电子设备的功耗。
以下结合图7、图8和具体实例,对本实施例提供的触发电子设备执行功能的方法进行具体介绍。例如,电子设备为手机。手机中包括语音助手。手机通过语音助手对用户输入的语音数据对应的文本进行语义分析以获得对应指令。该手机包括两个DSP,分别为DSP 1和DSP 2。手机包括的麦克风同一时间仅与这两个DSP中的其中一个DSP建立通路。麦克风默认与DSP2建立了通路。在手机进入驾驶模式后,将麦克风通路由DSP 2切换到DSP 1。
手机中设置有5个预定义的第一唤醒词,分别为:“退出导航”,“停止导航”,“切换路线”,“导航回家”,“导航去公司”。手机中还设置有1个第二唤醒词,“你好小E”。
如图7中的(a)所示,手机处于黑屏状态。AP处于休眠状态。语音助手处于休眠状态。用户在驾驶汽车时,靠近手机的麦克风说出“你好小E”。手机的麦克风采集到“你好小E”对应的语音数据1。手机的麦克风将采集到的语音数据1传输至DSP 2。DSP 2判别该语音数据1对应的文本与设置到手机中的第二唤醒词“你好小E”的文本的匹配度是否满足精度1,以确定语音数据1是否疑似是设置在手机中的第二唤醒词“你好小E”。DSP 2在确定接收到的语音数据1疑似是设置在手机中的第二唤醒 词“你好小E”时,将AP从休眠状态唤醒,并将语音数据1传输手机的AP。AP接收到语音数据1后,判断该语音数据1对应的文本与设置到手机中的第二唤醒词“你好小E”的文本的匹配度是否满足精度2。精度2大于精度1。手机的AP在确定该语音数据1对应的文本与设置到手机中的第二唤醒词“你好小E”的文本的匹配度满足精度2时,判断该语音数据1的声纹特征与第二唤醒词“你好小E”对应的声纹特征是否匹配,即进行声纹校验。如果声纹校验通过。手机的AP可以唤醒语音助手。如图7中的(b)所示,手机点亮屏幕,并显示语音助手界面701。该语音助手界面701中可以包括提示信息702。该提示信息702用于提示用户此时可输入语音命令,以触发手机执行对应功能。
用户靠近手机的麦克风说出“进入驾驶模式”。手机可以通过语音助手接收到对应的语音数据2。如图7中的(c)所示,手机可以在语音助手界面703中显示语音数据2对应的文字“进入驾驶模式”704。手机通过语音助手对该语音数据2进行语义分析,以确定出该语音数据2对应的指令。手机可以根据该指令触发手机进入驾驶模式。并且,手机将麦克风通路由DSP 2切换到DSP 1。在手机进入驾驶模式后,如图7中的(c)所示,手机可以在语音助手界面703中显示提示信息705。该提示信息705用于提示用户已进入驾驶模式,后续可直接说出第一唤醒词以触发手机执行对应功能。
一般的,手机根据语音助手接收到的语音数据执行了对应功能后,语音助手会进入空闲态。或者手机会在确定预定时间内没有接收到用户操作时,控制语音助手重新进入休眠状态。另外,在一定时间内若没有接收到用户操作,AP也可能会重新进入休眠状态。在现有技术中,语音助手在空闲态或者进入休眠状态(或者语音助手在空闲态或者进入休眠状态,且AP也处于休眠状态)后,如果用户想要再次使用语音助手触发手机执行对应功能,则需要重新输入唤醒词,然后再输入语音命令,才能达成意图。但是,在本实施例中,在手机进入驾驶模式后,即使语音助手在空闲态或者进入休眠状态(或者语音助手在空闲态或者进入休眠状态,且AP处于休眠状态),用户只需输入上述第一唤醒词便可达成意图。
例如,以手机进入驾驶模式后,语音助手重新进入休眠状态,且如图8中的(a)所示,手机重新处于黑屏状态,AP处于休眠状态。用户靠近手机的麦克风说出“导航回家”。手机的麦克风采集到“导航回家”对应的语音数据3。手机的麦克风将采集到的语音数据3传输至DSP 1。DSP 1可以对语音数据3进行较低精度的文本匹配,即判别设置在手机中的5个第一唤醒词中,是否存在文本与该语音数据3对应的文本的匹配度满足精度3的唤醒词。DSP 1在较低精度的文本匹配通过时,唤醒AP,并将语音数据3传输至AP。此时手机的AP可以对语音数据3进行较高精度的文本匹配,即判别设置在手机中的5个第一唤醒词中,是否存在文本与该语音数据3对应的文本的匹配度满足精度4的唤醒词。精度4大于精度3。在较高精度的文本校验通过时,手机可以唤醒语音助手。手机还可以通过语音助手对语音数据3进行语义分析,以确定出该语音数据3对应的指令。手机可以根据该指令调用对应的接口,以触发地图应用展示对应的导航路线给用户(或者手机也可以根据该指令模拟用户的点击操作,以便在地图应用中展示对应的导航路线给用户)。手机还可以通过扬声器播报导航路线信息。例如,如图8中的(b)所示,手机可以点亮屏幕,并显示语音助手界面801。该 语音助手界面801中可以包括识别出的用户输入的语音数据3对应的文字“导航回家”802。其中,该语音助手界面801是一个跳转界面,即手机的显示屏上显示语音助手界面801后立即跳转到图8中的(c)所示的导航界面803。在一些实施例中,手机的显示屏上也可以不显示语音助手界面801,而是在用户输入“导航回家”后,手机直接点亮屏幕,显示导航界面803。这样,用户无需多次输入语音数据,便可达成意图。提高了人机交互效率,提高了用户体验。
可以理解的是,上述电子设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请实施例的范围。
本实施例还提供一种实现上述各方法实施例的电子设备。具体的,可以对该电子设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在一些实施例中,在采用对应各个功能划分各个功能模块的情况下,图9示出了上述实施例中所涉及的电子设备900的一种可能的结构示意图,该电子设备900可以包括:输入单元901、验证单元902、唤醒单元903以及确定执行单元904。
其中,输入单元901,用于支持电子设备900执行上述方法实施例中的S301和/或用于本文所描述的技术的其它过程。如,输入单元901支持电子设备900执行上述方法实施例中的接收用户输入的第二语音数据、第三语音数据等。
验证单元902,用于支持电子设备900执行上述方法实施例中的S302和/或用于本文所描述的技术的其它过程。例如,验证单元902支持电子设备900执行上述方法实施例中的声纹验证。
唤醒单元903,用于支持电子设备900执行上述方法实施例中的唤醒主处理器(如AP)的操作和/或用于本文所描述的技术的其它过程。
确定执行单元904,用于支持电子设备900执行上述方法实施例中的S303和/或用于本文所描述的技术的其它过程。
在本申请实施例中,进一步的,如图9所示,该电子设备900还可以包括:触发单元905以及启动单元906。
触发单元905,用于支持电子设备900执行上述方法实施例中的S501和/或用于本文所描述的技术的其它过程。
启动单元906,用于支持电子设备900执行上述方法实施例中启动语音助手的操作和/或用于本文所描述的技术的其它过程。
进一步的,该电子设备900还可以包括删除单元。该删除单元可以用于支持电子设备执行上述方法实施例中的S304和/或用于本文所描述的技术的其它过程。
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块 的功能描述,在此不再赘述。
当然,电子设备900包括但不限于上述所列举的单元模块,例如,电子设备900还可以包括用于接收其他设备发送数据或者信号的接收单元、用于显示内容的显示单元等。并且,上述功能单元的具体所能够实现的功能也包括但不限于上述实例所述的方法步骤对应的功能,电子设备900的其他单元的详细描述可以参考其所对应方法步骤的详细描述,本申请实施例这里不予赘述。
在其他一些实施例中,在采用集成的单元的情况下,该电子设备可以包括:处理模块、存储模块和显示模块。处理模块用于对电子设备的动作进行控制管理。显示模块用于根据处理模块的指示进行内容显示。存储模块,用于保存电子设备的程序代码和数据。在一些实施例中,存储模块还可以用于保存上述实施例中的第一唤醒词对应的文本和/或对应的声纹特征信息,第二唤醒词对应的文本和/或对应的声纹特征信息等。进一步的,该电子设备还可以包括输入模块,通信模块,该通信模块用于支持电子设备与其他网络实体的通信,以实现电子设备的通话,数据交互,Internet访问等功能。
其中,处理模块可以是处理器或控制器。通信模块可以是收发器、RF电路或通信接口等。存储模块可以是存储器。显示模块可以是屏幕或显示器。输入模块可以是触摸屏,语音输入装置,或指纹传感器等。
当处理模块为处理器,通信模块为电路,存储模块为存储器,显示模块为触摸屏时,本实施例所提供的电子设备可以为图1所示的电子设备。其中,上述通信模块不仅可以包括RF电路,还可以包括Wi-Fi模块、NFC模块和蓝牙模块。RF电路、NFC模块、Wi-Fi模块和蓝牙模块等通信模块可以统称为通信接口。其中,上述处理器、RF电路、触摸屏和存储器可以通过总线耦合在一起。
如图10所示,本申请另外一些实施例还提供了一种电子设备1000,该电子设备1000可以包括:显示器1001;一个或多个处理器1002;存储器1003;以及一个或多个计算机程序代码1004,上述各器件可以通过一个或多个通信总线1005连接。其中该一个或多个计算机程序代码1004被存储在上述存储器1003中,并被配置为被该一个或多个处理器1002执行。该电子设备1000中设置有至少两个第一唤醒词,所述至少两个第一唤醒词中的每个对应一个第一指令,电子设备1000响应不同第一唤醒词对应的第一指令所执行的功能不同。该一个或多个计算机程序代码1004包括计算机指令,在本申请一些实施例中,上述计算机指令可以用于执行如图3或图5及相应实施例中电子设备执行的各个步骤。当然,电子设备1000包括但不限于上述所列举的器件,例如,上述电子设备1000还可以包括射频电路、定位装置、传感器等等,当电子设备1000包含有其他的器件时,上述电子设备1000可以为图1所示的电子设备。其中,上述处理器1002可以包括AP 1006以及第一DSP 1007。进一步的,处理器1002还可以包括第二DSP 1008。
本申请另外一些实施例还提供了一种计算机存储介质,该计算机存储介质中包括计算机指令,当上述计算机指令在电子设备上运行时,使得该电子设备执行如图3或图5中任一附图中的相关方法步骤,如S301、S302、S303、S304、S501,实现上述实施例中的触发电子设备执行功能的方法。
本申请另外一些实施例还提供了一种包含指令的计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行如图3或图5中任一附图中的相关方法步骤,如S301、S302、S303、S304、S501,实现上述实施例中的触发电子设备执行功能的方法。
本申请另外一些实施例还提供了一种控制设备,所述控制设备包括处理器和存储器,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,当所述处理器执行所述计算机指令时,所述控制设备执行如图3或图5中任一附图中的相关方法步骤,如S301、S302、S303、S304、S501,实现上述实施例中的触发电子设备执行功能的方法。该控制设备可以是一个集成电路IC,也可以是一个片上系统SOC。其中集成电路可以是通用集成电路,也可以是一个现场可编程门阵列FPGA,也可以是一个专用集成电路ASIC。
本申请另外一些实施例还提供了一种触发电子设备执行功能的装置,该装置具有实现上述方法实际中电子设备行为的功能。所述功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。所述硬件或软件包括一个或多个与上述功能相对应的模块。
其中,本申请实施例提供的电子设备、计算机存储介质、计算机程序产品或控制设备均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本实施例所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本实施例各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以 使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器执行各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:快闪存储器、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本实施例的具体实施方式,但本实施例的保护范围并不局限于此,任何在本实施例揭露的技术范围内的变化或替换,都应涵盖在本实施例的保护范围之内。因此,本实施例的保护范围应以所述权利要求的保护范围为准。
Claims (24)
- 一种触发电子设备执行功能的方法,其特征在于,所述电子设备中设置有至少两个第一唤醒词,所述至少两个第一唤醒词中的每个对应一个第一指令,所述电子设备响应不同第一唤醒词对应的第一指令所执行的功能不同;所述电子设备包括主处理器,所述主处理器处于休眠状态;所述方法包括:所述电子设备接收用户输入的第一语音数据;所述电子设备判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本匹配的唤醒词,则所述电子设备将所述主处理器从休眠状态唤醒,确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能。
- 根据权利要求1所述的方法,其特征在于,在所述电子设备将所述主处理器从休眠状态唤醒之前,还包括:所述电子设备确定所述第一语音数据的声纹特征与所述至少两个第一唤醒词对应的声纹特征匹配。
- 根据权利要求1或2所述的方法,其特征在于,所述电子设备将所述主处理器从休眠状态唤醒,确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能具体为:所述电子设备将所述主处理器从休眠状态唤醒,通过所述主处理器启动语音助手,通过所述语音助手确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能。
- 根据权利要求1-3中任一项所述的方法,其特征在于,所述电子设备还包括第一协处理器;所述电子设备接收用户输入的第一语音数据具体为:所述电子设备使用所述第一协处理器监测用户输入的所述第一语音数据;所述电子设备判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本匹配的唤醒词,则所述电子设备将所述主处理器从休眠状态唤醒具体为:所述电子设备使用所述第一协处理器判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若存在,则所述第一协处理器将所述主处理器从休眠状态唤醒。
- 根据权利要求1-3中任一项所述的方法,其特征在于,所述电子设备还包括第一协处理器;所述电子设备接收用户输入的第一语音数据具体为:所述电子设备使用所述第一协处理器监测用户输入的所述第一语音数据;所述电子设备判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本匹配的唤醒词,则所述电子设备将所述主处理器从休眠状态唤醒具体为:所述电子设备使用所述第一协处理器判断所述至少两个第一唤醒词中是否存在文 本与所述第一语音数据对应的文本的匹配度满足第一精度的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本的匹配度满足所述第一精度的唤醒词,则所述第一协处理器将所述主处理器从休眠状态唤醒;所述确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能具体为:所述电子设备使用所述主处理器判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本的匹配度满足第二精度的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本的匹配度满足所述第二精度的唤醒词,则确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能;所述第一精度小于所述第二精度。
- 根据权利要求1-5中任一项所述的方法,其特征在于,在所述电子设备接收用户输入的第一语音数据之前,所述方法还包括:所述电子设备进入预定模式。
- 根据权利要求6所述的方法,其特征在于,所述电子设备还包括第二协处理器;在所述电子设备进入所述预定模式之前,所述方法还包括:所述电子设备使用所述第二协处理器监测语音数据。
- 根据权利要求6所述的方法,其特征在于,所述电子设备中还设置有第二唤醒词;所述电子设备进入预定模式具体为:所述电子设备接收用户输入的第二语音数据;所述电子设备判断所述第二语音数据与所述第二唤醒词是否匹配;若所述第二语音数据与所述第二唤醒词匹配,则所述电子设备将所述主处理器从休眠状态唤醒,通过所述主处理器启动语音助手;所述电子设备通过所述语音助手接收用户输入的第三语音数据,并确定所述第三语音数据对应的第二指令,通过所述主处理器执行所述第二指令对应的功能,所述第二指令用于指示所述电子设备进入所述预定模式。
- 根据权利要求8所述的方法,其特征在于,所述电子设备判断所述第二语音数据与所述第二唤醒词是否匹配具体为:所述电子设备判断所述第二语音数据对应的文本与所述第二唤醒词的文本是否匹配,若所述第二语音数据对应的文本与所述第二唤醒词的文本匹配,则所述第二语音数据与所述第二唤醒词匹配。
- 根据权利要求8所述的方法,其特征在于,所述电子设备判断所述第二语音数据与所述第二唤醒词是否匹配具体为:所述电子设备判断所述第二语音数据对应的文本与所述第二唤醒词的文本是否匹配,判断所述第二语音数据的声纹特征与所述第二唤醒词对应的声纹特征是否匹配;若所述第二语音数据对应的文本与所述第二唤醒词的文本匹配,且所述第二语音数据的声纹特征与所述第二唤醒词对应的声纹特征匹配,则所述第二语音数据与所述第二唤醒词匹配。
- 根据权利要求8所述的方法,其特征在于,所述电子设备还包括第二协处理 器;所述电子设备接收用户输入的第二语音数据具体为:所述电子设备使用所述第二协处理器监测用户输入的所述第二语音数据;所述电子设备判断所述第二语音数据与所述第二唤醒词是否匹配;若所述第二语音数据与所述第二唤醒词匹配,则所述电子设备将所述主处理器从休眠状态唤醒,通过所述主处理器启动语音助手具体为:所述电子设备使用所述第二协处理器判断所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度是否满足第三精度;若所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度满足所述第三精度,则所述第二协处理器将所述主处理器从休眠状态唤醒;所述电子设备使用所述主处理器判断所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度是否满足第四精度;若所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度满足所述第四精度,则所述电子设备通过所述主处理器启动所述语音助手;所述第三精度小于所述第四精度。
- 一种电子设备,其特征在于,所述电子设备包括:处理器、存储器和显示器;所述存储器、所述显示器与所述处理器耦合;所述显示器用于显示所述处理器生成的图像;所述存储器用于存储计算机程序代码;所述处理器包括主处理器,所述主处理器处于休眠状态;所述电子设备中设置有至少两个第一唤醒词,所述至少两个第一唤醒词中的每个对应一个第一指令,所述电子设备响应不同第一唤醒词对应的第一指令所执行的功能不同;所述计算机程序代码包括计算机指令,当所述处理器执行上述计算机指令时,所述处理器,用于接收用户输入的第一语音数据;判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本匹配的唤醒词,则将所述主处理器从休眠状态唤醒,确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能。
- 根据权利要求12所述的电子设备,其特征在于,所述处理器,还用于确定所述第一语音数据的声纹特征与所述至少两个第一唤醒词对应的声纹特征匹配。
- 根据权利要求12或13所述的电子设备,其特征在于,所述处理器,用于将所述主处理器从休眠状态唤醒,确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能具体为:所述处理器,用于将所述主处理器从休眠状态唤醒,通过所述主处理器启动语音助手,通过所述语音助手确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能。
- 根据权利要求12-14中任一项所述的电子设备,其特征在于,所述处理器还包括第一协处理器;所述处理器,用于接收用户输入的第一语音数据具体为:所述第一协处理器,用于监测用户输入的所述第一语音数据;所述处理器,用于判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语 音数据对应的文本匹配的唤醒词,则将所述主处理器从休眠状态唤醒具体为:所述第一协处理器,用于判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若存在,则将所述主处理器从休眠状态唤醒。
- 根据权利要求12-14中任一项所述的电子设备,其特征在于,所述处理器还包括第一协处理器;所述处理器,用于接收用户输入的第一语音数据具体为:所述第一协处理器,用于监测用户输入的所述第一语音数据;所述处理器,用于判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本匹配的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本匹配的唤醒词,则将所述主处理器从休眠状态唤醒具体为:所述第一协处理器,用于判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本的匹配度满足第一精度的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本的匹配度满足所述第一精度的唤醒词,则将所述主处理器从休眠状态唤醒;所述处理器,用于确定所述第一语音数据对应的第一指令,并通过所述主处理器执行所述第一指令对应的功能具体为:所述主处理器,用于判断所述至少两个第一唤醒词中是否存在文本与所述第一语音数据对应的文本的匹配度满足第二精度的唤醒词;若所述至少两个第一唤醒词中存在文本与所述第一语音数据对应的文本的匹配度满足所述第二精度的唤醒词,则确定所述第一语音数据对应的第一指令,并执行所述第一指令对应的功能;所述第一精度小于所述第二精度。
- 根据权利要求12-16中任一项所述的电子设备,其特征在于,所述处理器,还用于触发所述电子设备进入预定模式。
- 根据权利要求17所述的电子设备,其特征在于,所述处理器还包括第二协处理器;所述第二协处理器,用于在所述电子设备进入所述预定模式之前,监测语音数据。
- 根据权利要求17所述的电子设备,其特征在于,所述电子设备中还设置有第二唤醒词;所述处理器,还用于触发所述电子设备进入预定模式具体为:所述处理器,还用于接收用户输入的第二语音数据;判断所述第二语音数据与所述第二唤醒词是否匹配;若所述第二语音数据与所述第二唤醒词匹配,则将所述主处理器从休眠状态唤醒,通过所述主处理器启动语音助手;通过所述语音助手接收用户输入的第三语音数据,并确定所述第三语音数据对应的第二指令,通过所述主处理器执行所述第二指令对应的功能,所述第二指令用于指示所述电子设备进入所述预定模式。
- 根据权利要求19所述的电子设备,其特征在于,所述处理器,用于判断所述第二语音数据与所述第二唤醒词是否匹配具体为:所述处理器,用于判断所述第二语音数据对应的文本与所述第二唤醒词的文本是否匹配,若所述第二语音数据对应的文本与所述第二唤醒词的文本匹配,则所述第二 语音数据与所述第二唤醒词匹配。
- 根据权利要求19所述的电子设备,其特征在于,所述处理器,用于判断所述第二语音数据与所述第二唤醒词是否匹配具体为:所述处理器,用于判断所述第二语音数据对应的文本与所述第二唤醒词的文本是否匹配,判断所述第二语音数据的声纹特征与所述第二唤醒词对应的声纹特征是否匹配;若所述第二语音数据对应的文本与所述第二唤醒词的文本匹配,且所述第二语音数据的声纹特征与所述第二唤醒词对应的声纹特征匹配,则所述第二语音数据与所述第二唤醒词匹配。
- 根据权利要求19所述的电子设备,其特征在于,所述处理器还包括第二协处理器;所述处理器,还用于接收用户输入的第二语音数据具体为:所述第二协处理器,用于监测用户输入的所述第二语音数据;所述处理器,还用于判断所述第二语音数据与所述第二唤醒词是否匹配;若所述第二语音数据与所述第二唤醒词匹配,则将所述主处理器从休眠状态唤醒,通过所述主处理器启动语音助手具体为:所述第二协处理器,还用于判断所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度是否满足第三精度;若所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度满足所述第三精度,则将所述主处理器从休眠状态唤醒;所述主处理器,还用于判断所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度是否满足第四精度;若所述第二唤醒词的文本与所述第二语音数据对应的文本的匹配度满足所述第四精度,则启动所述语音助手;所述第三精度小于所述第四精度。
- 一种计算机存储介质,其特征在于,所述计算机存储介质包括计算机指令,当所述计算机指令在电子设备上运行时,使得所述电子设备执行如权利要求1-11中任一项所述的触发电子设备执行功能的方法。
- 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1-11中任一项所述的触发电子设备执行功能的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/109888 WO2020073288A1 (zh) | 2018-10-11 | 2018-10-11 | 一种触发电子设备执行功能的方法及电子设备 |
CN201880090844.6A CN111819533B (zh) | 2018-10-11 | 2018-10-11 | 一种触发电子设备执行功能的方法及电子设备 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/109888 WO2020073288A1 (zh) | 2018-10-11 | 2018-10-11 | 一种触发电子设备执行功能的方法及电子设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020073288A1 true WO2020073288A1 (zh) | 2020-04-16 |
Family
ID=70164164
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/109888 WO2020073288A1 (zh) | 2018-10-11 | 2018-10-11 | 一种触发电子设备执行功能的方法及电子设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111819533B (zh) |
WO (1) | WO2020073288A1 (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022042274A1 (zh) * | 2020-08-31 | 2022-03-03 | 华为技术有限公司 | 一种语音交互方法及电子设备 |
CN114697438A (zh) * | 2020-12-29 | 2022-07-01 | 华为技术有限公司 | 一种利用智能设备进行通话的方法及设备 |
CN115019835A (zh) * | 2022-05-27 | 2022-09-06 | 江西省天轴通讯有限公司 | 一种设备智能管理方法、系统、存储介质及设备 |
CN115376524A (zh) * | 2022-07-15 | 2022-11-22 | 荣耀终端有限公司 | 一种语音唤醒方法、电子设备及芯片系统 |
CN115734323A (zh) * | 2020-09-25 | 2023-03-03 | 华为技术有限公司 | 功耗优化方法和装置 |
CN116069818A (zh) * | 2023-01-05 | 2023-05-05 | 广州市华势信息科技有限公司 | 一种基于零代码开发的应用处理方法及系统 |
WO2023246894A1 (zh) * | 2022-06-25 | 2023-12-28 | 华为技术有限公司 | 语音交互方法及相关装置 |
WO2024051611A1 (zh) * | 2022-09-05 | 2024-03-14 | 华为技术有限公司 | 人机交互方法及相关装置 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113066490B (zh) * | 2021-03-16 | 2022-10-14 | 海信视像科技股份有限公司 | 一种唤醒响应的提示方法和显示设备 |
CN116456441B (zh) * | 2023-06-16 | 2023-10-31 | 荣耀终端有限公司 | 声音处理装置、方法和电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103095911A (zh) * | 2012-12-18 | 2013-05-08 | 苏州思必驰信息科技有限公司 | 一种通过语音唤醒寻找手机的方法及系统 |
CN103197571A (zh) * | 2013-03-15 | 2013-07-10 | 张春鹏 | 一种控制方法及装置、系统 |
CN104866274A (zh) * | 2014-12-01 | 2015-08-26 | 联想(北京)有限公司 | 信息处理方法及电子设备 |
US20170133012A1 (en) * | 2015-11-05 | 2017-05-11 | Acer Incorporated | Voice control method and voice control system |
CN107315561A (zh) * | 2017-06-30 | 2017-11-03 | 联想(北京)有限公司 | 一种数据处理方法和电子设备 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102929390A (zh) * | 2012-10-16 | 2013-02-13 | 广东欧珀移动通信有限公司 | 一种在待机状态下应用程序的启动方法及装置 |
CN102999161B (zh) * | 2012-11-13 | 2016-03-02 | 科大讯飞股份有限公司 | 一种语音唤醒模块的实现方法及应用 |
CN103634480B (zh) * | 2013-12-17 | 2017-03-01 | 百度在线网络技术(北京)有限公司 | 在移动通讯终端中通讯的方法和装置 |
CN105323386A (zh) * | 2015-12-03 | 2016-02-10 | 上海卓易科技股份有限公司 | 自动切换手机情景模式的方法及系统 |
CN107277904A (zh) * | 2017-07-03 | 2017-10-20 | 上海斐讯数据通信技术有限公司 | 一种终端及语音唤醒方法 |
-
2018
- 2018-10-11 CN CN201880090844.6A patent/CN111819533B/zh active Active
- 2018-10-11 WO PCT/CN2018/109888 patent/WO2020073288A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103095911A (zh) * | 2012-12-18 | 2013-05-08 | 苏州思必驰信息科技有限公司 | 一种通过语音唤醒寻找手机的方法及系统 |
CN103197571A (zh) * | 2013-03-15 | 2013-07-10 | 张春鹏 | 一种控制方法及装置、系统 |
CN104866274A (zh) * | 2014-12-01 | 2015-08-26 | 联想(北京)有限公司 | 信息处理方法及电子设备 |
US20170133012A1 (en) * | 2015-11-05 | 2017-05-11 | Acer Incorporated | Voice control method and voice control system |
CN107315561A (zh) * | 2017-06-30 | 2017-11-03 | 联想(北京)有限公司 | 一种数据处理方法和电子设备 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022042274A1 (zh) * | 2020-08-31 | 2022-03-03 | 华为技术有限公司 | 一种语音交互方法及电子设备 |
CN115734323A (zh) * | 2020-09-25 | 2023-03-03 | 华为技术有限公司 | 功耗优化方法和装置 |
CN115734323B (zh) * | 2020-09-25 | 2024-01-30 | 华为技术有限公司 | 功耗优化方法和装置 |
CN114697438A (zh) * | 2020-12-29 | 2022-07-01 | 华为技术有限公司 | 一种利用智能设备进行通话的方法及设备 |
CN114697438B (zh) * | 2020-12-29 | 2023-06-27 | 华为技术有限公司 | 一种利用智能设备进行通话的方法、装置、设备及存储介质 |
CN115019835A (zh) * | 2022-05-27 | 2022-09-06 | 江西省天轴通讯有限公司 | 一种设备智能管理方法、系统、存储介质及设备 |
WO2023246894A1 (zh) * | 2022-06-25 | 2023-12-28 | 华为技术有限公司 | 语音交互方法及相关装置 |
CN115376524A (zh) * | 2022-07-15 | 2022-11-22 | 荣耀终端有限公司 | 一种语音唤醒方法、电子设备及芯片系统 |
WO2024051611A1 (zh) * | 2022-09-05 | 2024-03-14 | 华为技术有限公司 | 人机交互方法及相关装置 |
CN116069818A (zh) * | 2023-01-05 | 2023-05-05 | 广州市华势信息科技有限公司 | 一种基于零代码开发的应用处理方法及系统 |
CN116069818B (zh) * | 2023-01-05 | 2023-09-12 | 广州市华势信息科技有限公司 | 一种基于零代码开发的应用处理方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN111819533B (zh) | 2022-06-14 |
CN111819533A (zh) | 2020-10-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021052263A1 (zh) | 语音助手显示方法及装置 | |
RU2766255C1 (ru) | Способ голосового управления и электронное устройство | |
CN110910872B (zh) | 语音交互方法及装置 | |
WO2020073288A1 (zh) | 一种触发电子设备执行功能的方法及电子设备 | |
CN110134316B (zh) | 模型训练方法、情绪识别方法及相关装置和设备 | |
WO2020182065A1 (zh) | 快捷功能启动的方法及电子设备 | |
WO2021213164A1 (zh) | 应用界面交互方法、电子设备和计算机可读存储介质 | |
WO2020134869A1 (zh) | 电子设备的操作方法和电子设备 | |
JP7280005B2 (ja) | 無線充電方法および電子デバイス | |
WO2021063237A1 (zh) | 电子设备的控制方法及电子设备 | |
WO2020077540A1 (zh) | 一种信息处理方法及电子设备 | |
WO2021052139A1 (zh) | 手势输入方法及电子设备 | |
WO2021218429A1 (zh) | 应用窗口的管理方法、终端设备及计算机可读存储介质 | |
WO2020056778A1 (zh) | 一种屏蔽触摸事件的方法及电子设备 | |
WO2022042766A1 (zh) | 信息显示方法、终端设备及计算机可读存储介质 | |
CN115206308A (zh) | 一种人机交互的方法及电子设备 | |
CN113380240B (zh) | 语音交互方法和电子设备 | |
WO2021129453A1 (zh) | 一种截屏方法及相关设备 | |
CN116723384B (zh) | 进程的控制方法、电子设备及可读存储介质 | |
CN117271170B (zh) | 活动事件处理方法及相关设备 | |
WO2024114493A1 (zh) | 一种人机交互的方法和装置 | |
WO2024012346A1 (zh) | 任务迁移的方法、电子设备和系统 | |
WO2023124829A1 (zh) | 语音协同输入方法、电子设备及计算机可读存储介质 | |
CN117119102A (zh) | 语音交互功能的唤醒方法及电子设备 | |
CN117687814A (zh) | 异常处理方法、系统以及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18936482 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18936482 Country of ref document: EP Kind code of ref document: A1 |