WO2022188511A1 - 语音助手唤醒方法及装置 - Google Patents

语音助手唤醒方法及装置 Download PDF

Info

Publication number
WO2022188511A1
WO2022188511A1 PCT/CN2021/141207 CN2021141207W WO2022188511A1 WO 2022188511 A1 WO2022188511 A1 WO 2022188511A1 CN 2021141207 W CN2021141207 W CN 2021141207W WO 2022188511 A1 WO2022188511 A1 WO 2022188511A1
Authority
WO
WIPO (PCT)
Prior art keywords
value
state
data
electronic device
sensor data
Prior art date
Application number
PCT/CN2021/141207
Other languages
English (en)
French (fr)
Inventor
向肖
张晓帆
唐成戬
曾理
王佩玲
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2022188511A1 publication Critical patent/WO2022188511A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/725Cordless telephones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a method and device for waking up a voice assistant.
  • the voice assistant in the mobile terminal has gradually become a function that people often use. Users can control the voice assistant to complete various operations on the mobile terminal by sending some voice commands to the voice assistant.
  • Embodiments of the present application provide a method and device for waking up a voice assistant.
  • an embodiment of the present application provides a voice assistant wake-up method, which is applied to a first electronic device, and the method includes:
  • the sensor data is obtained
  • an embodiment of the present application provides a voice assistant wake-up device, which is applied to an electronic device, and the device includes:
  • the state perception module is used to obtain sensor data when receiving the user's voice data
  • the state sensing module is further configured to determine state sensing data according to the sensor data
  • a communication module for receiving state-aware data from at least one second electronic device
  • a decision module configured to determine a target wake-up device according to the state perception data of the first electronic device and the state perception data of the at least one second electronic device;
  • a wake-up module configured to wake up the voice assistant if the target wake-up device is the first device.
  • embodiments of the present application provide an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the above-mentioned processing
  • the above program includes instructions for executing steps in any method of the first aspect of the embodiments of the present application.
  • an embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to execute the computer program as described in the first embodiment of the present application. In one aspect some or all of the steps described in any method.
  • an embodiment of the present application provides a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute as implemented in the present application. Examples include some or all of the steps described in any method of the first aspect.
  • the computer program product may be a software installation package.
  • FIG. 1 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a software structure of an electronic device provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an application scenario of a voice assistant wake-up provided by an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of another electronic device provided by an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a method for waking up a voice assistant provided by an embodiment of the present application
  • FIG. 5a is a schematic diagram of a multi-device scenario provided by an embodiment of the present application.
  • 5b is a schematic diagram of a holding state provided by an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a voice assistant wake-up device provided by an embodiment of the present application.
  • the electronic devices involved in the embodiments of the present application may be portable electronic devices that also include other functions, such as personal digital assistants and/or music player functions, such as mobile phones, tablet computers, and wearable electronic devices (such as smart watches) with wireless communication functions. Wait.
  • portable electronic devices include, but are not limited to, portable electronic devices powered by IOS systems, Android systems, Microsoft systems, or other operating systems.
  • the above-mentioned portable electronic device may also be other portable electronic devices, such as a laptop computer (Laptop) or the like. It should also be understood that, in some other embodiments, the above-mentioned electronic device may not be a portable electronic device, but a desktop computer.
  • FIG. 1 shows a schematic structural diagram of an electronic device 100 .
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, compass 190, motor 191, indicator 192, camera 193, display screen 194 and user Identity module (subscriber identification module, SIM) card interface 195 and so on.
  • SIM subscriber identification module
  • the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (neural-network processing unit, NPU), etc.
  • application processor application processor
  • AP application processor
  • modem processor graphics processor
  • image signal processor image signal processor
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural-network processing unit neural-network processing unit
  • NPU neural-network processing unit
  • different processing units can be independent components, and can also be integrated in one or more processors.
  • electronic device 100 may also include one or more processors 110 .
  • the controller can generate an operation control signal according to the instruction operation code and the timing signal, and complete the control of fetching and executing
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in the processor 110 may be a cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. In this way, repeated access is avoided, and the waiting time of the processor 110 is reduced, thereby improving the efficiency of the electronic device 100 in processing data or executing instructions.
  • the processor 110 may include one or more interfaces.
  • the interface may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit sound (I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transceiver (universal) asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input/output (GPIO) interface, SIM card interface and/or USB interface, etc.
  • the USB interface 130 is an interface that conforms to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and peripheral devices.
  • the USB interface 130 can also be used to connect an earphone, and play audio through the earphone.
  • the interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
  • the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the charging management module 140 is used to receive charging input from the charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 charges the battery 142 , it can also supply power to the electronic device through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, the wireless communication module 160, and the like.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance).
  • the power management module 141 may also be provided in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be provided in the same device.
  • the wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 may provide wireless communication solutions including 2G/3G/4G/5G etc. applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves from the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor, and then turn it into an electromagnetic wave for radiation through the antenna 1 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110 .
  • at least part of the functional modules of the mobile communication module 150 may be provided in the same device as at least part of the modules of the processor 110 .
  • the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), global navigation Satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR), UWB and other wireless communication solutions.
  • WLAN wireless local area networks
  • BT wireless fidelity
  • GNSS global navigation Satellite system
  • frequency modulation frequency modulation, FM
  • NFC near field communication technology
  • infrared technology infrared, IR
  • UWB wireless communication solutions.
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it,
  • the electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • the display screen 194 is used to display images, videos, and the like.
  • Display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode, or an active matrix organic light emitting diode (active-matrix organic light).
  • emitting diode, AMOLED flexible light-emitting diode (flex light-emitting diode, FLED), mini light-emitting diode (mini light-emitting diode, miniled), MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), etc.
  • electronic device 100 may include one or more display screens 194 .
  • the electronic device 100 may implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
  • the ISP is used to process the data fed back by the camera 193 .
  • the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin tone. ISP can also optimize parameters such as exposure and color temperature of the shooting scene.
  • the ISP may be provided in the camera 193 .
  • Camera 193 is used to capture still images or video.
  • the object is projected through the lens to generate an optical image onto the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the electronic device 100 may include one or more cameras 193 .
  • a digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy and so on.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs.
  • the electronic device 100 can play or record videos in various encoding formats, such as: Moving Picture Experts Group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.
  • MPEG Moving Picture Experts Group
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • Applications such as intelligent cognition of the electronic device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100 .
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store one or more computer programs including instructions.
  • the processor 110 may execute the above-mentioned instructions stored in the internal memory 121, thereby causing the electronic device 100 to execute the method for displaying page elements, various applications and data processing provided in some embodiments of the present application.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the stored program area may store the operating system; the stored program area may also store one or more applications (such as gallery, contacts, etc.) and the like.
  • the storage data area may store data (such as photos, contacts, etc.) created during the use of the electronic device 100 and the like.
  • the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic disk storage components, flash memory components, universal flash storage (UFS), and the like.
  • the processor 110 may cause the electronic device 100 to execute the instructions provided in the embodiments of the present application by executing the instructions stored in the internal memory 121 and/or the instructions stored in the memory provided in the processor 110 . Methods for displaying page elements, as well as other applications and data processing.
  • the electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • the sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the pressure sensor 180A is used to sense pressure signals, and can convert the pressure signals into electrical signals.
  • the pressure sensor 180A may be provided on the display screen 194 .
  • the capacitive pressure sensor may be comprised of at least two parallel plates of conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes.
  • the electronic device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display screen 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A.
  • the electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A.
  • touch operations acting on the same touch position but with different touch operation intensities may correspond to different operation instructions. For example, when a touch operation whose intensity is less than the first pressure threshold acts on the short message application icon, the instruction for viewing the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, the instruction to create a new short message is executed.
  • the gyro sensor 180B may be used to determine the motion attitude of the electronic device 100 .
  • the angular velocity of the electronic device 100 about three axes ie, the X, Y, and Z axes
  • the gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shaking angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to offset the shaking of the electronic device 100 through reverse motion to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenarios.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes).
  • the magnitude and direction of gravity can be detected when the electronic device 100 is stationary. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket, so as to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, accessing application locks, taking pictures with fingerprints, answering incoming calls with fingerprints, and the like.
  • the temperature sensor 180J is used to detect the temperature.
  • the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J in order to reduce power consumption and implement thermal protection.
  • the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 caused by the low temperature.
  • the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also called “touch panel”.
  • the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to touch operations may be provided through display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the location where the display screen 194 is located.
  • FIG. 2 shows a software structural block diagram of the electronic device 100 .
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate with each other through software interfaces.
  • the Android system is divided into four layers, which are, from top to bottom, an application layer, an application framework layer, an Android runtime (Android runtime) and a system library, and a kernel layer.
  • the application layer can include a series of application packages.
  • the application package can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message and so on.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer may include window managers, content providers, view systems, telephony managers, resource managers, notification managers, and the like.
  • a window manager is used to manage window programs.
  • the window manager can get the size of the display screen, determine whether there is a status bar, lock the screen, take screenshots, etc.
  • Content providers are used to store and retrieve data and make these data accessible to applications.
  • the data may include video, images, audio, calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on. View systems can be used to build applications.
  • a display interface can consist of one or more views.
  • the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.
  • the phone manager is used to provide the communication function of the electronic device 100 .
  • the management of call status including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localization strings, icons, pictures, layout files, video files and so on.
  • the notification manager enables applications to display notification information in the status bar, which can be used to convey notification-type messages, and can disappear automatically after a brief pause without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc.
  • the notification manager can also display notifications in the status bar at the top of the system in the form of graphs or scroll bar text, such as notifications of applications running in the background, and notifications on the screen in the form of dialog windows. For example, text information is prompted in the status bar, a prompt sound is issued, the electronic device vibrates, and the indicator light flashes.
  • Android Runtime includes core libraries and a virtual machine. Android runtime is responsible for scheduling and management of the Android system.
  • the core library consists of two parts: one is the function functions that the java language needs to call, and the other is the core library of Android.
  • the application layer and the application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, safety and exception management, and garbage collection.
  • a system library can include multiple functional modules. For example: surface manager (surface manager), media library (media library), 3D graphics processing library (eg: OpenGL ES), 2D graphics engine (eg: SGL), etc.
  • surface manager surface manager
  • media library media library
  • 3D graphics processing library eg: OpenGL ES
  • 2D graphics engine eg: SGL
  • the Surface Manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer contains at least display drivers, camera drivers, audio drivers, and sensor drivers.
  • the voice assistant wake-up method provided by the embodiment of the present application can be applied to the application scenario shown in FIG. 3 .
  • the application scenario includes multiple electronic devices with voice assistants, and the wake-up words are the same.
  • such an application scenario may be referred to as a multi-device scenario.
  • the method of the embodiment of the present application is adopted, that is, the scenario in which the electronic device is located is determined according to multiple sensor data, and the electronic device that wakes up the most appropriate response is selected according to the priority of the scenario.
  • the voice assistant on the device can make the electronic device better perceive the user's intention, and then let the user feel a smarter product experience.
  • a voice assistant may be installed in the electronic device to enable the electronic device to implement a voice control function.
  • Voice assistants are generally dormant. Before using the voice control function of the electronic device, the user needs to voice wake up the voice assistant. Among them, the voice data for waking up the voice assistant may be called a wake-up word (or a wake-up voice). The wake word may be pre-registered in the electronic device.
  • the wake-up voice assistant in this embodiment may refer to that the electronic device starts the voice assistant in response to the wake-up word spoken by the user.
  • the voice control function may refer to: after the voice assistant of the electronic device is activated, the user can trigger the electronic device to automatically execute an event corresponding to the voice command by speaking a voice command (eg, a piece of voice data).
  • the above-mentioned voice assistant may be an embedded application in the electronic device (ie, a system application of the electronic device), or may be a downloadable application.
  • Embedded applications are applications provided as part of the implementation of an electronic device such as a cell phone.
  • a downloadable application is an application that can provide its own Internet Protocol Multimedia Subsystem (IMS) connection.
  • the downloadable application may be pre-installed in the electronic device, or may be a third-party application downloaded by the user and installed in the electronic device.
  • IMS Internet Protocol Multimedia Subsystem
  • the electronic device may include a state sensing module, a communication module, a decision-making module and a wake-up module.
  • the wake-up module is mainly used to wake up the voice assistant;
  • the state perception module is mainly used to calculate the state perception data of the electronic device;
  • the decision module is used to decide which electronic device to wake up the voice assistant.
  • the decision-making module of the electronic device can decide the target to wake up the electronic device according to the preset rules, and notify the wake-up module;
  • the communication module is responsible for sending and receiving messages between electronic devices.
  • FIG. 5 is a schematic flowchart of a voice assistant wake-up method provided by an embodiment of the present application, which is applied to any electronic device shown in FIG. 3. As shown in the figure, the voice assistant wake-up method includes the following operations .
  • the DSP of the first electronic device can monitor in real time whether the user has a voice through the microphone data input.
  • a user wants to use the voice control function of the electronic device, he or she can emit a sound within the sound pickup distance of the electronic device, so as to input the emitted sound into the microphone.
  • the DSP of the first electronic device can monitor the corresponding voice data, such as voice data, through the microphone, and cache it.
  • the user is sitting on the sofa in the living room, and when he wants to use the voice control function to play music, he can say the wake-up word " ⁇ ".
  • the mobile phone, tablet and notebook are all around the user, that is, the user is within the pickup distance, and no other software or hardware is using the microphone to collect voice data, the DSP of the mobile phone, tablet and smart watch can pass the respective microphones.
  • the voice data corresponding to the wake-up word "Xiaobu Xiaobu” is detected.
  • the voice data may be checked, that is, it is determined whether the received voice data is a wake-up word registered in the first electronic device. If the verification is passed, it indicates that the received voice data is a wake-up word, and sensor data can be obtained. If the verification fails, it indicates that the received voice data is not a wake-up word, and the first electronic device can delete the buffered voice data at this time.
  • the first electronic device when the voice data received by the first electronic device is a wake-up word, the first electronic device can start the service of the decision-making system, and the decision-making system is integrated into the operating system in the form of a service.
  • the decision-making system service starts to register initialization tasks such as sensor monitoring, that is, the operating system starts the sensor to monitor the state of the first electronic device, and transmits sensor data back through a callback function. After all the sensor data are transmitted back, the state sensing module is notified to read the sensor data and calculate the state of the first electronic device.
  • transmitting sensor data by means of a callback function is an asynchronous operation, which can improve system efficiency.
  • the sensor data includes at least one of the following: acceleration sensor data, angular velocity sensor data, Z-axis acceleration data, distance sensor data, and light sensor data
  • the state sensing data includes: first state data, second state data, The third state data and device identification.
  • the first state data may be data within a numerical range to which the first electronic device is in a holding state or a flat state
  • the second state data may be within a numerical range to which the first electronic device is in an inverted state or a non-inverted state
  • the third state data may be a value within the numerical range of any one of the first electronic device being in a trouser pocket state, a blocking state, a daytime state, and a nighttime state.
  • the determining the state perception data according to the sensor data includes: determining the first state data according to the acceleration sensor data and the angular velocity sensor data; determining the second state data according to the Z-axis acceleration data state data; the third state data is determined according to the distance sensor data and the light sensor data.
  • the state perception module can use the acceleration sensor and the angular velocity sensor to determine whether the electronic device is in the "holding” or “flat” (non-holding) state corresponding to the numerical range, by using the acceleration sensor, the proximity sensor and the light sensor to determine the electronic device. Whether it is in the numerical range corresponding to the five states of "inverted”, “pants pocket”, “blocked”, “night” and "day”.
  • the determining the first state data according to the acceleration sensor data and the angular velocity sensor data includes:
  • the first included angle is the included angle between the first electronic device and the horizontal plane; if the first included angle is greater than or equal to the preset included angle, the The value of the first state data is set to a first numerical value, and the first numerical value is a numerical value within the numerical range to which the first state belongs; if the first included angle is smaller than the preset included angle, according to the angular velocity
  • the value of the sensor data determines whether the first electronic device vibrates; if the first electronic device vibrates, the value of the first state data is set to the first value; otherwise, the value of the first state data is set It is set to a second value, and the second value is a value that is not within the value range to which the first state belongs.
  • the first state may be a holding state, and the non-first state may be a flat state; the second state may be an inverted state, and the non-second state may be a non-inverted state; and the third state is a trouser pocket state.
  • the above-mentioned first numerical value may be any numerical value within the numerical range to which the first state belongs, and the second numerical value may be any numerical value not within the numerical range to which the first state belongs.
  • a value or a second value It can be understood that the numerical range to which the first state belongs does not overlap with the numerical range to which the non-first state belongs.
  • the first electronic device it can be determined whether the first electronic device is currently being used by the user by judging whether the first electronic device is in a holding state.
  • the acceleration sensor data it is possible to make the first electronic device in the horizontal or vertical screen state, as long as the included angle between the first electronic device and the horizontal plane exceeds the preset included angle, the first state data The value of is set to the first value, so that the decision module can determine that the first electronic device is in a holding state.
  • the holding state includes but is not limited to the following scenarios.
  • the first electronic device will detect that it is in the holding state.
  • the acc_x and acc_y are the gravitational acceleration components of the x-axis and the y-axis, respectively; g is the gravitational acceleration, and the value is 9.81.
  • the preset included angle can be configured by the system, for example, 15 degrees; it can also be set according to the user's usage habits.
  • the present application does not limit other ways of setting the preset angle.
  • the first electronic device when the first included angle is smaller than the preset included angle, the first electronic device may still be in a holding state, for example, the user uses the first electronic device in the palm of his hand. Therefore, when the first included angle is smaller than the preset included angle, the first electronic device can start shaking detection. Through the shake detection, the first electronic device in the holding state but the first included angle is smaller than the preset included angle can be detected.
  • the determining whether the first electronic device shakes according to the value of the angular velocity sensor data includes:
  • the first angular velocity is the angular velocity between the first electronic device and the horizontal plane; if the first angular velocity is less than the first preset angular velocity and greater than the second preset angular velocity angular velocity, store the first angular velocity in the cache, otherwise, clear the cache; when the number of the first angular velocities in the cache is greater than or equal to the statistical window, calculate the first angular velocity in the statistical window within the sampling period If the average value is greater than or equal to the third preset angular velocity, or the first angular velocity is greater than the first preset angular velocity, it is determined that the first electronic device is shaking, otherwise it is determined that the first electronic device is stationary.
  • the first angular velocity is the modulus of the angular velocity vector of the first electronic device, and the calculation formula of the modulus of the angular velocity vector can be expressed as:
  • the axisX, axisY, and axisZ are the angular velocities of the x-axis, the y-axis, and the z-axis of the angular velocity sensor, respectively.
  • the above-mentioned first preset angular velocity, second preset angular velocity, third preset angular velocity, adoption period, and statistical window can be set by the system, for example, the first preset angular velocity is set to 0.030rad/s, the second preset angular velocity The angular velocity is set to 0.002rad/s, the third preset angular velocity is set to 0.010rad/s, the sampling period is set to 4, and the statistical window is set to 20; it can also be set according to specific actual application scenarios, and the embodiment of this application does not apply to this. Do limit.
  • the period of angular velocity data reported by angular velocity sensors is very short (2.5ms per packet), but the fluctuation of angular velocity data measured by gyroscopes may last for a period of time, so by increasing the statistical window and interval Sampling to filter out such fluctuations.
  • the state perception module receives the instantaneous value of the angular velocity reported by the gyroscope, it calculates the modulus of the reported angular velocity vector; if the modulus of the angular velocity vector is greater than the first preset angular velocity v1 or less than the second preset angular velocity v2, then the state perception module The module clears the modulus of the angular velocity vector stored in the memory buffer, and when the modulus of the angular velocity vector is greater than v1, the first electronic device is judged to be shaking; when the modulus of the angular velocity vector is less than v2, the first electronic device is judged to be static.
  • the modulo of the angular velocity vector is greater than the first preset angular velocity v1 and less than the second preset angular velocity v2, the modulo of the angular velocity vector is input into the buffer for buffering, and if the number of moduli of the angular velocity vector stored in the buffer is less than the statistics window Wstatistics_window , the state perception module performs the above calculation on the next received angular velocity instantaneous value; if the number of moduli of the angular velocity vector stored in the buffer is greater than or equal to Wstatistics_window, the sampling period Isamping_interval is used, the angular velocity in the interval sampling statistics window, and the sampling acceleration The number is Wstatistics_window/Isamping_interval, and then the average angular velocity value Avg of the Wstatistics_window/Isamping_interval angular velocity is calculated, thereby reducing the fluctuation of the angular velocity data.
  • the average angular velocity value Avg is less than the third preset angular velocity, it is judged that the first electronic device is stationary; if the average angular velocity value Avg is less than or equal to the third preset angular velocity, then it is judged that the first electronic device shakes.
  • the determining the second state data according to the Z-axis acceleration data includes: if the value of the Z-axis acceleration data is less than an acceleration threshold, setting the value of the second state data to a third Numerical value, the third numerical value is a numerical value within the numerical range to which the second state belongs, otherwise, the value of the second state data is set to a fourth numerical value, and the fourth numerical value is not within the numerical value range to which the second state belongs. value of .
  • the state sensing module sets the second state data according to the acceleration on the z-axis reported by the acceleration sensor, so as to detect whether the first electronic device is in an upside-down state. Specifically, if the value of the z-axis acceleration data is less than the acceleration threshold, the value of the second state data is set to a third value, and then the decision-making module can determine that the first electronic device is in an inverted state; if the value of the z-axis acceleration data is If it is greater than or equal to the acceleration threshold, the value of the second state data is set to a fourth value, so that the decision module can determine that the first electronic device is in a non-inverted state.
  • the acceleration threshold may be set to -9m/s 2 .
  • the gravitational acceleration of the Z-axis is opposite to the direction of the gravitational acceleration, and the acceleration of the Z-axis will be smaller than the acceleration threshold.
  • the above-mentioned third numerical value may be any numerical value within the numerical range described in the second state.
  • the value of the Z-axis acceleration data may also be in a mapping relationship with the values within the numerical range to which the second state belongs.
  • the numerical range to which the second state belongs is [-1, -2].
  • the second state data can be set to -1; when the Z-axis acceleration is in the range of -9.5m/s 2 --9.8m/s 2 , The second state data may be set to -2.
  • the above-mentioned fourth numerical value may be any numerical value within the numerical range described in the non-second state.
  • the value of the Z-axis acceleration data may also be in a mapping relationship with the value within the value range to which the non-second state belongs.
  • the value range to which the non-second state belongs is [1, 4].
  • the second state data can be set to 1; when the Z-axis acceleration is in the range of -4.9m/s 2 --1.0m/s 2 , it can be set to 1 Set the second state data to 2; when the Z-axis acceleration is in the range of -0.9m/s 2 -4.0m/s 2 , the second state data can be set to 3; when the Z-axis acceleration is 4.1m/s 2 When within the range of -9.8m/s 2 , the second state data can be set to 4.
  • the embodiments of the present application also do not limit other mapping methods for mapping the third value to the value within the value range of the second state, and the fourth value to the value not within the value range of the second state.
  • the determining the third state data according to the distance sensor data and the light sensor data includes: if the value of the distance sensor data is less than a blocking distance threshold, and the value of the light sensor data is less than the light intensity threshold, the value of the third state data is set to a fifth numerical value, and the fifth numerical value is a numerical value within the numerical range to which the third state belongs; otherwise, the value of the third state data is set to the first Six numerical values, where the sixth numerical value is a numerical value that is not within the numerical range to which the third state belongs.
  • the third state is the trouser pocket state.
  • the first electronic device When the first electronic device is in the trouser pocket state, it means that the user currently has little willingness to use the first electronic device, that is, the probability that the user chooses to wake up the voice assistant on the first electronic device is very small. . Since the light in the trouser pocket scene is dark and blocked at the same time, the value of the third state data can be set through the distance sensor and the light sensor, so that the decision module can detect whether the first electronic device is in the trouser pocket state.
  • the value of the third state data is set to the fifth value, and the decision-making module can decide according to the fifth value. It is determined that the first electronic device is in a trouser pocket state, otherwise the value of the third state data is set to a sixth value, and the decision module determines that the first electronic device is in a non-pants pocket state according to the sixth value.
  • the occlusion distance threshold may be set to 0, 0.1 cm, 0.2 cm, etc.
  • the light intensity threshold may be set to 10 lux.
  • the occlusion distance threshold and the light intensity threshold may also be set according to the user's habit, or set according to a specific actual scene, which is not limited in this embodiment of the present application.
  • the non-third state includes a fourth state, a fifth state and a sixth state.
  • the fourth state may be a blocking state
  • the fifth state may be a night state
  • the sixth state may be a daytime state.
  • the setting of the value of the third state data as the sixth numerical value includes: if the value of the distance sensor data is less than the occlusion distance threshold, and the value of the light sensor data is greater than or equal to the light intensity threshold , then the value of the third state data is set to the seventh numerical value, and the seventh numerical value is the numerical value within the numerical range to which the fourth state belongs; if the value of the distance sensor data is greater than or equal to the occlusion distance threshold, And the value of the light sensor data is less than the light intensity threshold, the value of the third state data is set to an eighth numerical value, and the eighth numerical value is a numerical value within the numerical range to which the fifth state belongs; if the The value of the distance sensor data is greater than or equal to the occlusion distance threshold, and the value of the light sensor data is greater than or equal to the light intensity threshold, then the value of the third state data is set to the ninth value, and the ninth value is the The value within the value range to which the sixth state belongs.
  • the numerical range to which the non-third state belongs may be the union of the numerical range to which the fourth state belongs, the numerical range to which the fifth state belongs, and the numerical range to which the sixth state belongs, that is, the sixth numerical value may be the numerical range to which the fourth state belongs, the numerical range to which the sixth state belongs, and the numerical range to which the sixth state belongs. Any value in the numerical range to which the five states belong and the numerical range to which the sixth state belongs.
  • the distance between the electronic device and the target object can be detected according to the distance sensor. Therefore, the value of the distance sensor data can also detect the first electronic device is not placed in the trouser pocket but has an obstruction state. For example, Scenarios where the phone is placed next to the ear (for example, when answering a call).
  • the light level of the current environment where the electronic device is located can be detected according to the light sensor, so whether the first electronic device is in a night state or a daytime state can be detected by the value of the light sensor data.
  • the method further includes: sending the state awareness data of the first electronic device to the at least one second electronic device.
  • the state sensing data can be sent to the surrounding second electronic device, so that the second electronic device can make the state sensing data according to the surrounding multiple devices.
  • State-aware data makes decisions to choose a target to wake up the device.
  • the first electronic device may set a timer in the communication module, when the voice data received by the first electronic device is a wake-up word, start the timer, and then broadcast the value calculated by the state perception module in the timer. State-aware data.
  • the electronic device broadcasts the state-aware data to synchronize the states of multiple devices, which can reduce the time overhead caused by establishing a connection; at the same time, the real-time processing of sensor data does not need to collect sensor data for multiple times. calculations, reducing the time to collect sensor data. This brings new features to the user while providing a good control over the wake-up time of the voice assistant.
  • S530 Receive state sensing data from at least one second electronic device.
  • the first electronic device when the voice data received by the first electronic device is a wake-up word, can use the communication module to monitor the status awareness data sent by the surrounding second electronic devices.
  • the first electronic device may set another timer in the communication module, when the voice data received by the first electronic device is a wake-up word, start the timer, and then receive the data sent by the first electronic device within the timer.
  • state-aware module Exemplarily, the first electronic device may set a timer in the communication module to start broadcasting and receiving state-aware data, and automatically stop broadcasting and receiving state-aware data when the timer expires.
  • the above timer may be set by the system, for example, set to a timeout of 300 ms; it may also be set according to a specific actual scenario, which is not limited in this embodiment of the present application.
  • the decision-making delay can be effectively reduced, and the user experience can be improved.
  • the communication module can encapsulate a variety of near-field communication methods, such as Bluetooth (Bluetooth, BT), Bluetooth Low Energy (BLE), WIFI direct (direct), wireless local area network (Wireless Local Area Network, WLAN), etc.
  • Bluetooth Bluetooth
  • BLE Bluetooth Low Energy
  • WIFI direct direct
  • wireless local area network Wireless Local Area Network, WLAN
  • the BLE broadcast packet will transmit a series of data including Universal Unique Identifier (UUID), MAC address, Bluetooth name, etc., but there is still a part of the remaining space that can be used to transmit custom data.
  • the remaining space of a broadcast packet can be up to 23 bytes.
  • the first electronic device and/or the second electronic device can reuse the established connection to directly perform data Packets are sent and received, so connected-oriented communication is more reliable, more secure, and has lower latency.
  • this embodiment of the present application does not limit the execution order of S520 and S530, that is, the first electronic device may execute S520 and S530 at the same time; it may also execute S530 first, and then execute S520; or execute S520 first, and then execute S520 first. S530.
  • the communication module can send the state sensing data of the second electronic device and the state sensing data of the first electronic device to the decision-making module, and the decision-making module can send the state-sensing data of the second electronic device to the decision-making module according to the The state sensing modules of the multiple devices determine the target wake-up device.
  • the determining a target wake-up device according to the state sensing data of the first electronic device and the state sensing data of the at least one second electronic device includes:
  • the first candidate device is the first electronic device and/or the second electronic device in the second state; if the first candidate device is the first electronic device and/or the second electronic device in the second state If the number of devices is 1, then the first candidate device is determined as the target wake-up device; if the number of the first candidate device is greater than 1, then according to the third state data, the first candidate device is selected from the first candidate device.
  • the data structure of the state-aware data can be expressed as ⁇ isHoldInHand; isUpsideDown; isInPocket; deviceId ⁇ .
  • isHoldInHand is the value of the first state data
  • isUpsideDown is the value of the second state data
  • isInPocket is the value of the third state data
  • deviceId is the device ID.
  • a chain decision-making method is adopted, that is, the selection is made in sequence according to the state-aware data, and when the only target wake-up device is selected, it will exit immediately, and will not continue to pass on. If it is passed to the last rule, there is still no When the target wake-up device is selected, the target wake-up device is selected through the device ID.
  • the decision-making module selects the first electronic device and/or the second electronic device in the non-second state (ie the electronic device in the non-inverted state) from the plurality of devices according to the second state data in the state perception data device), that is, when the value of the second state data of the electronic device is within the value range of the second state, it is determined that the electronic device is in the second state.
  • the electronic device in the non-third state is directly determined as the target wake-up device; if there are multiple electronic devices in the non-third state, the electronic device in the non-third state is selected from the multiple electronic devices in the non-third state.
  • Select the electronic device in the first state specifically, if the value of the first state data of the electronic device is within the value range of the first state, then determine that the electronic device is in the first state, if no electronic device is in a non-first state. With three states (all are in the trouser pocket state), the electronic device in the first state is selected from a plurality of devices or a plurality of electronic devices not in the second state. If only one electronic device is in the first state, the electronic device is directly determined as the target wake-up device; if there are multiple or none of the electronic devices in the first state, the target wake-up device is selected according to the device ID.
  • the method of selecting the unique device by the device ID includes but is not limited to selecting the smallest device ID or the largest device ID, or randomly selecting a device ID through a random algorithm.
  • the electronic device determines various scenarios in which the electronic device is located according to the data detected by multiple sensors, and selects the device that wakes up the most appropriate response according to the priority of the scenarios, so that the electronic device can better perceive the user's intention , so that users can experience a more intelligent product experience.
  • the acquisition perception data structure of the target wake-up device selected by the decision-making module is passed to the wake-up module, and the wake-up module judges whether it is itself through the deviceId.
  • the voice assistant wake-up method proposes the voice assistant wake-up method proposed in the embodiment of the present application, when the first electronic device receives the user's voice data, it acquires sensor data, determines state perception data according to the sensor data, and then receives the data from at least one second electronic device.
  • Status awareness data according to the status awareness data of the first electronic device and the status awareness data of at least one second electronic device, determine the target wake-up device, and wake up the voice assistant if the target wake-up device is the first electronic device.
  • the present application determines the voice assistant on the electronic device that the user needs to wake up according to the state perception data of multiple electronic devices, so that the electronic device can better perceive the user's intention, so that the electronic device that the user needs to wake up can be awakened from the multiple electronic devices. On the voice assistant, improve the user experience.
  • the electronic device includes corresponding hardware and/or software modules for executing each function.
  • the present application can be implemented in hardware or in the form of a combination of hardware and computer software in conjunction with the algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functionality for each particular application in conjunction with the embodiments, but such implementations should not be considered beyond the scope of this application.
  • the electronic device can be divided into functional modules according to the above method examples.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware. It should be noted that, the division of modules in this embodiment is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • FIG. 6 shows a schematic structural diagram of a voice assistant wake-up device.
  • the voice assistant wake-up device 600 is applied to electronic equipment, and the voice assistant wake-up device 600 can be It includes: a state perception module 601 , a communication module 602 , a decision module 603 and a wake-up module 604 .
  • the state perception module 601 may be used to support the electronic device to perform the above S510, S520, etc., and/or other processes for the techniques described herein.
  • the communication module 602 may be used to support the electronic device to perform the above-described S530, etc., and/or other processes for the techniques described herein.
  • the decision module 603 may be used to support the electronic device to perform the above-described S540, etc., and/or other processes for the techniques described herein.
  • the wake-up module 604 may be used to support the electronic device to perform S550, etc. above, and/or other processes for the techniques described herein.
  • the electronic device provided in this embodiment is used to execute the above-mentioned voice assistant wake-up method, so it can achieve the same effect as the above-mentioned implementation method.
  • the electronic device may include a processing module, a memory module and a communication module.
  • the processing module can be used to control and manage the actions of the electronic device, for example, can be used to support the electronic device to perform the steps performed by the state sensing module 601 , the communication module 602 , the decision module 603 and the wake-up module 604 .
  • the storage module may be used to support the electronic device to execute stored program codes and data, and the like.
  • the communication module can be used to support the communication between the electronic device and other devices.
  • the processing module may be a processor or a controller. It may implement or execute the various exemplary logical blocks, modules and circuits described in connection with this disclosure.
  • the processor may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of digital signal processing (DSP) and a microprocessor, and the like.
  • the storage module may be a memory.
  • the communication module may specifically be a device that interacts with other electronic devices, such as a radio frequency circuit, a Bluetooth chip, and a Wi-Fi chip.
  • the electronic device involved in this embodiment may be a device having the structure shown in FIG. 1 .
  • This embodiment also provides a computer storage medium, where computer instructions are stored in the computer storage medium, and when the computer instructions are executed on the electronic device, the electronic device executes the above-mentioned relevant method steps to realize the voice assistant wake-up method in the above-mentioned embodiment. .
  • This embodiment also provides a computer program product, when the computer program product runs on the computer, the computer executes the above-mentioned relevant steps, so as to realize the voice assistant wake-up method in the above-mentioned embodiment.
  • the embodiments of the present application also provide an apparatus, which may specifically be a chip, a component or a module, and the apparatus may include a connected processor and a memory; wherein, the memory is used for storing computer execution instructions, and when the apparatus is running, The processor can execute the computer-executed instructions stored in the memory, so that the chip executes the voice assistant wake-up method in the above method embodiments.
  • the electronic device, computer storage medium, computer program product or chip provided in this embodiment are all used to execute the corresponding method provided above. Therefore, for the beneficial effects that can be achieved, reference can be made to the corresponding provided above. The beneficial effects in the method will not be repeated here.
  • the disclosed apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or May be integrated into another device, or some features may be omitted, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • Units described as separate components may or may not be physically separated, and components shown as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed in multiple different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium.
  • a readable storage medium including several instructions to make a device (which may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.

Abstract

一种语音助手唤醒方法、装置、电子设备及计算机可读存储介质,其中该方法包括:当第一电子设备接收到用户的语音数据时,获取传感器数据(S510);根据传感器数据确定状态感知数据(S520);然后接收来自至少一个第二电子设备的状态感知数据(S530);根据第一电子设备的状态感知数据和至少一个第二电子设备的状态感知数据,确定目标唤醒设备(S540);在目标唤醒设备为第一电子设备,唤醒语音助手(S550)。该方法根据多个电子设备的状态感知数据确定用户需要唤醒的电子设备上的语音助手,可以让电子设备更好地感知用户意图,从而可以从多个电子设备中使得唤醒的用户需求的电子设备上的语音助手,提高用户体验。

Description

语音助手唤醒方法及装置 技术领域
本申请涉及计算机技术领域,尤其涉及一种语音助手唤醒方法及装置。
背景技术
随着语音识别技术的广泛应用,移动终端内的语音助手逐渐成为一种人们经常使用的功能,用户可以通过向语音助手发出一些语音指令,以控制语音助手来完成对于移动终端的各种操作。
发明内容
本申请实施例提供一种语音助手唤醒方法及装置。
第一方面,本申请实施例提供一种语音助手唤醒方法,应用于第一电子设备,所述方法包括:
当接收到用户的语音数据时,获取传感器数据;
根据所述传感器数据确定状态感知数据;
接收来自至少一个第二电子设备的状态感知数据;
根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备;
若所述目标唤醒设备为所述第一电子设备,唤醒语音助手。
第二方面,本申请实施例提供一种语音助手唤醒装置,应用于电子设备,所述装置包括:
状态感知模块,用于当接收到用户的语音数据时,获取传感器数据;
所述状态感知模块,还用于根据所述传感器数据确定状态感知数据;
通信模块,用于接收来自至少一个第二电子设备的状态感知数据;
决策模块,用于根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备;
唤醒模块,用于若所述目标唤醒设备为所述第一设备,唤醒语音助手。
第三方面,本申请实施例提供一种电子设备,包括处理器、存储器、通信接口以及一个或多个程序,其中,上述一个或多个程序被存储在上述存储器中,并且被配置由上述处理器执行,上述程序包括用于执行本申请实施例第一方面任一方法中的步骤的指令。
第四方面,本申请实施例提供了一种计算机可读存储介质,其中,上述计算机可读存储介质存储用于电子数据交换的计算机程序,其中,上述计算机程序使得计算机执行如本申请实施例第一方面任一方法中所描述的部分或全部步骤。
第五方面,本申请实施例提供了一种计算机程序产品,其中,上述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,上述计算机程序可操作来使计算机执行如本申请实施例第一方面任一方法中所描述的部分或全部步骤。该计算机程序产品可以为一个软件安装包。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种电子设备的结构示意图;
图2是本申请实施例提供的一种电子设备的软件结构示意图;
图3是本申请实施例提供的一种语音助手唤醒的应用场景示意图;
图4是本申请实施例提供的另一种电子设备的结构示意图;
图5是本申请实施例提供的一种语音助手唤醒方法的流程示意图;
图5a是本申请实施例提供的一种多设备的场景示意图;
图5b是本申请实施例提供的一种持握状态的示意图;
图6是本申请实施例提供的一种语音助手唤醒装置的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。
在本申请中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本申请所描述的实施例可以与其它实施例相结合。
本申请实施例涉及的电子设备可以是还包含其它功能诸如个人数字助理和/或音乐播放器功能的便携式电子设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴电子设备(如智能手表)等。便携式电子设备的示例性实施例包括但不限于搭载IOS系统、Android系统、Microsoft系统或者其它操作系统的便携式电子设备。上述便携式电子设备也可以是其它便携式电子设备,诸如膝上型计算机(Laptop)等。还应当理解的是,在其他一些实施例中,上述电子设备也可以不是便携式电子设备,而是台式计算机。
第一部分,本申请所公开的技术方案的软硬件运行环境介绍如下。
示例性的,图1示出了电子设备100的结构示意图。电子设备100可以包括处理器110、外部存储器接口120、内部存储器121、通用串行总线(universal serial bus,USB)接口130、充电管理模块140、电源管理模块141、电池142、天线1、天线2、移动通信模块150、无线通信模块160、音频模块170、扬声器170A、受话器170B、麦克风170C、耳机接口170D、传感器模块180、指南针190、马达191、指示器192、摄像头193、显示屏194以及用户标识模块(subscriber identification module,SIM)卡接口195等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的部件,也可以集成在一个 或多个处理器中。在一些实施例中,电子设备100也可以包括一个或多个处理器110。其中,控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。在其他一些实施例中,处理器110中还可以设置存储器,用于存储指令和数据。示例性地,处理器110中的存储器可以为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。这样就避免了重复存取,减少了处理器110的等待时间,因而提高了电子设备100处理数据或执行指令的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路间(inter-integrated circuit,I2C)接口、集成电路间音频(inter-integrated circuit sound,I2S)接口、脉冲编码调制(pulse code modulation,PCM)接口、通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口、移动产业处理器接口(mobile industry processor interface,MIPI)、用输入输出(general-purpose input/output,GPIO)接口、SIM卡接口和/或USB接口等。其中,USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口、Micro USB接口、USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。该USB接口130也可以用于连接耳机,通过耳机播放音频。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110、内部存储器121、外部存储器、显示屏194、摄像头193和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量、电池循环次数、电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1、天线2、移动通信模块150、无线通信模块160、调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local  area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络)、蓝牙(blue tooth,BT),全球导航卫星系统(global navigation satellite system,GNSS)、调频(frequency modulation,FM)、近距离无线通信技术(near field communication,NFC)、红外技术(infrared,IR)、UWB等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像、视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)、有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED)、柔性发光二极管(flex light-emitting diode,FLED)、迷你发光二极管(mini light-emitting diode,miniled)、MicroLed、Micro-oLed、量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或多个显示屏194。
电子设备100可以通过ISP、摄像头193、视频编解码器、GPU、显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点、亮度、肤色进行算法优化。ISP还可以对拍摄场景的曝光、色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或多个摄像头193。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1、MPEG2、MPEG3、MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别、人脸识别、语音识别、文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储一个或多个计算机程序,该一个或多个计算机程序包括 指令。处理器110可以通过运行存储在内部存储器121的上述指令,从而使得电子设备100执行本申请一些实施例中所提供的显示页面元素的方法,以及各种应用以及数据处理等。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统;该存储程序区还可以存储一个或多个应用(比如图库、联系人等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如照片,联系人等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如一个或多个磁盘存储部件,闪存部件,通用闪存存储器(universal flash storage,UFS)等。在一些实施例中,处理器110可以通过运行存储在内部存储器121的指令,和/或存储在设置于处理器110中的存储器的指令,来使得电子设备100执行本申请实施例中所提供的显示页面元素的方法,以及其他应用及数据处理。电子设备100可以通过音频模块170、扬声器170A、受话器170B、麦克风170C、耳机接口170D、以及应用处理器等实现音频功能。例如音乐播放、录音等。
传感器模块180可以包括压力传感器180A、陀螺仪传感器180B、气压传感器180C、磁传感器180D、加速度传感器180E、距离传感器180F、接近光传感器180G、指纹传感器180H、温度传感器180J、触摸传感器180K、环境光传感器180L、骨传导传感器180M等。
其中,压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测所述触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即X、Y和Z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由 触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
示例性的,图2示出了电子设备100的软件结构框图。分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。应用程序层可以包括一系列应用程序包。
如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
Android Runtime包括核心库和虚拟机。Android runtime负责安卓系统的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(media libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。
请参阅图3,本申请实施例提供的语音助手唤醒方法可以应用于如图3所示的应用场景。如图3所示,该应用场景包括多个具备语音助手的电子设备,且唤醒词相同。在本申请实施例中,可将这种应用场景称为多设备场景。在该多设备场景下,用户在说出唤醒词后,采用本申请实施例的方法,即根据多个传感器数据判断电子设备所处的场景,并根据场景优先级选择唤醒最合适应答的电子设备上的语音助手,可以让电子设备更好地感知用户意图,进而让使用者感受到更智能的产品体验。
在一些实施例中,可以通过在电子设备中安装语音助手,以使该电子设备实现语音控制功能。语音助手一般情况下是处于休眠状态的。用户在使用电子设备的语音控制功能之前,需要对语音助手进行语音唤醒。其中,唤醒语音助手的语音数据可以称为唤醒词(或唤醒语音)。该唤醒词可以预先注册在电子设备中。本实施例中所述的唤醒语音助手可以是指,电子设备响应于用户说出的唤醒词,启动语音助手。语音控制功能可以是指:电子设备的语音助手启动后,用户通过说出语音命令(如,一段语音数据),可以触发电子设备自动执行该语音命令对应的事件。
另外,上述语音助手可以是电子设备中的嵌入式应用(即电子设备的系统应用),也可以是可下载应用。嵌入式应用是作为电子设备(如手机)实现的一部分提供的应用程序。可下载应用是一个可以提供自己的因特网协议多媒体子系统(Internet Protocol Multimedia Subsystem,IMS)连接的应用程序。可下载应用可以预先安装在电子设备中,也可是由用户下载并安装在电子设备中的第三方应用。
示例性地,如图4所示,所述电子设备可以包括状态感知模块、通信模块、决策模块和唤醒模块。唤醒模块主要用于唤醒语音助手;状态感知模块主要用于计算本电子设备的状态感知数据;决策模块用于负责决策去唤醒哪个电子设备上的语音助手,该模块采用分布式决策方式,每个电子设备的决策模块在收到其它电子设备的状态感知数据后,可根据预设的规则决策出目标唤醒电子设备,并通知唤醒模块;通信模块负责电子设备之间的消息发送和接收。
另,以下实施例中结合图3,以多个设备场景为,该多个电子设备均安装有语音助手,且唤醒词均为“小布小布”为例进行说明。
请参阅图5,图5是本申请实施例提供了一种语音助手唤醒方法的流程示意图,应用于如图3所示的任一电子设备,如图所示,本语音助手唤醒方法包括以下操作。
S510、当接收到用户的语音数据时,获取传感器数据。
在本申请实施例中,对于安装有语音助手的第一电子设备,在该电子设备没有其他软硬件使用麦克风采集语音数据的情况下,第一电子设备的DSP可以通过麦克风实时监测用户是否有语音数据输入。一般情况下,在用户想要使用电子设备的语音控制功能时,可以在电子设备的拾音距离内发声,以将发出的声音输入到麦克风。此时,若第一电子设备没有其他软硬件正在使用麦克风采集语音数据,则第一电子设备的DSP可以通过麦克风监测到对应的语音数据,如语音数据,并进行缓存。
例如,图5a所示,用户坐在客厅的沙发上,在想要使用语音控制功能播放音乐时,可以说出唤醒词“小布小布”。如手机、平板和笔记本均在用户的周围,即用户都在其拾音距离内,且均没有其他软硬件正在使用麦克风采集语音数据,则手机、平板和智能手表的DSP便可通过各自的麦克风检测到唤醒词“小布小布”对应的语音数据。
示例性地,在第一电子设备接收到上述语音数据后,可以对该语音数据进行校验,即判断接收到的该语音数据是否是注册在第一电子设备中的唤醒词。如果校验通过,则表明 接收到的语音数据是唤醒词,可获取传感器数据。如果校验未通过,则表明接收到的语音数据不是唤醒词,此时第一电子设备可以删除缓存的语音数据。
其中,当第一电子设备接收的语音数据是唤醒词时,第一电子设备可拉起决策系统服务,决策系统以服务的形式集成到操作系统中。当操作系统检测到唤醒词,决策系统服务开始注册传感器监听等初始化任务,即操作系统启动传感器监听第一电子设备所处的状态,并通过回调函数的方式将传感器数据传递回来。当所有传感器数据都传递回来后,才通知状态感知模块去读取传感器数据,计算第一电子设备所处的状态。
在本申请实施例中,通过回调函数的方式传递传感器数据是一种异步操作,可以提交系统效率。
S520、根据所述传感器数据确定状态感知数据。
其中,所述传感器数据包括以下至少一项:加速度传感器数据、角速度传感器数据、Z轴加速度数据、距离传感器数据和光线传感器数据,所述状态感知数据包括:第一状态数据、第二状态数据、第三状态数据和设备标识。
示例性地,第一状态数据可以是第一电子设备处于持握状态或平放状态所属数值范围内的数据,第二状态数据可以是第一电子设备处于倒置状态或非倒置状态所属数值范围内的数值,第三状态数据可以是第一电子设备处于裤兜状态、遮挡状态、白天状态、黑夜状态中的任一种所属数值范围内的数值。
可选的,所述根据所述传感器数据确定状态感知数据,包括:根据所述加速度传感器数据和所述角速度传感器数据确定所述第一状态数据;根据所述Z轴加速度数据确定所述第二状态数据;根据所述距离传感器数据和光线传感器数据确定所述第三状态数据。
其中,状态感知模块可以利用加速度传感器和角速度传感器确定电子设备是否处于“持握”或“平放”(非持握)状态对应的数值范围,通过利用加速度传感器,接近传感器和光线传感器确定电子设备是否处于“倒置”,“裤兜”,“遮挡”,“夜晚”和“白天”等五种状态对应的数值范围。
可选的,所述根据所述加速度传感器数据和所述角速度传感器数据确定所述第一状态数据,包括:
根据所述加速度传感器数据的值计算第一夹角,所述第一夹角为所述第一电子设备与水平面的夹角;若所述第一夹角大于或等于预设夹角,将所述第一状态数据的值设置为第一数值,所述第一数值为所述第一状态所属数值范围内的数值;若所述第一夹角小于所述预设夹角,根据所述角速度传感器数据的值判断所述第一电子设备是否抖动;若所述第一电子设备抖动,将所述第一状态数据的值设置为所述第一数值;否则将所述第一状态数据的值设置为第二数值,所述第二数值为非第一状态所属数值范围内的数值。
其中,上述第一状态可以为持握状态,非第一状态可以为平放状态;第二状态可以为倒置状态,非第二状态为非倒置状态;第三状态为裤兜状态。上述第一数值可以是第一状态所属数值范围内的任一数值,第二数值可以是非第一状态所属数值范围内的任一数值。示例性地,还可以构建第一夹角与第一状态所属数值范围内数值的映射关系、第一夹角与非第一状态所属数值范围内数值的映射关系,从而可根据映射关系来确定第一数值或第二数值。可以理解的是,第一状态所属数值范围与非第一状态所属数值范围并不重合。
在实际应用中,可通过判断第一电子设备是否处于持握状态,来确定第一电子设备当前是否被用户使用。通过使用加速度传感器数据来计算第一夹角,可以使得第一电子设备在横屏或竖屏的状态下,只要第一电子设备与水平面的夹角超过预设夹角时,将第一状态数据的值设置为第一数值,可使得决策模块将第一电子设备判定为处于持握状态。
示例性地,如图5b所示,所述持握状态包括但不限于以下几种场景,在如图5b所示的每种场景,第一电子设备都会检测到处于持握状态。
示例性地,上述使用加速度传感器数据计算手机与水平面夹角的公式可表示为:
Figure PCTCN2021141207-appb-000001
其中,所述acc_x和acc_y分别为x轴和y轴的重力加速度分量;g为重力加速度,取值为9.81。
示例性地,所述预设夹角可以由系统配置,例如15度;也可以根据用户的使用习惯进行设置。当然本申请对于其他设置预设夹角的方式也不进行限制。
进一步地,在一些场景中,当第一夹角小于预设夹角时,第一电子设备可能仍处于持握状态,例如,用户将第一电子设备放于掌心使用。因此,在第一夹角小于预设夹角时,第一电子设备可以启动抖动检测。通过抖动检测,可以将处于持握状态但第一夹角小于预设夹角的场景下的第一电子设备检测出来。
可选的,所述根据所述角速度传感器数据的值判断所述第一电子设备是否抖动,包括:
根据所述角速度传感器数据的值计算第一角速度,所述第一角速度为所述第一电子设备与所述水平面的角速度;若所述第一角速度小于第一预设角速度且大于第二预设角速度,将所述第一角速度存储于缓存中,否则,清空所述缓存;在所述缓存中的第一角速度的数量大于或等于统计窗口时,计算在采样周期内统计窗口内的第一角速度的平均值;若所述平均值大于或等于第三预设角速度,或者所述第一角速度大于第一预设角速度,确定所述第一电子设备抖动,否则确定所述第一电子设备静止。
示例性地,所述第一角速度为第一电子设备的角速度向量的模,该角速度向量的模计算公式可表示为:
Figure PCTCN2021141207-appb-000002
其中,所述axisX、axisY、axisZ分别为角速度传感器x轴、y轴、z轴的角速度。
示例性地,上述第一预设角速度、第二预设角速度、第三预设角速度、采用周期、统计窗口可以由系统设置,例如,第一预设角速度设置为0.030rad/s、第二预设角速度设置为0.002rad/s、第三预设角速度设置为0.010rad/s、采样周期设置为4、统计窗口设置为20;也可以根据具体实际应用场景进行设置,本申请实施例对此不做限定。
在实际应用中,角速度传感器(例如陀螺仪)上报角速度数据的周期非常短(2.5ms一包),但是在陀螺仪测量的角速度数据的波动可能会持续一段时间,因此通过拉大统计窗口和间隔采样来过滤掉这种波动。
具体地,状态感知模块接收到陀螺仪上报的角速度瞬时值时,计算上报的角速度向量的模;若角速度向量的模大于第一预设角速度v1或小于第二预设角速度v2时,则状态感知模块清空存储器buffer中存储的角速度向量的模,并在角速度向量的模大于v1时,将第一电子设备判定为抖动;在角速度向量的模小于v2是,将第一电子设备判定为静止。若角速度向量的模大于第一预设角速度v1且小于第二预设角速度v2时,将该角速度向量的模输入buffer中进行缓存,若buffer中存储的角速度向量的模的数量是否小于统计窗口Wstatistics_window,则状态感知模块对下一个接收的角速度瞬时值进行上述计算;若buffer中存储的角速度向量的模的数量是否大于或等于Wstatistics_window,使用采样周期Isamping_interval,间隔采样统计窗口中的角速度,采样的加速度数量为Wstatistics_window/Isamping_interval个,然后计算该Wstatistics_window/Isamping_interval个角速度的平均角速度值Avg,从而减少角速度数据的波动。若平均角速度值Avg小于第三预设角速度,则判断第一电子设备静止;若平均角速度值Avg小于或等于第三预设角速 度,则判断第一电子设备抖动。
可选的,所述根据所述Z轴加速度数据确定所述第二状态数据,包括:若所述Z轴加速度数据的值小于加速度阈值,则将所述第二状态数据的值设置为第三数值,所述第三数值为所述第二状态所属数值范围内的数值,否则将所述第二状态数据的值设置为第四数值,所述第四数值为非第二状态所属数值范围内的数值。
其中,状态感知模块根据加速度传感器上报的z轴上的加速度来设置第二状态数据,从而检测第一电子设备是否处于倒置状态。具体为,若z轴加速度数据的值小于加速度阈值,则将第二状态数据的值设置为第三数值,进而决策模块可将第一电子设备判定为处于倒置状态;若z轴加速度数据的值大于或等于加速度阈值,则将第二状态数据的值设置为第四数值,使得决策模块可将第一电子设备判定为处于非倒置状态。
示例性地,上述加速度阈值可以设置为-9m/s 2,将电子设备倒置时,Z轴的重力加速度与重力加速度的方向相反,Z轴的加速度会小于加速度阈值。
需要说明的是,上述第三数值可以是第二状态所述数值范围内的任一数值。示例性地,所述Z轴加速度数据的值还可以与第二状态所属数值范围内的数值呈映射关系,例如,第二状态所属数值范围为[-1,-2],当Z轴加速度处于-9.1m/s 2--9.4m/s 2范围内时,可将第二状态数据设置为-1;当Z轴加速度处于-9.5m/s 2--9.8m/s 2范围内时,可将第二状态数据设置为-2。上述第四数值可以是非第二状态所述数值范围内的任一数值。示例性地,所述Z轴加速度数据的值还可以与非第二状态所属数值范围内的数值呈映射关系,例如,非第二状态所属数值范围为[1,4],当Z轴加速度处于-9.0m/s 2--5.0m/s 2范围内时,可将第二状态数据设置为1;当Z轴加速度处于-4.9m/s 2--1.0m/s 2范围内时,可将第二状态数据设置为2;当Z轴加速度处于-0.9m/s 2-4.0m/s 2范围内时,可将第二状态数据设置为3;当Z轴加速度处于4.1m/s 2-9.8m/s 2范围内时,可将第二状态数据设置为4。当然,本申请实施例也不限制其他将第三数值与第二状态所属数值范围内的数值,以及第四数值与非第二状态所述数值范围内的数值的映射方法。
可选的,所述根据所述距离传感器数据和光线传感器数据确定所述第三状态数据,包括:若所述距离传感器数据的值小于遮挡距离阈值,且所述光线传感器数据的值小于光照强度阈值,则将所述第三状态数据的值设置为第五数值,所述第五数值为所述第三状态所属数值范围内的数值;否则,将所述第三状态数据的值设置为第六数值,所述第六数值为非第三状态所属数值范围内的数值。
其中,所述第三状态为裤兜状态,当第一电子设备处于裤兜状态时,表示用户当前使用第一电子设备的意愿很小,即用户选择唤醒第一电子设备上的语音助手的概率很小。由于裤兜场景中的光线暗同时又有遮挡,可以通过距离传感器和光线传感器来设置第三状态数据的值,进而使得决策模块检测第一电子设备是否处于裤兜状态。具体为当距离传感器上报的数据值小于或等于遮挡距离阈值且光线传感器上报的数据值小于或等于光照强度阈值时,将第三状态数据的值设置为第五数值,决策模块根据第五数值可以判定第一电子设备处于裤兜状态,否则将第三状态数据的值设置为第六数值,决策模块根据第六数值判定第一电子设备处于非裤兜状态。
示例性地,所述遮挡距离阈值可以设置为0、0.1cm、0.2cm等等,所述光照强度阈值可以设置为10勒克斯。
示例性地,所述遮挡距离阈值和光照强度阈值也可以根据用户的习惯进行设置,或者根据具体实际场景进行设置,本申请实施例对此不做限定。
可选的,所述非第三状态包括第四状态、第五状态和第六状态。
示例性地,所述第四状态可以是遮挡状态,所述第五状态可以是夜晚状态,第六状态可以是白天状态。
可选的,所述将所述第三状态数据的值设置为第六数值,包括:若所述距离传感器数据的值小于遮挡距离阈值,且所述光线传感器数据的值大于或等于光照强度阈值,则将所述第三状态数据的值设置为第七数值,所述第七数值为所述第四状态所属数值范围内的数值;若所述距离传感器数据的值大于或等于遮挡距离阈值,且所述光线传感器数据的值小于光照强度阈值,则将所述第三状态数据的值设置为第八数值,所述第八数值为所述第五状态所属数值范围内的数值;若所述距离传感器数据的值大于或等于遮挡距离阈值,且所述光线传感器数据的值大于或等于光照强度阈值,则将所述第三状态数据的值设置为第九数值,所述第九数值为所述第六状态所属数值范围内的数值。
其中,上述非第三状态所属数值范围可以是第四状态所属数值范围、第五状态所属数值范围和第六状态所属数值范围的并集,即第六数值可以是第四状态所属数值范围、第五状态所属数值范围和第六状态所属数值范围中任一数值。
在实际应用中,根据距离传感器可以检测电子设备与目标物体之间的距离,因此通过距离传感器数据的值还可以检测出第一电子设备未放置于裤兜但又有遮挡物的遮挡状态,例如,将手机放于耳侧(如接电听话)的场景。根据光线传感器可以检测出电子设备当前所处环境的光亮程度,因此通过光线传感器数据的值可以检测出第一电子设备是处于黑夜状态还是白天状态。
可选的,所述方法还包括:向所述至少一个第二电子设备发送所述第一电子设备的状态感知数据。
其中,当第一电子设备中的状态感知模块根据传感器数据确定出状态感知数据后,可以将该状态感知数据发送给周围的第二电子设备,以使得该第二电子设备根据周围的多设备的状态感知数据进行决策来选择目标唤醒设备。
示例性地,第一电子设备可在通信模块内设置一个定时器,当第一电子设备接收的语音数据是唤醒词时,开启该定时器,然后在该定时器内广播状态感知模块计算出的状态感知数据。
在本申请实施例中,电子设备将状态感知数据广播以进行多设备状态的同步,可以减少因建立连接带来的时间开销;同时,传感器数据的实时处理,可不需要收集多次传感器数据再做计算,减少了收集传感器数据的时间。这样给用户带来新增功能的同时可以很好地控制语音助手的唤醒时间。
S530、接收来自至少一个第二电子设备的状态感知数据。
在本实施例中,当第一电子设备接收的语音数据是唤醒词时,第一电子设备可使用通信模块监听周围第二电子设备发送的状态感知数据。示例性地,第一电子设备可在通信模块内设置另一个定时器,当第一电子设备接收的语音数据是唤醒词时,开启该定时器,然后在该定时器内接收第一电子设备发送的状态感知模块。示例性地,第一电子设备可在通信模块内设置一个定时器用于,开启广播和接收状态感知数据,定时器超时自动停止广播和接收状态感知数据。
示例性地,上述定时器可由系统设置,例如设置为300ms超时;也可以根据具体实际场景进行设置,本申请实施例对此不做限定。
在本申请实施例中,通过采用广播和接收的方式来同步多设备的状态感知数据,同时采用实时传感器数据处理的方式,可以有效地降低决策时延,提高用户体验。
示例性地,通信模块可封装多种近场通信方式,例如蓝牙(Bluetooth,BT),低功耗蓝牙(Bluetooth Low Energy,BLE),WIFI直连(direct),无线局域网(Wireless Local Area Network,WLAN)等。调用通信模块时可通过参数或者配置文件指定具体选择哪一种通信方式。以BLE为例,BLE广播包会传输包括通用唯一识别码(Universally Unique Identifier,UUID),MAC地址,蓝牙名字等一系列数据,但是仍然有一部分剩余空间可用于传输自 定义数据,通过计算可知每一条广播包剩余空间最多可以放23个byte,通过将状态感知模块计算出状态感知数据放置于剩余空间中发送给第二电子设备,可以节省BLE GATT建连的时间,进而减少唤醒所需延迟。
在本申请实施例中,若第一电子设备与第二电子设备之间已经建立了连接,则第一电子设备和/或第二电子设备可以复用这条已经建立好的连接,直接进行数据包发送与接收,这样面向已连接的通信方式更可靠,也更安全,同时时延也更低。
需要说明的是,本申请实施例对S520和S530的执行顺序不做限定,即第一电子设备可以同时执行S520和S530;也可先执行S530,在执行S520;还可以先执行S520,在执行S530。
S540、根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备。
在本实施例中,通信模块接收到第二电子设备发送的状态感知数据后,可将第二电子设备的状态感知数据和第一电子设备的状态感知数据一并发送给决策模块,决策模块根据多个设备的状态感知模块确定出目标唤醒设备。
可选的,所述根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备,包括:
根据所述第二状态数据,确定第一候选设备的数量,所述第一候选设备为处于第二状态的所述第一电子设备和/或所述第二电子设备;若所述第一候选设备的数量为1,则将所述第一候选设备确定为所述目标唤醒设备;若所述第一候选设备的数量大于1,则根据所述第三状态数据,从所述第一候选设备中确定第二候选设备的数量,所述第二候选设备为处于第三状态的所述第一候选设备;若所述第二候选设备的数量为1,则将所述第二候选设备确定为所述目标唤醒设备;若所述第二候选设备的数量大于1,则根据所述第一状态数据,从所述第二候选设备中确定第三候选设备的数量,所述第三候选设备为处于第一状态的所述第二候选设备;若所述第三候选设备的数量为1,则将所述第三候选设备确定为所述目标唤醒设备;若所述第二候选设备的数量大于1,则根据所述设备标识从所述第三候选设备中确定所述目标唤醒设备。
具体地,状态感知数据的数据结构可以表示为{isHoldInHand;isUpsideDown;isInPocket;deviceId}。其中isHoldInHand为第一状态数据的值,isUpsideDown为第二状态数据的值,isInPocket为第三状态数据的值,deviceId为设备ID。
在本实施例中,采用链式决策方法,即根据状态感知数据依次进行选择,当选出了唯一的目标唤醒设备时就立即退出,不会往后继续传递,如果传递到最后一条规则,仍没有选择出目标唤醒设备时,则通过设备ID来选择目标唤醒设备。
具体地,首先,决策模块根据状态感知数据中的第二状态数据,从多个设备中选择出处于非第二状态的第一电子设备和/第二电子设备(即处于非倒置状态下的电子设备),即当电子设备的第二状态数据的值处于第二状态所述数值范围内,则判定该电子设备处于第二状态。若只有一个电子设备处于非第二状态,则将该电子设备确定为目标唤醒设备;若有多个电子设备处于非第二状态下,则从该多个处于非第二状态下电子设备中选择处于非第三状态下的电子设备(处于第四状态、第五状态和第六状态的电子设备),具体为若电子设备的第三状态数据的值处于非第三状态所述数值范围内,则判定该电子设备处于非第三状态;若没有一个电子设备处于非第二状态(都处于倒置状态),则从多个设备中选择出非第三状态下的电子设备。若只有一个电子设备处于非第三状态,则直接将该电子设备确定为目标唤醒设备;若有多个电子设备处于非第三状态,则从该多个处于非第三状态下的电子设备中选择处于第一状态下的电子设备,具体为若电子设备的第一状态数据的值处于第一状态所述数值范围内,则判定该电子设备处于第一状态,若没有一个电子设备处于 非第三状态(都处于裤兜状态),则从多个设备或多个处于非第二状态下电子设备中选择出第一状态下的电子设备。若只有一个电子设备处于第一状态,则直接将该电子设备确定为目标唤醒设备,若有多个或没有一个电子设备处于第一状态,则根据设备ID选择目标唤醒设备。
其中,通过设备ID选择唯一的设备的方式包括但不局限于选择设备ID最小或者设备ID最大或者通过随机算法随机选择一个设备ID。
在本申请实施例中,电子设备根据多个传感器检测的数据来判断电子设备所处的多种场景,并根据场景优先级选择唤醒最合适应答的设备,可以让电子设备更好地感知用户意图,进而让使用者感受到更智能的产品体验。
S550、若所述目标唤醒设备为所述第一电子设备,唤醒语音助手。
其中,决策模块选择出的目标唤醒设备的获取感知数据结构传递给唤醒模块,唤醒模块通过deviceId来判断是否是自己,如果不是则不拉起语音助手,反之则拉起语音助手进行语音应答。
可以看出,本申请实施例提出的语音助手唤醒方法,当第一电子设备接收到用户的语音数据时,获取传感器数据,根据传感器数据确定状态感知数据,然后接收来自至少一个第二电子设备的状态感知数据;根据第一电子设备的状态感知数据和至少一个第二电子设备的状态感知数据,确定目标唤醒设备,在目标唤醒设备为第一电子设备,唤醒语音助手。本申请根据多个电子设备的状态感知数据确定用户需要唤醒的电子设备上的语音助手,可以让电子设备更好地感知用户意图,从而可以从多个电子设备中使得唤醒的用户需求的电子设备上的语音助手,提高用户体验。
可以理解的是,电子设备为了实现上述功能,其包含了执行各个功能相应的硬件和/或软件模块。结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以结合实施例对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本实施例可以根据上述方法示例对电子设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块可以采用硬件的形式实现。需要说明的是,本实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图6示出了语音助手唤醒装置的结构示意图,如图6所示,该语音助手唤醒装置600应用于电子设备,该语音助手唤醒装置600可以包括:状态感知模块601、通信模块602、决策模块603和唤醒模块604。
其中,状态感知模块601可以用于支持电子设备执行上S510、S520等,和/或用于本文所描述的技术的其他过程。
通信模块602可以用于支持电子设备执行上述S530等,和/或用于本文所描述的技术的其他过程。
决策模块603可以用于支持电子设备执行上述S540等,和/或用于本文所描述的技术的其他过程。
唤醒模块604可以用于支持电子设备执行上述S550等,和/或用于本文所描述的技术的其他过程。
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
本实施例提供的电子设备,用于执行上述语音助手唤醒方法,因此可以达到与上述实 现方法相同的效果。
在采用集成的单元的情况下,电子设备可以包括处理模块、存储模块和通信模块。其中,处理模块可以用于对电子设备的动作进行控制管理,例如,可以用于支持电子设备执行上述状态感知模块601、通信模块602、决策模块603和唤醒模块604执行的步骤。存储模块可以用于支持电子设备执行存储程序代码和数据等。通信模块,可以用于支持电子设备与其他设备的通信。
其中,处理模块可以是处理器或控制器。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理(digital signal processing,DSP)和微处理器的组合等等。存储模块可以是存储器。通信模块具体可以为射频电路、蓝牙芯片、Wi-Fi芯片等与其他电子设备交互的设备。
在一个实施例中,当处理模块为处理器,存储模块为存储器时,本实施例所涉及的电子设备可以为具有图1所示结构的设备。
本实施例还提供一种计算机存储介质,该计算机存储介质中存储有计算机指令,当该计算机指令在电子设备上运行时,使得电子设备执行上述相关方法步骤实现上述实施例中的语音助手唤醒方法。
本实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现上述实施例中的语音助手唤醒方法。
另外,本申请的实施例还提供一种装置,这个装置具体可以是芯片,组件或模块,该装置可包括相连的处理器和存储器;其中,存储器用于存储计算机执行指令,当装置运行时,处理器可执行存储器存储的计算机执行指令,以使芯片执行上述各方法实施例中的语音助手唤醒方法。
其中,本实施例提供的电子设备、计算机存储介质、计算机程序产品或芯片均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来, 该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种语音助手唤醒方法,应用于第一电子设备,所述方法包括:
    当接收到用户的语音数据时,获取传感器数据;
    根据所述传感器数据确定状态感知数据;
    接收来自至少一个第二电子设备的状态感知数据;
    根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备;
    若所述目标唤醒设备为所述第一电子设备,唤醒语音助手。
  2. 根据权利要求1所述的方法,其中,所述传感器数据包括以下至少一项:加速度传感器数据、角速度传感器数据、Z轴加速度数据、距离传感器数据和光线传感器数据,所述状态感知数据包括:第一状态数据、第二状态数据和第三状态数据;
    所述根据所述传感器数据确定状态感知数据,包括:
    根据所述加速度传感器数据和所述角速度传感器数据确定所述第一状态数据;
    根据所述Z轴加速度数据确定所述第二状态数据;
    根据所述距离传感器数据和光线传感器数据确定所述第三状态数据。
  3. 根据权利要求2所述的方法,其中,所述状态感知数据还包括设备标识;
    所述根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备,包括:
    根据所述第二状态数据,确定第一候选设备的数量,所述第一候选设备为处于非第二状态的所述第一电子设备和/或所述第二电子设备;
    若所述第一候选设备的数量为1,则将所述第一候选设备确定为所述目标唤醒设备,若所述第一候选设备的数量大于1,则根据所述第三状态数据,从所述第一候选设备中确定第二候选设备的数量,所述第二候选设备为处于非第三状态的所述第一候选设备;
    若所述第二候选设备的数量为1,则将所述第二候选设备确定为所述目标唤醒设备,若所述第二候选设备的数量大于1,则根据所述第一状态数据,从所述第二候选设备中确定第三候选设备的数量,所述第三候选设备为处于第一状态的所述第二候选设备;
    若所述第三候选设备的数量为1,则将所述第三候选设备确定为所述目标唤醒设备,若所述第二候选设备的数量大于1,则根据所述设备标识从所述第三候选设备中确定所述目标唤醒设备。
  4. 根据权利要求2或3所述的方法,其中,根据所述加速度传感器数据和所述角速度传感器数据确定所述第一状态数据,包括:
    根据所述加速度传感器数据的值计算第一夹角,所述第一夹角为所述第一电子设备与水平面的夹角;
    若所述第一夹角大于或等于预设夹角,将所述第一状态数据的值设置为第一数值,所述第一数值为所述第一状态所属数值范围内的数值;
    若所述第一夹角小于所述预设夹角,根据所述角速度传感器数据的值判断所述第一电子设备是否抖动;
    若所述第一电子设备抖动,将所述第一状态数据的值设置为所述第一数值;否则将所述第一状态数据的值设置为第二数值,所述第二数值为非第一状态所属数值范围内的数值。
  5. 根据权利要求4所述的方法,其中,所述根据所述角速度传感器数据的值判断所述第一电子设备是否抖动,包括:
    根据所述角速度传感器数据的值计算第一角速度,所述第一角速度为所述第一电子设备与所述水平面的角速度;
    若所述第一角速度小于第一预设角速度且大于第二预设角速度,将所述第一角速度存储于缓存中,否则,清空所述缓存;
    在所述缓存中的第一角速度的数量大于或等于统计窗口时,计算在采样周期内统计窗口内的第一角速度的平均值;
    若所述平均值大于或等于第三预设角速度,或者所述第一角速度大于第一预设角速度,确定所述第一电子设备抖动,否则确定所述第一电子设备静止。
  6. 根据权利要求2或3所述的方法,其中,所述根据所述Z轴加速度数据确定所述第二状态数据,包括:
    若所述Z轴加速度数据的值小于加速度阈值,则将所述第二状态数据的值设置为第三数值,所述第三数值为所述第二状态所属数值范围内的数值,否则将所述第二状态数据的值设置为第四数值,所述第四数值为非第二状态所属数值范围内的数值。
  7. 根据权利要求2或3所述的方法,其中,所述根据所述距离传感器数据和光线传感器数据确定所述第三状态数据,包括:
    若所述距离传感器数据的值小于遮挡距离阈值,且所述光线传感器数据的值小于光照强度阈值,则将所述第三状态数据的值设置为第五数值,所述第五数值为所述第三状态所属数值范围内的数值;否则,将所述第三状态数据的值设置为第六数值,所述第六数值为非第三状态所属数值范围内的数值。
  8. 根据权利要求7所述的方法,其中,所述非第三状态包括第四状态、第五状态和第六状态;
    所述将所述第三状态数据的值设置为第六数值,包括:
    若所述距离传感器数据的值小于遮挡距离阈值,且所述光线传感器数据的值大于或等于光照强度阈值,则将所述第三状态数据的值设置为第七数值,所述第七数值为所述第四状态所属数值范围内的数值;
    若所述距离传感器数据的值大于或等于遮挡距离阈值,且所述光线传感器数据的值小于光照强度阈值,则将所述第三状态数据的值设置为第八数值,所述第八数值为所述第五状态所属数值范围内的数值;
    若所述距离传感器数据的值大于或等于遮挡距离阈值,且所述光线传感器数据的值大于或等于光照强度阈值,则将所述第三状态数据的值设置为第九数值,所述第九数值为所述第六状态所属数值范围内的数值。
  9. 根据权利要求1-8任一项所述的方法,其中,所述方法还包括:
    向所述至少一个第二电子设备发送所述第一电子设备的状态感知数据。
  10. 一种语音助手唤醒装置,应用于第一电子设备,所述装置包括:
    状态感知模块,用于当接收到用户的语音数据时,获取传感器数据;
    所述状态感知模块,还用于根据所述传感器数据确定状态感知数据;
    通信模块,用于接收来自至少一个第二电子设备的状态感知数据;
    决策模块,用于根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备;
    唤醒模块,用于若所述目标唤醒设备为所述第一设备,唤醒语音助手。
  11. 根据权利要求10所述的装置,其中,所述传感器数据包括以下至少一项:加速度传感器数据、角速度传感器数据、Z轴加速度数据、距离传感器数据和光线传感器数据,所述状态感知数据包括:第一状态数据、第二状态数据和第三状态数据;
    在根据所述传感器数据确定状态感知数据方面,所述状态感知模块具体用于:
    根据所述加速度传感器数据和所述角速度传感器数据确定所述第一状态数据;
    根据所述Z轴加速度数据确定所述第二状态数据;
    根据所述距离传感器数据和光线传感器数据确定所述第三状态数据。
  12. 根据权利要求11所述的装置,其中,所述状态感知数据还包括设备标识;
    在根据所述第一电子设备的状态感知数据和所述至少一个第二电子设备的状态感知数据,确定目标唤醒设备方面,所述决策模块具体用于:
    根据所述第二状态数据,确定第一候选设备的数量,所述第一候选设备为处于非第二状态的所述第一电子设备和/或所述第二电子设备;
    若所述第一候选设备的数量为1,则将所述第一候选设备确定为所述目标唤醒设备,若所述第一候选设备的数量大于1,则根据所述第三状态数据,从所述第一候选设备中确定第二候选设备的数量,所述第二候选设备为处于非第三状态的所述第一候选设备;
    若所述第二候选设备的数量为1,则将所述第二候选设备确定为所述目标唤醒设备,若所述第二候选设备的数量大于1,则根据所述第一状态数据,从所述第二候选设备中确定第三候选设备的数量,所述第三候选设备为处于第一状态的所述第二候选设备;
    若所述第三候选设备的数量为1,则将所述第三候选设备确定为所述目标唤醒设备,若所述第二候选设备的数量大于1,则根据所述设备标识从所述第三候选设备中确定所述目标唤醒设备。
  13. 根据权利要求11或12所述的装置,其中,在根据所述加速度传感器数据和所述角速度传感器数据确定所述第一状态数据方面,所述状态感知模块具体用于:
    根据所述加速度传感器数据的值计算第一夹角,所述第一夹角为所述第一电子设备与水平面的夹角;
    若所述第一夹角大于或等于预设夹角,将所述第一状态数据的值设置为第一数值,所述第一数值为所述第一状态所属数值范围内的数值;
    若所述第一夹角小于所述预设夹角,根据所述角速度传感器数据的值判断所述第一电子设备是否抖动;
    若所述第一电子设备抖动,将所述第一状态数据的值设置为所述第一数值;否则将所述第一状态数据的值设置为第二数值,所述第二数值为非第一状态所属数值范围内的数值。
  14. 根据权利要求13所述的装置,其中,在根据所述角速度传感器数据的值判断所述第一电子设备是否抖动方面,所述状态感知模块具体用于:
    根据所述角速度传感器数据的值计算第一角速度,所述第一角速度为所述第一电子设备与所述水平面的角速度;
    若所述第一角速度小于第一预设角速度且大于第二预设角速度,将所述第一角速度存储于缓存中,否则,清空所述缓存;
    在所述缓存中的第一角速度的数量大于或等于统计窗口时,计算在采样周期内统计窗口内的第一角速度的平均值;
    若所述平均值大于或等于第三预设角速度,或者所述第一角速度大于第一预设角速度,确定所述第一电子设备抖动,否则确定所述第一电子设备静止。
  15. 根据权利要求11或12所述的装置,其中,在根据所述Z轴加速度数据确定所述第二状态数据方面,所述状态感知模块具体用于:
    若所述Z轴加速度数据的值小于加速度阈值,则将所述第二状态数据的值设置为第三数值,所述第三数值为所述第二状态所属数值范围内的数值,否则将所述第二状态数据的值设置为第四数值,所述第四数值为非第二状态所属数值范围内的数值。
  16. 根据权利要求11或12所述的装置,其中,在根据所述距离传感器数据和光线传感器数据确定所述第三状态数据方面,所述状态感知模块具体用于:
    若所述距离传感器数据的值小于遮挡距离阈值,且所述光线传感器数据的值小于光照强度阈值,则将所述第三状态数据的值设置为第五数值,所述第五数值为所述第三状态所 属数值范围内的数值;否则,将所述第三状态数据的值设置为第六数值,所述第六数值为非第三状态所属数值范围内的数值。
  17. 根据权利要求16所述的装置,其中,所述非第三状态包括第四状态、第五状态和第六状态;
    在将所述第三状态数据的值设置为第六数值方面,所述状态感知模块具体用于:
    若所述距离传感器数据的值小于遮挡距离阈值,且所述光线传感器数据的值大于或等于光照强度阈值,则将所述第三状态数据的值设置为第七数值,所述第七数值为所述第四状态所属数值范围内的数值;
    若所述距离传感器数据的值大于或等于遮挡距离阈值,且所述光线传感器数据的值小于光照强度阈值,则将所述第三状态数据的值设置为第八数值,所述第八数值为所述第五状态所属数值范围内的数值;
    若所述距离传感器数据的值大于或等于遮挡距离阈值,且所述光线传感器数据的值大于或等于光照强度阈值,则将所述第三状态数据的值设置为第九数值,所述第九数值为所述第六状态所属数值范围内的数值。
  18. 根据权利要求10-17任一项所述的装置,其中,所述通信模块还用于:
    向所述至少一个第二电子设备发送所述第一电子设备的状态感知数据。
  19. 一种电子设备,包括处理器、存储器、通信接口,以及一个或多个程序,所述一个或多个程序被存储在所述存储器中,并且被配置由所述处理器执行,所述程序包括用于执行如权利要求1-9任一项所述的方法中的步骤的指令。
  20. 一种计算机可读存储介质,包括存储用于电子数据交换的计算机程序,其中,所述计算机程序使得计算机执行如权利要求1-9任一项所述的方法。
PCT/CN2021/141207 2021-03-10 2021-12-24 语音助手唤醒方法及装置 WO2022188511A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110261894.XA CN115083400A (zh) 2021-03-10 2021-03-10 语音助手唤醒方法及装置
CN202110261894.X 2021-03-10

Publications (1)

Publication Number Publication Date
WO2022188511A1 true WO2022188511A1 (zh) 2022-09-15

Family

ID=83227373

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/141207 WO2022188511A1 (zh) 2021-03-10 2021-12-24 语音助手唤醒方法及装置

Country Status (2)

Country Link
CN (1) CN115083400A (zh)
WO (1) WO2022188511A1 (zh)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107004412A (zh) * 2014-11-28 2017-08-01 微软技术许可有限责任公司 用于监听设备的设备仲裁
CN107430501A (zh) * 2015-03-08 2017-12-01 苹果公司 对语音触发进行响应的竞争设备
US20170357478A1 (en) * 2016-06-11 2017-12-14 Apple Inc. Intelligent device arbitration and control
CN107506166A (zh) * 2017-08-04 2017-12-22 珠海市魅族科技有限公司 信息提示方法及装置、计算机装置及可读存储介质
CN108196819A (zh) * 2018-01-30 2018-06-22 广东小天才科技有限公司 应用于终端的工作模式切换方法、装置及电子设备
CN109391528A (zh) * 2018-08-31 2019-02-26 百度在线网络技术(北京)有限公司 语音智能设备的唤醒方法、装置、设备及存储介质
CN110322878A (zh) * 2019-07-01 2019-10-11 华为技术有限公司 一种语音控制方法、电子设备及系统
CN110335601A (zh) * 2019-07-10 2019-10-15 三星电子(中国)研发中心 语音助手设备及其语音唤醒方法
CN111276139A (zh) * 2020-01-07 2020-06-12 百度在线网络技术(北京)有限公司 语音唤醒方法及装置

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107004412A (zh) * 2014-11-28 2017-08-01 微软技术许可有限责任公司 用于监听设备的设备仲裁
CN107430501A (zh) * 2015-03-08 2017-12-01 苹果公司 对语音触发进行响应的竞争设备
US20170357478A1 (en) * 2016-06-11 2017-12-14 Apple Inc. Intelligent device arbitration and control
CN107506166A (zh) * 2017-08-04 2017-12-22 珠海市魅族科技有限公司 信息提示方法及装置、计算机装置及可读存储介质
CN108196819A (zh) * 2018-01-30 2018-06-22 广东小天才科技有限公司 应用于终端的工作模式切换方法、装置及电子设备
CN109391528A (zh) * 2018-08-31 2019-02-26 百度在线网络技术(北京)有限公司 语音智能设备的唤醒方法、装置、设备及存储介质
CN110322878A (zh) * 2019-07-01 2019-10-11 华为技术有限公司 一种语音控制方法、电子设备及系统
CN110335601A (zh) * 2019-07-10 2019-10-15 三星电子(中国)研发中心 语音助手设备及其语音唤醒方法
CN111276139A (zh) * 2020-01-07 2020-06-12 百度在线网络技术(北京)有限公司 语音唤醒方法及装置

Also Published As

Publication number Publication date
CN115083400A (zh) 2022-09-20

Similar Documents

Publication Publication Date Title
WO2021052263A1 (zh) 语音助手显示方法及装置
WO2020259452A1 (zh) 一种移动终端的全屏显示方法及设备
CN109814766B (zh) 一种应用显示方法及电子设备
WO2021213164A1 (zh) 应用界面交互方法、电子设备和计算机可读存储介质
WO2021036770A1 (zh) 一种分屏处理方法及终端设备
WO2021043046A1 (zh) 一种资源管控方法及设备
WO2021223539A1 (zh) 射频资源分配方法及装置
WO2021052200A1 (zh) 一种设备能力调度方法及电子设备
WO2020150917A1 (zh) 一种应用权限的管理方法及电子设备
WO2022199509A1 (zh) 应用执行绘制操作的方法及电子设备
JP7397861B2 (ja) スタイラスペン検出方法、システムおよび関連装置
WO2021036830A1 (zh) 一种折叠屏显示应用方法及电子设备
WO2022042770A1 (zh) 控制通信服务状态的方法、终端设备和可读存储介质
WO2022017474A1 (zh) 任务处理方法及相关装置
WO2021052139A1 (zh) 手势输入方法及电子设备
CN113806105A (zh) 消息处理方法、装置、电子设备和可读存储介质
WO2022078105A1 (zh) 内存管理方法、电子设备以及计算机可读存储介质
WO2021017935A1 (zh) 一种唤醒锁的管理方法及电子设备
WO2021238370A1 (zh) 显示控制方法、电子设备和计算机可读存储介质
WO2022095744A1 (zh) Vr显示控制方法、电子设备及计算机可读存储介质
WO2021238387A1 (zh) 一种执行应用的方法及装置
WO2022105702A1 (zh) 保存图像的方法及电子设备
CN113641271A (zh) 应用窗口的管理方法、终端设备及计算机可读存储介质
CN114422686A (zh) 参数调整方法及相关装置
CN111381996A (zh) 内存异常处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21929973

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21929973

Country of ref document: EP

Kind code of ref document: A1